Job Board

Companies

Pantera Capital

AI Inference Engineer

London Full-Time 114000 - 144000 £ / year (est.) No home office possible

Apply now

At a Glance

Tasks: Join us as an AI Inference Engineer, developing APIs and optimising machine learning models.
Company: Perplexity is a rapidly growing tech company revolutionising AI with over 10 million active users.
Benefits: Enjoy competitive salary, equity options, and comprehensive health benefits including dental and vision.
Why this job: Be part of a cutting-edge team impacting millions globally with innovative AI solutions.
Qualifications: Experience with ML systems, deep learning frameworks, and knowledge of LLM architectures required.
Other info: Flexible work environment with opportunities for growth in a billion-dollar valued startup.

The predicted salary is between 114000 - 144000 £ per year.

Location

London

Employment Type

Full time

Department

We are looking for an AI Inference engineer to join our growing team. Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.

Responsibilities

Develop APIs for AI inference that will be used by both internal and external customers
Benchmark and address bottlenecks throughout our inference stack
Improve the reliability and observability of our systems and respond to system outages
Explore novel research and implement LLM inference optimizations

Qualifications

Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
Understanding of GPU architectures or experience with GPU kernel programming using CUDA

Final offer amounts are determined by multiple factors, including, experience and expertise.

Equity: In addition to the base salary, equity may be part of the total compensation package.

#J-18808-Ljbffr

AI Inference Engineer employer: Pantera Capital

At Perplexity, we pride ourselves on being an innovative leader in AI technology, offering our employees a dynamic work environment that fosters creativity and collaboration. With competitive compensation packages, including equity options, comprehensive health benefits, and a strong focus on professional development, we empower our team to grow alongside our rapidly expanding company. Join us in our vibrant location, where you can contribute to cutting-edge projects that impact millions of users globally.

Contact Detail:

Pantera Capital Recruiting Team

View Pantera Capital Profile

StudySmarter Expert Advice 🤫

We think this is how you could land AI Inference Engineer

✨Tip Number 1

Familiarise yourself with our technology stack, especially Python and C++. Having hands-on experience with these languages will give you a significant edge during discussions and technical assessments.

✨Tip Number 2

Dive deep into LLM architectures and inference optimisation techniques. Understanding concepts like batching and quantisation will not only help you in the role but also impress us during your interviews.

✨Tip Number 3

Showcase any projects or experiences where you've deployed scalable, real-time model serving systems. Real-world examples of your work can set you apart from other candidates.

✨Tip Number 4

Engage with the AI community through forums or social media. Networking can provide insights into industry trends and may even lead to referrals, increasing your chances of landing an interview with us.

We think you need these skills to ace AI Inference Engineer

Proficiency in Python and C++

Experience with TensorRT-LLM

Knowledge of Kubernetes

Familiarity with machine learning systems

Deep learning frameworks expertise (PyTorch, TensorFlow, ONNX)

Understanding of LLM architectures

Inference optimisation techniques (e.g., batching, quantization)

Experience in deploying scalable model serving systems

System reliability and observability management

Benchmarking and performance optimisation skills

Problem-solving skills in real-time inference contexts

GPU architecture knowledge or CUDA programming experience

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights relevant experience with machine learning systems, deep learning frameworks like PyTorch and TensorFlow, and any work you've done with LLM architectures. Use keywords from the job description to catch the employer's attention.

Craft a Strong Cover Letter: In your cover letter, express your enthusiasm for the role and the company. Discuss specific projects where you've developed APIs for AI inference or optimised inference stacks, showcasing your problem-solving skills and technical expertise.

Showcase Relevant Projects: If you have worked on projects involving real-time model serving systems or have experience with CUDA programming, be sure to include these in your application. Provide links to your GitHub or portfolio to demonstrate your hands-on experience.

Highlight Continuous Learning: Mention any recent courses, certifications, or workshops related to AI, ML, or deep learning that you've completed. This shows your commitment to staying updated in a rapidly evolving field, which is crucial for an AI Inference Engineer.

How to prepare for a job interview at Pantera Capital

✨Showcase Your Technical Skills

Be prepared to discuss your experience with Python, C++, and deep learning frameworks like PyTorch and TensorFlow. Highlight specific projects where you've implemented machine learning models or optimised inference processes.

✨Understand the Company’s Technology Stack

Familiarise yourself with TensorRT-LLM and Kubernetes, as these are key components of the role. Demonstrating knowledge about how these technologies work together will show your genuine interest in the position.

✨Prepare for Problem-Solving Questions

Expect questions that assess your ability to troubleshoot and optimise AI inference systems. Think of examples where you've successfully identified bottlenecks and implemented solutions, particularly in real-time model serving.

✨Research LLM Architectures

Since knowledge of LLM architectures is crucial, brush up on the latest techniques in inference optimisation, such as batching and quantisation. Being able to discuss these topics will set you apart from other candidates.

AI Inference Engineer

Pantera Capital

Location: London

Apply now

AI Inference Engineer

At a Glance

AI Inference Engineer employer: Pantera Capital

StudySmarter Expert Advice 🤫

✨Tip Number 1

✨Tip Number 2

✨Tip Number 3

✨Tip Number 4

We think you need these skills to ace AI Inference Engineer

Some tips for your application 🫡

How to prepare for a job interview at Pantera Capital

AI Inference Engineer

Land your dream job quicker with Premium