At a Glance
- Tasks: Develop APIs for AI inference and optimise machine learning models.
- Company: Join a cutting-edge AI team in London with a focus on innovation.
- Benefits: Competitive salary, equity options, and opportunities for professional growth.
- Why this job: Make an impact in AI by working on real-time inference systems.
- Qualifications: Experience with ML systems and deep learning frameworks required.
- Other info: Dynamic work environment with a focus on collaboration and creativity.
The predicted salary is between 36000 - 60000 £ per year.
Location: London
Employment Type: Full time
Department: AI
We are looking for an AI Inference engineer to join our growing team. Our current stack is Python, Rust, C++, PyTorch, Triton, CUDA, Kubernetes. You will have the opportunity to work on large-scale deployment of machine learning models for real-time inference.
Responsibilities:
- Develop APIs for AI inference that will be used by both internal and external customers
- Benchmark and address bottlenecks throughout our inference stack
- Improve the reliability and observability of our systems and respond to system outages
- Explore novel research and implement LLM inference optimizations
Qualifications:
- Experience with ML systems and deep learning frameworks (e.g. PyTorch, TensorFlow, ONNX)
- Familiarity with common LLM architectures and inference optimization techniques (e.g. continuous batching, quantization, etc.)
- Understanding of GPU architectures or experience with GPU kernel programming using CUDA
Final offer amounts are determined by multiple factors, including experience and expertise. Equity: In addition to the base salary, equity may be part of the total compensation package.
AI Inference Engineer employer: Pantera Capital
Contact Detail:
Pantera Capital Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land AI Inference Engineer
✨Tip Number 1
Network like a pro! Reach out to folks in the AI and tech community on LinkedIn or at meetups. We all know that sometimes it’s not just what you know, but who you know that can help you land that dream job.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects, especially those involving Python, Rust, or any ML frameworks. We want to see what you can do, so make it easy for potential employers to check out your work.
✨Tip Number 3
Prepare for technical interviews by brushing up on your coding skills and understanding of GPU architectures. We recommend practicing common algorithms and data structures, as well as discussing your experience with inference optimizations.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are genuinely interested in joining our team.
We think you need these skills to ace AI Inference Engineer
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your experience with ML systems and deep learning frameworks like PyTorch and TensorFlow. We want to see how your skills align with the role, so don’t be shy about showcasing relevant projects!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Tell us why you’re excited about the AI Inference Engineer position and how your background makes you a great fit. Be specific about your experience with inference optimisations and GPU programming.
Showcase Your Projects: If you've worked on any interesting projects related to AI or machine learning, make sure to mention them! We love seeing practical applications of your skills, especially if they involve real-time inference or novel research.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy – just follow the prompts!
How to prepare for a job interview at Pantera Capital
✨Know Your Tech Stack
Make sure you’re familiar with the technologies mentioned in the job description, like Python, Rust, C++, and PyTorch. Brush up on your knowledge of CUDA and Kubernetes too, as these will likely come up during technical discussions.
✨Showcase Your Problem-Solving Skills
Be prepared to discuss specific challenges you've faced in ML systems or deep learning frameworks. Think of examples where you’ve optimised inference processes or tackled bottlenecks, as this will demonstrate your hands-on experience and critical thinking.
✨Understand LLM Architectures
Familiarise yourself with common LLM architectures and inference optimisation techniques like continuous batching and quantisation. Being able to discuss these topics confidently will show that you’re not just a coder but someone who understands the underlying principles.
✨Prepare Questions
Have a few insightful questions ready about the team’s current projects or future goals. This shows your genuine interest in the role and helps you gauge if the company is the right fit for you.