Staff ML Performance Engineer (Inference Optimisation) in London

Job Board

Companies

Wayve

Staff ML Performance Engineer (Inference Optimisation)

Staff ML Performance Engineer (Inference Optimisation) in London

London Full-Time 70000 - 90000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Optimise ML inference for edge devices and contribute to groundbreaking AI projects.
Company: Wayve, a leader in Embodied AI technology for automated driving.
Benefits: Hybrid working policy, inclusive culture, and opportunities for professional growth.
Other info: Dynamic environment with a focus on collaboration and innovation.
Why this job: Join a team tackling complex challenges in the future of autonomous vehicles.
Qualifications: Experience in performance optimisation and strong software engineering skills required.

The predicted salary is between 70000 - 90000 £ per year.

Founded in 2017, Wayve is the leading developer of Embodied AI technology. Our advanced AI software and foundation models enable vehicles to perceive, understand, and navigate any complex environment, enhancing the usability and safety of automated driving systems. Our vision is to create autonomy that propels the world forward. Our intelligent, mapless, and hardware-agnostic AI products are designed for automakers, accelerating the transition from assisted to automated driving. In our fast-paced environment big problems ignite us—we embrace uncertainty, leaning into complex challenges to unlock groundbreaking solutions. We aim high and stay humble in our pursuit of excellence, constantly learning and evolving as we pave the way for a smarter, safer future. At Wayve, your contributions matter. We value diversity, embrace new perspectives, and foster an inclusive work environment; we back each other to deliver impact.

The role As a Staff ML Performance Engineer, you’ll play a key role in high-impact projects, optimising ML inference for edge accelerators and GPUs. The focus of this team is to run large transformer-based models efficiently on low-cost, low-power edge devices to enable Wayve’s first driving product. You’ll help set the technical direction for turning these models into production systems that run reliably on in-vehicle compute. This is a hands-on role working across ML systems, compilers, runtimes, kernels, and embedded deployment, contributing to several early-stage, high-impact projects at Wayve.

Key responsibilities:

Profile and pinpoint bottlenecks across the full inference stack (model graph, compiler/runtime, kernel execution, memory movement) and deliver measurable improvements.
Implement and validate optimisations in compilers, runtimes, and/or kernels (e.g. operator fusion, scheduling, quantisation-aware performance, custom kernels).
Build robust benchmarking and regression testing to ensure performance improvements hold across models, devices, and software releases.
Optimise for multiple targets (e.g. NVIDIA Orin/Thor, Qualcomm) and work with teams to support these in a maintainable way.
Collaborate with model developers to influence architecture and training/deployment decisions that affect on-device performance.
Contribute to technical roadmaps and tooling and help raise the standard of performance engineering across the team.

About you

Essential

Proven experience improving performance in production systems with tight constraints (latency, memory, bandwidth, power/thermal, or cost).
Strong proficiency with at least one relevant stack/toolchain (e.g. TensorRT, CUDA, Qualcomm QNN, Triton, OpenCL) and confidence learning adjacent frameworks quickly.
Comfort operating at multiple levels of abstraction — from high-level model behaviour down to low-level kernel/runtime execution.
Strong software engineering fundamentals (debugging, profiling, testing, and maintainable code).
Clear communicator and collaborative teammate; able to align multiple stakeholders on performance trade-offs and priorities.

Desirable

Exposure to embedded or edge deployment of ML models, including benchmarking on real devices and handling system-level constraints.
Experience with NVIDIA and/or Qualcomm SoCs and performance tooling.
Python and C++ proficiency.
Experience mentoring others and/or driving technical direction in a small, fast-moving team.

This is a full-time role based in our office in London. At Wayve we want the best of all worlds so we operate a hybrid working policy that combines time together in our offices and workshops to fuel innovation, culture, relationships and learning, and time spent working from home.

Wayve is committed to creating an inclusive interview experience. If you require any accommodations or adjustments to participate fully in our interview process, please let us know. We understand that everyone has a unique set of skills and experiences and that not everyone will meet all of the requirements listed above. If you’re passionate about self-driving cars and think you have what it takes to make a positive impact on the world, we encourage you to apply.

At Wayve we're committed to creating a diverse, fair and respectful culture that is inclusive of everyone based on their unique skills and perspectives, and regardless of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, veteran status, pregnancy or related condition (including breastfeeding) or any other basis as protected by applicable law.

Staff ML Performance Engineer (Inference Optimisation) in London employer: Wayve

Wayve is an exceptional employer that fosters a dynamic and innovative work culture, where software engineers can thrive while contributing to cutting-edge AI solutions. With a strong emphasis on employee growth, Wayve offers opportunities for professional development and collaboration in a hybrid work environment, allowing for flexibility and work-life balance. Located in a vibrant tech hub, employees benefit from a supportive community and access to industry-leading resources.

Contact Details:

Wayve Recruitment Team

View Wayve profile

StudySmarter Expert Advice🤫

We think this is how you could land Staff ML Performance Engineer (Inference Optimisation) in London

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, especially those at Wayve. A friendly chat can open doors and give you insights that a job description just can't.

✨Tip Number 2

Show off your skills! If you've got a project or a portfolio that highlights your experience with ML performance engineering, make sure to share it during interviews. It’s a great way to demonstrate your hands-on expertise.

✨Tip Number 3

Prepare for technical discussions! Brush up on your knowledge of relevant stacks like TensorRT or CUDA. Being able to discuss your approach to optimising ML inference will impress the interviewers.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining the Wayve team.

We think you need these skills to ace Staff ML Performance Engineer (Inference Optimisation) in London

ML Inference Optimisation

Performance Engineering

TensorRT

CUDA

Qualcomm QNN

Triton

OpenCL

Embedded Deployment

Benchmarking

Python

C++

Debugging

Profiling

Testing

Collaboration

Some tips for your application 🫡

Tailor Your Application:Make sure to customise your CV and cover letter to highlight your experience with ML performance optimisation. We want to see how your skills align with the role, so don’t hold back on showcasing relevant projects!

Show Your Passion:Let us know why you’re excited about working in the field of AI and self-driving cars. A genuine enthusiasm for the technology and its potential can really make your application stand out!

Be Clear and Concise:When writing your application, keep it straightforward. Use clear language to describe your experiences and achievements. We appreciate a well-structured application that’s easy to read.

Apply Through Our Website:We encourage you to submit your application directly through our website. It’s the best way to ensure we receive all your details and can consider you for the role. Plus, it shows you’re keen to join our team!

How to prepare for a job interview at Wayve

✨Know Your Tech Stack

Make sure you’re well-versed in the relevant stacks and toolchains like TensorRT, CUDA, or Qualcomm QNN. Brush up on how these tools can optimise ML inference, as you'll likely be asked to discuss your experience with them during the interview.

✨Showcase Problem-Solving Skills

Prepare to discuss specific examples where you've identified and resolved performance bottlenecks in production systems. Highlight your approach to tackling tight constraints like latency and memory, as this will demonstrate your hands-on experience and critical thinking.

✨Communicate Clearly

Practice articulating complex technical concepts in a way that’s easy to understand. Being a clear communicator is essential, especially when aligning stakeholders on performance trade-offs. Consider doing mock interviews to refine your delivery.

✨Collaborate and Contribute

Be ready to talk about your experience working in teams, especially in fast-paced environments. Share instances where you’ve influenced architectural decisions or mentored others, as collaboration is key at Wayve and they value team players.

Staff ML Performance Engineer (Inference Optimisation) in London

Wayve

Location: London

Apply Now

Staff ML Performance Engineer (Inference Optimisation) in London

At a Glance

Staff ML Performance Engineer (Inference Optimisation) in London employer: Wayve

StudySmarter Expert Advice🤫

We think you need these skills to ace Staff ML Performance Engineer (Inference Optimisation) in London

Some tips for your application 🫡

How to prepare for a job interview at Wayve

Company

Product

Help