Staff / Principal Machine Learning Engineer, Serving in London

Job Board

Companies

Inworld AI

Staff / Principal Machine Learning Engineer, Serving

Staff / Principal Machine Learning Engineer, Serving in London

London Full-Time 140000 - 200000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Develop and optimise cutting-edge machine learning models for real-time applications.
Company: Join a top AI research lab backed by major investors like Microsoft and Meta.
Benefits: Competitive salary, equity options, and a supportive work environment.
Other info: Dynamic team culture with opportunities for growth and open-source contributions.
Why this job: Make a real impact in AI with innovative technology and a flat structure.
Qualifications: Experience in ML systems, programming, and a passion for problem-solving.

The predicted salary is between 140000 - 200000 £ per year.

Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and the only realtime orchestration platform optimized for thousands of queries per second. We have raised more than $125M from various investors. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We have also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA.

Who We’re Looking For

A year ago, reliably working agentic systems and sub-second multimodal inference at scale barely existed. Nobody has a decade of experience here. So we’re not screening for a resume template — we’re looking for strong people from varied backgrounds who learn fast, thrive in ambiguity, and can show us what they’ve built, broken, and understood.

Experience We Find Useful

Inference Optimization: Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM.
Model Acceleration: Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding.
High-Performance Systems: Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs.
Distributed Systems & Scaling: Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections.
Public work: Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups.
Full-cycle ownership: You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production.
Background: PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems.

Who Thrives Here

You don’t need a roadmap to start walking; you’re comfortable picking a direction and building the map as you go. You believe engineering isn’t finished until it’s shipped and stable. You have a bias for impact over purely theoretical optimizations. You don’t just ship code; you obsess over the why. You’re the first to question an architecture if you think there’s a better way to solve the core latency or throughput problem. You aren’t satisfied with "the PM said so." You thrive on deep context and want to understand the fundamental logic behind every decision we make.

What Working Here Is Like

We hand you unclear problems and expect you to make them clear. We value engineers who say "I don’t know yet" and then design the benchmark or prototype that finds out. We treat performance, latency, and reliability as first-class product features, not a box to check before launch. Impact comes before everything else, though we support sharing work and open-source contributions that move the field forward. Your work should be visible. Flat structure, fast iterations, minimal process theater.

The base salary range for this full-time position is £140,000 – £200,000. In addition to base pay, total compensation includes equity and benefits. Within the range, individual pay is determined by work location, level, and additional factors, including competencies, experience, and business needs. The base pay range is subject to change and may be modified in the future.

Candidates must already have the legal right to work in the United Kingdom, as visa sponsorship is not available for this role. For candidates interested in relocating to the San Francisco Bay Area in the future, full U.S. visa and relocation support may be available, subject to business needs and applicable legal and work authorization requirements.

Staff / Principal Machine Learning Engineer, Serving in London employer: Inworld AI

Inworld AI is an exceptional employer that fosters a culture of innovation and collaboration, making it an ideal place for Senior Research Scientists passionate about advancing AI technologies. Located in the UK, employees benefit from competitive salaries, equity options, and a supportive environment that encourages professional growth and exploration of new ideas. Join a team where your contributions will directly impact the future of voice models and AI systems in a dynamic and forward-thinking setting.

Contact Details:

Inworld AI Recruitment Team

View Inworld AI profile

StudySmarter Expert Advice🤫

We think this is how you could land Staff / Principal Machine Learning Engineer, Serving in London

✨Tip Number 1

Get your hands dirty with projects that showcase your skills. Build something cool, break it, and then fix it! This hands-on experience is what we want to see, so don’t be shy about sharing your work.

✨Tip Number 2

Networking is key! Connect with folks in the industry, attend meetups, or join online forums. You never know who might have a lead on an opportunity or can give you insider tips on landing that dream job.

✨Tip Number 3

When you get that interview, don’t just prepare for the technical questions. Be ready to discuss your thought process and how you tackle ambiguity. We love candidates who can think on their feet and adapt!

✨Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows us you’re genuinely interested in being part of our team at Inworld.

We think you need these skills to ace Staff / Principal Machine Learning Engineer, Serving in London

Inference Optimization

Deep understanding of modern serving frameworks

Model Acceleration

Hands-on experience with quantization

Caching strategies

Continuous batching

Paged attention

Speculative decoding

High-Performance Systems

Proficiency in C++

CUDA

Rust

Optimized Python

Distributed Systems

Experience with Kubernetes

Multi-GPU/multi-node inference

Full-cycle ownership

PhD in CS, Physics, Math, or equivalent practical experience

Some tips for your application 🫡

Show Us Your Passion:When you're writing your application, let your enthusiasm for machine learning and AI shine through. We want to see what excites you about the field and how you've engaged with it in your past projects.

Be Specific About Your Experience:Don't just list your skills; tell us about the specific projects you've worked on. Highlight your hands-on experience with inference optimization or high-performance systems, and share any challenges you faced and how you overcame them.

Keep It Clear and Concise:We appreciate clarity! Make sure your application is easy to read and straight to the point. Avoid jargon unless it's necessary, and focus on communicating your ideas effectively.

Apply Through Our Website:We encourage you to apply directly through our website. This way, we can ensure your application gets the attention it deserves, and you can easily keep track of your application status!

How to prepare for a job interview at Inworld AI

✨Know Your Tech Inside Out

Make sure you have a solid grasp of the technologies mentioned in the job description, like inference optimisation and high-performance systems. Be ready to discuss your hands-on experience with tools like C++, CUDA, or Rust, and how you've applied them in real-world scenarios.

✨Showcase Your Projects

Prepare to talk about specific projects you've worked on that demonstrate your ability to build, break, and understand complex systems. Highlight any public work or open-source contributions that relate to machine learning or distributed systems, as this will show your passion and expertise.

✨Embrace Ambiguity

Inworld values candidates who can thrive in uncertain situations. Be prepared to discuss how you've tackled unclear problems in the past and what strategies you used to clarify and solve them. This will show that you're comfortable taking initiative and can adapt quickly.

✨Ask Insightful Questions

During the interview, don't hesitate to ask questions that dig deeper into the company's challenges and goals. This shows that you're not just interested in the role but also in understanding the bigger picture and how you can contribute to their success.

Staff / Principal Machine Learning Engineer, Serving in London

Inworld AI

Location: London

Apply Now

Staff / Principal Machine Learning Engineer, Serving in London

At a Glance

Staff / Principal Machine Learning Engineer, Serving in London employer: Inworld AI

StudySmarter Expert Advice🤫

We think you need these skills to ace Staff / Principal Machine Learning Engineer, Serving in London

Some tips for your application 🫡

How to prepare for a job interview at Inworld AI

Company

Product

Help