At a Glance
- Tasks: Design and improve ML platform systems for training and production serving.
- Company: Join Synthesia, the leading AI video platform trusted by Fortune 100 companies.
- Benefits: Competitive salary, remote work options, and opportunities for professional growth.
- Other info: Collaborative culture with a focus on innovation and real-world impact.
- Why this job: Shape the future of AI with hands-on ownership in a dynamic environment.
- Qualifications: Strong experience in production systems, cloud infrastructure, and coding skills.
The predicted salary is between 80000 - 100000 € per year.
Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London, with offices and teams across Europe and the US. As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the centre of successful organizations.
We’re looking for a senior engineer to join the ML Platform team at Synthesia. Our team builds and operates the systems that allow researchers and product teams to train, serve, and deploy generative models reliably and efficiently. This includes research infrastructure, production serving systems, internal tooling, and the platform interfaces that connect them. A growing part of our mission is making these systems more automation-friendly and agent-oriented, so that workflows can increasingly be operated through reliable tooling rather than manual effort.
We’re looking for a strong generalist with a systems mindset: someone who is comfortable working across infrastructure, backend systems, and tooling, and who has seen ML systems in practice. This is not a pure ML Engineer role. We’re especially interested in people who think deeply about reliability, scalability, performance, and resource efficiency in complex production environments. This is a hands-on senior IC role with significant ownership. You’ll help shape how our ML platform evolves as we scale the number of models, workloads, tools and teams relying on it.
What you’ll do:
- Design and improve the platform systems that support model training, evaluation, and production serving.
- Build infrastructure and tooling that make ML workloads more reliable, scalable, and cost-efficient.
- Develop internal tools and workflows that are easy to operate both by humans and by agents.
- Work on the architecture behind how models are deployed, served, and operated across research and product environments.
- Improve how we schedule, monitor, and debug workloads running on GPUs and cloud infrastructure.
- Develop internal tools and abstractions and agentic systems that reduce operational overhead for researchers and engineers.
- Drive improvements across observability, automation, reliability, and developer experience.
- Collaborate closely with researchers and product engineers to understand pain points and turn them into robust platform capabilities.
- Contribute to technical direction and make pragmatic architectural tradeoffs as the platform grows.
You’ll thrive in this role if you have:
- Strong experience building or operating production systems with a focus on reliability, scalability, and maintainability.
- A systems mindset: you naturally think in terms of bottlenecks, failure modes, interfaces, resource usage, and long-term operability.
- Solid hands-on experience with cloud infrastructure, Linux, and infrastructure automation.
- Experience with Kubernetes and operating distributed workloads in production.
- Strong coding skills, ideally in Python or similar languages used for backend systems and tooling.
- Strong judgment around where automation adds leverage, and where human control and reliability matter most.
- Experience building internal platforms, developer tooling, or infrastructure abstractions used by other engineers.
- Comfort working in ambiguous environments and taking ownership of open-ended technical problems.
- A pragmatic approach: you care about solving the right problem well, not over-engineering.
Particularly relevant experience:
- Operating ML infrastructure or model serving systems in production.
- Supporting research or data-intensive workloads.
- Working with GPU-based systems or other performance-sensitive infrastructure.
- Experience with observability and debugging in distributed systems.
- Familiarity with Terraform, Datadog, GitHub Actions, or similar tools.
Bonus points for:
- Experience building agentic or LLM-powered internal tools.
- Experience with workflow orchestration systems such as Temporal.
- Experience working at the boundary between research and production engineering.
- Familiarity with performance optimization, scheduling, or resource allocation problems.
- Experience building lightweight product or developer-facing tools.
Principal ML Platform Engineer employer: Synthesia
At Synthesia, we pride ourselves on being a forward-thinking employer that champions innovation and collaboration in the heart of London. Our dynamic work culture fosters creativity and growth, offering employees ample opportunities to develop their skills while working on cutting-edge AI technology. With significant backing from top-tier investors and a commitment to employee well-being, we provide a supportive environment where your contributions directly impact the future of visual communication.
StudySmarter Expert Advice🤫
We think this is how you could land Principal ML Platform Engineer
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, especially those at Synthesia or similar companies. A friendly chat can open doors and give you insights that a job description just can't.
✨Tip Number 2
Show off your skills! If you've got a portfolio or GitHub with projects related to ML systems, make sure to highlight them. Real-world examples of your work can speak volumes about your capabilities.
✨Tip Number 3
Prepare for the interview by diving deep into Synthesia's products and tech stack. Understanding their challenges and how you can contribute will set you apart from other candidates.
✨Tip Number 4
Don't forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you're genuinely interested in joining the team.
We think you need these skills to ace Principal ML Platform Engineer
Some tips for your application 🫡
Tailor Your CV:Make sure your CV reflects the skills and experiences that align with the Principal ML Platform Engineer role. Highlight your experience with production systems, cloud infrastructure, and any relevant tools like Kubernetes or Terraform.
Craft a Compelling Cover Letter:Use your cover letter to tell us why you're passionate about AI and how your background makes you a great fit for our team. Share specific examples of how you've tackled complex problems in previous roles.
Showcase Your Projects:If you've worked on any relevant projects, whether personal or professional, make sure to include them. We love seeing hands-on experience, especially with ML infrastructure or developer tooling.
Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows us you're keen on joining our team!
How to prepare for a job interview at Synthesia
✨Know Your Stuff
Make sure you brush up on your knowledge of ML systems, cloud infrastructure, and the tools mentioned in the job description. Be ready to discuss your hands-on experience with Kubernetes, Python, and any relevant performance-sensitive infrastructure you've worked with.
✨Showcase Your Systems Mindset
During the interview, highlight your ability to think about reliability, scalability, and maintainability. Prepare examples that demonstrate how you've tackled bottlenecks or failure modes in past projects, and how you approach complex production environments.
✨Be Ready for Technical Challenges
Expect some technical questions or challenges related to building and operating production systems. Practice explaining your thought process clearly, especially when it comes to architectural trade-offs and automation versus manual control.
✨Collaborate and Communicate
Since this role involves working closely with researchers and product engineers, be prepared to discuss how you've collaborated in the past. Share examples of how you've turned pain points into robust platform capabilities, and emphasise your communication skills throughout the interview.