Senior Research Engineer - Video Foundation Models (Pre - Training) in London
Senior Research Engineer - Video Foundation Models (Pre - Training)

Senior Research Engineer - Video Foundation Models (Pre - Training) in London

London Full-Time 43200 - 72000 ÂŁ / year (est.) No home office possible
Synthesia

At a Glance

  • Tasks: Develop cutting-edge video models and optimise human-centric video generation.
  • Company: Join Synthesia, a leading AI video platform trusted by Fortune 100 companies.
  • Benefits: Competitive salary, stock options, remote work, and generous annual leave.
  • Why this job: Make a real-world impact with your research in a fast-growing AI company.
  • Qualifications: Strong experience in deep learning, Python, and PyTorch; video model experience preferred.
  • Other info: Collaborative culture focused on innovation and high engineering standards.

The predicted salary is between 43200 - 72000 ÂŁ per year.

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London, with offices and teams across Europe and the US. As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the center of successful organizations.

About the role

As a Research Engineer in our Video Pre-Training team, you will help build the next generation of production‑grade foundation models for human‑centric video generation. You will join a highly focused team working at the intersection of large‑scale generative modeling, distributed systems, and production engineering. Our mission is to develop and optimize video base models that power realistic, controllable, and emotionally expressive synthetic humans at scale. This is not pure research. This is applied research with direct product impact.

What you’ll do:

  • Developing and scaling latent video diffusion models tailored for human‑centric video generation
  • Designing conditioning mechanisms to improve control (pose, emotion, script, camera) without sacrificing fidelity
  • Advancing distributed training strategies (DDP, FSDP, DeepSpeed, sequence parallelism) under real compute constraints
  • Improving training stability at multi‑node scale
  • Designing rigorous evaluation frameworks combining automated metrics and structured human evaluation
  • Optimizing inference for low latency, high resolution, and cost efficiency
  • Running controlled ablations and experiments to drive high‑signal modeling decisions
  • Contributing to high engineering standards: reproducibility, experiment tracking, CI/CD, monitoring

You will be expected to move fast, run multiple hypotheses in parallel, identify signal early, and focus on outcomes rather than exploration for its own sake.

What we’re looking for:

Must-have:

  • Strong experience training deep learning models at scale
  • Strong Python and PyTorch skills
  • Hands‑on experience with diffusion models (image domain required; video preferred)
  • Experience with large scale multi‑GPU / multi‑node training
  • Good understanding of distributed training (DDP, FSDP, DeepSpeed or similar)
  • Ability to design controlled experiments and interpret noisy results

Nice-to-have:

  • Experience with video diffusion models
  • Experience in avatar or human‑centric generation
  • Familiarity with world / interactive models
  • Experience with GANs or VAEs
  • Experience optimizing inference systems for production.

Our stack:

  • Python, PyTorch, CUDA
  • DeepSpeed, distributed training & inference
  • Sequence parallelism
  • AWS, SLURM, Docker
  • GitHub, CI/CD pipelines

Who you are:

  • You are research‑driven but outcome‑focused
  • You care about shipping, not just publishing
  • You can explore multiple ideas quickly and drop low‑signal directions early
  • You communicate clearly and present results scientifically
  • You operate independently but collaborate actively across teams

Why join us?

  • Build production‑scale video foundation models in a fast‑growing Generative AI company
  • Work on human‑centric video generation with real‑world impact
  • Tackle hard problems in scaling, stability, and controllability
  • Influence the direction of next‑generation synthetic human technology
  • Join a highly technical, high‑ownership environment where your work ships

If you want to work on cutting‑edge generative video models and see your research power real‑world products, we’d love to talk.

Our culture:

At Synthesia we’re passionate about building, not talking, planning or politicising. We strive to hire the smartest, kindest and most unrelenting people and let them do their best work without distractions. Our work principles serve as our charter for how we make decisions, give feedback and structure our work to empower everyone to go as fast as possible.

The good stuff…

  • Competitive compensation (salary + stock options + bonus)
  • Fully remote from Europe or hybrid work setting with an office in London, Amsterdam, Zurich, Munich
  • 25 days of annual leave + public holidays
  • Great company culture with the option to join regular planning and socials at our hubs + other benefits depending on your location

Senior Research Engineer - Video Foundation Models (Pre - Training) in London employer: Synthesia

At Synthesia, we pride ourselves on being a leading AI video platform that fosters a dynamic and innovative work culture. Our employees enjoy competitive compensation, flexible remote or hybrid working options, and ample opportunities for professional growth in a fast-paced environment. Join us in London to make a real impact on the future of human-centric video generation while collaborating with some of the brightest minds in the industry.
Synthesia

Contact Detail:

Synthesia Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Research Engineer - Video Foundation Models (Pre - Training) in London

✨Tip Number 1

Network like a pro! Reach out to people in the industry, attend meetups, and connect with current Synthesia employees on LinkedIn. A friendly chat can sometimes lead to opportunities that aren’t even advertised!

✨Tip Number 2

Show off your skills! If you’ve got projects or research that align with what Synthesia is doing, don’t hesitate to share them. Create a portfolio or GitHub repo showcasing your work with deep learning models and video generation.

✨Tip Number 3

Prepare for the interview by diving deep into Synthesia’s products and values. Understand their mission around human-centric video generation and think about how your experience can contribute to that vision.

✨Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in being part of the Synthesia team. Don’t miss out on this chance!

We think you need these skills to ace Senior Research Engineer - Video Foundation Models (Pre - Training) in London

Deep Learning
Python
PyTorch
Diffusion Models
Multi-GPU Training
Distributed Training
Experiment Design
Inference Optimization
CUDA
AWS
Docker
CI/CD Pipelines
Data Analysis
Communication Skills
Collaboration

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter to highlight your experience with deep learning models and Python. We want to see how your skills align with the role, so don’t hold back on showcasing relevant projects!

Show Your Passion for AI: Let us know why you’re excited about working in generative AI and video technology. Share any personal projects or research that demonstrate your enthusiasm and commitment to the field.

Be Clear and Concise: When writing your application, keep it straightforward. Use clear language and avoid jargon where possible. We appreciate a well-structured application that gets straight to the point!

Apply Through Our Website: We encourage you to submit your application through our website. It’s the best way for us to receive your details and ensures you’re considered for the role. Plus, it’s super easy!

How to prepare for a job interview at Synthesia

✨Know Your Models

Make sure you have a solid understanding of the deep learning models relevant to the role, especially diffusion models. Be prepared to discuss your hands-on experience with these models and how you've applied them in real-world scenarios.

✨Showcase Your Python Skills

Since strong Python and PyTorch skills are a must-have, brush up on your coding abilities. You might be asked to solve problems or even write code during the interview, so practice common algorithms and data structures in Python.

✨Prepare for Technical Questions

Expect questions about distributed training strategies and how you've tackled challenges in scaling models. Be ready to explain concepts like DDP, FSDP, and DeepSpeed, and share specific examples from your past work.

✨Demonstrate Your Research Mindset

This role is all about applied research, so be prepared to discuss how you balance exploration with outcome-focused results. Share examples of how you've designed controlled experiments and interpreted noisy results to drive high-signal modelling decisions.

Senior Research Engineer - Video Foundation Models (Pre - Training) in London
Synthesia
Location: London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>