Senior Research Engineer - Voice in London

Senior Research Engineer - Voice in London

London Full-Time 70000 - 90000 € / year (est.) Home office (partial)
Synthesia

At a Glance

  • Tasks: Join a team to develop cutting-edge AI voice technologies and enhance global communication.
  • Company: Leading AI video platform with a strong focus on innovation and collaboration.
  • Benefits: Competitive salary, flexible working options, and opportunities for professional growth.
  • Other info: Dynamic environment with exciting projects and career advancement opportunities.
  • Why this job: Make a real impact in the AI space and work with top-tier talent.
  • Qualifications: Expertise in ML, LLMs, and speech generation required.

The predicted salary is between 70000 - 90000 € per year.

Synthesia is the world’s leading AI video platform for business, used by over 90% of the Fortune 100. Founded in 2017, the company is headquartered in London, with offices and teams across Europe and the US. As AI continues to shape the way we live and work, Synthesia develops products to enhance visual communication and enterprise skill development, helping people work better and stay at the center of successful organizations.

Following our recent Series E funding round, where we raised $200 million, our valuation stands at $4 billion. Our total funding exceeds $530 million from premier investors including Accel, NVentures (Nvidia's VC arm), Kleiner Perkins, GV, and Evantic Capital, alongside the founders and operators of Stripe, Datadog, Miro, and Webflow.

As a Research Engineer you will join a team of 40+ Researchers and Engineers within the R&D Department working on cutting-edge challenges in the Generative AI space, with a focus on creating high-quality, expressive and real-time synthetic voices. Within the team you’ll have the opportunity to work on the applied side of our research efforts and directly impact our solutions that are used worldwide by over 60,000 businesses.

If you are an expert in ML, LLMs, speech generation, conversational models, this is your chance to make a global impact. You will join our Audio Post-Training Team, which works on generative speech and voice synthesis, ensuring our in-house voice models reach production-level quality, speed, and robustness.

Typical projects include:

  • Develop and evaluate streaming and speech-to-speech systems, enabling low-latency, interactive voice synthesis.
  • Adapt models for new conditioning inputs (emotion, speed, prosody, speaker control, etc.).
  • Implement post-training optimization techniques (quantization, pruning, distillation) to improve efficiency and latency in real-time speech generation.
  • Integrate and test novel architectures, such as neural codecs, diffusion, or flow-matching models, to enhance realism and responsiveness.
  • Contribute to defining new evaluation metrics for conversational speech, including latency-aware and online MOS prediction systems.
  • Stay updated with the latest research in audio diffusion, autoregressive models, neural codecs, and multimodal LLMs.
  • Apply DPO (Direct Preference Optimization) and distillation to fine-tune large-scale speech models.

What we're looking for:

  • Strong understanding of generative modeling, ideally applied to sequential or multimodal data.
  • Hands-on experience with large language models (LLMs) or similar transformer-based architectures.
  • High proficiency in PyTorch, including experience with distributed training and model optimization.
  • Solid grasp of time-series modeling and tokenization, preferably in the context of audio or speech.
  • Demonstrated ability to prototype quickly, test hypotheses, and iterate efficiently.
  • Proven experience in training deep learning models end-to-end, from data preparation to evaluation.
  • Strong general software engineering skills, enabling contributions to a large, shared research infrastructure.

Nice to have experience:

  • Experience with real-time or streaming architectures is a big plus.
  • Familiarity with state-of-the-art architectures in audio and speech generation (e.g., diffusion models, neural codecs, flow-matching models, autoregressive decoders).
  • Experience with speech-to-speech or text-to-speech (TTS) systems.
  • Evidence of original research contributions, such as publications or open-source work in top-tier venues (e.g., ICASSP, Interspeech, NeurIPS, ICML).

Senior Research Engineer - Voice in London employer: Synthesia

At Synthesia, we pride ourselves on being at the forefront of AI innovation, offering a dynamic work environment that fosters creativity and collaboration. Our London headquarters is not only a hub for cutting-edge research but also a place where employees are encouraged to grow and develop their skills in a supportive culture. With significant funding backing our ambitious projects, we provide unique opportunities for impactful work that shapes the future of visual communication across the globe.

Synthesia

Contact Detail:

Synthesia Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Senior Research Engineer - Voice in London

Tip Number 1

Network like a pro! Reach out to current or former employees at Synthesia on LinkedIn. A friendly chat can give us insider info and might just get your foot in the door.

Tip Number 2

Show off your skills! Prepare a portfolio or a project that highlights your expertise in ML, LLMs, or speech generation. This can really set us apart from the crowd during interviews.

Tip Number 3

Practice makes perfect! Get ready for technical interviews by brushing up on your coding skills and understanding of generative models. We want to be confident when tackling those tricky questions!

Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen. Plus, we love seeing candidates who take the initiative to connect directly with us.

We think you need these skills to ace Senior Research Engineer - Voice in London

Generative Modeling
Large Language Models (LLMs)
Transformer-based Architectures
PyTorch
Distributed Training
Model Optimization
Time-Series Modeling

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Senior Research Engineer role. Highlight your experience with generative modelling, LLMs, and any relevant projects you've worked on. We want to see how your skills align with what we're looking for!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about AI and voice synthesis. Share specific examples of your work that relate to our projects, and let us know why you want to join Synthesia.

Showcase Your Projects:If you've got any cool projects or research papers, don’t forget to mention them! Whether it's a GitHub repo or a publication, we love seeing your hands-on experience and contributions to the field. It helps us understand your expertise better.

Apply Through Our Website:We encourage you to apply through our website for the best chance of getting noticed. It’s straightforward and ensures your application goes directly to us. Plus, you’ll find all the info you need about the role there!

How to prepare for a job interview at Synthesia

Know Your Stuff

Make sure you brush up on generative modelling and large language models. Be ready to discuss your hands-on experience with PyTorch and any projects you've worked on that relate to speech generation or conversational models. This is your chance to show off your expertise!

Show Your Problem-Solving Skills

Prepare to talk about how you've tackled challenges in previous roles, especially those involving real-time or streaming architectures. Think of specific examples where you had to prototype quickly or iterate on a project. This will demonstrate your ability to think on your feet.

Stay Current

Familiarise yourself with the latest research in audio diffusion and neural codecs. Being able to reference recent advancements or trends in the field during your interview will show that you're passionate and engaged with the industry.

Ask Insightful Questions

Prepare thoughtful questions about Synthesia's projects and goals, particularly around their Audio Post-Training Team. This not only shows your interest but also helps you gauge if the company aligns with your career aspirations. Plus, it makes for a great conversation starter!