ML Engineer, Speech Data
ML Engineer, Speech Data

ML Engineer, Speech Data

Full-Time 60000 - 80000 £ / year (est.) Home office possible
techire ai

At a Glance

  • Tasks: Own the data pipeline for AI character speech technology and enhance audio quality.
  • Company: Exciting startup focused on innovative AI character technology.
  • Benefits: Competitive compensation, equity, and remote work flexibility.
  • Other info: Work remotely with a dynamic team and great growth potential.
  • Why this job: Join a cutting-edge team and shape the future of AI conversations.
  • Qualifications: Experience with large-scale speech data and strong ML skills in Python.

The predicted salary is between 60000 - 80000 £ per year.

Ready to own the data pipeline powering the voice of the next generation of AI characters? You'll be joining a well-funded startup building AI character technology, where speech is a core part of the product experience. Think super natural conversations, handling interruptions, personality shifts and more!

Your focus:

  • Own the full data lifecycle — defining specs, auditing and curating large-scale audio and text corpora.
  • Build automated quality metrics and dashboards across SNR, VAD, WER, speaker verification and safety, validated against listening tests.
  • Train and deploy lightweight classifiers for noise detection, diarisation, language ID, and content moderation.

What you'll bring:

  • Deep experience working with speech and audio data at scale — 1M+ hours.
  • Strong ML engineering skills in Python and PyTorch, including training and fine-tuning models like Whisper or Wav2Vec.
  • Practical knowledge of audio processing — torchaudio, librosa, spectrograms, DSP basics.
  • A solid understanding of audio quality metrics — MOS, WER, PESQ/STOI, SNR, speaker verification.

Nice to have:

  • Experience with Spark/Beam, Airflow, SQL or similar data engineering tools.
  • Open-source contributions or publications in speech or audio ML.
  • Background in denoising and enhancement, and how it affects downstream model quality.

Remote, with a preference for European or overlapping timezones. Competitive compensation and equity.

ML Engineer, Speech Data employer: techire ai

Join a dynamic and innovative startup at the forefront of AI character technology, where your contributions will directly shape the future of natural conversations. With a strong emphasis on employee growth, you'll have access to competitive compensation, equity options, and a collaborative work culture that values creativity and technical expertise. Embrace the flexibility of remote work while being part of a passionate team dedicated to pushing the boundaries of speech data engineering.
techire ai

Contact Detail:

techire ai Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land ML Engineer, Speech Data

✨Tip Number 1

Network like a pro! Reach out to folks in the industry on LinkedIn or attend relevant meetups. We can’t stress enough how personal connections can open doors that applications alone can’t.

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those related to speech and audio data. We love seeing practical examples of your work, so make sure to highlight your experience with Python and PyTorch.

✨Tip Number 3

Prepare for interviews by brushing up on common ML engineering questions and be ready to discuss your experience with audio processing. We want to see your thought process, so practice explaining your approach to solving problems.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, we’re always on the lookout for passionate candidates who are eager to own the data pipeline in AI character technology.

We think you need these skills to ace ML Engineer, Speech Data

Data Pipeline Management
Speech and Audio Data Processing
Machine Learning Engineering
Python
PyTorch
Model Training and Fine-Tuning
Audio Processing (torchaudio, librosa)
Understanding of Audio Quality Metrics (MOS, WER, PESQ/STOI, SNR)
Automated Quality Metrics Development
Noise Detection Classifiers
Diarisation
Language Identification
Content Moderation
Experience with Data Engineering Tools (Spark/Beam, Airflow, SQL)

Some tips for your application 🫡

Show Your Passion for Speech Data: When you're writing your application, let us see your enthusiasm for working with speech and audio data. Share any personal projects or experiences that highlight your skills in this area, especially if you've tackled large datasets before!

Tailor Your CV to the Role: Make sure your CV is tailored specifically for the ML Engineer position. Highlight your experience with Python, PyTorch, and any relevant audio processing tools. We want to see how your background aligns with our needs!

Be Clear and Concise: Keep your application clear and to the point. Use bullet points where possible to make it easy for us to read through your qualifications and experiences. We appreciate a well-structured application that gets straight to the good stuff!

Apply Through Our Website: Don't forget to apply through our website! It’s the best way for us to keep track of your application and ensures you’re considered for the role. Plus, it shows you’re serious about joining our team at StudySmarter!

How to prepare for a job interview at techire ai

✨Know Your Data Inside Out

Make sure you’re well-versed in the specifics of speech and audio data. Brush up on your experience with large-scale datasets, especially if you've worked with 1M+ hours of audio. Be ready to discuss how you’ve handled messy audio and what steps you took to clean and curate it for training models.

✨Show Off Your ML Skills

Prepare to demonstrate your proficiency in Python and PyTorch. Have examples ready of models you've trained or fine-tuned, like Whisper or Wav2Vec. It’s a good idea to talk about the challenges you faced and how you overcame them during the training process.

✨Understand Audio Quality Metrics

Familiarise yourself with key audio quality metrics such as MOS, WER, and SNR. Be prepared to explain how these metrics impact model performance and share any experiences you have with validating these metrics against listening tests.

✨Be Ready for Practical Questions

Expect practical questions that may involve building automated quality metrics or dashboards. Think about how you would approach tasks like noise detection or language ID. Showing your problem-solving skills in real-time can really impress the interviewers.

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>