At a Glance
- Tasks: Own the data pipeline for AI character speech technology and enhance audio quality.
- Company: Exciting startup focused on innovative AI character technology.
- Benefits: Competitive compensation, equity, and remote work flexibility.
- Other info: Work remotely with a dynamic team and great growth potential.
- Why this job: Join a cutting-edge team and shape the future of AI conversations.
- Qualifications: Experience with large-scale speech data and strong ML skills in Python.
The predicted salary is between 60000 - 80000 £ per year.
Ready to own the data pipeline powering the voice of the next generation of AI characters? You'll be joining a well-funded startup building AI character technology, where speech is a core part of the product experience. Think super natural conversations, handling interruptions, personality shifts and more!
Your focus:
- Own the full data lifecycle — defining specs, auditing and curating large-scale audio and text corpora.
- Build automated quality metrics and dashboards across SNR, VAD, WER, speaker verification and safety, validated against listening tests.
- Train and deploy lightweight classifiers for noise detection, diarisation, language ID, and content moderation.
What you'll bring:
- Deep experience working with speech and audio data at scale — 1M+ hours.
- Strong ML engineering skills in Python and PyTorch, including training and fine-tuning models like Whisper or Wav2Vec.
- Practical knowledge of audio processing — torchaudio, librosa, spectrograms, DSP basics.
- A solid understanding of audio quality metrics — MOS, WER, PESQ/STOI, SNR, speaker verification.
Nice to have:
- Experience with Spark/Beam, Airflow, SQL or similar data engineering tools.
- Open-source contributions or publications in speech or audio ML.
- Background in denoising and enhancement, and how it affects downstream model quality.
Remote, with a preference for European or overlapping timezones. Competitive compensation and equity.
ML Engineer, Speech Data employer: techire ai
Contact Detail:
techire ai Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land ML Engineer, Speech Data
✨Tip Number 1
Network like a pro! Reach out to folks in the industry on LinkedIn or attend relevant meetups. We can’t stress enough how personal connections can open doors that applications alone can’t.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects, especially those related to speech and audio data. We love seeing practical examples of your work, so make sure to highlight your experience with Python and PyTorch.
✨Tip Number 3
Prepare for interviews by brushing up on common ML engineering questions and be ready to discuss your experience with audio processing. We want to see your thought process, so practice explaining your approach to solving problems.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, we’re always on the lookout for passionate candidates who are eager to own the data pipeline in AI character technology.
We think you need these skills to ace ML Engineer, Speech Data
Some tips for your application 🫡
Show Your Passion for Speech Data: When you're writing your application, let us see your enthusiasm for working with speech and audio data. Share any personal projects or experiences that highlight your skills in this area, especially if you've tackled large datasets before!
Tailor Your CV to the Role: Make sure your CV is tailored specifically for the ML Engineer position. Highlight your experience with Python, PyTorch, and any relevant audio processing tools. We want to see how your background aligns with our needs!
Be Clear and Concise: Keep your application clear and to the point. Use bullet points where possible to make it easy for us to read through your qualifications and experiences. We appreciate a well-structured application that gets straight to the good stuff!
Apply Through Our Website: Don't forget to apply through our website! It’s the best way for us to keep track of your application and ensures you’re considered for the role. Plus, it shows you’re serious about joining our team at StudySmarter!
How to prepare for a job interview at techire ai
✨Know Your Data Inside Out
Make sure you’re well-versed in the specifics of speech and audio data. Brush up on your experience with large-scale datasets, especially if you've worked with 1M+ hours of audio. Be ready to discuss how you’ve handled messy audio and what steps you took to clean and curate it for training models.
✨Show Off Your ML Skills
Prepare to demonstrate your proficiency in Python and PyTorch. Have examples ready of models you've trained or fine-tuned, like Whisper or Wav2Vec. It’s a good idea to talk about the challenges you faced and how you overcame them during the training process.
✨Understand Audio Quality Metrics
Familiarise yourself with key audio quality metrics such as MOS, WER, and SNR. Be prepared to explain how these metrics impact model performance and share any experiences you have with validating these metrics against listening tests.
✨Be Ready for Practical Questions
Expect practical questions that may involve building automated quality metrics or dashboards. Think about how you would approach tasks like noise detection or language ID. Showing your problem-solving skills in real-time can really impress the interviewers.