AI Evaluations Engineer

Job Board

Companies

ConnexAI

AI Evaluations Engineer

Full-Time 60000 - 80000 € / year (est.) Home office (partial)

Apply Now

At a Glance

Tasks: Define and measure AI quality, build datasets, and automate evaluation workflows.
Company: Join a leading tech firm focused on innovative AI solutions.
Benefits: Competitive salary, flexible working hours, and opportunities for professional growth.
Other info: Collaborative environment with exciting projects and career advancement.
Why this job: Make a real impact on AI systems that enhance user experiences.
Qualifications: Strong Python skills and experience with ML systems required.

The predicted salary is between 60000 - 80000 € per year.

This role sits at the centre of how we measure and improve AI systems in production. You’ll define what good performance means across LLMs, ASR, TTS, and full speech-to-speech pipelines, and build the datasets, metrics, and evaluation systems that make AI quality measurable and comparable in the real world. You’ll work closely with engineering and product teams to ensure model changes lead to real improvements in user experience, not just better offline benchmarks.

What you’ll do:

Design and run evaluations across LLM, ASR, TTS, and speech-to-speech systems
Build real-world datasets and test cases from production behaviour and edge cases
Define metrics and scorecards for model and system quality
Benchmark internal models against external and frontier systems
Build Python tools to automate evaluation workflows
Create internal leaderboards, red-teaming setups, and regression tests
Work with engineers and product teams to diagnose system failures
Turn vague product goals into measurable evaluation frameworks

What this role is about:

Defining and measuring AI quality in production systems
Turning real user behaviour into structured evaluation signals
Ensuring model changes improve real-world performance
Understanding why AI systems fail, not just whether they do

What good looks like:

You can translate improved quality into measurable metrics
You think in terms of system impact (before vs after), not just accuracy
You’re comfortable working across code, data, and production systems
You care about real-world behaviour, not just benchmarks

Core skills:

Strong Python (scripting, data analysis, tooling)
Experience with ML systems, evaluation, or experimentation
Understanding of LLMs or speech systems (ASR / TTS)
Ability to design test cases and structured datasets
Comfortable working with engineers and product teams

Nice to have:

Experience with LLM evaluation or benchmarking
Exposure to speech or multimodal systems
Familiarity with production APIs or ML systems
Experience with automated testing or CI-style workflows

AI Evaluations Engineer employer: ConnexAI

As an AI Evaluations Engineer, you will thrive in a dynamic and innovative environment that prioritises collaboration and continuous improvement. Our company fosters a culture of growth, offering ample opportunities for professional development while working on cutting-edge AI technologies that have a real-world impact. Located in a vibrant tech hub, we provide a supportive atmosphere where your contributions directly enhance user experiences and drive meaningful advancements in AI quality.

Contact Detail:

ConnexAI Recruiting Team

View ConnexAI Profile

StudySmarter Expert Advice🤫

We think this is how you could land AI Evaluations Engineer

✨Tip Number 1

Network like a pro! Reach out to folks in the AI and tech space, especially those who work with LLMs, ASR, and TTS. A friendly chat can open doors that a CV just can't.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repo showcasing your Python projects related to AI evaluations. This gives potential employers a taste of what you can do beyond the written application.

✨Tip Number 3

Prepare for interviews by diving deep into real-world examples of AI systems you've worked on. Be ready to discuss how you’ve turned vague goals into measurable metrics – that’s what they want to hear!

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team.

We think you need these skills to ace AI Evaluations Engineer

Python

Data Analysis

Machine Learning Systems

Evaluation and Experimentation

Understanding of LLMs

ASR (Automatic Speech Recognition)

TTS (Text-to-Speech)

Test Case Design

Structured Datasets

Collaboration with Engineering Teams

Benchmarking

Automated Testing

CI-style Workflows

Production APIs

Some tips for your application 🫡

Tailor Your Application:Make sure to customise your CV and cover letter for the AI Evaluations Engineer role. Highlight your experience with Python, ML systems, and any relevant projects that showcase your ability to define and measure AI quality.

Showcase Your Skills:Don’t just list your skills; demonstrate them! Use specific examples from your past work where you’ve designed evaluations or built datasets. This will help us see how you think in terms of system impact and real-world behaviour.

Be Clear and Concise:When writing your application, keep it clear and to the point. We appreciate straightforward communication, so avoid jargon unless it’s necessary. Make it easy for us to understand your qualifications and enthusiasm for the role.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen on joining the StudySmarter team!

How to prepare for a job interview at ConnexAI

✨Know Your AI Systems

Make sure you brush up on your knowledge of LLMs, ASR, and TTS systems. Understand how they work and their real-world applications. This will help you articulate how you can measure and improve their performance during the interview.

✨Prepare Real-World Examples

Think of specific instances where you've designed evaluations or built datasets in previous roles. Be ready to discuss how these experiences relate to the job description, especially in terms of turning vague goals into measurable outcomes.

✨Showcase Your Python Skills

Since strong Python skills are crucial for this role, be prepared to discuss your experience with scripting, data analysis, and tooling. If possible, bring examples of Python tools you've built or used in evaluation workflows.

✨Collaborate and Communicate

This role involves working closely with engineers and product teams, so highlight your teamwork and communication skills. Be ready to discuss how you've successfully collaborated in the past to diagnose system failures or improve user experience.

AI Evaluations Engineer

ConnexAI

Apply Now

AI Evaluations Engineer

At a Glance

AI Evaluations Engineer employer: ConnexAI

StudySmarter Expert Advice🤫

We think you need these skills to ace AI Evaluations Engineer

Some tips for your application 🫡

How to prepare for a job interview at ConnexAI

Company

Product

Help