Evaluation Scenario Writer - AI Agent Testing Specialist
Evaluation Scenario Writer - AI Agent Testing Specialist

Evaluation Scenario Writer - AI Agent Testing Specialist

Full-Time 36000 - 60000 £ / year (est.) No home office possible
M

At a Glance

  • Tasks: Design evaluation scenarios for AI agents and create structured test cases.
  • Company: Mindrift connects specialists with innovative AI projects to shape the future of technology.
  • Benefits: Enjoy remote work flexibility, part-time hours, and enhance your portfolio with cutting-edge AI experience.
  • Why this job: Contribute to impactful AI projects while working on your own schedule and learning new skills.
  • Qualifications: Bachelor's or Master's degree in relevant fields and 3+ years of experience required.
  • Other info: This is a fully remote freelance role, perfect for balancing with studies or other commitments.

The predicted salary is between 36000 - 60000 £ per year.

At Mindrift, innovation meets opportunity. We believe in using the power of collective human intelligence to ethically shape the future of AI.

About the Role

We\’re looking for someone who can design realistic and structured evaluation scenarios for LLM-based agents. You will create test cases that simulate human-performed tasks, define gold-standard behavior to compare agent actions against, and ensure each scenario is clearly defined, well-scored, and easy to execute and reuse. A sharp analytical mindset, attention to detail, and an interest in how AI agents make decisions are essential.

Responsibilities

  • Create structured test cases that simulate complex human workflows.
  • Define gold-standard behavior and scoring logic to evaluate agent actions.
  • Analyze agent logs, failure modes, and decision paths.
  • Work with code repositories and test frameworks to validate your scenarios.
  • Iterate on prompts, instructions, and test cases to improve clarity and difficulty.
  • Ensure that scenarios are production-ready, easy to run, and reusable.

Requirements

  • Bachelor’s or Master’s Degree in Computer Science, Software Engineering, Data Science / Data Analytics, Artificial Intelligence / Machine Learning, Computational Linguistics / Natural Language Processing (NLP), Information Systems or related fields.
  • Background in QA, software testing, data analysis, or NLP annotation.
  • Good understanding of test design principles (e.g., reproducibility, coverage, edge cases).
  • Strong written communication skills in English.
  • Comfortable with structured formats like JSON/YAML for scenario description.
  • Can define expected agent behaviors (gold paths) and scoring logic.
  • Basic experience with Python and JavaScript.
  • Curious and open to working with AI-generated content, agent logs, and prompt-based behavior.

Nice to Have

  • Experience in writing manual or automated test cases.
  • Familiarity with LLM capabilities and typical failure modes.
  • Understanding of scoring metrics (precision, recall, coverage, reward functions).

Benefits

  • Work on your own schedule, from anywhere in the world.
  • Get paid for your expertise, with rates up to $50/hour depending on your skills, experience, and project needs.
  • Participate in a flexible, remote, freelance project that fits around your primary professional or academic commitments.
  • Participate in an advanced AI project and gain valuable experience to enhance your portfolio.
  • Influence how future AI models understand and communicate in your field of expertise.

#J-18808-Ljbffr

Evaluation Scenario Writer - AI Agent Testing Specialist employer: Mindrift

At Mindrift, we foster a dynamic and innovative work culture that empowers our team to shape the future of AI through collaboration and creativity. As a remote employer, we offer flexible part-time opportunities that allow you to balance your professional commitments while working on cutting-edge AI projects that enhance your portfolio. Join us to be part of a community that values expertise and encourages continuous learning in a supportive environment.
M

Contact Detail:

Mindrift Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Evaluation Scenario Writer - AI Agent Testing Specialist

✨Tip Number 1

Familiarise yourself with the latest trends in AI and LLMs. Understanding how these technologies work will help you design more effective evaluation scenarios that are relevant to current industry standards.

✨Tip Number 2

Network with professionals in the AI field. Engaging with others who have experience in AI agent testing can provide insights into best practices and may even lead to referrals for the position.

✨Tip Number 3

Showcase your analytical skills by discussing past projects where you've designed test cases or worked with AI systems. Be prepared to explain your thought process and how you approached problem-solving in those scenarios.

✨Tip Number 4

Stay updated on the ethical considerations surrounding AI. Being knowledgeable about the ethical implications of AI technology will demonstrate your commitment to responsible AI development, which is crucial for this role.

We think you need these skills to ace Evaluation Scenario Writer - AI Agent Testing Specialist

Analytical Skills
Attention to Detail
Experience with LLM-based agents
Test Case Design
Scenario Development
Understanding of AI Decision-Making
Annotation Skills
Collaboration with Developers
Adaptability to Complex Guidelines
Strong English Proficiency (C1 or above)
Problem-Solving Skills
Knowledge of Computational Linguistics
Familiarity with Natural Language Processing (NLP)
Ability to Define Gold-Standard Behaviour

Some tips for your application 🫡

Understand the Role: Before applying, make sure you fully understand the responsibilities of an Evaluation Scenario Writer. Familiarise yourself with designing structured test scenarios and the importance of defining gold-standard behaviour for AI agents.

Tailor Your CV: Highlight your relevant experience in AI, data science, or software engineering. Emphasise any previous work involving scenario design or testing, and ensure your skills align with the job requirements.

Craft a Compelling Cover Letter: Write a cover letter that showcases your analytical mindset and attention to detail. Discuss your interest in AI and how your background makes you a suitable candidate for this role. Be sure to mention your ability to adapt to complex guidelines.

Proofread Your Application: Before submitting, carefully proofread your CV and cover letter. Check for any grammatical errors or typos, as these can create a negative impression. A polished application reflects your professionalism and attention to detail.

How to prepare for a job interview at Mindrift

✨Showcase Your Analytical Skills

As an Evaluation Scenario Writer, your analytical mindset is crucial. Be prepared to discuss specific examples of how you've designed test scenarios or evaluated AI outputs in the past. Highlight your attention to detail and how it has positively impacted your previous projects.

✨Understand AI Decision-Making

Familiarise yourself with how AI agents make decisions. During the interview, demonstrate your understanding of LLMs and their applications. This will show that you are not only qualified but also genuinely interested in the field of AI.

✨Prepare for Technical Questions

Expect technical questions related to your experience in computer science, data analytics, or machine learning. Brush up on relevant concepts and be ready to explain how you've applied them in real-world scenarios, especially in creating structured test cases.

✨Ask Insightful Questions

At the end of the interview, take the opportunity to ask thoughtful questions about the company's projects and future directions. This shows your enthusiasm for the role and helps you gauge if the company aligns with your career goals.

Evaluation Scenario Writer - AI Agent Testing Specialist
Mindrift

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

M
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>