Research Scientist/Engineer (Evaluations)
Research Scientist/Engineer (Evaluations)

Research Scientist/Engineer (Evaluations)

Full-Time 80000 - 120000 £ / year (est.) No home office possible
COL Limited

At a Glance

  • Tasks: Run evaluations on cutting-edge AI systems and automate testing pipelines.
  • Company: Join Apollo Research, a leader in AI risk assessment and evaluation.
  • Benefits: Enjoy competitive salary, unlimited vacation, and professional development budget.
  • Why this job: Be at the forefront of AI research and make a real impact.
  • Qualifications: Strong Python skills and a passion for AI model evaluation.
  • Other info: Dynamic team culture with opportunities for growth and collaboration.

The predicted salary is between 80000 - 120000 £ per year.

Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.

ABOUT THE OPPORTUNITY

We develop and run evaluations that help assess the risks posed by scheming AIs. You will get to work with frontier labs like OpenAI, Anthropic, and Google DeepMind and be amongst the first to interact with new models before anyone else. The ideal candidate loves rigorously testing frontier AI models, and enjoys building efficient pipelines and automating them.

YOU WILL HAVE THE OPPORTUNITY TO:

  • Run pre-deployment evaluation campaigns on the most capable AI systems in the world. We partner with multiple labs, giving you access to a breadth of models that no single AI lab could offer.
  • Deep dive into AI cognition. Scan through thousands of model transcripts to surface behavioural patterns that no one has ever observed before.
  • Build new evaluations for frontier risks, from designing novel test environments to scaling them across hundreds of distinct scenarios.
  • Work directly with frontier AI developers. Share your findings, engage with their feedback, and see your evaluations directly inform deployment decisions for the most capable AI systems in the world.
  • Automate and improve the evaluation pipeline. We already use automation across building, running, and analyzing evals.

KEY REQUIREMENTS

  • Software engineering skills: Our entire stack uses Python. We’re looking for candidates with strong software engineering experience.
  • Process optimisation: You always try to improve workflows.
  • Data Analysis & Pattern Recognition: You can extract signal from large, messy datasets.
  • Writing and communication: You succinctly convey qualitative and quantitative findings to a technical and non-technical audience.
  • AI power-user: You are curious about the capabilities and propensities of frontier AI models.

(Bonus) We are using Inspect as our primary evals framework, and we value experience with it.

We want to emphasise that people who feel they don’t fulfill all of these characteristics but think they would be a good fit for the position, nonetheless, are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine. We don’t require a formal background or industry experience and welcome self-taught candidates.

BENEFITS

  • This role offers market competitive salary, equity, and competitive benefits.
  • Salary: 100k - 200k GBP (~135k - 270k USD)
  • Flexible work hours and schedule
  • Unlimited vacation
  • Unlimited sick leave
  • Lunch, dinner, and snacks are provided for all employees on workdays
  • Paid work trips, including staff retreats, business trips, and relevant conferences
  • A yearly $1,000 (USD) professional development budget

LOGISTICS

  • Time Allocation: Full-time
  • Location: The office is in London, and the building is shared with the London Initiative for Safe AI (LISA) offices. This is an in-person role.
  • Work Visas: We can sponsor UK visas

ABOUT APOLLO RESEARCH

The rapid rise in AI capabilities offer tremendous opportunities, but also present significant risks. At Apollo Research, we’re primarily concerned with risks from Loss of Control, i.e. risks coming from the model itself rather than e.g. humans misusing the AI.

ABOUT THE TEAM

The current evals team consists of Jérémy Scheurer, Alex Meinke, Bronson Schoen, Felix Höfstäter, Axel Højmark, Teun van der Weij, Alex Lloyd and Mia Hopman.

Equality Statement: Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.

How to apply: Please complete the application form with your CV. The provision of a cover letter is optional but not necessary. Please also feel free to share links to relevant work samples.

About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 2.5 hours), 3 technical interviews, and a final interview with Marius (CEO).

Your Privacy and Fairness in Our Recruitment Process: We are committed to protecting your data, ensuring fairness, and adhering to workplace fairness principles in our recruitment process.

Research Scientist/Engineer (Evaluations) employer: COL Limited

Apollo Research is an exceptional employer, offering a dynamic work environment in London where innovation meets collaboration. Employees enjoy competitive salaries, unlimited vacation, and a strong emphasis on professional development, all while working alongside leading AI labs to tackle the most pressing challenges in AI safety. With a culture that prioritises truth-seeking and constructive feedback, Apollo fosters an inclusive atmosphere that encourages personal growth and meaningful contributions to the field of AI.
COL Limited

Contact Detail:

COL Limited Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Research Scientist/Engineer (Evaluations)

✨Tip Number 1

Get your hands dirty with some practical projects! Dive into LLM evals or similar tasks to showcase your skills. This not only helps you understand the role better but also gives you something tangible to discuss during interviews.

✨Tip Number 2

Network like a pro! Reach out to current employees or alumni from your university who work in AI. A friendly chat can give you insider info and might even lead to a referral, which is always a bonus!

✨Tip Number 3

Prepare for those technical interviews by brushing up on Python and data analysis techniques. Make sure you can talk about your past projects and how you tackled challenges, as this will show your problem-solving skills.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in joining our team at Apollo Research.

We think you need these skills to ace Research Scientist/Engineer (Evaluations)

Python Programming
Software Engineering
Process Optimisation
Data Analysis
Pattern Recognition
Quantitative Analysis
Qualitative Assessment
Writing Skills
Communication Skills
AI Model Evaluation
Automation
Problem-Solving
Collaboration
Adaptability

Some tips for your application 🫡

Show Your Passion for AI: When writing your application, let your enthusiasm for AI shine through! We want to see that you’re genuinely excited about working with frontier models and tackling the challenges they present.

Tailor Your CV: Make sure your CV highlights relevant experience, especially in Python and software engineering. We love seeing how you've tackled messy problems and optimised processes in your past roles!

Keep It Clear and Concise: Whether you choose to include a cover letter or not, clarity is key. We appreciate succinct communication, so make sure your findings and experiences are easy to digest for both technical and non-technical audiences.

Apply Through Our Website: Don’t forget to submit your application through our website! It’s the best way for us to receive your details and ensures you’re considered for this exciting opportunity.

How to prepare for a job interview at COL Limited

✨Know Your AI Models

Familiarise yourself with the latest AI models and their capabilities. Since you'll be working with frontier labs like OpenAI and Google DeepMind, understanding their models will help you engage in meaningful discussions during the interview.

✨Showcase Your Python Skills

Since the role requires strong software engineering skills in Python, be prepared to discuss your experience with it. Bring examples of projects where you've shipped and maintained production code, and be ready to explain how you tackled messy problems.

✨Demonstrate Process Optimisation

Highlight your ability to improve workflows. Share specific examples of how you've streamlined processes in previous roles, especially in fast-paced environments. This will show that you can handle the rapid pace of pre-deployment evaluations.

✨Prepare for Technical Interviews

The technical interviews will focus on tasks relevant to the job. Brush up on hands-on LLM evals projects and be ready to discuss your approach to building evaluations in Inspect. This will demonstrate your practical knowledge and readiness for the role.

Research Scientist/Engineer (Evaluations)
COL Limited

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>