Principal Coding Annotator / LLM Evaluation Engineer

Principal Coding Annotator / LLM Evaluation Engineer

Full-Time 80000 - 100000 € / year (est.) No home office possible
Braintrust

At a Glance

  • Tasks: Create coding prompts and evaluate LLM outputs to improve model reliability.
  • Company: Join a cutting-edge AI team focused on large language models.
  • Benefits: Contract role with potential for long-term engagement, remote options available.
  • Other info: Collaborative environment with opportunities for mentorship and career growth.
  • Why this job: Work hands-on with advanced LLMs and make a real impact in AI.
  • Qualifications: 10+ years in software development, strong Python skills, and LLM evaluation experience.

The predicted salary is between 80000 - 100000 € per year.

We are building and evaluating state‑of‑the‑art large language models (LLMs) and are looking for experienced software engineers to join our evaluation and annotation team. This role sits at the intersection of real‑world software engineering, model evaluation, and applied AI, and is critical to improving model reliability, reasoning, and code quality. This is a contracting engagement – initially 6 months – with potential for long‑term engagement. Location: Paris or London‑based preferred; alternatively Europe remote for strong candidates.

What You’ll Do

  • Create high‑quality coding prompts and reference answers (benchmark‑style, e.g. SWE‑Bench‑like problems).
  • Evaluate LLM outputs for code generation, refactoring, debugging, and implementation tasks.
  • Identify and document model failures, edge cases, and reasoning gaps.
  • Perform head‑to‑head evaluations between private LLMs (Mistral‑based) and leading external models.
  • Build or configure coding environments to support evaluation and reinforcement learning (RL).
  • Follow detailed annotation and evaluation guidelines with high consistency.

What We’re Looking For

  • 10+ years of professional software development experience.
  • Strong Python skills (required).
  • Knowledge of at least one additional programming language (bonus).
  • 1+ year of coding annotation and/or LLM evaluation experience (part‑time OK) in a major frontier AI lab or AI infrastructure company.
  • Prior code reviewer experience is a plus.
  • Proven ability to apply structured evaluation criteria and write clear technical feedback.
  • Fluent in English (written and spoken).
  • Team lead or mentoring experience is a strong plus.

Why This Role

  • Work hands‑on with cutting‑edge LLMs.
  • Apply real‑world engineering judgment to model evaluation and improvement.
  • High‑impact, technical work with a focused, senior team.

Principal Coding Annotator / LLM Evaluation Engineer employer: Braintrust

Join a forward-thinking company that values innovation and technical excellence, offering a collaborative work culture where your expertise in software engineering and AI can truly shine. With opportunities for professional growth and the chance to work on cutting-edge large language models, this role provides a meaningful impact in a dynamic environment, whether based in vibrant Paris or London, or remotely across Europe for exceptional candidates.

Braintrust

Contact Detail:

Braintrust Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Principal Coding Annotator / LLM Evaluation Engineer

Tip Number 1

Network like a pro! Reach out to your connections in the AI and software engineering fields. Attend meetups, webinars, or even online forums where you can chat with industry folks. You never know who might have a lead on that perfect role!

Tip Number 2

Show off your skills! Create a portfolio showcasing your coding prompts, evaluations, or any relevant projects. This is your chance to demonstrate your expertise in Python and LLM evaluation. Make it easy for potential employers to see what you can do!

Tip Number 3

Prepare for interviews by brushing up on your technical knowledge and soft skills. Practice explaining complex concepts clearly and concisely. Remember, they want to see how you think and solve problems, so be ready to tackle some coding challenges!

Tip Number 4

Don’t forget to apply through our website! We’re always on the lookout for talented individuals like you. Keep an eye on our job postings and make sure your application stands out by tailoring it to the specific role you’re after.

We think you need these skills to ace Principal Coding Annotator / LLM Evaluation Engineer

Software Development
Python
Additional Programming Language
Coding Annotation
LLM Evaluation
Code Review
Technical Feedback Writing

Some tips for your application 🫡

Tailor Your Application:Make sure to customise your CV and cover letter to highlight your experience in software development and LLM evaluation. We want to see how your skills align with the role, so don’t hold back on showcasing relevant projects!

Showcase Your Technical Skills:Since strong Python skills are a must, be sure to include specific examples of your coding experience. If you’ve worked with other programming languages, mention those too! We love seeing a diverse skill set.

Be Clear and Concise:When writing your application, clarity is key. Use straightforward language and avoid jargon where possible. We appreciate well-structured applications that get straight to the point, especially when it comes to your technical feedback experience.

Apply Through Our Website:We encourage you to submit your application through our website. It’s the best way for us to keep track of your application and ensures you’re considered for the role. Plus, it’s super easy!

How to prepare for a job interview at Braintrust

Know Your Tech Inside Out

Make sure you brush up on your Python skills and any other programming languages you know. Be ready to discuss specific projects you've worked on, especially those involving coding annotation or LLM evaluation. This will show that you have the hands-on experience they’re looking for.

Prepare for Technical Questions

Expect to face questions about model evaluation and coding prompts. Think about how you would create high-quality coding prompts and reference answers. Practising coding problems similar to SWE-Bench can give you a solid edge in the interview.

Showcase Your Problem-Solving Skills

Be prepared to discuss how you identify and document model failures or reasoning gaps. Share examples of how you've tackled similar challenges in the past, as this will demonstrate your analytical thinking and attention to detail.

Highlight Teamwork and Mentoring Experience

If you have experience leading teams or mentoring others, make sure to bring it up. This role values collaboration, so sharing how you've successfully worked with others or guided junior engineers can set you apart from other candidates.