Applied Researcher (Product)
Applied Researcher (Product)

Applied Researcher (Product)

Full-Time 80000 - 120000 ÂŁ / year (est.) No home office possible
COL Limited

At a Glance

  • Tasks: Join our AGI safety team to transform AI research into practical tools for safety.
  • Company: Apollo Research, a leader in AI safety and innovation.
  • Benefits: Competitive salary, unlimited vacation, flexible hours, and professional development budget.
  • Why this job: Make a real impact on AI safety while working with cutting-edge technology.
  • Qualifications: 2+ years in empirical research with AI systems and strong Python skills.
  • Other info: Collaborative environment with opportunities for rapid growth and responsibility.

The predicted salary is between 80000 - 120000 ÂŁ per year.

Application deadline: We are conducting interviews actively and aim to fill this role as soon as we find someone suitable.

THE OPPORTUNITY

Join our new AGI safety product team and help transform complex AI research into practical tools that reduce risks from AI. As an applied researcher, you’ll work closely with our CEO (also Head of Product), product engineers and the Evals team software engineers to build tools that make AI agent safety accessible at scale for our customers. Our current focus is the monitoring of AI coding agents for AI safety and security failures. You will join a small team and will have significant ability to shape the team & tech, and have the ability to earn responsibility quickly. You will like this opportunity if you’re passionate about using empirical research to make AI systems safer in practice. You enjoy the challenge of translating theoretical AI risks into concrete detection mechanisms. You thrive on rapid iteration and learning from data. You want your research to directly impact real-world AI safety.

KEY RESPONSIBILITIES

  • Research & Development
  • Systematically collect and catalog coding agent failure modes from real-world instances, public examples, research literature, and theoretical predictions.
  • Design and conduct experiments to test monitor effectiveness across different failure modes and agent behaviors.
  • Build and maintain evaluation frameworks to measure progress on monitoring capabilities.
  • Iterate on monitoring approaches based on empirical results, balancing detection accuracy with computational efficiency.
  • Stay current with research on AI safety, agent failures, and detection methodologies.
  • Stay current with research into coding security and safety vulnerabilities.
  • Monitor Design & Optimization
    • Develop a comprehensive library of monitoring prompts tailored to specific failure modes (e.g., security vulnerabilities, goal misalignment, deceptive behaviors).
    • Experiment with different reasoning strategies and output formats to improve monitor reliability.
    • Design and test hierarchical monitoring architectures and ensemble approaches.
    • Optimize log pre-processing pipelines to extract relevant signals while minimizing latency and computational costs.
    • Implement and evaluate different scaffolding approaches for monitors, including chain-of-thought reasoning, structured outputs, and multi-step verification.
  • Future projects (likely not in the first 6 months)
    • Fine-tune smaller open-source models to create efficient, specialized monitors for high-volume production environments.
    • Design and build agentic monitoring systems that autonomously investigate logs to identify both known and novel failure modes.

    JOB REQUIREMENTS

    • 2+ years of experience conducting empirical research with large language models or AI systems.
    • Strong experience with AI coding agents, having extensively used and compared frontier coding agents.
    • Experience with LLM-as-a-judge setups.
    • Experience designing and running experiments, analyzing results, and iterating based on empirical findings (e.g., prompting, scaffolding, agent design, fine-tuning, or RL).
    • Strong Python programming skills.
    • Demonstrated ability to work independently on open-ended research problems.

    Bonus:

    • Experience with AI evaluation frameworks, particularly Inspect (though other frameworks are relevant as well).
    • Familiarity with AI safety concepts, particularly agent-related risks.
    • Familiarity with computer security, e.g. security testing and secure system design.
    • Experience fine-tuning language models or working with smaller open-source models.
    • Previous work building developer tools or monitoring systems.
    • Publications or contributions to AI safety or ML research.
    • Experience with production log systems or production log analysis.

    We want to emphasize that people who feel they don’t fulfill all of these characteristics but think they would be a good fit for the position nonetheless are strongly encouraged to apply. We believe that excellent candidates can come from a variety of backgrounds and are excited to give you opportunities to shine.

    WHAT YOU’LL ACCOMPLISH IN YOUR FIRST YEAR

    • Build a comprehensive failure mode database: Systematically collect and categorize 100+ distinct AI agent failure modes across safety and security dimensions, creating the foundation for our monitoring library.
    • Develop and validate monitoring approaches: Create and empirically test monitoring prompts and strategies for key failure categories, establishing clear metrics for monitor performance and building evaluation frameworks to track progress.
    • Optimize the monitoring pipeline: Improve log preprocessing and monitor scaffolding to achieve measurable improvements in detection accuracy, false positive rates, and computational efficiency.
    • Advance monitoring capabilities: Begin work on advanced approaches such as fine-tuned specialized monitors or agentic investigation systems, moving our monitoring from reactive detection toward proactive risk identification.

    REPRESENTATIVE PROJECTS

    • Hierarchical monitoring for coding agent security: Design a multi-layer monitoring system for detecting security vulnerabilities introduced by coding agents. Start by cataloging common security failure modes (e.g., hardcoded credentials, SQL injection vulnerabilities, insecure API calls). Build specialized monitors for each category, then create a hierarchical system where fast, efficient first-pass monitors flag potentially problematic code for deeper investigation by more sophisticated monitors. Validate the system on synthetic test cases and real agent outputs, iterating to optimize the tradeoff between detection rates and false positives while maintaining sub-second latency for most monitoring decisions.

    BENEFITS

    This role offers market competitive salary, equity, and competitive benefits. Salary: 100k - 180k GBP (~135k - 245k USD). Flexible work hours and schedule. Unlimited vacation. Unlimited sick leave. Lunch, dinner, and snacks are provided for all employees on workdays. Paid work trips, including staff retreats, business trips, and relevant conferences. A yearly $1,000 (USD) professional development budget.

    LOGISTICS

    Time Allocation: Full-time. Location: The office is in London, and the building is next to the London Initiative for Safe AI (LISA) offices. This is an in-person role. In rare situations, we may consider partially remote arrangements on a case-by-case basis. Work Visas: We can sponsor UK visas.

    ABOUT THE TEAM

    The Product team is a new team. Especially early on, you will work closely with Marius Hobbhahn (CEO), Jeremy Neiman (product engineer) and Zak Walters (product engineer). You’ll also sometimes work with our SWEs, Rusheb Shah, Andrei Matveiakin, Alex Kedrik, and Glen Rodgers to translate our internal tools into externally usable products. Furthermore, you will interact with our researchers, since we intend to be “our own customer” by using our products internally for our research work.

    ABOUT APOLLO RESEARCH

    The rapid rise in AI capabilities offer tremendous opportunities, but also present significant risks. At Apollo Research, we’re primarily concerned with risks from Loss of Control, i.e. risks coming from the model itself rather than e.g. humans misusing the AI. We’re particularly concerned with deceptive alignment / scheming, a phenomenon where a model appears to be aligned but is, in fact, misaligned and capable of evading human oversight. We work on the detection of scheming (e.g., building evaluations), the science of scheming (e.g., model organisms), and scheming mitigations (e.g., anti-scheming and control). We closely work with multiple frontier AI companies, e.g. to test their models before deployment or collaborate on scheming mitigations. At Apollo, we aim for a culture that emphasizes truth-seeking, being goal-oriented, giving and receiving constructive feedback, and being friendly and helpful.

    If you’re interested in more details about what it’s like working at Apollo, you can find more information here.

    Equality Statement: Apollo Research is an Equal Opportunity Employer. We value diversity and are committed to providing equal opportunities to all, regardless of age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, or sexual orientation.

    HOW TO APPLY

    Please complete the application form with your CV. The provision of a cover letter is neither required nor encouraged. Please also feel free to share links to relevant work samples.

    About the interview process: Our multi-stage process includes a screening interview, a take-home test (approx. 3 hours), 3 technical interviews, and a final interview with Marius (CEO). The technical interviews will be closely related to tasks the candidate would do on the job. There are no leetcode-style general coding interviews. If you want to prepare for the interviews, we suggest getting familiar with the evaluations framework Inspect, or by building simple monitors for coding agents and running them on your own Claude Code / Cursor / Codex / etc. traffic.

    Your Privacy and Fairness in Our Recruitment Process: We are committed to protecting your data, ensuring fairness, and adhering to workplace fairness principles in our recruitment process. To enhance hiring efficiency, we use AI-powered tools to assist with tasks such as resume screening. These tools are designed and deployed in compliance with internationally recognized AI governance frameworks. Your personal data is handled securely and transparently. We adopt a human-centred approach: all resumes are screened by a human and final hiring decisions are made by our team. If you have questions about how your data is processed or wish to report concerns about fairness, please contact us at info@apolloresearch.ai.

    Applied Researcher (Product) employer: COL Limited

    At Apollo Research, we pride ourselves on fostering a dynamic and innovative work environment where your contributions directly impact the safety of AI systems. Located in London, our team enjoys flexible working hours, unlimited vacation, and a strong emphasis on professional development, all while collaborating closely with industry leaders to tackle real-world challenges in AI safety. Join us to be part of a culture that values truth-seeking, constructive feedback, and personal growth, making it an exceptional place for passionate researchers like you.
    COL Limited

    Contact Detail:

    COL Limited Recruiting Team

    StudySmarter Expert Advice 🤫

    We think this is how you could land Applied Researcher (Product)

    ✨Tip Number 1

    Get to know the team! Before your interview, do a bit of research on the people you’ll be meeting. Understanding their roles and how they fit into the company can help you tailor your responses and show that you're genuinely interested in the team dynamic.

    ✨Tip Number 2

    Show off your passion for AI safety! During the interview, share specific examples of your past work related to AI systems and how it aligns with the role. This is your chance to demonstrate your enthusiasm and expertise, so don’t hold back!

    ✨Tip Number 3

    Prepare for practical tests! Since the role involves empirical research and experimentation, brush up on your Python skills and be ready to discuss your approach to designing experiments. Think about how you would tackle real-world problems using your knowledge.

    ✨Tip Number 4

    Don’t forget to ask questions! At the end of your interview, have a few thoughtful questions ready about the team’s projects or the company’s vision for AI safety. This shows your interest and helps you gauge if the role is the right fit for you.

    We think you need these skills to ace Applied Researcher (Product)

    Empirical Research
    AI Safety
    AI Coding Agents
    Experiment Design
    Data Analysis
    Python Programming
    Monitoring Frameworks
    Log Pre-processing
    Machine Learning
    Security Testing
    Fine-tuning Language Models
    Agent Design
    Scaffolding Approaches
    Hierarchical Monitoring Architectures
    Communication Skills

    Some tips for your application 🫡

    Be Yourself: We want to see the real you! Don’t be afraid to let your personality shine through in your application. Show us what makes you passionate about AI safety and how your unique experiences can contribute to our team.

    Tailor Your CV: Make sure your CV highlights relevant experience, especially with AI coding agents and empirical research. We love seeing how your skills align with our needs, so don’t hold back on showcasing your achievements!

    Skip the Cover Letter: No need for a cover letter here! Instead, focus on filling out the application form and sharing links to any work samples that demonstrate your expertise. We’re all about efficiency, so keep it straightforward.

    Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands quickly. Plus, it shows us you’re keen on joining our team!

    How to prepare for a job interview at COL Limited

    ✨Know Your Stuff

    Make sure you brush up on the latest research in AI safety and coding agents. Familiarise yourself with empirical research methods and be ready to discuss your past experiences with large language models. This role is all about translating theory into practice, so be prepared to share how you've done that before.

    ✨Show Your Passion

    Let your enthusiasm for AI safety shine through during the interview. Talk about why you care about making AI systems safer and how your personal values align with the company's mission. This is a small team where passion can make a big difference, so don’t hold back!

    ✨Prepare for Technical Questions

    Expect technical interviews that dive deep into your experience with coding agents and experimental design. Brush up on your Python skills and be ready to discuss specific projects you've worked on. Practising coding challenges related to AI monitoring systems could give you an edge.

    ✨Ask Insightful Questions

    Prepare thoughtful questions about the team's current projects and future goals. This shows you're genuinely interested in the role and helps you gauge if the company culture aligns with your expectations. Asking about their approach to AI safety and how they measure success can spark engaging conversations.

    Applied Researcher (Product)
    COL Limited

    Land your dream job quicker with Premium

    You’re marked as a top applicant with our partner companies
    Individual CV and cover letter feedback including tailoring to specific job roles
    Be among the first applications for new jobs with our AI application
    1:1 support and career advice from our career coaches
    Go Premium

    Money-back if you don't land a job in 6-months

    >