At a Glance
- Tasks: Evaluate and improve AI agent systems through innovative methodologies and automated pipelines.
- Company: Join a forward-thinking tech company focused on AI and machine learning.
- Benefits: Gain hands-on experience, mentorship, and potential for future opportunities in AI.
- Other info: Collaborative environment that values diverse voices and offers growth opportunities.
- Why this job: Dive into the exciting world of AI evaluation and make a real impact on technology.
- Qualifications: Pursuing or recently graduated in relevant fields with strong Python skills.
The predicted salary is between 20000 - 30000 £ per year.
We are hiring an intern to work on evaluation and reliability infrastructure for a real‑world LLM agent system in the UA performance marketing field. The agent performs multi‑step reasoning, retrieves context, selects tools, executes actions, handles user confirmations, and interacts with external services. The goal of this internship is to build transferable expertise in agent evaluation engineering: evaluating tool use, measuring trajectory quality, designing benchmarks, analyzing traces, comparing model and prompt variants, and improving the reliability of agentic AI systems. This role is ideal for someone interested in future opportunities in LLM agent evaluation, AI safety evaluation, research engineering, LLMOps, or applied AI infrastructure.
Responsibilities include:
- Researching the state‑of‑the‑art agentic workflow evaluation frameworks in the industry and in the research field.
- Applying the theory to build automated evaluation pipelines that can run agent scenarios, capture execution artifacts, score results, and detect regressions.
- Evaluating tool‑use behaviour, including whether the agent selects the right tool, passes correct arguments, avoids unnecessary calls, and handles tool errors appropriately.
- Analysing agent trajectories using traces, logs, intermediate steps, and final outputs to identify reasoning failures, context misuse, hallucinated assumptions, and brittle workflow patterns.
- Designing metrics for agent reliability, including success rate, tool‑call precision, argument accuracy, recovery rate, retry count, latency, cost, and safety‑related failure rates.
- Creating reusable evaluation datasets from synthetic cases, golden workflows, and real anonymised executions.
- Supporting experiments comparing prompts, model providers, tool descriptions, memory strategies, context construction methods, and execution modes.
- Helping build human evaluation workflows and rubrics for judging agent correctness, faithfulness, usefulness, and risk awareness.
- Working with engineers to translate evaluation findings into better tests, monitoring signals, tool interfaces, prompts, and guardrails.
- Potentially composing research papers and publishing in scientific conferences.
Who We Look For:
Currently pursuing or recent graduates of a Master’s or PhD degree in Computer Science, Artificial Intelligence, Machine Learning, Software Engineering, Data Science, or a related field. Strong Python fundamentals and interest in AI systems. Curious about how LLM agents work, fail, and improve. Interested in evaluation methodology, not just application building. Comfortable reading logs, traces, test cases, and structured data. Detail‑oriented and able to define clear, measurable criteria for ambiguous agent behaviour. Prior experience with LLMs, LangChain‑like agents, tool calling, pytest, data analysis, or observability tools is helpful but not required.
As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.
Agent Evaluation Intern employer: IMAGE FRAME INVESTMENT (UK) LIMITED
At Tencent, we pride ourselves on being an excellent employer, offering a dynamic work culture that encourages innovation and collaboration. Our Agent Evaluation Intern role provides unique opportunities for professional growth in the cutting-edge field of AI, with access to mentorship from industry experts and the chance to contribute to meaningful projects that shape the future of technology. Located in a vibrant environment, we support our employees' development through diverse experiences and a commitment to inclusivity, ensuring everyone can thrive and make a significant impact.
Contact Details:
IMAGE FRAME INVESTMENT (UK) LIMITED Recruitment Team
StudySmarter Expert Advice🤫
We think this is how you could land Agent Evaluation Intern
✨Tip Number 1
Network like a pro! Reach out to people in the AI and evaluation fields on LinkedIn or at industry events. A friendly chat can open doors that a CV just can't.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing any projects or experiments you've done related to LLMs or agent evaluation. This gives us a tangible way to see what you can do.
✨Tip Number 3
Prepare for interviews by diving deep into the latest trends in AI and agent evaluation. We love candidates who can discuss current challenges and solutions in the field.
✨Tip Number 4
Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you're genuinely interested in joining our team.
We think you need these skills to ace Agent Evaluation Intern
Some tips for your application 🫡
Tailor Your CV:Make sure your CV reflects the skills and experiences that align with the Agent Evaluation Intern role. Highlight any relevant projects or coursework in AI, machine learning, or software engineering that showcase your Python skills and interest in evaluation methodologies.
Craft a Compelling Cover Letter:Use your cover letter to tell us why you're excited about this internship. Share your passion for AI systems and evaluation, and mention any specific experiences that have prepared you for this role. Keep it engaging and personal!
Showcase Your Curiosity:In your application, demonstrate your curiosity about LLM agents and their evaluation. Mention any research you've done or questions you have about the field. We love candidates who are eager to learn and explore new ideas!
Apply Through Our Website:Don't forget to submit your application through our website! It’s the best way for us to receive your materials and ensures you’re considered for the role. Plus, it shows you’re serious about joining our team at StudySmarter!
How to prepare for a job interview at IMAGE FRAME INVESTMENT (UK) LIMITED
✨Know Your Stuff
Make sure you brush up on the latest trends in LLM agent evaluation and AI systems. Familiarise yourself with the state-of-the-art frameworks and methodologies. This will not only show your enthusiasm but also demonstrate that you're serious about the role.
✨Show Off Your Skills
Be ready to discuss your Python fundamentals and any relevant projects you've worked on. If you've dabbled in data analysis or have experience with LLMs, make sure to highlight these during the interview. Practical examples can really set you apart!
✨Ask Smart Questions
Prepare some insightful questions about the company's approach to agent evaluation and reliability. This shows that you're genuinely interested in the role and helps you gauge if the company is the right fit for you.
✨Be Detail-Oriented
Since the role requires a keen eye for detail, be prepared to discuss how you define clear, measurable criteria for evaluating ambiguous agent behaviour. Share any experiences where your attention to detail made a difference in your work.