LLM Agent Evaluation Intern in London

LLM Agent Evaluation Intern in London

London Internship 20000 - 30000 £ / year (est.) Home office (partial)
Tencent

At a Glance

  • Tasks: Evaluate and improve LLM agent systems in a dynamic gaming environment.
  • Company: Join Tencent's Level Infinite, a global leader in gaming innovation.
  • Benefits: Gain hands-on experience, mentorship, and potential for future roles in AI.
  • Other info: Collaborative culture that values diverse voices and innovative ideas.
  • Why this job: Dive into cutting-edge AI technology and shape the future of gaming.
  • Qualifications: Pursuing or recently graduated in relevant fields with strong Python skills.

The predicted salary is between 20000 - 30000 £ per year.

About The Hiring Team

Level Infinite is Tencent's global gaming brand. It is a global game publisher offering a comprehensive network of services for games, development teams, and studios around the world. We are dedicated to delivering engaging and original gaming experiences to a worldwide audience, whenever and wherever they choose to play while building a community that fosters inclusivity, connection, and accessibility. Level Infinite also provides a wide range of services and resources to our network of developers and partner studios around the world to help them unlock the true potential of their games.

What The Role Entails

We are hiring an intern to work on evaluation and reliability infrastructure for a real-world LLM agent system in the UA performance marketing field. The agent performs multi-step reasoning, retrieves context, selects tools, executes actions, handles user confirmations, and interacts with external services. The goal of this internship is to build transferable expertise in agent evaluation engineering: evaluating tool use, measuring trajectory quality, designing benchmarks, analyzing traces, comparing model and prompt variants, and improving the reliability of agentic AI systems.

  • Research the state-of-the-art agentic workflow evaluation frameworks in the industry and in the research field.
  • Apply the theory to build automated evaluation pipelines that can run agent scenarios, capture execution artifacts, score results, and detect regressions.
  • Evaluate tool-use behavior, including whether the agent selects the right tool, passes correct arguments, avoids unnecessary calls, and handles tool errors appropriately.
  • Analyze agent trajectories using traces, logs, intermediate steps, and final outputs to identify reasoning failures, context misuse, hallucinated assumptions, and brittle workflow patterns.
  • Design metrics for agent reliability, including success rate, tool-call precision, argument accuracy, recovery rate, retry count, latency, cost, and safety-related failure rates.
  • Create reusable evaluation datasets from synthetic cases, golden workflows, and real anonymized executions.
  • Support experiments comparing prompts, model providers, tool descriptions, memory strategies, context construction methods, and execution modes.
  • Help build human evaluation workflows and rubrics for judging agent correctness, faithfulness, usefulness, and risk awareness.
  • Work with engineers to translate evaluation findings into better tests, monitoring signals, tool interfaces, prompts, and guardrails.
  • Potentially compose research papers and publish in scientific conferences.

Who We Look For

  • Currently pursuing or recent graduates of a Master's or PhD degree in Computer Science, Artificial Intelligence, Machine Learning, Software Engineering, Data Science, or a related field.
  • Strong Python fundamentals and interest in AI systems.
  • Curious about how LLM agents work, fail, and improve.
  • Interested in evaluation methodology, not just application building.
  • Comfortable reading logs, traces, test cases, and structured data.
  • Detail-oriented and able to define clear, measurable criteria for ambiguous agent behavior.
  • Prior experience with LLMs, LangChain-like agents, tool calling, pytest, data analysis, or observability tools is helpful but not required.

Equal Employment Opportunity at Tencent

As an equal opportunity employer, we firmly believe that diverse voices fuel our innovation and allow us to better serve our users and the community. We foster an environment where every employee of Tencent feels supported and inspired to achieve individual and common goals.

LLM Agent Evaluation Intern in London employer: Tencent

Level Infinite, as part of Tencent's global gaming brand, is an exceptional employer that champions inclusivity and innovation in the gaming industry. Our vibrant work culture encourages collaboration and creativity, providing interns with invaluable hands-on experience in cutting-edge AI technologies while fostering personal and professional growth. Located in a dynamic environment, we offer unique opportunities to engage with industry leaders and contribute to meaningful projects that shape the future of gaming.

Tencent

Contact Details:

Tencent Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land LLM Agent Evaluation Intern in London

Join Data-Science Meetups

Get yourself along to local data-science meetups or workshops. They're goldmines for networking, and you'll learn from industry pros who might just point you in the direction of internships. Plus, discussing the latest trends with like-minded individuals can really amp up your game.

Utilise University Career Services

Check in with your uni's career services since they often have connections with companies looking for interns. They might even organise information sessions with firms, which can be a great chance for you to learn more about potential internships and make some key contacts.

Show Off Your Stuff on GitHub

If you're into data science, having a GitHub profile with your projects is essential. Make sure your portfolio is public and showcases your best work! Recruiters love to see your coding skills and problem-solving approach, and it’s a brilliant way to stand out.

Apply Directly on Our Website

Don’t forget to check out the internships listed on our site! It's always a good idea to apply directly through our website because it makes your application easier for our team to find, and you might just catch the hiring manager’s eye by showcasing exactly what you're passionate about in data science.

We think you need these skills to ace LLM Agent Evaluation Intern in London

Python
Artificial Intelligence
Machine Learning
Data Analysis
Evaluation Methodology
Log Analysis
Tool Calling

Some tips for your application 🫡

Show Off Your Technical Skills:For a data science internship, we want to see those analytical skills shine! List your programming languages, like Python or R, and make sure to highlight any relevant projects or courses you've completed. If you've dabbled with tools like Pandas, NumPy, or machine learning algorithms, don’t hold back – include those in your CV!

Share Your Curiosity in Your Cover Letter:As an intern, your motivation and eagerness to learn are key! In your cover letter, talk about specific data science concepts that excite you and how this internship at Tencent will help you grow. Share what you hope to achieve and how you plan to tackle real-world data problems - we love enthusiasm!

Include Any Relevant Certifications:If you've earned any certifications, such as from Coursera or DataCamp, make sure to include these in your application. They show us that you're proactive and committed to expanding your data science skillset. This could make a real difference in how we assess your application!

Keep It Relevant and Concise:Remember, as an intern, you don’t need to have decades of experience. Focus on showcasing relevant coursework, personal projects, or even related volunteer work in data science. Keep your CV and cover letter concise but impactful – we appreciate clear and straightforward communication!

How to prepare for a job interview at Tencent

Brush Up on Your Coding Skills

As a data science intern, you might get grilled on your programming skills. Expect to tackle some coding challenges using languages like Python or R. We recommend practising basic algorithms or data manipulation tasks so you can show off your tech skills with confidence.

Show Off Your Projects

Prepare to discuss any projects you’ve done, whether in your studies or on your own time. Having a strong portfolio of data analyses or machine learning models will really set you apart. We can use platforms like GitHub to showcase your work to impress Tencent.

Know Your Stats and ML Basics

Brush up on your statistics and machine learning concepts because interviewers love to dig into this! Be ready to explain your understanding of algorithms or how you would approach a given data problem. This will highlight your theoretical background alongside your practical skills.

Be Eager to Learn and Adapt

Internships are all about potential and growth. Make sure you convey your eagerness to learn and adapt to new tools or methodologies. Show Tencent that you’re not just looking for experience, but that you're keen to contribute and grow within the team.