At a Glance
- Tasks: Architect automated evaluation pipelines and design methods to assess answer quality.
- Company: Join a cutting-edge tech company focused on AI and machine learning.
- Benefits: Competitive salary, flexible work environment, and opportunities for professional growth.
- Other info: Be part of a small, high-impact team shaping the future of AI-driven products.
- Why this job: Make a real impact on product quality while working with innovative technologies.
- Qualifications: PhD or MS in a technical field, 4+ years in data science, strong Python and SQL skills.
The predicted salary is between 60000 - 80000 £ per year.
Requirements
- PhD or MS in a technical field or equivalent experience
- 4+ years of experience in data science or machine learning
- Strong proficiency in Python and SQL (expected to write production-grade code)
- Experience building within a modern cloud data stack, specifically AWS and Databricks
- Comfortable with agentic coding workflows and using AI-assisted development tools to iterate faster
- (Desirable) 1+ years of experience working with LLMs at scale, specifically with LLM-as-a-judge setups
- (Desirable) Prior experience working on customer-facing web products or consumer apps, with real user traffic at scale
- (Desirable) A strong research background, with experience applying research methods to real-world ML problems
- (Desirable) Experience defining evaluation metrics (e.g., factual consistency, hallucination rate, retrieval precision) and building ground truth datasets
What the job involves
- Architect and maintain automated evaluation pipelines to assess answer quality across Perplexity's products, ensuring high standards for accuracy and helpfulness
- Design evaluation sets and methods specifically to measure the impact of tool calls (particularly web search retrieval) on the final answer's quality
- Develop VLM-based solutions to programmatically evaluate how final answers render visually across different platforms and devices
- Continuously review public benchmarks and academic evaluations for their applicability to the Perplexity product, adapting and incorporating them into our regular performance measurements
- Operate within a small, high-impact team where your evaluation metrics directly shape product changes, collaborating closely with technical leadership to measure and improve Answer Quality
Data Scientist, ML Evaluation & LLM Metrics employer: Perplexity AI
At Perplexity, we pride ourselves on being an exceptional employer that fosters a collaborative and innovative work culture. Our Data Scientists enjoy the unique opportunity to work in a dynamic environment where their contributions directly influence product quality, while benefiting from continuous professional development and access to cutting-edge technologies. Located in a vibrant tech hub, we offer competitive compensation, flexible working arrangements, and a commitment to employee well-being, making us an ideal choice for those seeking meaningful and rewarding careers in machine learning.
StudySmarter Expert Advice🤫
We think this is how you could land Data Scientist, ML Evaluation & LLM Metrics
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects, especially those involving Python, SQL, and machine learning. This is your chance to demonstrate your expertise and make a lasting impression on hiring managers.
✨Tip Number 3
Prepare for interviews by brushing up on your technical knowledge and problem-solving skills. Practice coding challenges and be ready to discuss your past experiences, especially those related to LLMs and evaluation metrics.
✨Tip Number 4
Don’t forget to apply through our website! We love seeing applications directly from candidates who are genuinely interested in joining our team. It shows initiative and enthusiasm, which we really appreciate!
We think you need these skills to ace Data Scientist, ML Evaluation & LLM Metrics
Some tips for your application 🫡
Show Off Your Skills:Make sure to highlight your technical skills, especially in Python and SQL. We want to see how you can write production-grade code, so don’t hold back on showcasing your experience with cloud data stacks like AWS and Databricks.
Tailor Your Application:Take a moment to customise your application for the Data Scientist role. Mention any relevant experience with LLMs or customer-facing products, as this will help us see how you fit into our team and the projects we work on.
Be Clear and Concise:When writing your application, keep it clear and to the point. We appreciate well-structured applications that get straight to the heart of your experience and how it relates to the job description.
Apply Through Our Website:Don’t forget to apply through our website! It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it makes the whole process smoother for everyone involved.
How to prepare for a job interview at Perplexity AI
✨Know Your Stuff
Make sure you brush up on your Python and SQL skills. Be ready to discuss your past projects where you wrote production-grade code. They’ll likely ask you to solve a problem on the spot, so practice coding challenges beforehand!
✨Showcase Your Experience with LLMs
If you've worked with large language models, especially in LLM-as-a-judge setups, be prepared to share specific examples. Talk about the challenges you faced and how you overcame them, as well as any metrics you defined to evaluate their performance.
✨Understand Evaluation Metrics
Familiarise yourself with evaluation metrics like factual consistency and hallucination rates. Be ready to discuss how you’ve applied these in previous roles and how they can impact product quality. This shows you’re not just a coder but someone who understands the bigger picture.
✨Be Ready to Collaborate
This role involves working closely with a small team and technical leadership. Prepare to discuss how you’ve collaborated in the past, particularly in high-impact environments. Highlight your communication skills and how you’ve contributed to team success.