Member of Technical Staff - Data Ingestion Engineer
Member of Technical Staff - Data Ingestion Engineer

Member of Technical Staff - Data Ingestion Engineer

Full-Time 36000 - 60000 £ / year (est.) Home office (partial)
R

At a Glance

  • Tasks: Build and operate large-scale data ingestion systems for AI training.
  • Company: Join a cutting-edge AI company with a mission to democratise superintelligence.
  • Benefits: Top-tier salary, comprehensive health benefits, and generous parental leave.
  • Why this job: Make a real impact on AI innovation while collaborating with world-class researchers.
  • Qualifications: Experience in web crawling and large-scale data systems; strong communication skills.
  • Other info: Enjoy a dynamic work environment with daily meals and team celebrations.

The predicted salary is between 36000 - 60000 £ per year.

Overview

Reflection’s mission is to build open superintelligence and make it accessible to all. We’re developing open weight models for individuals, agents, enterprises, and even nation states. Our team of AI researchers and company builders come from DeepMind, OpenAI, Google Brain, Meta, Character.AI, Anthropic and beyond.

About The Role

Data is playing an increasingly crucial role at the frontier of AI innovation. Many of the most meaningful advances in recent years have come not from new architectures, but from better data. As a member of the Data Team, your mission is to build and operate the ingestion systems that turn the open web and other large-scale data sources into reliable, well-structured corpora for training frontier models. You will own the machinery that acquires, extracts, normalizes, versions, and delivers data to our pre-training pipelines. You’ll work directly with world-class researchers to close the loop between what we collect and how it impacts model performance.

This role is ideal for engineers who love building robust distributed systems, but who also want to run experiments, reason about tradeoffs in data acquisition, and iterate quickly based on measurable impact.

Working closely with our pre-training and data quality teams, you will:

  • Build and operate large-scale data ingestion systems for pre-training, including web crawling, extraction, and dataset delivery
  • Run experiments to evaluate crawling strategies, extraction methods, and ingestion tradeoffs
  • Analyze ingested data to identify gaps, redundancy, and areas to improve
  • Build ingestion pipelines that scale reliably across large data campaigns
  • Develop specialized crawlers for high-priority data sources
  • Review code, debug production issues, and continuously improve ingestion infrastructure

About You

  • Curious about how training data influences model capabilities, and can iterate quickly based on measurable downstream impact
  • Able to collaborate tightly across functions: researchers, infra, operations, and external partners
  • Enjoy working in a hybrid research–engineering role

Skills And Qualifications

  • Experience building web crawling, data ingestion, or large-scale data acquisition systems using Ray, Beam, Spark, or similar technologies
  • Familiarity with how LLMs are trained and evaluated, and an intuition for what makes data useful for training
  • Comfortable working with very large datasets (multi-TB to PB scale) and building systems that are observable, testable, and maintainable
  • Comfortable designing experiments and using data to guide system improvements
  • Excellent communication skills. You can explain system behavior. You consider and communicate tradeoffs clearly

What We Offer

  • Top-tier compensation: Salary and equity structured to recognize and retain the best talent globally
  • Health & wellness: Comprehensive medical, dental, vision, life, and disability insurance
  • Life & family: Fully paid parental leave for all new parents, including adoptive and surrogate journeys. Financial support for family planning
  • Benefits & balance: paid time off when you need it, relocation support, and more perks that optimize your time
  • Opportunities to connect with teammates: lunch and dinner are provided daily. We have regular off-sites and team celebrations.

Member of Technical Staff - Data Ingestion Engineer employer: Reflection AI

At Reflection, we pride ourselves on being an exceptional employer, offering a dynamic work culture that fosters innovation and collaboration among top-tier AI researchers and engineers. Our commitment to employee growth is evident through our comprehensive benefits package, including top-tier compensation, health and wellness support, and generous parental leave, all designed to ensure a balanced work-life experience. Located in a vibrant tech hub, we provide unique opportunities for meaningful contributions to the future of AI while enjoying daily team meals and regular celebrations that strengthen our community.
R

Contact Detail:

Reflection AI Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Member of Technical Staff - Data Ingestion Engineer

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, especially those who work at companies you're interested in. A friendly chat can open doors and give you insights that job descriptions just can't.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repo showcasing your projects related to data ingestion or web crawling. This gives potential employers a taste of what you can do and sets you apart from the crowd.

✨Tip Number 3

Prepare for interviews by practising common technical questions and scenarios related to data systems. We recommend doing mock interviews with friends or using online platforms to get comfortable with the format.

✨Tip Number 4

Don't forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you're genuinely interested in joining our team!

We think you need these skills to ace Member of Technical Staff - Data Ingestion Engineer

Web Crawling
Data Ingestion
Large-Scale Data Acquisition
Ray
Beam
Spark
Data Analysis
Experiment Design
Communication Skills
System Observability
Testability
Maintainability
Collaboration
Understanding of LLMs

Some tips for your application 🫡

Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the role of Data Ingestion Engineer. Highlight any experience you have with web crawling, data ingestion, or large-scale data systems. We want to see how your background fits into our mission!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about data and AI. Share specific examples of projects you've worked on that relate to the job. We love seeing your personality come through!

Showcase Your Technical Skills: Don’t forget to mention your experience with technologies like Ray, Beam, or Spark. If you've built any systems or run experiments, let us know! We’re looking for engineers who can hit the ground running.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates. Plus, we love seeing applications come in through our own platform!

How to prepare for a job interview at Reflection AI

✨Know Your Data Ingestion Systems

Make sure you brush up on your knowledge of web crawling and data ingestion systems. Be ready to discuss your experience with technologies like Ray, Beam, or Spark, and how you've built or improved these systems in the past.

✨Show Your Curiosity

Demonstrate your curiosity about how training data influences model capabilities. Prepare examples of how you've iterated on projects based on measurable impact, and be ready to discuss any experiments you've run to evaluate different strategies.

✨Collaboration is Key

Highlight your ability to collaborate across functions. Think of specific instances where you've worked closely with researchers, operations, or external partners, and be prepared to explain how those collaborations led to successful outcomes.

✨Communicate Clearly

Practice explaining complex system behaviours in simple terms. Be ready to discuss trade-offs in your previous projects and how you communicated these to your team. Clear communication can set you apart from other candidates.

Member of Technical Staff - Data Ingestion Engineer
Reflection AI

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

R
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>