Data Engineer

Data Engineer

Full-Time 70000 - 90000 £ / year (est.) Home office (partial)
D

At a Glance

  • Tasks: Build and manage data architecture for cutting-edge magnetic materials research.
  • Company: Join Diffractive Labs, a pioneering tech company in materials science.
  • Benefits: Competitive salary, equity, flexible work options, and inclusive culture.
  • Other info: Be part of a small, dynamic team driving scientific discovery.
  • Why this job: Make a real impact on groundbreaking research that shapes the future of energy and computing.
  • Qualifications: 3+ years in data engineering, strong Python skills, and experience with ML pipelines.

The predicted salary is between 70000 - 90000 £ per year.

Diffractive Labs is building the infrastructure to accelerate the discovery of novel magnetic materials through closed-loop AI and physical experimentation. As our Data Engineer, you will own the data architecture that powers this research. This role requires a builder who understands both large-scale machine learning pipelines and the messy reality of physical lab data. You will serve as the critical link between experimental results generated at the bench and the models evaluating them, ensuring our research team always has the exact, high-quality datasets required to push the frontier of materials science. We are looking for someone with a rigorous, experimental mindset who thrives in an interdisciplinary environment and operates with a high degree of technical ownership.

What You'll Do

  • Drive the overarching data architecture across our training stack, mapping out data requirements with ML researchers and evaluating new external sources to fill knowledge gaps.
  • Design and deploy the ingestion pipelines that capture physical experimental data directly from our wet lab instruments and feed it seamlessly into our model training workflows.
  • Construct robust, reproducible systems for processing, standardizing, and versioning diverse scientific corpora, creating a highly reliable foundation for the research team.
  • Develop custom evaluation datasets and reinforcement learning environments specifically calibrated for the properties and behaviors of magnetic materials.
  • Build internal tooling that allows machine learning researchers and physical scientists to effectively query, inspect, and audit the data feeding into pretraining, midtraining, and RL runs.
  • Continuously integrate emerging techniques in synthetic data generation, data selection, and data-efficient training into our production systems.

Skills & Qualifications

  • 3+ years of engineering experience focused on large-scale data pipelines, ideally within an applied ML, scientific, or LLM training environment.
  • High proficiency in Python and modern workflow orchestration frameworks (e.g., Dagster, Airflow, Prefect, or similar).
  • Demonstrated experience with dataset lineage, versioning, and reproducibility tooling (such as DVC, Delta Lake, or custom equivalents).
  • A track record of collaborating directly with machine learning researchers, translating complex modeling needs into scalable pipeline architecture and back again.
  • Strong DevOps fundamentals, including hands‑on experience with containerization (Docker, Kubernetes) and CI/CD deployment.

Nice to Have

  • Prior experience processing and structuring data from physical laboratory instrumentation, computational simulations, or multimodal scientific sources.
  • A background in curating datasets for domain‑specific continued pretraining or instruction tuning.
  • An academic or practical background in physics, materials science, or chemistry.

Why Join Us

You are building the foundation for breakthroughs in magnetic materials that will directly influence the future of energy and computing hardware. Diffractive is building the AI Material Scientist that autonomously learns from real‑world experimentation to push the boundaries of scientific discovery. We're early, moving fast, and working on problems that genuinely matter. You'll join a small, high‑calibre team where your work has real impact from day one. We're London‑based with a flexible approach to how and where you work. We offer competitive salary, generous equity and benefits. You'll have a real stake in what you build and in the company's overall success.

Equal Opportunity Employer

Diffractive is an equal opportunities employer. We are committed to creating an inclusive environment for all employees and welcome applications from people of all backgrounds, experiences, and identities. If you require any adjustments or accommodations at any point during the interview process please let us know - we will be happy to help.

Data Engineer employer: Diffractive Labs

At Diffractive Labs, we are at the forefront of scientific innovation, providing a dynamic and inclusive work environment where your contributions as a Data Engineer will directly shape the future of magnetic materials research. With a flexible approach to work and a commitment to employee growth, we offer competitive salaries, generous equity, and the opportunity to make a meaningful impact from day one in our London-based team. Join us to be part of a small, high-calibre group dedicated to pushing the boundaries of materials science through cutting-edge AI and experimentation.

D

Contact Details:

Diffractive Labs Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Data Engineer

Get Involved in Data Science Meetups

Tap into local data science meetups or workshops to connect with fellow enthusiasts and professionals. These events are goldmines for networking, and sometimes even lead directly to job openings at companies like Diffractive Labs!

Show Off Your Projects

Start building a public portfolio showcasing your data science projects on platforms like GitHub or personal websites. Highlight unique analyses or models you've developed. This not only demonstrates your skills but also gets your name out there for roles like Data Engineer at Diffractive Labs.

Leverage Professional Networks

Join professional bodies related to data science, like the Data Science Society or similar organisations. Getting involved can lead to mentorship opportunities and insider knowledge about full-time positions at companies like Diffractive Labs.

Apply Directly through Our Website

When you find a suitable opening like Data Engineer at Diffractive Labs, make sure to apply directly through our website. It gives you an edge and shows you're keen to join our team. Plus, who doesn’t love a direct application? It’s easier than navigating through job boards!

We think you need these skills to ace Data Engineer

Data Architecture
Large-Scale Data Pipelines
Machine Learning Pipelines
Python
Workflow Orchestration Frameworks
Dataset Lineage
Versioning Tools

Some tips for your application 🫡

Show Off Your Projects:In the world of data science, your projects can speak volumes about your skills. Make sure to showcase a few key projects in your CV or portfolio, especially those that highlight your ability to work with data sets, build models, or use relevant tools like Python, R, or SQL. Don’t forget to include links to any GitHub repositories if applicable!

Quantify Your Achievements:Employers love numbers! When drafting your CV, highlight your achievements with quantifiable results. For instance, mention how your data analysis led to a certain percentage increase in efficiency or revenue at a previous job or project. These details can really make your application pop!

Craft a Tailored Cover Letter:For a full-time role at Diffractive Labs, your cover letter should reflect your passion for data science and your excitement about the specific projects or values of the company. Dive into why you’re a good fit, how your skills align with their needs, and any unique perspectives you can bring to the team.

Stand Out with Relevant Courses and Certifications:Although experience talks, relevant courses or certifications can be your ticket to impressing hiring managers at Diffractive Labs. Mention any standout courses you've completed that equipped you with essential skills, such as machine learning certifications or data visualisation courses. This shows your commitment to continuously developing your skills in the field!

How to prepare for a job interview at Diffractive Labs

Brush Up on Your Statistics

For a data science role, we need to seriously sharpen our statistics skills. Get ready to tackle technical questions on probability distributions, hypothesis testing, and regression analysis. These are often the bread and butter of data science interviews, so don't just skim over them!

Showcase Your Projects

Prepare a killer portfolio showcasing your data science projects. We should include details about the datasets used, the tools and techniques applied, and the impact of your findings. If we can walk them through a particularly challenging project or a cool visualisation that had real-world implications, it’ll really make us stand out!

Get Comfortable with Python and R

Most data science positions require us to be proficient in programming languages like Python and R. We should practice common libraries like pandas, NumPy, and scikit-learn, and be ready for live coding exercises or algorithm questions. Showing off our coding chops can really impress the interviewers at Diffractive Labs!

Prepare for Case Studies

Expect to encounter real-world case studies during the interview. We might be asked how we’d approach a data problem or analyse a dataset to extract insights. It's essential to think out loud and demonstrate our problem-solving process so that the interviewer can see our logical thinking in action.