At a Glance
- Tasks: Build data pipelines and infrastructure for groundbreaking materials science research.
- Company: Join CuspAI, a leading AI company revolutionising materials discovery.
- Benefits: Competitive salary, equity, generous holiday, and professional development budget.
- Other info: Collaborative environment with world-class researchers and excellent career growth opportunities.
- Why this job: Make a real impact on sustainability and climate challenges with cutting-edge technology.
- Qualifications: 3+ years in data engineering, strong Python skills, and experience with large datasets.
The predicted salary is between 60000 - 80000 € per year.
About CuspAI
CuspAI is the frontier AI company on a mission to solve the breakthrough materials needed to power human progress. While nature took billions of years to perfect molecules, we are harnessing AI to unlock trillion-dollar materials breakthroughs in months, not millennia. Our founding team is the most cited in the world, comprised of world-class researchers in AI, chemistry and engineering. We are working on some of the hardest and most important challenges including energy, clean water, the future of compute, and carbon capture, and this is just the start of what our 'search engine' for next-generation materials will unlock. We invite you to be part of a diverse, innovative team at the intersection of AI and materials science, working to create impactful partnerships that drive innovation, scalability, and industry collaboration. This work matters. Your work matters. We’re on the cusp of the on-demand materials era. Join us.
The Role
As we grow, we are seeking a Data Engineer to play a crucial part in driving our research and development efforts forward.
Your Impact
As a Data Engineer you will be part of the new team building the infrastructure that underpins and acts as the critical bridge between raw chemical data and our machine learning models. Your main focus will be to build the pipeline infrastructure and tooling for data ingestion, moving towards self-serve setup for the scientific team members. You'll also be responsible for securing, collecting, cleaning, standardising, and tagging diverse chemical datasets to create high-quality training data for our ML researchers while working closely with our chemistry team to ensure scientific accuracy.
What You Will Do
- Data Pipeline Development
- Design and build robust data pipelines for materials science datasets, experimental results, and computational chemistry outputs.
- Develop processes to integrate diverse data sources including materials databases, literature, patent filings, and laboratory instruments.
- Create automated workflows for processing crystallographic data, molecular structures, and materials properties.
- Build scalable systems to handle high-throughput computational chemistry calculations and experimental data.
- Data Quality & Standardisation
- Partner closely with the scientific and research teams to implement automated quality checks for crystal structure data, chemical compositions, and experimental measurements.
- Create standardisation protocols for materials nomenclature, units, and measurement conditions.
- Build monitoring systems to ensure data integrity across all pipelines.
- Collaboration & Integration
- Work hand in hand with ML researchers to understand data requirements for model training and inference.
- Partner with materials scientists to ensure accurate representation of domain knowledge in data schemas.
- Integrate with laboratory automation systems and computational chemistry software.
- Support real-time data needs for AI-driven materials discovery experiments.
Must Have Skills and Qualifications
- You are someone who gets excited about the opportunity to enable scientists to work on world changing challenges in this domain, with a personal interest in the potential applications of the technology that Cusp is building.
- You’re a builder of tools and infrastructure who enjoys making life as easy as possible for the teams, providing self-serve, reliable and scalable ingestion pipelines.
- You have at least 3+ years experience in data engineering roles, preferably in scientific or research environments.
- High level of proficiency in Python and databases with experience in large-scale data processing.
- You’re an advanced user of workflow orchestration tools (e.g. Airflow, Prefect, Dagster, Flyte or similar).
- Solid experience with containerisation (Docker, Kubernetes) and CI/CD practices.
- You have direct experience handling large/complex datasets and are interested in working with scientific packages.
- You’re a fast learner when it comes to new tools/systems.
- You enjoy designing systems that scale with growing data volumes and user demands.
- Understanding and appreciation of DevOps practices is also important.
Bonus Points (But Not Critical)
- You’ve worked with data from scientific computing (simulations or experiments).
- Knowledge of machine learning data requirements and MLOps practices, including pre-processing/processing as part of model training.
- An academic background in Materials Science, Chemistry, Chemical Engineering, or related field.
- Even more bonus points if you have an understanding of crystallography, materials properties, and computational chemistry concepts!
What we Offer
- A competitive salary: We value and reward impact and growth.
- Equity in CuspAI: You have a stake in the success of the company.
- Time off to stay fresh: 28 days holiday (DE, NL, UK) or 21 days holiday (JP, SG, US), in addition to local public holidays.
- ‘Gold Standard’ parental leave: 26 weeks (primary caregiver) and 12 weeks (secondary caregiver) at full pay.
- Professional development budget: We invest in your career development so you can stay up to date with the latest industry knowledge or add to your skills to increase impact and growth.
- Solve meaningful problems: See how your work has a direct impact on advancing materials science and solving sustainability and climate-related problems.
- True interdisciplinary teamwork: Be part of a deeply collaborative environment bridging AI research, computational chemistry, and experimental science.
CuspAI is an equal opportunities employer committed to building a diverse and inclusive workplace. We do not discriminate on the basis of sex, race, religion or belief, ethnic or national origin, disability, age, citizenship, marital, domestic or civil partnership status, sexual orientation, gender identity, pregnancy or related condition, veteran status, or any other basis protected by applicable law. We actively encourage applications from all backgrounds and value the unique perspectives and contributions that diversity brings to our team.
Please let us know If you require any specific adjustments during or after the interview process. We will do everything we can within reason to accommodate.
Data Engineer in London employer: CuspAI
CuspAI is an exceptional employer that fosters a culture of innovation and collaboration, where your contributions directly impact the advancement of materials science and sustainability. With a competitive salary, equity options, generous parental leave, and a commitment to professional development, we empower our employees to grow while working alongside world-class researchers in a supportive and inclusive environment. Join us in shaping the future of materials with AI, and be part of a team that values diversity and the unique perspectives each member brings.
StudySmarter Expert Advice🤫
We think this is how you could land Data Engineer in London
✨Tip Number 1
Network like a pro! Reach out to people in the industry, attend meetups, and connect with CuspAI employees on LinkedIn. A friendly chat can sometimes lead to job opportunities that aren’t even advertised!
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your data engineering projects. This gives potential employers a taste of what you can do and sets you apart from the crowd.
✨Tip Number 3
Prepare for interviews by brushing up on relevant technologies and concepts. Practice common data engineering questions and think about how your experience aligns with CuspAI’s mission. Confidence is key!
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, it shows you’re genuinely interested in being part of the CuspAI team.
We think you need these skills to ace Data Engineer in London
Some tips for your application 🫡
Tailor Your Application:Make sure to customise your CV and cover letter for the Data Engineer role. Highlight your experience with data pipelines, Python, and any relevant projects that showcase your skills in handling large datasets. We want to see how you can contribute to our mission!
Show Your Passion:Let us know why you're excited about working at CuspAI! Share your interest in materials science and AI, and how you see yourself making an impact. A genuine enthusiasm for the field can really make your application stand out.
Be Clear and Concise:When writing your application, keep it clear and to the point. Use bullet points where possible to make it easy for us to read through your qualifications and experiences. We appreciate a well-structured application that gets straight to the good stuff!
Apply Through Our Website:We encourage you to apply directly through our website. This ensures your application goes straight to the right team and helps us keep track of all applicants. Plus, it’s super easy to do – just follow the prompts and submit your info!
How to prepare for a job interview at CuspAI
✨Know Your Data Engineering Basics
Before the interview, brush up on your data engineering fundamentals. Be ready to discuss your experience with Python, databases, and large-scale data processing. CuspAI is looking for someone who can hit the ground running, so showcasing your technical skills will definitely give you an edge.
✨Understand the Science Behind the Role
Even if you don't have a deep background in materials science or chemistry, take some time to familiarise yourself with key concepts. Understanding how data relates to scientific research will help you communicate effectively with the team and demonstrate your enthusiasm for the role.
✨Prepare for Collaboration Questions
CuspAI values teamwork, so be prepared to discuss your past experiences working with cross-functional teams. Think of examples where you've collaborated with scientists or researchers, and how you ensured that data integrity was maintained throughout the process.
✨Showcase Your Problem-Solving Skills
Be ready to share specific examples of challenges you've faced in previous roles and how you overcame them. Highlight your ability to design scalable systems and automate workflows, as these are crucial for the Data Engineer position at CuspAI.