Senior Data Engineer

Senior Data Engineer

Full-Time 60000 - 80000 € / year (est.) Home office (partial)
Boehringer Ingelheim

At a Glance

  • Tasks: Transform biomedical datasets into AI-ready assets and build robust data engineering pipelines.
  • Company: Join a pioneering AI Accelerator in London focused on computational innovation.
  • Benefits: Hybrid work model, competitive salary, and opportunities for professional growth.
  • Other info: Collaborative environment with a focus on cutting-edge technology and career development.
  • Why this job: Make a real impact in biomedical AI and advance healthcare through innovative data solutions.
  • Qualifications: PhD in a relevant field and strong experience in data engineering for machine learning.

The predicted salary is between 60000 - 80000 € per year.

The AI Accelerator is a brand-new, London-based hub within Computational Innovation (CI), a global organisation comprising computational biology, human genetics, data excellence and AI expertise. The purpose of CI’s AI Accelerator is to provision production-quality, versatile, foundational biomedical AI capabilities that can be adapted and deployed to improve and accelerate portfolio decision-making and increase the probability of success, by furthering understanding of the biology driving patient outcomes and identifying mechanisms involved in disease.

A core component of the AI Accelerator is AI Enablement, a team focused on ensuring that the accelerator’s model provisioning teams can design, build and deploy versatile biomedical foundation models that can enhance human understanding of disease biology and help identify potential targets, biomarkers and patient segments for further research. This will be achieved by provisioning AI‑ready, integrated, multimodal data for distributed training, managing the model lifecycle and partnering with the IT organisation to ensure that model builders and downstream users have the necessary infrastructure and tooling to prototype, implement, adapt and deploy AI capabilities to advance the portfolio.

We are seeking a Senior Data Engineer to join the AI Enablement team and contribute to the design and delivery of robust data engineering pipelines that transform harmonised biomedical datasets into AI‑ready, integrated assets across multi‑omics, clinical and health records, and medical imaging data. You will be an experienced, independent data engineer within AI Enablement, owning significant data engineering workstreams within the broader technical direction and architecture set by the Senior Staff Data Engineer. The pipelines and integrated datasets you build will enable model training, fine‑tuning and inference.

Key Responsibilities
  • Transform harmonised datasets into AI‑ready assets suitable for large model pre‑training and fine‑tuning within the defined standards and specifications.
  • Build and maintain entity linking pipelines that connect patients and biomedical entities across modalities.
  • Build and maintain cross‑modal integration pipelines to support multimodal training, fine‑tuning and inference.
  • Ensure pipelines and datasets are built and operated in accordance with data access permissions, consent conditions and usage restrictions.
  • Maintain data lineage and provenance throughout.
  • Build and maintain biomedical benchmark datasets with versioning and documentation.
  • Write clean, well‑tested, well‑documented code that meets the required engineering standards.
  • Contribute to code reviews within the data engineering team.
  • Stay current with advances in data engineering tooling and practices relevant to biomedical AI.
Required Qualifications
  • PhD in Machine Learning, Computer Science, Bioinformatics, Computational Biology or a related quantitative field.
  • Strong hands‑on experience in data engineering for machine learning.
  • Experience working with at least one biomedical data modality in a data engineering context.
  • Practical experience with entity linking or record linkage, ideally in a biomedical or clinical context.
  • Strong understanding of biomedical data characteristics such as variant data formats, expression matrices, clinical coding standards such as SNOMED and ICD‑10.
  • Proficiency with modern data engineering tools.
  • Familiarity with data governance frameworks applicable to biomedical and clinical data.
  • Familiarity with Trusted Research Environments or controlled access biomedical data environments.
  • Experience with biomedical ontology systems and identifier mapping across modalities.
  • Contributions to open‑source data engineering or bioinformatics tooling.

Hybrid role: approximately 3 days a week in the office.

Senior Data Engineer employer: Boehringer Ingelheim

As a Senior Data Engineer at our innovative AI Accelerator in London, you will be part of a dynamic team dedicated to advancing biomedical AI capabilities that directly impact patient outcomes. We pride ourselves on fostering a collaborative work culture that encourages continuous learning and professional growth, offering unique opportunities to work with cutting-edge technology in a supportive environment. With a hybrid work model and a commitment to employee well-being, we ensure that our team members thrive both personally and professionally while contributing to meaningful projects in the healthcare sector.

Boehringer Ingelheim

Contact Detail:

Boehringer Ingelheim Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Senior Data Engineer

Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with people on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.

Tip Number 2

Show off your skills! Create a portfolio showcasing your data engineering projects, especially those related to biomedical AI. This will give potential employers a taste of what you can do and set you apart from the crowd.

Tip Number 3

Prepare for interviews by brushing up on your technical knowledge and soft skills. Practice common interview questions and be ready to discuss your experience with data pipelines and biomedical datasets. Confidence is key!

Tip Number 4

Don’t forget to apply through our website! We’re always on the lookout for talented individuals like you. Keep an eye on our job listings and make sure your application stands out by tailoring it to the role.

We think you need these skills to ace Senior Data Engineer

Data Engineering
Machine Learning
Bioinformatics
Computational Biology
Entity Linking
Record Linkage
Biomedical Data Modality

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Senior Data Engineer role. Highlight your experience with data engineering, especially in biomedical contexts, and showcase any relevant projects or tools you've worked with. We want to see how your skills align with our needs!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about AI Enablement and how your background makes you a perfect fit for our team. Don’t forget to mention specific experiences that relate to the job description.

Showcase Your Technical Skills:In your application, be sure to highlight your technical skills, especially those related to data engineering tools and practices. Mention any hands-on experience you have with biomedical data modalities and entity linking, as these are key for us.

Apply Through Our Website:We encourage you to apply through our website for the best chance of getting noticed. It’s super easy, and you’ll be able to submit all the necessary documents in one go. Plus, we love seeing applications come directly from our site!

How to prepare for a job interview at Boehringer Ingelheim

Know Your Data Inside Out

Make sure you’re well-versed in the specifics of biomedical data, especially the formats and standards mentioned in the job description. Brush up on your knowledge of variant data formats, expression matrices, and clinical coding standards like SNOMED and ICD-10. This will show that you understand the nuances of the data you'll be working with.

Showcase Your Pipeline Skills

Be prepared to discuss your experience with building and maintaining data pipelines. Have examples ready that demonstrate your ability to transform datasets into AI-ready assets. Highlight any specific projects where you’ve worked on entity linking or cross-modal integration, as these are key responsibilities for the role.

Stay Current with Tools and Practices

Familiarise yourself with the latest data engineering tools and practices relevant to biomedical AI. Mention any recent advancements or tools you’ve used in your work. This shows that you’re proactive about staying updated and can bring fresh ideas to the team.

Prepare for Technical Questions

Expect technical questions that assess your problem-solving skills and understanding of data governance frameworks. Practice explaining complex concepts clearly and concisely, as you may need to communicate these ideas to non-technical stakeholders. Being able to articulate your thought process is crucial.