Senior Data Engineer in London

Senior Data Engineer in London

London Full-Time 60000 - 80000 € / year (est.) Home office (partial)
Boehringer Ingelheim

At a Glance

  • Tasks: Transform biomedical datasets into AI-ready assets and build robust data engineering pipelines.
  • Company: Join Boehringer Ingelheim, a Top Employer in the UK, focused on innovation and collaboration.
  • Benefits: Enjoy a hybrid work model, competitive salary, and supportive HR policies.
  • Other info: Dynamic team environment with opportunities for professional growth and development.
  • Why this job: Make a real impact in biomedical AI and enhance patient outcomes through your work.
  • Qualifications: PhD in a relevant field and strong experience in data engineering for machine learning.

The predicted salary is between 60000 - 80000 € per year.

The AI Accelerator is a brand-new, London-based hub, sitting within Computational Innovation (CI), which is a global organisation comprising computational biology, human genetics, data excellence and AI expertise.

The purpose of CI’s AI Accelerator is to provision production-quality, versatile, foundational biomedical AI capabilities that can be adapted and deployed to improve and accelerate portfolio decision-making and increase the probability of success, by furthering understanding of the biology driving patient outcomes and identifying mechanisms involved in disease.

A core component of the AI Accelerator is AI Enablement, a team focused on ensuring that the accelerator’s model provisioning teams can design, build and deploy versatile biomedical foundation models that can enhance human understanding of disease biology and help identify potential targets, biomarkers and patient segments for further research. This will be achieved by provisioning AI-ready, integrated, multimodal data for distributed training, managing the model lifecycle and partnering with the IT organisation to ensure that model builders and downstream users have the necessary infrastructure and tooling to prototype, implement, adapt and deploy AI capabilities to advance the portfolio.

We are seeking a Senior Data Engineer to join the AI Enablement team and contribute to the design and delivery of robust data engineering pipelines that transform harmonised biomedical datasets into AI-ready, integrated assets across multi-omics, clinical and health records, and medical imaging data.

You will be an experienced, independent data engineer within AI Enablement, owning significant data engineering workstreams within the broader technical direction and architecture set by the Senior Staff Data Engineer. The pipelines and integrated datasets you build will enable model training, fine-tuning and inference.

Key Responsibilities

  • Transform harmonised datasets into AI-ready assets suitable for large model pre-training and fine-tuning within the defined standards and specifications.
  • Build and maintain entity linking pipelines that connect patients and biomedical entities across modalities.
  • Build and maintain cross-modal integration pipelines to support multimodal training, fine-tuning and inference.
  • Ensure pipelines and datasets are built and operated in accordance with data access permissions, consent conditions and usage restrictions.
  • Maintain data lineage and provenance throughout.
  • Build and maintain biomedical benchmark datasets with versioning and documentation.
  • Write clean, well-tested, well-documented code that meets the required engineering standards.
  • Contribute to code reviews within the data engineering team.
  • Stay current with advances in data engineering tooling and practices relevant to biomedical AI.

Required Qualifications

  • PhD in Machine Learning, Computer Science, Bioinformatics, Computational Biology or a related quantitative field.
  • Strong hands-on experience in data engineering for machine learning.
  • Experience working with at least one biomedical data modality in a data engineering context.
  • Practical experience with entity linking or record linkage, ideally in a biomedical or clinical context.
  • Strong understanding of biomedical data characteristics such as variant data formats, expression matrices, clinical coding standards such as SNOMED and ICD-10.
  • Proficiency with modern data engineering tools.
  • Familiarity with data governance frameworks applicable to biomedical and clinical data.
  • Familiarity with Trusted Research Environments or controlled access biomedical data environments.
  • Experience with biomedical ontology systems and identifier mapping across modalities.
  • Contributions to open-source data engineering or bioinformatics tooling.

This is a hybrid role with approximately 3 days a week in the office.

Boehringer Ingelheim has been recognised as a Top Employer in the UK, demonstrating our commitment to building an exceptional workplace through strong people practices and supportive HR policies.

Senior Data Engineer in London employer: Boehringer Ingelheim

Boehringer Ingelheim's AI Accelerator in London offers an exceptional work environment for Senior Data Engineers, fostering a culture of innovation and collaboration. With a strong commitment to employee growth, the company provides access to cutting-edge resources and training opportunities, ensuring that team members can thrive in their careers while contributing to impactful biomedical advancements. Recognised as a Top Employer in the UK, Boehringer Ingelheim prioritises employee well-being and engagement, making it a rewarding place to work.

Boehringer Ingelheim

Contact Detail:

Boehringer Ingelheim Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Senior Data Engineer in London

Tip Number 1

Network like a pro! Reach out to folks in the industry, especially those at the AI Accelerator or similar companies. A friendly chat can open doors and give you insights that job descriptions just can't.

Tip Number 2

Show off your skills! Prepare a portfolio showcasing your data engineering projects, especially those related to biomedical data. This will help us see your hands-on experience and how you tackle real-world problems.

Tip Number 3

Ace the interview by being ready to discuss your thought process. We want to know how you approach building data pipelines and integrating datasets. Be prepared to share examples of challenges you've faced and how you overcame them.

Tip Number 4

Apply through our website! It’s the best way to ensure your application gets noticed. Plus, it shows you're genuinely interested in joining our team at the AI Accelerator.

We think you need these skills to ace Senior Data Engineer in London

Data Engineering
Machine Learning
Biomedical Data Modality Experience
Entity Linking
Record Linkage
Biomedical Data Characteristics Understanding
Data Governance Frameworks

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Senior Data Engineer role. Highlight your experience with data engineering, especially in biomedical contexts, and showcase any relevant projects or tools you've worked with.

Craft a Compelling Cover Letter:Your cover letter should tell us why you're passionate about this role and how your skills align with our mission at the AI Accelerator. Be specific about your experience with AI-ready datasets and data pipelines.

Showcase Your Technical Skills:Don’t forget to mention your proficiency with modern data engineering tools and any contributions to open-source projects. We love seeing candidates who are up-to-date with the latest trends in data engineering!

Apply Through Our Website:We encourage you to apply through our website for a smoother application process. It helps us keep track of your application and ensures you don’t miss out on any important updates!

How to prepare for a job interview at Boehringer Ingelheim

Know Your Data Inside Out

Make sure you’re well-versed in the specifics of biomedical data, especially the formats and standards mentioned in the job description. Brush up on your knowledge of variant data formats, expression matrices, and clinical coding standards like SNOMED and ICD-10. This will show that you understand the nuances of the role and can hit the ground running.

Showcase Your Pipeline Skills

Be prepared to discuss your experience with building and maintaining data engineering pipelines. Bring examples of how you've transformed datasets into AI-ready assets and any challenges you faced along the way. Highlighting your hands-on experience with entity linking and cross-modal integration will definitely impress the interviewers.

Stay Current with Tools and Practices

Familiarise yourself with the latest data engineering tools and practices relevant to biomedical AI. Mention any recent advancements or tools you’ve worked with that could benefit the team. This shows your commitment to continuous learning and staying ahead in the field.

Prepare for Technical Questions

Expect technical questions that dive deep into your data engineering expertise. Practice explaining complex concepts clearly and concisely. You might also want to prepare for a coding challenge or a code review scenario, so brush up on writing clean, well-documented code that meets engineering standards.