Senior MLOps Engineer in London

Senior MLOps Engineer in London

London Full-Time 60000 - 80000 £ / year (est.) No working from home possible
Boehringer Ingelheim GmbH

At a Glance

  • Tasks: Ensure AI models move from development to production smoothly and efficiently.
  • Company: Join a pioneering AI Accelerator in London focused on transforming disease understanding.
  • Benefits: Hybrid work model, competitive salary, and opportunities for professional growth.
  • Other info: Collaborative environment with a focus on operational excellence and innovation.
  • Why this job: Make a real impact in healthcare by deploying cutting-edge AI solutions.
  • Qualifications: MSc or PhD in relevant fields with hands-on ML experience required.

The predicted salary is between 60000 - 80000 £ per year.

Most diseases are still poorly understood at a biological level. Despite decades of research, the causal mechanisms driving many conditions remain unclear, limiting our ability to identify the right targets, design the right interventions and bring the right medicines to patients. The AI Accelerator exists to change that. Based in London and sitting within Computational Innovation, a global organisation spanning computational biology, human genetics, data excellence and AI, the Accelerator’s mission is to build production-quality AI capabilities that deepen our understanding of disease biology and increase probability of success.

We do this by applying neural-based methods across the biomedical data landscape to integrate heterogeneous, multimodal data sources, infer biological relationships and embed causal thinking into what we build. The goal is not just to predict but to explain and understand why disease occurs. It could be electronic health records and medical imaging to support patient segmentation. It could be ‘omics data to identify novel therapeutic targets. It could be predicting transcriptional change for a given disease-causing variant. It could be simulating the effect of modulating a target of interest.

A core component of the AI Accelerator is AI Enablement, that provides the support framework to make our ambitions a technical reality. It could be provisioning integrated, multimodal biomedical data for model training and inference. It could be managing the lifecycle of models provided by AI Systems. It could be working with IT to ensure the right infrastructure and tooling are in place. AI Enablement ensures that the model builders can focus on the technology and that Computational Innovation’s downstream users can leverage accelerator capabilities for real portfolio impact.

We are looking for a Senior MLOps Engineer to join AI Enablement and play a central role in ensuring that the AI Accelerator’s models move from development to production reliably and keep performing. This is a hands-on operational role with real stakes. The models you deploy and manage will be used to make decisions about which indications to pursue, in which patient population and against which target. When your systems work well, science moves faster and portfolio decision-making gets better.

You will take full operational ownership of shipped models, managing deployment, monitoring, retraining and lifecycle end-to-end. You will make sure that the IT-provisioned experiment tracking and model registry systems are used effectively, that training and fine-tuning runs are consistently and correctly logged and that model artefacts are registered with full provenance from data through to prediction. You will work closely with ML engineers at model handover, reviewing documentation and signing off before accepting operational ownership. This role is for someone who takes pride in operational excellence and who understands that the AI Accelerator models can only realise their impact on the portfolio if they are deployed and performing reliably in production.

Key Responsibilities

  • Ensure experiment tracking and model registry systems are used effectively across the AI Accelerator with consistent and correct logging of training and fine-tuning runs and model artefacts registered with full provenance.
  • Configure, run and troubleshoot distributed training and fine-tuning jobs, ensuring efficient use of available compute and resolving job-level failures.
  • Participate in a structured model handovers with ML engineers, reviewing and signing off documentation before accepting full model operational ownership of shipped models.
  • Deploy, monitor and manage model serving endpoints, making technical decisions about serving configurations to meet performance requirements of downstream users.
  • Take full operational ownership of models in production, managing monitoring, retraining and lifecycle end-to-end.
  • Uphold MLOps standards and practices across the AI Accelerator, contributing to their evolution based on operational experience and keeping teams current with relevant advances in MLOps tooling.

Required Qualifications

  • MSc in Machine Learning, Computer Science, Software Engineering or a related technical field; PhD preferred or the equivalent industry experience.
  • Solid hands-on experience operating ML training and serving workflows in production environments.
  • Experience with distributed training frameworks such as PyTorch Distributed, DeepSpeed, FSDP or Ray Train.
  • Experience operating experiment tracking systems and model registry systems such as MLflow, Weights and Biases or equivalent.
  • Familiarity with CI/CD tooling for ML workflows e.g. cloud-native pipeline services, GitHub Actions or equivalent.
  • Solid understanding of cloud infrastructure for ML (compute, storage, networking) that is sufficient to specify requirements clearly and diagnose infrastructure-related issues.
  • Awareness of large model training characteristics including memory footprint, compute scaling and parallelisation strategies.
  • Familiarity with infrastructure-as-code tooling such as Terraform or cloud-native equivalents.
  • Experience working closely with research and ML engineering teams as a platform operator.
  • Familiarity with biomedical AI workloads, such as training foundation models on large-scale multimodal data.

This is a hybrid role with approximately 3 days a week in the office.

Senior MLOps Engineer in London employer: Boehringer Ingelheim GmbH

Join a pioneering team at the AI Accelerator in London, where your work as a Senior MLOps Engineer will directly contribute to advancing our understanding of disease biology through cutting-edge AI technologies. We foster a collaborative and innovative work culture that prioritises operational excellence and offers ample opportunities for professional growth, ensuring you can make a meaningful impact in the biomedical field while enjoying a supportive environment. With a hybrid working model, you will benefit from flexibility while being part of a global organisation dedicated to transforming healthcare.

Boehringer Ingelheim GmbH

Contact Details:

Boehringer Ingelheim GmbH Recruitment Team

We think you need these skills to ace Senior MLOps Engineer in London

MLOps
Machine Learning
Model Deployment
Experiment Tracking
Model Registry Systems
Distributed Training Frameworks
PyTorch Distributed