Technology - ML Ops Engineer

Technology - ML Ops Engineer

Full-Time 70000 - 90000 £ / year (est.) No working from home possible
P

At a Glance

  • Tasks: Drive production-grade ML and LLM services on Azure, ensuring reliability and scalability.
  • Company: Join the UK's largest online pharmacy with a commitment to social responsibility.
  • Benefits: Competitive salary, extensive benefits, hybrid work, and a supportive environment.
  • Other info: Great career growth opportunities in a positive, open workplace.
  • Why this job: Make a real impact in digital healthcare while working with cutting-edge technology.
  • Qualifications: Strong Python skills, experience in ML frameworks, and a solid DevOps background.

The predicted salary is between 70000 - 90000 £ per year.

Location: Hybrid schedule; 2-3 days a week in the office at Thorpe Park, Leeds.

Working hours: Core hours 09:30 – 16:00; you can work around these to suit you.

Salary: £ DOE plus extensive benefits

Contract type: Permanent

Employment type: Full time

Our tech teams keep us running 24/7 to ensure world‑class service for our patients. This role may include participation in an out‑of‑hours rota as required by the business, with a fair scheduling process and additional compensation for on‑call periods.

About Us: We are the nation's largest online pharmacy, with 25 years of experience, helping over 1.8 million patients in England manage NHS prescriptions from request through to delivery. We are Great Place to Work certified and a certified B Corp, reflecting high standards of social and environmental responsibility. Our people are fundamental to our success as we strive to be a world leading, patient‑centric digital healthcare provider and to maintain a positive, open and honest working environment.

Role Overview: The ML Ops Engineer will drive the operation of production‑grade Machine Learning and LLM services on Azure, ensuring models run as reliable, scalable, high‑performing systems. You will own the end‑to‑end MLOps/LLMOps lifecycle, leading CI/CD, deployment automation, monitoring, and incident response. You will work closely with Data Science to turn models into robust production services with governance, observability, and continuous optimisation for fast, safe, and efficient delivery at scale.

What you’ll be doing:

  • Production Deployment & Release Engineering: Design and operate CI/CD pipelines for ML models and LLM prompt‑flows, covering build, test, validation, deployment, and rollback. Own model registration and promotion across environments, ensuring traceability, governance, and auditability. Implement safe deployment strategies (blue/green, canary, champion/challenger). Package and deploy containerised inference services and batch pipelines, ensuring repeatability and rapid rollback.
  • Reliability Engineering (Day 2 Operations): Run ML and LLM services as production‑grade systems, defining SLOs/SLIs, dashboards, and alerting. Lead incident response for runtime issues, including triage, mitigation, recovery, and post‑incident reviews. Develop and maintain operational runbooks covering restart, rollback, secret rotation, and safe‑mode scenarios. Improve service resilience and reduce MTTR through automation (self‑healing, retries, fallbacks, circuit breakers).
  • Observability (Service, Data, Model & Cost): Implement monitoring for availability, latency, errors, resource usage, and job performance. Monitor data quality including freshness, volume, completeness, schema drift, and distribution changes. Monitor model performance, including drift and prediction distribution shifts, and track accuracy where labels exist. Instrument LLM services for token usage, latency, and safety signals, with clear visibility into cost, quotas, and risks.
  • LLMOps: Lifecycle, Quality & Safety: Manage prompts and workflows as code, including versioning, code reviews, and automated regression testing. Own production configuration for LLM deployments, including model updates, limits, and safeguards. Partner with Data Science and Security to ensure robust safety practices, including PII protection and prompt‑injection testing.
  • Security, Privacy & Governance: Implement secure access controls, identity management, and secrets handling. Support production readiness through documentation, monitoring plans, cost models, and audit evidence. Ensure all changes follow structured governance with clear traceability and reproducibility.

Who we’re looking for:

  • Strong Python engineering skills with experience in ML frameworks (scikit‑learn, PyTorch, TensorFlow) and experiment tracking.
  • Comfortable in regulated environments with privacy, auditability, change control, and handling sensitive data.
  • Strong DevOps/SRE background: CI/CD, Infrastructure as Code, monitoring and alerting, incident management, reliability engineering.
  • Hands‑on experience with Docker and Kubernetes (e.g., AKS), including debugging and performance tuning.
  • Experience with Azure, including Azure Machine Learning (pipelines, registries, endpoints) and Azure Monitor or Log Analytics.
  • Experience operationalising ML pipelines (training, batch scoring, feature engineering) and preventing training‑serving skew.
  • Experience implementing safe deployment practices (blue/green or canary) with automated validation.
  • Understanding of data contracts, schema evolution, and data quality practices, troubleshooting data drift and missing features.

Please click apply. If we think you are a good match, we will be in touch to arrange an interview. Applicants must prove they have the right to live in the UK. All successful applicants will be required to undergo a DBS check. Unsolicited agency applications will be treated as a gift.

Technology - ML Ops Engineer employer: Pharmacy2U | Certified B Corp

As the nation's largest online pharmacy, we pride ourselves on fostering a supportive and innovative work culture that prioritises employee well-being and growth. Located in the vibrant Thorpe Park area of Leeds, our hybrid working model offers flexibility with core hours to suit your lifestyle, while our extensive benefits package and commitment to social responsibility make us an exceptional employer for those seeking meaningful and rewarding careers in digital healthcare.

P

Contact Details:

Pharmacy2U | Certified B Corp Recruitment Team

We think you need these skills to ace Technology - ML Ops Engineer

Problem-Solving Skills
Communication Skills
SQL
Python
Data Engineering
Automation
Data Pipeline Development