Senior Site Reliability Engineer - UK

Job Board

Companies

Heidi

Senior Site Reliability Engineer - UK

Full-Time 60000 - 80000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Respond to production incidents and improve operational reliability through automation and system changes.
Company: Heidi is developing an AI Care Partner to support clinicians in delivering care effectively.
Benefits: Enjoy comprehensive private medical cover, a £700 learning budget, and global parental leave.
Other info: This position is hybrid, requiring 3 days in the office.
Why this job: Join a hands-on role focused on maintaining real systems in production with significant ownership.
Qualifications: 3–6+ years in SRE or operations-heavy engineering roles, with experience in cloud infrastructure and Kubernetes.

The predicted salary is between 60000 - 80000 £ per year.

About Heidi

Heidi is building an AI Care Partner that supports clinicians every step of the way, from documentation to delivery of care.

The Role

This role sits in the core Platform/SRE team that owns production. You’ll work directly on incident response, on-call duties, system reliability, and day-to-day operations for Heidi’s platform. We’re open to candidates who are strong mid-level SREs ready to take on more ownership, as well as senior SREs who enjoy being hands-on in operations. The role is intentionally ops-heavy and focused on keeping real systems healthy in production.

What You’ll Do

Participate in on-call and incident response: Respond to production incidents, contribute to service restoration, and support clear communication during incidents. Over time, take increasing responsibility for leading incidents end-to-end.
Improve operational reliability: Identify recurring issues and reliability risks, and drive fixes through better alerting, automation, system changes, or process improvements.
Own parts of the production environment: Operate and improve Kubernetes clusters, cloud infrastructure, and core platform services, with growing ownership as familiarity increases.
Strengthen observability: Improve dashboards, alerts, logs, and traces so issues are detected earlier and diagnosed faster, with a strong focus on actionable signals.
Reduce operational toil: Automate repetitive tasks, simplify runbooks, and improve tooling to make on-call and day-to-day operations easier and safer.
Support safe change: Improve deployments, rollback mechanisms, and operational readiness to reduce the risk of incidents caused by change.
Contribute to operational practices: Write and maintain runbooks, participate in blameless post-mortems, and help improve incident response processes over time.
Collaborate closely with engineers: Work with product and feature teams to improve production readiness, service ownership, and reliability expectations.

What We’re Looking For

3–6+ years in SRE, DevOps, Platform, or operations-heavy engineering roles.
Experience supporting production systems and participating in on-call rotations.
Comfortable debugging live systems under pressure.
Experience operating cloud infrastructure (AWS preferred).
Working knowledge of Kubernetes and containerised workloads.
Infrastructure as Code experience (Terraform or similar).
Familiarity with monitoring and alerting tools (Datadog, Prometheus, etc).
Scripting or automation experience (Python, Bash, or similar).

Nice to Have

Experience leading incidents or mentoring others during on-call.
Experience in regulated or security-sensitive environments.
Familiarity with databases, queues, and caches in production.
Interest in reliability practices such as SLOs, error budgets, and capacity planning.

Benefits

Real product momentum. We’re not trying to generate interest, we’re channeling it.
Equity from day one.
Unmatched impact.
Work alongside world-class talent.
Your health, covered. Comprehensive private medical and dental cover through Bupa, plus 24/7 mental health, coaching and wellbeing support through Sonder and a £100/month Healthy Heidi’s stipend.
Global parental leave. 26 weeks paid for primary carers and 18 weeks for secondary carers, subject to eligibility.
Fertility support. £7,000 one-off payment, eligibility applies.
Learning & development. £700 per year for courses, books, memberships, conferences and more.
Home office budget. £500 one-off to set up a workspace you actually want to work in.
Recharge days after major milestones and busy periods so you can reset and come back strong.
Work from anywhere for up to 4 weeks per year, wherever the world takes you.
Clinical leave. 10 days per year for eligible clinical roles to maintain accreditation and requirements.
Flexibility that works. A hybrid environment, with 3 days in the office.

Heidi’s Commitment to Diversity, Equity and Inclusion

Heidi is dedicated to creating an equitable, inclusive, and supportive work environment that brings people together from diverse backgrounds, experiences, and perspectives. Our strength is in our differences. We’re proud to be an equal opportunity employer and are proud to welcome all applicants as we’re committed to promoting a culture of opportunity for all.

Senior Site Reliability Engineer - UK employer: Heidi

Heidi offers a unique opportunity to work on an impactful AI platform in the UK. Employees benefit from comprehensive health coverage and a generous learning budget. The team values diversity and inclusion, fostering a supportive environment for all backgrounds.

Contact Details:

Heidi Recruitment Team

View Heidi profile

We think you need these skills to ace Senior Site Reliability Engineer - UK

Incident Response

System Reliability

Kubernetes

Cloud Infrastructure (AWS preferred)

Infrastructure as Code (Terraform or similar)

Monitoring and Alerting Tools (Datadog, Prometheus, etc.)

Scripting or Automation (Python, Bash, or similar)

Debugging Live Systems

Operational Practices

Runbook Maintenance

Collaboration with Engineers

Reliability Practices (SLOs, error budgets, capacity planning)

Automation of Repetitive Tasks

Improvement of Deployment Mechanisms

Senior Site Reliability Engineer - UK

Heidi

Apply Now

Senior Site Reliability Engineer - UK

At a Glance

Senior Site Reliability Engineer - UK employer: Heidi

We think you need these skills to ace Senior Site Reliability Engineer - UK

Company

Product

Help