Lead Site Reliability Engineer
Lead Site Reliability Engineer

Lead Site Reliability Engineer

Full-Time 48000 - 72000 £ / year (est.) No home office possible
H

At a Glance

  • Tasks: Lead the charge in building reliable software tools and systems for engineers and data scientists.
  • Company: Join a forward-thinking tech company focused on operational excellence and innovation.
  • Benefits: Enjoy competitive pay, flexible work options, and opportunities for professional growth.
  • Why this job: Make a real impact by shaping cloud-native systems and mentoring future tech leaders.
  • Qualifications: 5-8 years in SRE or related fields with strong coding skills in Python or Go.
  • Other info: Collaborative environment with a focus on automation and engineering culture.

The predicted salary is between 48000 - 72000 £ per year.

As a Senior Site Reliability Engineer, you will ensure the high-quality delivery of our software by building and maintaining tools used by software engineers and data scientists to deploy and monitor their code. In this role, you will be a champion of automation, reliability, and operational excellence.

Technical Leadership That Raises the Bar

  • Architect and improve cloud-native systems with reliability as a first-class principle.
  • Shape SLIs/SLOs, error budgets, capacity planning, and performance strategies.
  • Continuously evolve availability, efficiency, and resilience across our platforms.
  • Mentor SREs, platform engineers, and developers across the organisation.
  • Champion automation, observability, DevSecOps, and modern operational practices.
  • Influence engineering culture and architectural direction.

Operational Excellence

  • Own and lead high-severity incident response with calm, clarity, and technical depth.
  • Run world-class post-incident reviews and drive meaningful, measurable improvements.
  • Strengthen monitoring, alerting, on-call practices, and reliability processes.
  • Support resilience validation through load testing, stress testing, and chaos engineering.

Automation, Tooling & Engineering Efficiency

  • Build tools and automation that remove toil and accelerate teams.
  • Develop CI/CD pipelines and Infrastructure-as-Code environments.
  • Drive consistency, repeatability, and self-service across engineering.

Cross-Team Collaboration

  • Partner with Security, Platform, and Engineering teams to align reliability with security and resilience goals.
  • Lead teams toward better design, operational readiness, and measurable service health.
  • Contribute to documentation, runbooks, and operational processes that scale.

The security engineering team is missioned to build security services, platforms and technologies, as well as to support cross-functional teams to protect our users, products and infrastructures.

Qualifications

  • 5-8+ years in SRE, Platform, Cloud Infrastructure, or operational engineering roles.
  • Hands-on experience architecting and improving large-scale, distributed systems.
  • Strong coding proficiency in Python, Go, Bash, or similar automation-focused languages.
  • Expertise with observability stacks: Datadog, Prometheus, Grafana, OpenTelemetry.
  • Deep AWS experience across EC2, EKS, Lambda, VPC, DynamoDB, S3, CloudFront, RDS, IAM, KMS, and more.
  • Proficiency with Terraform, CloudFormation, or AWS CDK.
  • Incident response leadership and root-cause analysis expertise.
  • Excellent documentation and communication skills.
  • Strong analytical and troubleshooting abilities.

Bonus

  • Experience mentoring or leading engineers within SRE or platform teams.
  • Experience with load testing, stress testing, and chaos engineering.
  • A passion for uplifting engineering culture through tooling, automation, and reliability-first thinking.

Lead Site Reliability Engineer employer: Holland & Barrett

As a Lead Site Reliability Engineer, you will join a forward-thinking company that prioritises innovation and operational excellence in a collaborative environment. With a strong focus on employee growth, the company offers extensive mentorship opportunities and encourages a culture of automation and reliability, ensuring that your contributions directly impact the success of our cloud-native systems. Located in a vibrant tech hub, you will benefit from a dynamic work culture that values creativity and teamwork, making it an ideal place for professionals seeking meaningful and rewarding careers.
H

Contact Detail:

Holland & Barrett Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Lead Site Reliability Engineer

✨Tip Number 1

Network like a pro! Reach out to your connections in the industry, attend meetups, and engage in online forums. The more people you know, the better your chances of landing that Lead Site Reliability Engineer role.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those involving automation and cloud-native systems. This will give potential employers a taste of what you can bring to the table.

✨Tip Number 3

Prepare for interviews by brushing up on your technical knowledge and incident response strategies. Be ready to discuss your experience with tools like Datadog and Terraform, and how you've improved reliability in past roles.

✨Tip Number 4

Don't forget to apply through our website! We love seeing candidates who are genuinely interested in joining our team. Plus, it makes it easier for us to keep track of your application and get back to you quickly.

We think you need these skills to ace Lead Site Reliability Engineer

Cloud-Native Architecture
SLIs/SLOs Definition
Error Budgets Management
Capacity Planning
Performance Strategies
Incident Response Leadership
Post-Incident Review
Monitoring and Alerting
Load Testing
Stress Testing
Chaos Engineering
CI/CD Pipeline Development
Infrastructure-as-Code
Python
Go
Bash
Observability Tools (Datadog, Prometheus, Grafana, OpenTelemetry)
AWS Services (EC2, EKS, Lambda, VPC, DynamoDB, S3, CloudFront, RDS, IAM, KMS)
Terraform
CloudFormation
AWS CDK
Documentation Skills
Analytical Skills
Troubleshooting Abilities
Mentoring Skills

Some tips for your application 🫡

Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the Lead Site Reliability Engineer role. Highlight your hands-on experience with cloud-native systems and automation tools, as these are key to what we’re looking for.

Showcase Your Projects: Include specific examples of projects where you've improved reliability or built automation tools. We love seeing how you’ve tackled challenges in previous roles, so don’t hold back on the details!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you’re passionate about SRE and how your experience aligns with our mission at StudySmarter. Keep it engaging and personal – we want to get to know you!

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re serious about joining our team!

How to prepare for a job interview at Holland & Barrett

✨Know Your Tech Inside Out

Make sure you’re well-versed in the technologies mentioned in the job description, especially AWS services and automation tools like Terraform. Brush up on your coding skills in Python or Go, as you might be asked to solve technical problems on the spot.

✨Showcase Your Incident Response Skills

Prepare to discuss your experience with high-severity incident responses. Have examples ready that demonstrate your calmness under pressure and your ability to lead post-incident reviews. This will show your potential employer that you can handle critical situations effectively.

✨Emphasise Collaboration and Mentorship

Highlight any experience you have in mentoring or collaborating with cross-functional teams. Discuss how you’ve influenced engineering culture or improved operational practices in previous roles. This will align with their need for someone who can lead and uplift others.

✨Prepare Questions That Matter

Think of insightful questions to ask about their current SRE practices, tooling, and team dynamics. This shows your genuine interest in the role and helps you assess if the company’s culture aligns with your values, especially regarding reliability and automation.

Lead Site Reliability Engineer
Holland & Barrett

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

H
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>