At a Glance
- Tasks: Lead the charge in building reliable software tools and systems for engineers and data scientists.
- Company: Join a forward-thinking tech company focused on operational excellence and innovation.
- Benefits: Enjoy competitive pay, flexible work options, and opportunities for professional growth.
- Why this job: Make a real impact by shaping cloud-native systems and mentoring future tech leaders.
- Qualifications: 5-8 years in SRE or related fields with strong coding skills in Python or Go.
- Other info: Collaborative environment with a focus on automation and engineering culture.
The predicted salary is between 48000 - 72000 £ per year.
As a Senior Site Reliability Engineer, you will ensure the high-quality delivery of our software by building and maintaining tools used by software engineers and data scientists to deploy and monitor their code. In this role, you will be a champion of automation, reliability, and operational excellence.
Technical Leadership That Raises the Bar
- Architect and improve cloud-native systems with reliability as a first-class principle.
- Shape SLIs/SLOs, error budgets, capacity planning, and performance strategies.
- Continuously evolve availability, efficiency, and resilience across our platforms.
- Mentor SREs, platform engineers, and developers across the organisation.
- Champion automation, observability, DevSecOps, and modern operational practices.
- Influence engineering culture and architectural direction.
Operational Excellence
- Own and lead high-severity incident response with calm, clarity, and technical depth.
- Run world-class post-incident reviews and drive meaningful, measurable improvements.
- Strengthen monitoring, alerting, on-call practices, and reliability processes.
- Support resilience validation through load testing, stress testing, and chaos engineering.
Automation, Tooling & Engineering Efficiency
- Build tools and automation that remove toil and accelerate teams.
- Develop CI/CD pipelines and Infrastructure-as-Code environments.
- Drive consistency, repeatability, and self-service across engineering.
Cross-Team Collaboration
- Partner with Security, Platform, and Engineering teams to align reliability with security and resilience goals.
- Lead teams toward better design, operational readiness, and measurable service health.
- Contribute to documentation, runbooks, and operational processes that scale.
The security engineering team is missioned to build security services, platforms and technologies, as well as to support cross-functional teams to protect our users, products and infrastructures.
Qualifications
- 5-8+ years in SRE, Platform, Cloud Infrastructure, or operational engineering roles.
- Hands-on experience architecting and improving large-scale, distributed systems.
- Strong coding proficiency in Python, Go, Bash, or similar automation-focused languages.
- Expertise with observability stacks: Datadog, Prometheus, Grafana, OpenTelemetry.
- Deep AWS experience across EC2, EKS, Lambda, VPC, DynamoDB, S3, CloudFront, RDS, IAM, KMS, and more.
- Proficiency with Terraform, CloudFormation, or AWS CDK.
- Incident response leadership and root-cause analysis expertise.
- Excellent documentation and communication skills.
- Strong analytical and troubleshooting abilities.
Bonus
- Experience mentoring or leading engineers within SRE or platform teams.
- Experience with load testing, stress testing, and chaos engineering.
- A passion for uplifting engineering culture through tooling, automation, and reliability-first thinking.
Lead Site Reliability Engineer employer: Holland & Barrett
Contact Detail:
Holland & Barrett Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Lead Site Reliability Engineer
✨Tip Number 1
Network like a pro! Reach out to your connections in the industry, attend meetups, and engage in online forums. The more people you know, the better your chances of landing that Lead Site Reliability Engineer role.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those involving automation and cloud-native systems. This will give potential employers a taste of what you can bring to the table.
✨Tip Number 3
Prepare for interviews by brushing up on your technical knowledge and incident response strategies. Be ready to discuss your experience with tools like Datadog and Terraform, and how you've improved reliability in past roles.
✨Tip Number 4
Don't forget to apply through our website! We love seeing candidates who are genuinely interested in joining our team. Plus, it makes it easier for us to keep track of your application and get back to you quickly.
We think you need these skills to ace Lead Site Reliability Engineer
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the Lead Site Reliability Engineer role. Highlight your hands-on experience with cloud-native systems and automation tools, as these are key to what we’re looking for.
Showcase Your Projects: Include specific examples of projects where you've improved reliability or built automation tools. We love seeing how you’ve tackled challenges in previous roles, so don’t hold back on the details!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you’re passionate about SRE and how your experience aligns with our mission at StudySmarter. Keep it engaging and personal – we want to get to know you!
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re serious about joining our team!
How to prepare for a job interview at Holland & Barrett
✨Know Your Tech Inside Out
Make sure you’re well-versed in the technologies mentioned in the job description, especially AWS services and automation tools like Terraform. Brush up on your coding skills in Python or Go, as you might be asked to solve technical problems on the spot.
✨Showcase Your Incident Response Skills
Prepare to discuss your experience with high-severity incident responses. Have examples ready that demonstrate your calmness under pressure and your ability to lead post-incident reviews. This will show your potential employer that you can handle critical situations effectively.
✨Emphasise Collaboration and Mentorship
Highlight any experience you have in mentoring or collaborating with cross-functional teams. Discuss how you’ve influenced engineering culture or improved operational practices in previous roles. This will align with their need for someone who can lead and uplift others.
✨Prepare Questions That Matter
Think of insightful questions to ask about their current SRE practices, tooling, and team dynamics. This shows your genuine interest in the role and helps you assess if the company’s culture aligns with your values, especially regarding reliability and automation.