Job Board

Companies

SGI

Site Reliability Engineering Lead - London | London, UK

London Full-Time 80000 - 100000 £ / year (est.) No home office possible

Apply now

At a Glance

Tasks: Lead and mentor the SRE team while ensuring system reliability and performance.
Company: Dynamic tech company in London focused on innovation and reliability.
Benefits: Competitive salary, flexible working hours, and opportunities for professional growth.
Other info: Join a culture of continuous improvement and collaboration in a fast-paced environment.
Why this job: Shape the future of our systems and make a real impact on reliability.
Qualifications: Strong software engineering background and experience with SLIs/SLOs and incident management.

The predicted salary is between 80000 - 100000 £ per year.

We’re looking for a true SRE leader with a strong software engineering background. This isn’t a DevOps “on-call only” role — you’ll need to be comfortable reading and writing production code, deeply understanding application behaviour, and working alongside developers as a technical peer. You’ll lead and mentor the SRE team, setting direction and raising the bar for reliability across our systems. You’ll take end-to-end ownership of production, ensuring availability, performance, and effective incident response, while defining SLIs and partnering with Product on meaningful SLOs and error budgets.

In practice, that means you’ll:

Own production systems (availability, performance, incident response)
Define SLIs/SLOs and use error budgets to guide decisions
Run incident management, on-call, and blameless postmortems
Get hands-on with code (PHP, Java/.NET) to troubleshoot and improve reliability
Drive automation and reduce operational toil
Build observability that gives real insight into system health
Partner with engineers to embed reliability into the SDLC

A big part of the role is shaping culture — creating a blameless environment, improving how we respond to incidents, and driving continuous, systemic improvements. You’ll also lead on capacity planning, performance optimisation, and cost efficiency as the platform scales.

We’re looking for someone who brings strong technical leadership, communicates clearly (especially during incidents), and takes real ownership of problems through to resolution. You should be comfortable operating at scale, have deep experience with SLIs/SLOs, incident management, and observability tooling, and be at home working with Linux, databases, cloud platforms (ideally Azure), Kubernetes, and Infrastructure as Code. Just as importantly, you should enjoy tackling complex, imperfect systems — and turning them into something reliable, scalable, and well-understood.

Site Reliability Engineering Lead - London | London, UK employer: SGI

Join a forward-thinking company in London that values innovation and technical excellence, where you will lead a dynamic Site Reliability Engineering team. With a strong emphasis on employee growth, we offer mentorship opportunities, a collaborative work culture, and the chance to shape our reliability practices while working with cutting-edge technologies. Enjoy a supportive environment that encourages continuous improvement and celebrates achievements, making it an ideal place for those seeking meaningful and rewarding employment.

Contact Detail:

SGI Recruiting Team

View SGI Profile

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineering Lead - London | London, UK

✨Tip Number 1

Network like a pro! Reach out to your connections in the SRE field and let them know you're on the lookout for opportunities. Attend meetups or tech events in London to meet potential employers and fellow SREs. You never know who might have the inside scoop on a job opening!

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those involving incident management, SLIs/SLOs, and automation. This will give potential employers a taste of your hands-on experience and problem-solving abilities.

✨Tip Number 3

Prepare for technical interviews by brushing up on your coding skills and understanding of production systems. Practice common SRE scenarios, like incident response and performance optimisation, so you can demonstrate your expertise during interviews. We believe in you!

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets noticed. Plus, we love seeing candidates who are proactive about their job search. So, get your application in and let’s make some reliability magic happen together!

We think you need these skills to ace Site Reliability Engineering Lead - London | London, UK

Software Engineering

Production Code Proficiency

Application Behaviour Understanding

Team Leadership

Incident Management

SLI/SLO Definition

Error Budget Management

Hands-on Troubleshooting (PHP, Java/.NET)

Automation Skills

Observability Implementation

Reliability Engineering

Capacity Planning

Performance Optimisation

Cloud Platforms (Azure)

Kubernetes

Infrastructure as Code

Some tips for your application 🫡

Show Your Technical Skills: Make sure to highlight your software engineering background in your application. We want to see your experience with production code and how you've tackled reliability issues in the past.

Be Clear About Your Leadership Style: Since this role involves leading and mentoring, share examples of how you've shaped team culture and improved incident response. We love seeing candidates who can communicate effectively and foster a blameless environment.

Demonstrate Your Problem-Solving Abilities: We’re looking for someone who takes ownership of problems. In your application, include specific instances where you’ve resolved complex issues or improved system performance. Show us how you think!

Apply Through Our Website: Don’t forget to submit your application through our website! It’s the best way for us to keep track of your application and ensure it gets the attention it deserves.

How to prepare for a job interview at SGI

✨Know Your Tech Inside Out

Make sure you’re well-versed in the technologies mentioned in the job description, like PHP, Java/.NET, and cloud platforms like Azure. Brush up on your coding skills and be ready to discuss how you've used these technologies to improve system reliability in past roles.

✨Showcase Your Leadership Skills

Prepare examples of how you've led teams or projects in the past. Highlight your experience in mentoring others and driving a blameless culture during incidents. This role is about shaping the SRE team, so demonstrate your ability to inspire and guide others.

✨Understand SLIs and SLOs

Be ready to discuss your experience with defining and using SLIs and SLOs. Think of specific instances where you’ve used error budgets to make decisions or improve performance. This shows you understand the metrics that matter in an SRE role.

✨Prepare for Incident Management Scenarios

Expect questions around incident management and how you handle on-call situations. Prepare to share your approach to running blameless postmortems and how you’ve improved incident response times in previous roles. This will show your readiness to take ownership of production systems.

Site Reliability Engineering Lead - London | London, UK

SGI

Location: London

Apply now

Site Reliability Engineering Lead - London | London, UK

At a Glance

Site Reliability Engineering Lead - London | London, UK employer: SGI

StudySmarter Expert Advice 🤫

✨Tip Number 1

✨Tip Number 2

✨Tip Number 3

✨Tip Number 4

We think you need these skills to ace Site Reliability Engineering Lead - London | London, UK

Some tips for your application 🫡

How to prepare for a job interview at SGI

Site Reliability Engineering Lead - London | London, UK

Land your dream job quicker with Premium