Site Reliability Engineering (SRE) Manager in Northgate
Site Reliability Engineering (SRE) Manager

Site Reliability Engineering (SRE) Manager in Northgate

Northgate Full-Time 120000 - 120000 £ / year (est.) Home office possible
Go Premium
Halian Technology Limited

At a Glance

  • Tasks: Lead incident management and ensure system reliability in a hands-on SRE role.
  • Company: Join a forward-thinking tech company focused on true SRE principles.
  • Benefits: Competitive salary up to £120,000 and fully remote work within the UK.
  • Other info: Opportunity to solve real SRE challenges and enhance your technical leadership.
  • Why this job: Make a real impact on system reliability and lead a small engineering team.
  • Qualifications: Strong AWS and Linux skills, with experience in incident management and team leadership.

The predicted salary is between 120000 - 120000 £ per year.

Senior Site Reliability Engineer (SRE) UK Remote Permanent | Up to £120,000 | Fully Remote (UK Only)

This is NOT a DevOps role. Real SRE work only.

We are looking for a true Senior Site Reliability Engineer with deep incident management experience, strong operational ownership, and expert Linux/AWS troubleshooting skills.

This role is focused entirely on reliability, availability, incident response, and systems engineering, not building CI/CD pipelines or acting as DevOps by another name.

Leadership Requirement

Small Team Technical Lead

You must have experience leading a small engineering team (2-5 people), defining technical direction, improving on-call processes, and owning reliability strategy. This is a hands-on role with real SRE leadership, not people management.

About the Role

As a Senior SRE, you will own the reliability, resilience, and operational health of large-scale AWS/Linux systems. You will join an engineering organisation where SRE principles are fully embedded, respected, and treated as a distinct discipline.

Key Responsibilities
  • Lead major incidents, mitigation, RCA, and preventative improvements
  • Own and refine SLIs, SLOs, and error budgets
  • Reduce operational toil through automation
  • Deep-dive Linux debugging, performance tuning, and systems analysis
  • Strengthen observability, monitoring, and alerting
  • Provide technical leadership to a small SRE/engineering group
  • Improve and manage on-call processes (PagerDuty, OpsGenie, etc.)
  • Collaborate with development teams to build reliability into system design
What You'll Bring
  • Strong AWS experience (EC2, networking, autoscaling, IAM, load balancing)
  • Deep Linux troubleshooting skills (performance, networking, debugging)
  • Real 24/7 production on-call experience
  • Hands-on incident management and postmortems
  • Experience mentoring or leading a small technical team
  • Scripting/automation with Python, Go, or Bash
  • Strong observability skills (Datadog, Prometheus, Grafana, CloudWatch)
Why This Role Appeals to Real SREs

You will be solving actual SRE problems: reliability, incidents, resilience, uptime. You will guide a small team through complex engineering challenges.

Site Reliability Engineering (SRE) Manager in Northgate employer: Halian Technology Limited

As a Senior Site Reliability Engineering (SRE) Manager, you will thrive in a fully remote environment that champions innovation and technical excellence. Our company fosters a collaborative work culture where your expertise in incident management and operational ownership is valued, offering ample opportunities for professional growth and development. Join us to lead a dedicated team in tackling real SRE challenges while enjoying the flexibility of remote work across the UK.
Halian Technology Limited

Contact Detail:

Halian Technology Limited Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineering (SRE) Manager in Northgate

✨Tip Number 1

Network like a pro! Join online forums, attend meetups, or connect with fellow SREs on LinkedIn. The more people you know in the industry, the better your chances of landing that dream role.

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your incident management experience and any automation projects you've worked on. This will give potential employers a taste of what you can bring to their team.

✨Tip Number 3

Prepare for technical interviews by brushing up on your Linux and AWS troubleshooting skills. Practice common scenarios and be ready to discuss how you've handled incidents in the past. Confidence is key!

✨Tip Number 4

Don't forget to apply through our website! We love seeing candidates who are genuinely interested in joining our team. Plus, it gives you a chance to showcase your enthusiasm for real SRE work.

We think you need these skills to ace Site Reliability Engineering (SRE) Manager in Northgate

Incident Management
Operational Ownership
Linux Troubleshooting
AWS Expertise
Systems Engineering
Technical Leadership
SLIs, SLOs, and Error Budgets
Automation
Performance Tuning
Observability
Monitoring and Alerting
On-call Process Management
Scripting with Python, Go, or Bash
Collaboration with Development Teams

Some tips for your application 🫡

Show Your SRE Skills: Make sure to highlight your deep incident management experience and strong operational ownership in your application. We want to see how you've tackled real SRE challenges, so don’t hold back on those examples!

Be Specific About Your Experience: When detailing your AWS and Linux skills, be specific! Mention the tools and technologies you've used, like EC2 or Datadog. This helps us understand your hands-on experience and how you can contribute to our team.

Leadership Matters: Since this role involves leading a small team, share your leadership experiences. Talk about how you've defined technical direction or improved on-call processes. We’re looking for someone who can guide others through complex challenges.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for this exciting opportunity. Let’s get started on this journey together!

How to prepare for a job interview at Halian Technology Limited

✨Know Your SRE Fundamentals

Make sure you brush up on your SRE principles and practices. Understand the key concepts like SLIs, SLOs, and error budgets, as these will likely come up in conversation. Being able to discuss how you've applied these in real-world scenarios will show your depth of knowledge.

✨Showcase Your Incident Management Skills

Prepare to share specific examples of major incidents you've led or been a part of. Discuss your approach to mitigation, root cause analysis, and any preventative measures you've implemented. This will demonstrate your hands-on experience and ability to handle pressure.

✨Demonstrate Technical Leadership

Since this role involves leading a small team, be ready to talk about your leadership style and experiences. Highlight how you've defined technical direction, improved on-call processes, and mentored team members. This will help convey that you're not just a techie but also a capable leader.

✨Familiarise Yourself with Tools and Technologies

Make sure you're well-versed in the tools mentioned in the job description, like AWS services, Linux troubleshooting, and observability platforms like Datadog or Grafana. Being able to discuss your experience with these tools will show that you're prepared for the technical demands of the role.

Site Reliability Engineering (SRE) Manager in Northgate
Halian Technology Limited
Location: Northgate
Go Premium

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>