Site Reliability Engineer in Manchester

Site Reliability Engineer in Manchester

Manchester Full-Time 60000 - 80000 € / year (est.) No home office possible
Free-Work UK

At a Glance

  • Tasks: Support and enhance critical data platforms, ensuring reliability and operational excellence.
  • Company: Join a high-performing team in a dynamic tech environment.
  • Benefits: Competitive salary, hands-on experience, and opportunities for professional growth.
  • Other info: Work onsite five days a week in a secure environment.
  • Why this job: Make a real impact on platform reliability and automation in fast-paced settings.
  • Qualifications: Experience with Kubernetes, ELK stack, and strong problem-solving skills required.

The predicted salary is between 60000 - 80000 € per year.

We are seeking an experienced and motivated Site Reliability Engineer (SRE) to join a high‑performing team supporting multiple data product and platform groups. This role is focused on improving the reliability, scalability, observability, deployment, and operational support of critical data‑driven platforms and services operating within complex production environments. The successful candidate will work closely with engineering, platform, and operational support teams to strengthen monitoring and alerting capabilities, improve logging and traceability, troubleshoot incidents, support deployments, and automate operational processes wherever possible. The environment includes Kubernetes, Helm, the ELK stack, and a broad range of modern Site Reliability Engineering and cloud platform practices. This is a hands‑on technical role suited to someone who thrives in fast‑paced operational environments, enjoys solving complex production issues, and is passionate about automation, platform reliability, and continuous improvement. The role requires strong collaboration with both client stakeholders and engineering teams to ensure operational excellence, platform resilience, and service availability across critical systems.

Candidate Profile

  • Support, maintain, and improve highly available production platforms and services across cloud and containerised environments.
  • Manage and support Kubernetes clusters and Helm‑based deployments across multiple environments.
  • Enhance monitoring, alerting, logging, and observability solutions to improve operational visibility and system reliability.
  • Investigate incidents, analyse logs, identify root causes, and drive timely resolution of production issues.
  • Participate in incident response, post‑incident reviews, and continuous service improvement activities.
  • Automate operational tasks and repetitive support activities to reduce manual effort and improve platform efficiency.
  • Collaborate with engineering and data platform teams to improve scalability, resilience, deployment reliability, and operational maturity.
  • Develop and maintain operational documentation, support procedures, runbooks, and troubleshooting guides.
  • Contribute to reliability engineering initiatives including proactive monitoring, service health management, operational readiness, and platform optimisation.
  • Support deployment activities, release processes, and production change management activities.

Required Qualifications To Be Successful In This Role

  • Strong commercial experience in Site Reliability Engineering, DevOps, Platform Engineering, or Production Support environments.
  • Strong hands‑on experience with Kubernetes and Helm within enterprise or production environments.
  • Proven experience supporting mission‑critical production platforms and operational support environments.
  • Strong experience with the ELK stack (Elasticsearch, Logstash, Kibana) for logging, monitoring, troubleshooting, and operational analysis.
  • Demonstrated capability in log analysis, incident investigation, troubleshooting, and root cause analysis.
  • Strong understanding and practical experience with core SRE practices.
  • Experience working with data platforms, analytics platforms, or data product teams would be highly advantageous.
  • Experience with scripting and automation technologies such as Bash, Python, or similar would be beneficial.
  • Exposure to CI/CD pipelines, Infrastructure as Code, cloud‑native platforms, or observability tooling would be desirable.
  • Strong communication, stakeholder engagement, and collaboration skills.
  • Ability to work effectively within fast‑paced operational support environments while managing competing priorities and deadlines.

Security Clearance

Resource must be willing and able to work onsite at the client location five days per week. Candidate must already hold current HLC clearance (mandatory requirement). Previous experience working within secure, government, defence, or highly regulated environments will be highly regarded. Due to client security requirements, only candidates meeting the required clearance criteria will be considered.

Site Reliability Engineer in Manchester employer: Free-Work UK

Join a dynamic and innovative team as a Site Reliability Engineer, where you will play a crucial role in enhancing the reliability and performance of critical data-driven platforms. Our company fosters a collaborative work culture that prioritises continuous improvement and employee growth, offering opportunities to work with cutting-edge technologies like Kubernetes and the ELK stack. Located in a secure environment, we provide a supportive atmosphere for professionals who thrive in fast-paced settings and are passionate about operational excellence.

Free-Work UK

Contact Detail:

Free-Work UK Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Site Reliability Engineer in Manchester

Network Like a Pro

Get out there and connect with folks in the industry! Attend meetups, webinars, or even local tech events. We all know that sometimes it’s not just what you know, but who you know that can help you land that SRE role.

Show Off Your Skills

Don’t just talk about your experience; demonstrate it! Create a portfolio showcasing your projects, especially those involving Kubernetes, Helm, or the ELK stack. We want to see how you’ve tackled real-world problems and improved platform reliability.

Ace the Interview

Prepare for technical interviews by brushing up on your SRE practices and incident response strategies. We recommend practising common scenarios you might face in production environments. Remember, confidence is key!

Apply Through Our Website

When you find a job that excites you, apply through our website! It’s the best way to ensure your application gets the attention it deserves. Plus, we love seeing passionate candidates who are eager to join our team.

We think you need these skills to ace Site Reliability Engineer in Manchester

Site Reliability Engineering
Kubernetes
Helm
ELK stack
Logging
Monitoring
Troubleshooting

Some tips for your application 🫡

Tailor Your CV:Make sure your CV highlights your experience with Kubernetes, Helm, and the ELK stack. We want to see how your skills align with our needs, so don’t be shy about showcasing your hands-on experience in production environments!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you’re passionate about Site Reliability Engineering and how you can contribute to our team. We love seeing candidates who are excited about automation and continuous improvement.

Showcase Your Problem-Solving Skills:In your application, share specific examples of how you've tackled complex production issues. We’re looking for someone who thrives in fast-paced environments, so let us know how you’ve made an impact in previous roles!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen on joining our awesome team!

How to prepare for a job interview at Free-Work UK

Know Your Tech Inside Out

Make sure you’re well-versed in Kubernetes, Helm, and the ELK stack. Brush up on your hands-on experience and be ready to discuss specific scenarios where you've improved reliability or resolved incidents. The more examples you can share, the better!

Showcase Your Problem-Solving Skills

Prepare to talk about complex production issues you've tackled in the past. Think of a few incidents where you identified root causes and drove resolutions. This will demonstrate your analytical skills and ability to thrive in fast-paced environments.

Emphasise Collaboration

Since this role involves working closely with engineering and platform teams, be ready to discuss how you’ve collaborated in previous roles. Share examples of how you’ve engaged stakeholders and contributed to team success, especially in operational support settings.

Highlight Your Automation Experience

Automation is key in SRE roles, so be prepared to discuss any scripting or automation technologies you’ve used, like Bash or Python. Talk about how you’ve automated operational tasks to improve efficiency and reduce manual effort in your previous positions.