Senior Specialist Engineer (SRE) - UKHSA - SEO in Manchester
Senior Specialist Engineer (SRE) - UKHSA - SEO

Senior Specialist Engineer (SRE) - UKHSA - SEO in Manchester

Manchester Full-Time 36000 - 60000 £ / year (est.) Home office (partial)
M

At a Glance

  • Tasks: Join our team to enhance public health services through innovative site reliability engineering.
  • Company: UK Health Security Agency, a leader in health tech and research.
  • Benefits: Hybrid working model, competitive salary, and opportunities for professional growth.
  • Why this job: Make a real difference in public health while working with cutting-edge technology.
  • Qualifications: Experience in SRE or DevOps, strong programming skills, and a collaborative mindset.
  • Other info: Dynamic work environment with excellent career advancement opportunities.

The predicted salary is between 36000 - 60000 £ per year.

We are seeking a highly motivated and experienced Site Reliability Engineer to join our High Performance Computing, Site Reliability Engineering, Artificial Intelligence (HPC/SRE/AI) & research computing unit at UK Health Security Agency (UKHSA). The role will be based in Manchester Digital and will operate as a hybrid position across UKHSA headquarters (Birmingham, Leeds, Liverpool, London) with a minimum of 60% onsite.

Location & Working Arrangement

Hybrid working model: minimum 60% contractual hours (≈3 days a week pro rata) at one of UKHSA's core HQs (Birmingham, Leeds, Liverpool, London). Modern refurbished offices with excellent transport links. Public space collaboration with other government departments including DHSC.

About The Job

The Digital and Data Directorate provides scientific and research computing services. The Digital Development and Operations unit delivers platforms and technical capabilities to enable public health services within the organisation and with clients and stakeholders.

Key Responsibilities

  • Remediate infrastructure and operational problems.
  • Leverage automation and CI/CD to ensure reliable, scalable, and high‑performance services.
  • Monitor and manage cloud infrastructure services and observe systems to prioritize operational and performance improvements meeting/exceeding SLOs.
  • Architect, develop & manage multi‑cloud HPC platforms and on‑premise infrastructure.
  • Ensure services are highly available, scalable, and resilient.
  • Manage performance, capacity planning, and support UKHSA's AI requirements.

Incident Response & Troubleshooting

  • Respond swiftly to production incidents with minimal downtime and rapid restoration.
  • Perform root cause analysis and post‑mortems to implement lessons learned.

Monitoring, Alerting & Observability

  • Design and implement effective monitoring and alerting systems using Prometheus, Grafana, etc.
  • Improve observability to identify issues before impacting users.
  • Continuously refine practices to reduce alert fatigue.

Automation & Tooling

  • Develop automation to eliminate manual repetitive tasks and improve efficiency.
  • Write clean, maintainable, well‑tested code for automation and tooling.
  • Drive initiatives to reduce operational toil via Infrastructure as Code.

Service Level Objectives & Operational Improvements

  • Define, track, and improve SLOs, SLI, and error budgets.
  • Prioritize improvements aligning with business goals & user experience.

SRE Best Practices & Advocacy

  • Evangelize SRE principles across the organisation.
  • Integrate reliability practices into the development lifecycle.

Collaboration & Knowledge Sharing

  • Collaborate with software engineering, DevOps, and infrastructure teams.
  • Promote culture of shared responsibility for service reliability.

Documentation & Training

  • Maintain accurate technical documents, runbooks, post‑incident reports.
  • Provide training and mentorship on best practices and tools.

Essential Criteria

  • Experience as a Site Reliability Engineer, DevOps Engineer, Operations Engineer or similar.
  • Programming/scripting skills in Python, PowerShell, Bash.
  • Understanding of Linux/Unix, Windows, networking, distributed systems.
  • Experience with observability tools (Prometheus, Grafana, Datadog) and alerting systems.
  • Infrastructure automation skills (Terraform, Ansible, Helm).
  • Excellent communication and collaboration skills.
  • Experience with security best practices.
  • Strong problem‑solving skills and ability to respond to sudden demands.

Desirable Criteria

  • CI/CD pipelines, cloud platforms (AWS, GCP, Azure), and Kubernetes experience.
  • Post‑incident review experience.
  • Driving SRE practice adoption across an organisation.
  • Delivering training or mentoring of junior engineers.

Seniority Level Mid‑Senior level

Employment Type Full‑time

Job Function Engineering and Information Technology; Industries: Technology, Information and Internet

Senior Specialist Engineer (SRE) - UKHSA - SEO in Manchester employer: Manchester Digital

At UK Health Security Agency (UKHSA), we pride ourselves on being an exceptional employer, offering a dynamic work environment that fosters innovation and collaboration. Our hybrid working model allows for flexibility while ensuring a strong presence in our modern offices across key UK locations, promoting a culture of shared responsibility and continuous learning. With ample opportunities for professional growth and a commitment to employee well-being, UKHSA is dedicated to empowering our team to make a meaningful impact in public health through cutting-edge technology and research.
M

Contact Detail:

Manchester Digital Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Specialist Engineer (SRE) - UKHSA - SEO in Manchester

✨Tip Number 1

Network like a pro! Attend industry meetups, conferences, or even local tech events. It's all about making connections and getting your name out there. You never know who might have the inside scoop on job openings!

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to SRE, automation, or cloud platforms. This gives potential employers a taste of what you can do beyond just a CV.

✨Tip Number 3

Prepare for interviews by brushing up on common SRE scenarios and problem-solving questions. Practice makes perfect, so consider mock interviews with friends or mentors to build your confidence.

✨Tip Number 4

Don't forget to apply through our website! We love seeing candidates who are genuinely interested in joining us at StudySmarter. Tailor your application to highlight how your experience aligns with the role and our mission.

We think you need these skills to ace Senior Specialist Engineer (SRE) - UKHSA - SEO in Manchester

Site Reliability Engineering
High Performance Computing
Cloud Infrastructure Management
Automation
CI/CD
Monitoring and Observability
Incident Response
Root Cause Analysis
Python
PowerShell
Bash
Linux/Unix
Terraform
Ansible
Kubernetes

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Senior Specialist Engineer role. Highlight your experience with SRE practices, cloud platforms, and automation tools. We want to see how your skills align with what we're looking for!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about site reliability engineering and how you can contribute to our team at UKHSA. Keep it concise but impactful!

Showcase Your Problem-Solving Skills: In your application, don’t forget to mention specific examples of how you've tackled operational challenges in the past. We love seeing candidates who can think on their feet and respond to incidents effectively.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you're keen on joining our team!

How to prepare for a job interview at Manchester Digital

✨Know Your Tech Inside Out

Make sure you brush up on your technical skills, especially in Python, PowerShell, and Bash. Be ready to discuss your experience with observability tools like Prometheus and Grafana, as well as your knowledge of cloud platforms and infrastructure automation.

✨Showcase Your Problem-Solving Skills

Prepare to share specific examples of how you've tackled operational problems in the past. Think about incidents you've responded to and how you performed root cause analysis. This will demonstrate your ability to handle sudden demands effectively.

✨Emphasise Collaboration

Since this role involves working closely with various teams, be ready to talk about your collaboration experiences. Highlight any instances where you've promoted a culture of shared responsibility for service reliability or mentored junior engineers.

✨Understand SRE Best Practices

Familiarise yourself with SRE principles and be prepared to discuss how you've integrated these practices into your previous roles. Showing that you can advocate for reliability across an organisation will set you apart from other candidates.

Senior Specialist Engineer (SRE) - UKHSA - SEO in Manchester
Manchester Digital
Location: Manchester

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

M
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>