Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE)

Full-Time 60000 - 80000 £ / year (est.) Working from home possible
The Investigo Group

At a Glance

  • Tasks: Operate and enhance production Kubernetes platforms while driving automation and reliability.
  • Company: Join The Investigo Group, a leader in cutting-edge tech solutions.
  • Benefits: Enjoy competitive salary, private medical, generous holiday, and continuous learning opportunities.
  • Other info: Be part of a diverse team that values inclusion and innovation.
  • Why this job: Make a real impact on secure, innovative technology solutions in a dynamic environment.
  • Qualifications: Strong experience with Kubernetes, Linux, and cloud-native tooling required.

The predicted salary is between 60000 - 80000 £ per year.

Location: Remote - UK (possible paid occasional travel to TIG Secure site locations as required)

Job Type: Full-time, Permanent (37.5 hours)

Salary: Competitive + benefits + package

Security Clearance Requirements

Please note that holding a current Security Clearance is not essential at the time of application, but eligibility is required. This role requires the successful candidate to be eligible for Security Check (SC) clearance. To meet this requirement, applicants must:

  • Have the right to work in the UK
  • Have lived in the UK continuously for the past 5 years
  • Not have spent more than 6 months outside the UK in total during that period
  • Be willing to undergo security vetting as part of the onboarding process

About Us

Come and be a part of The Investigo Group (TIG), a dynamic coalition of cutting-edge tech firms specialising in Platform, Software, Data, AI and other bleeding-edge technology solutions. Our innovative prowess spans the globe while proudly hailing from the United Kingdom.

The group is multi-functional with a large portfolio of B2B products and services. Our ecosystem is made up of:

  • Voixtel, secure communications and voice platforms for regulated and critical environments.
  • IIS, providing secure internet access in both the public and private sectors.
  • Vestigo Consulting, our training and consultancy company, tailored around specialist sector-specific knowledge.
  • Collaboraite, a bleeding-edge company that provides our Data and AI capability.

Diversity, Equity, and Inclusion (DEI) are at the heart of The Investigo Group (TIG). We're dedicated to creating a workplace where people from all backgrounds are not only welcome but empowered to excel.

About You

You’re an experienced SRE, Platform Engineer or Cloud Engineer with strong hands-on experience running Kubernetes in production environments. You’re comfortable working across Linux, Kubernetes, cloud-native tooling, automation, observability, CI/CD and infrastructure as code.

You enjoy treating infrastructure as a product, automating repeatable work, improving resilience, and building platforms that other engineers can rely on. You’re calm under pressure, methodical during incidents, and able to turn operational challenges into long-term improvements.

About the Role

We’re looking for a Senior Site Reliability Engineer (SRE) to help operate, harden and mature our production OKD / Kubernetes platforms. This is a hands-on engineering role focused on reliability, automation, observability, GitOps, CI/CD and secure platform operations.

Key Responsibilities

  • Operate, harden and extend production OpenShift / OKD / Kubernetes clusters across on-premises and hybrid environments.
  • Support the migration from VMware to KVM.
  • Own and improve CI/CD processes across the full lifecycle of platform and application components.
  • Work with platform and application engineers to support cloud-native delivery using tools such as Helm and Kustomize.
  • Develop and mature GitOps deployment practices using tools such as Argo CD or Flux.
  • Maintain and improve core platform services including identity, ingress, observability, certificate management, service mesh and container registry capabilities.
  • Automate repeatable operational tasks using tools such as Ansible, Terraform, Helm, Kustomize, Go, Python or equivalent technologies.
  • Lead incident response activity, support blameless post-mortems and drive systemic fixes.
  • Create and maintain clear technical documentation, runbooks, design notes and operational guidance.
  • Mentor other engineers and act as a senior technical authority across cloud and Kubernetes operations.

Success in This Role Looks Like

  • A more reliable, secure and measurable production Kubernetes estate.
  • Improved platform observability, with meaningful alerting, SLOs and trend data.
  • Progress against the VMware to KVM migration.
  • A mature GitOps approach covering platform and application components.
  • Improved CI/CD practices that help teams move at pace while considering security, QA and compliance.
  • Well-documented, supportable and scalable platform services.

Requirements

We’re looking for a Senior Site Reliability Engineer (SRE) with strong experience operating production Kubernetes environments. This role is well suited to someone who combines deep technical capability with strong operational discipline.

Essential Experience & Skills

  • Strong experience running production Kubernetes environments.
  • Strong Linux fundamentals.
  • Experience with at least one Kubernetes distribution.
  • Solid infrastructure as code experience.
  • GitOps and CI/CD experience managing full application and component lifecycles.
  • Prometheus, Grafana, Elastic Stack / LGTM, OpenTelemetry or similar.
  • Experience working with identity and access technologies.
  • Strong troubleshooting, problem-solving and analytical skills.
  • Strong communication skills.

Benefits

  • Private Medical
  • Health Cash Plan
  • 4x Life Assurance
  • Generous holiday allowance.
  • Access to continuous learning and development opportunities.
  • Bonus potential based on performance and business-related factors.
  • Discounts on a wide range of products and services.
  • Pension scheme contributions.
  • Regular Pay Reviews.

Equal Opportunities

Here at TIG we are committed to equal opportunities and value diversity, equity and inclusion at our company.

Senior Site Reliability Engineer (SRE) employer: The Investigo Group

At The Investigo Group (TIG), we pride ourselves on being an exceptional employer, offering a dynamic and inclusive work culture that fosters innovation and collaboration. Our remote UK-based team enjoys competitive salaries, generous benefits, and continuous learning opportunities, all while contributing to cutting-edge technology solutions that make a real impact. Join us to be part of a forward-thinking environment where your expertise in Site Reliability Engineering will not only be valued but also help shape the future of secure, cloud-native platforms.

The Investigo Group

Contact Details:

The Investigo Group Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Senior Site Reliability Engineer (SRE)

Tip Number 1

Network like a pro! Reach out to folks in your industry on LinkedIn or at tech meetups. A friendly chat can lead to opportunities that aren’t even advertised yet.

Tip Number 2

Show off your skills! Create a portfolio or GitHub repo showcasing your projects, especially those involving Kubernetes and cloud-native tools. This gives potential employers a taste of what you can do.

Tip Number 3

Prepare for interviews by practising common SRE scenarios. Think about how you’d handle incidents or improve platform reliability. We want to see your problem-solving skills in action!

Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love hearing from passionate candidates who are eager to join our innovative team.

We think you need these skills to ace Senior Site Reliability Engineer (SRE)

Kubernetes
OpenShift
Linux
Infrastructure as Code
Ansible
Terraform
GitOps

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Senior Site Reliability Engineer role. Highlight your hands-on experience with Kubernetes and any relevant projects you've worked on. We want to see how your skills align with what we're looking for!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about this role and how your experience makes you a perfect fit. Don't forget to mention your understanding of operational maturity and reliability in cloud environments.

Showcase Your Problem-Solving Skills:In your application, share examples of how you've tackled complex platform challenges in the past. We love candidates who can turn operational issues into long-term improvements, so let us know how you've done that!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you're keen to join our team at TIG!

How to prepare for a job interview at The Investigo Group

Know Your Kubernetes Inside Out

Make sure you brush up on your Kubernetes knowledge before the interview. Be ready to discuss your hands-on experience with production environments, including any specific distributions you've worked with like OpenShift or OKD. Prepare to share examples of how you've tackled challenges in these environments.

Showcase Your Automation Skills

Since this role emphasises automation, be prepared to talk about your experience with tools like Ansible, Terraform, and CI/CD practices. Have specific examples ready that demonstrate how you've automated operational tasks and improved efficiency in previous roles.

Demonstrate Problem-Solving Abilities

Expect questions that assess your troubleshooting skills. Think of scenarios where you've had to resolve complex issues under pressure. Highlight your methodical approach and how you turned operational challenges into long-term improvements.

Emphasise Collaboration and Mentorship

This position requires working closely with various teams, so be ready to discuss your collaborative experiences. Share instances where you've mentored others or influenced engineering practices, showcasing your ability to elevate the technical capabilities of your peers.