Site Reliability Engineer in Nottingham

Site Reliability Engineer in Nottingham

Nottingham Full-Time 60000 - 80000 Β£ / year (est.) No working from home possible
Randstad Digital UK

At a Glance

  • Tasks: Design and scale observability systems to keep global services online.
  • Company: Join a forward-thinking tech company with a focus on innovation.
  • Benefits: Remote work, competitive salary, and opportunities for professional growth.
  • Other info: Be part of an autonomous team with excellent career advancement opportunities.
  • Why this job: Make a real impact by solving complex data challenges in a dynamic team.
  • Qualifications: 5+ years in distributed systems and experience with programming languages like Go or Python.

The predicted salary is between 60000 - 80000 Β£ per year.

About the Role

We are looking for a Lead SRE to design, scale, and operate massive-scale observability systems that keep our global services online and performant. You will join an autonomous team of software engineers focused on solving complex data infrastructure challenges.

Key Responsibilities

  • Scale Prometheus metrics infrastructure to handle 100+ million active series.
  • Operate large Elasticsearch clusters holding 2000+TB of data.
  • Grow high-throughput Kafka data pipelines processing hundreds of thousands of events per second.
  • Build custom alerting workflows and self-service APIs for internal engineering teams.
  • Provision cloud and private infrastructure using Terraform.

Requirements

  • 5+ years operating mid-to-large distributed systems on Linux VMs or bare-metal machines.
  • 2+ years developing in Go, Python, Ruby, Scala, or Bash.
  • Hands-on experience with Prometheus/Thanos/Cortex, Kafka, the ELK stack, Ansible, or Consul.
  • Comfortable diving into unfamiliar codebases and participating in an on-call rotation.

Keywords: Observability, Monitoring, SRE, Site Reliability Engineering, DevOps, ElasticSearch, ELK, Prometheus, Kafka, Terraform, Linux, Bare Metal

Site Reliability Engineer in Nottingham employer: Randstad Digital UK

As a Lead Site Reliability Engineer at our company, you will be part of a dynamic and innovative team dedicated to maintaining the performance and reliability of our global services. We offer a flexible remote work environment that fosters collaboration and creativity, alongside opportunities for professional growth through challenging projects and cutting-edge technologies. Join us to make a meaningful impact while enjoying a supportive culture that values your contributions and encourages continuous learning.

Randstad Digital UK

Contact Details:

Randstad Digital UK Recruitment Team

We think you need these skills to ace Site Reliability Engineer in Nottingham

Prometheus
Elasticsearch
Kafka
Terraform
Linux
Go
Python