At a Glance
- Tasks: Design and scale observability systems to keep global services online.
- Company: Join a forward-thinking tech company with a focus on innovation.
- Benefits: Remote work, competitive salary, and opportunities for professional growth.
- Other info: Be part of an autonomous team with excellent career advancement opportunities.
- Why this job: Make a real impact by solving complex data challenges in a dynamic team.
- Qualifications: 5+ years in distributed systems and experience with programming languages like Go or Python.
The predicted salary is between 60000 - 80000 Β£ per year.
About the Role
We are looking for a Lead SRE to design, scale, and operate massive-scale observability systems that keep our global services online and performant. You will join an autonomous team of software engineers focused on solving complex data infrastructure challenges.
Key Responsibilities
- Scale Prometheus metrics infrastructure to handle 100+ million active series.
- Operate large Elasticsearch clusters holding 2000+TB of data.
- Grow high-throughput Kafka data pipelines processing hundreds of thousands of events per second.
- Build custom alerting workflows and self-service APIs for internal engineering teams.
- Provision cloud and private infrastructure using Terraform.
Requirements
- 5+ years operating mid-to-large distributed systems on Linux VMs or bare-metal machines.
- 2+ years developing in Go, Python, Ruby, Scala, or Bash.
- Hands-on experience with Prometheus/Thanos/Cortex, Kafka, the ELK stack, Ansible, or Consul.
- Comfortable diving into unfamiliar codebases and participating in an on-call rotation.
Keywords: Observability, Monitoring, SRE, Site Reliability Engineering, DevOps, ElasticSearch, ELK, Prometheus, Kafka, Terraform, Linux, Bare Metal
Site Reliability Engineer in Nottingham employer: Randstad Digital UK
As a Lead Site Reliability Engineer at our company, you will be part of a dynamic and innovative team dedicated to maintaining the performance and reliability of our global services. We offer a flexible remote work environment that fosters collaboration and creativity, alongside opportunities for professional growth through challenging projects and cutting-edge technologies. Join us to make a meaningful impact while enjoying a supportive culture that values your contributions and encourages continuous learning.