Senior Site Reliability Engineer (SRE) in Sheffield

Job Board

Companies

hackajob

Senior Site Reliability Engineer (SRE)

Senior Site Reliability Engineer (SRE) in Sheffield

Sheffield Full-Time 90000 - 90000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Design and implement best SRE practices to enhance system reliability and performance.
Company: Join a forward-thinking tech company with a focus on innovation and collaboration.
Benefits: Enjoy 25 days annual leave, a generous pension scheme, and state-of-the-art offices.
Other info: Hybrid working model with excellent career development opportunities.
Why this job: Make a real impact by optimising cloud infrastructure and ensuring system security.
Qualifications: Experience in SRE/DevOps, strong scripting skills, and knowledge of cloud platforms required.

The predicted salary is between 90000 - 90000 £ per year.

Department: Technology

Location: Sheffield, London, Talbot Green or Yeovil

Working Pattern: Hybrid, includes 3 days each week in the office

Contract Type: Full time, permanent

Salary: Up to £90,000 per annum

Role Overview

As a Senior SRE Engineer, you will be pivotal in designing and implementing best SRE practices while fostering a culture of continuous improvement and optimization. You will collaborate closely with development and operations teams to improve the platform stability and performance, ensuring that our systems are reliable, secure, and scalable.

Key Responsibilities

Infrastructure Management: Manage and scale cloud-based infrastructure (e.g., AWS, Azure, GCP). Apply Infrastructure as Code (IaC) principles for provisioning and configuration management.
Security And Compliance: Collaborate with the security team to implement best practices for system and data security. Ensure systems comply with relevant industry standards and regulations.
Monitoring And Performance: Set up and maintain monitoring and alerting systems for early issue detection and resolution. Continuously optimize system performance and resource usage.
Documentation: Create and maintain thorough documentation for SRE/platform processes, tools, and practices. Exposure to Jira and equivalent tool would be beneficial.

What will you need to succeed?

Experience:

Proven experience in a SRE/DevOps/Platform role, with a strong background in both software development or operations.
Knowledge of CI/CD tools (e.g., Jenkins, GitLab CI/CD, Travis CI).
Proficiency in scripting and automation (e.g., Bash, Python, Ansible).
Strong experience with containerization and orchestration technologies (e.g., Docker, Kubernetes).
Strong hands-on experience of at least one major public cloud platforms (e.g., AWS, Azure, GCP).
Strong problem-solving and troubleshooting abilities in a timebound situation (Major incidents).
Clear communication and incident management experience.
Demonstrable strong hands-on experience with Terraform.
Knowledge of microservices architecture.
Familiarity with security best practices and tools.
Demonstrable experience of monitoring/observability tools – preferred Grafana, Prometheus, PagerDuty, uptime.

Knowledge:

Cloud Platforms: Strong knowledge of AWS, Azure, or GCP, including cloud architecture, services, and security models.
Containerization & Orchestration: In-depth understanding of Docker and Kubernetes for deploying and managing containerized applications.
Infrastructure as Code (IaC): Knowledge of IaC frameworks, particularly Terraform, to manage cloud infrastructure via code.
Microservices Architecture: Familiarity with microservices design patterns and deployment strategies in a cloud-native environment.
Monitoring & Observability: Understanding of monitoring, logging, and alerting tools (e.g., Prometheus, Grafana, ELK) to ensure system performance and issue tracking.

Skills:

CI/CD Tools: Hands-on experience with Jenkins, GitLab CI/CD, Travis CI, or similar tools for building CI/CD pipelines.
Scripting & Automation: Proficiency in scripting languages like Bash and Python, along with automation tools such as Ansible for managing configurations and deployments.
Containerization & Orchestration: Practical skills in deploying and managing containers using Docker and orchestrating workloads using Kubernetes.
Cloud Platform Management: Expertise in managing and scaling cloud environments on AWS, Azure, or GCP, leveraging services for compute, storage, networking, and security.
Infrastructure as Code (IaC): Skilled in using Terraform to automate provisioning and management of cloud infrastructure.
Troubleshooting & Problem Solving: Strong analytical skills for identifying and resolving complex system issues, especially in production environments.
Collaboration & Communication: Excellent ability to work under pressure e.g. in a Major incident.

Qualifications:

Certifications (Preferred): Holding certifications such as AWS Certified DevOps Engineer, CKA (Certified Kubernetes Administrator), or other relevant credentials.

What do you get in return?

25 days annual leave rising to 30
5% pension after probation
State of the art offices
Access to a range of benefits via My Benefits World
Free eye care cover
Life Assurance
Cycle to Work Scheme
EAP (Employee assistance programme)
Quarterly Team Socials
Access to an extensive Learning and Development menu

Senior Site Reliability Engineer (SRE) in Sheffield employer: hackajob

Join a forward-thinking technology company that prioritises employee well-being and professional growth. With a hybrid working model, state-of-the-art offices, and a commitment to continuous improvement, we offer an environment where your expertise as a Senior Site Reliability Engineer will thrive. Enjoy generous benefits including 25 days of annual leave, a robust pension scheme, and access to extensive learning opportunities, all while being part of a collaborative and innovative culture in vibrant locations like Sheffield, London, Talbot Green, or Yeovil.

Contact Details:

hackajob Recruitment Team

View hackajob profile

We think you need these skills to ace Senior Site Reliability Engineer (SRE) in Sheffield

Cloud Infrastructure Management

AWS

Azure

GCP

Infrastructure as Code (IaC)

Terraform

CI/CD Tools

Jenkins

GitLab CI/CD

Travis CI

Scripting and Automation

Bash

Python

Containerization

Docker

Kubernetes

Monitoring and Observability

Prometheus

Grafana

Incident Management

Problem-Solving Skills

Collaboration Skills

Security Best Practices

Senior Site Reliability Engineer (SRE) in Sheffield

hackajob

Location: Sheffield

Apply Now

Senior Site Reliability Engineer (SRE) in Sheffield

At a Glance

Senior Site Reliability Engineer (SRE) in Sheffield employer: hackajob

We think you need these skills to ace Senior Site Reliability Engineer (SRE) in Sheffield

Company

Product

Help