Site Reliability Engineer | London City, Hybrid

Job Board

Companies

SGI

Site Reliability Engineer | London City, Hybrid

London Full-Time 36000 - 60000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Shape and drive reliable, observable, and secure systems on AWS.
Company: Leading provider in iGaming and predictive analytics with a collaborative culture.
Benefits: Highly competitive salary and benefits package.
Other info: Opportunity for career growth and cross-functional collaboration.
Why this job: Make a real impact on cutting-edge technology in a dynamic environment.
Qualifications: 3+ years in SRE, strong Kubernetes and scripting skills.

The predicted salary is between 36000 - 60000 £ per year.

Our Client is a premier provider of high-volume software solutions for the global iGaming and predictive analytics sector. With a footprint spanning the USA, UK, and Europe, they partner with industry leaders to engineer sophisticated platforms for sports wagering, prize-based systems, and complex market simulation environments. Their vision is to lead the evolution of interactive technology through intelligent, data-driven architecture that ensures seamless user experiences. The firm is driven by a culture of teamwork, transparency, and technical excellence.

The role involves helping shape and drive how the firm builds and operates reliable, observable, secure, and cost-efficient systems on AWS. Working closely with development, platform, and incident management teams, you will define reliability in measurable terms and build the tooling and processes to achieve it, improving platform speed, stability, and scalability.

Key responsibilities:

Partner with engineering teams to define, measure, and manage SLOs/SLIs, using error budgets to guide delivery decisions.
Enhance observability across services (metrics, logs, traces) to detect and resolve issues proactively.
Lead cost optimisation: monitor spend, right‑size workloads, tune autoscaling, and improve infrastructure efficiency.
Improve production readiness via pre‑deployment checks, post‑release validation, and robust platform guardrails.
Introduce and run chaos engineering experiments to strengthen resilience and recovery.
Automate operational processes to reduce manual intervention and toil across the stack.
Support major incident response, root‑cause analysis, and continual improvement actions.
Collaborate cross‑functionally to raise standards for stability, security, performance, and compliance.

Required skills & experience:

3+ years’ experience in SRE, Platform, or DevOps roles within production environments.
Strong Kubernetes operational experience (on‑prem and AWS EKS).
Hands‑on experience defining and operating SLOs/SLIs, alerting, and incident workflows.
Deep understanding of observability and telemetry (monitoring, logging, tracing).
Infrastructure as Code with Terraform; experience with GitOps workflows and CI/CD.
Scripting proficiency in Python, Bash, or Go.
Proven ability to balance cost efficiency with reliability and performance.
Excellent communication skills and the ability to work effectively across multiple teams.

Strong Desirables for this role:

Experience running chaos engineering experiments.
Exposure to high‑throughput, low‑latency systems.
FinOps knowledge or cost management practices.
AWS certifications (e.g., Solutions Architect, DevOps Engineer).

Site Reliability Engineer | London City, Hybrid employer: SGI

Our Client is an exceptional employer, offering a dynamic work environment in the heart of London City, where innovation meets collaboration. With a strong emphasis on teamwork and technical excellence, employees benefit from a highly competitive salary and a comprehensive benefits package, alongside ample opportunities for professional growth in the rapidly evolving iGaming sector. The hybrid work model promotes a healthy work-life balance, making it an ideal place for those seeking meaningful and rewarding employment.

Contact Details:

SGI Recruitment Team

View SGI profile

StudySmarter Expert Advice🤫

We think this is how you could land Site Reliability Engineer | London City, Hybrid

✨Tip Number 1

Network like a pro! Attend industry meetups, conferences, or even local tech events. Chatting with folks in the field can lead to opportunities that aren’t even advertised yet.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to SRE, Kubernetes, or AWS. This gives potential employers a taste of what you can do.

✨Tip Number 3

Prepare for interviews by practising common SRE scenarios and technical questions. We recommend doing mock interviews with friends or using online platforms to get comfortable with the format.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive about their job search!

We think you need these skills to ace Site Reliability Engineer | London City, Hybrid

Site Reliability Engineering (SRE)

Kubernetes

AWS EKS

SLOs/SLIs definition and operation

Observability and telemetry

Infrastructure as Code (IaC) with Terraform

GitOps workflows

CI/CD

Scripting in Python, Bash, or Go

Cost optimisation

Incident response and root-cause analysis

Communication skills

Chaos engineering

High-throughput, low-latency systems knowledge

AWS certifications

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the skills and experience mentioned in the job description. Highlight your SRE, Platform, or DevOps experience, especially with AWS and Kubernetes, to show us you're a great fit!

Showcase Your Projects:Include any relevant projects or experiences that demonstrate your ability to define and operate SLOs/SLIs or your hands-on experience with observability tools. We love seeing practical examples of your work!

Keep It Clear and Concise:When writing your application, be clear and to the point. Use bullet points for easy reading and make sure to highlight your key achievements. We appreciate straightforward communication!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates from our team!

How to prepare for a job interview at SGI

✨Know Your SLOs and SLIs

Make sure you understand the concepts of Service Level Objectives (SLOs) and Service Level Indicators (SLIs) inside out. Be ready to discuss how you've defined and managed these in your previous roles, as this will show your technical expertise and alignment with the company's goals.

✨Showcase Your Kubernetes Skills

Since strong Kubernetes operational experience is a must-have, prepare to talk about specific projects where you've used Kubernetes, especially in AWS EKS. Bring examples of how you've optimised workloads or improved system reliability using Kubernetes.

✨Demonstrate Your Observability Knowledge

Be prepared to discuss your understanding of observability and telemetry. Share experiences where you've implemented monitoring, logging, or tracing solutions that helped detect and resolve issues proactively. This will highlight your ability to enhance platform stability.

✨Communicate Effectively

Given the collaborative nature of the role, practice articulating your thoughts clearly and concisely. Think of examples where you've worked cross-functionally to improve system performance or security, as this will showcase your teamwork skills and ability to communicate across teams.

Site Reliability Engineer | London City, Hybrid

SGI

Location: London

Apply Now

Site Reliability Engineer | London City, Hybrid

At a Glance

Site Reliability Engineer | London City, Hybrid employer: SGI

StudySmarter Expert Advice🤫

We think you need these skills to ace Site Reliability Engineer | London City, Hybrid

Some tips for your application 🫡

How to prepare for a job interview at SGI

Company

Product

Help