Senior Site Reliability engineer (SRE)

London Full-Time No home office possible

Role: Senior Site Reliability Engineer (SRE)

Location: London (full onsite – 5 days every week)

Salary: Up to 80K gross annually

Experience: Minimum 12+ years profile required

Experience with monitoring tools such as Datadog, Splunk, Dynatrace, Grafana, Prometheus, Thousand Eyes, Gremlin, etc.
Ability to create dashboards for Infrastructure, Application Performance Monitoring (APM), and End-to-End workflows
Monitoring, logging, alerting, and error budgeting (e.g., 99.9%, 99.99%, 99.999%) for software, operations, and business
Define Service Level Objectives (SLO), Service Level Indicators (SLI), and Service Level Agreements (SLA) with business, operations, and engineering teams
Automation and auto-healing skills using Python, Shell scripting, JavaScript, etc.; developing custom monitoring services
Experience with logging, monitoring, and event detection on cloud or distributed platforms
ITIL practices including incident management, change management, problem management, blameless postmortems, documentation, and lessons learned
Technical operations support focusing on stability, reliability, and resiliency

#J-18808-Ljbffr

Contact Detail:

TN United Kingdom Recruiting Team