At a Glance
- Tasks: Ensure reliability and performance of our AWS-hosted Data Platform through automation and collaboration.
- Company: Join a dynamic team focused on cutting-edge cloud technology.
- Benefits: Competitive daily rate, remote work, and potential for contract extension.
- Other info: Opportunity for career growth in a supportive and collaborative environment.
- Why this job: Make a real impact in a high-impact role while working with innovative technologies.
- Qualifications: Experience as an SRE or DevOps Engineer in AWS, with strong automation skills.
The predicted salary is between 60000 - 80000 £ per year.
Role: AWS Site Reliability Engineer (SRE) - Data Platform
Location: Remote (UK-based)
Contract Length: Initial 3 months (with potential for extension)
IR35 Status: Inside IR35
Clearance: SC Clearance preferred
About the Role
We are looking for an experienced AWS Site Reliability Engineer (SRE) to join our team on an initial 3-month contract. You will be embedded within a high-impact team dedicated to ensuring the reliability, scalability, and performance of our AWS-hosted Data Platform. If you live and breathe observability, love tearing down operational toil through automation, and know how to keep complex cloud ecosystems running smoothly, we want to hear from you.
Key Responsibilities
- Define and Operationalise Reliability: Establish, refine, and operationalise SLIs, SLOs, and error budgets for critical data services, mapping them to the four golden signals (latency, errors, traffic, and saturation).
- Observability Frameworks: Build and maintain comprehensive SLO dashboards and end-to-end monitoring (metrics, logs, and traces) utilizing Dynatrace and Prometheus.
- Cloud and Container Management: Navigate the AWS ecosystem confidently, managing and optimizing containerized workloads deployed on Amazon EKS (Kubernetes).
- Toil Reduction and Automation: Drive aggressive automation initiatives to eliminate repetitive operational tasks and streamline system efficiency.
- Collaboration and Resilience: Partner closely with developers and architects to improve architecture reliability, contribute to continuous improvement backlogs, and lead root cause analysis (RCA) via blameless post-mortems.
Technical Skills and Experience
- Strong background as an SRE or DevOps Engineer within an AWS environment.
- Hands-on experience managing and scaling workloads on Amazon EKS (Kubernetes).
- Proven track record with observability stacks, specifically Dynatrace and Prometheus.
- Deep understanding of SRE principles, including error budgets, alerting thresholds, and full-stack tracing.
- Excellent scripting/automation skills (e.g., Python, Bash, or Go).
- Data Platform Experience: Prior exposure to data platforms, batch/streaming data pipelines (e.g., Kafka, Spark), and the unique challenges of data observability and workload reliability.
- Active or recent SC Clearance.
£300.00 - £375.00 / day
AWS SRE in London employer: TALENT INTERNATIONAL UK LTD
At Talent International, we pride ourselves on fostering a dynamic and inclusive work culture that empowers our employees to thrive. As an AWS Site Reliability Engineer, you will have the opportunity to work remotely from the UK, collaborating with a high-impact team dedicated to innovation and excellence in cloud technology. We offer competitive daily rates, a commitment to professional development, and a supportive environment that values your contributions and encourages growth.
Contact Details:
TALENT INTERNATIONAL UK LTD Recruitment Team