At a Glance
- Tasks: Lead the SRE team, manage AWS infrastructure, and implement best practices for reliability.
- Company: Amber Labs is a dynamic tech consultancy focused on innovation and collaboration.
- Benefits: Enjoy flexible work, private medical insurance, 25 days leave, and a vibrant company culture.
- Why this job: Join a rapidly growing start-up that values personal growth and encourages experimentation.
- Qualifications: 8+ years in SRE or DevOps, with strong AWS expertise and leadership experience required.
- Other info: This is a 12-month FTC role; SC clearance is mandatory.
The predicted salary is between 43200 - 72000 Β£ per year.
AWS Head of Site Reliability Engineering (Must hold current SC)
2 days ago Be among the first 25 applicants
Direct message the job poster from Amber Labs
AWS Head of Site Reliability Engineering (Must hold current SC)
The Company:
At Amber Labs, we are a cutting-edge UK and European technology consultancy that prioritises empowering autonomy, promoting experimentation, and facilitating rapid learning to provide exceptional value to our clients. Our company culture is centred around collaboration, where all colleagues, regardless of their role, work together to minimise risk and shorten delivery times. Our team consists of highly-skilled cross-functional consultants, analysts, and support staff.
Overview:
We are looking for a highly skilled and visionary leader to join our team as the Head of Site Reliability Engineering (SRE) with a strong focus on AWS cloud infrastructure. The ideal candidate will have a deep understanding of cloud architectures, extensive experience in SRE practices, and the ability to lead and scale SRE teams to ensure the availability, performance, and security of our systems.
Key Responsibilities:
- Leadership and Team Management: Lead and manage the SRE team to ensure high availability, scalability, and performance of our AWS-based infrastructure. Provide mentorship and guidance to junior and senior engineers, fostering a culture of operational excellence and continuous improvement.
- Cloud Infrastructure Management: Oversee the design, implementation, and maintenance of cloud infrastructure in AWS, ensuring the systems are secure, reliable, and highly available. Use best practices for AWS services, automation, and monitoring.
- SRE Practices Implementation: Establish and lead the implementation of SRE principles, such as Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets, to drive the team's focus on reliability.
- Incident Management: Lead incident response efforts, root cause analysis (RCA), and post-incident reviews to improve system reliability. Ensure rapid response to production issues and minimize downtime.
- Performance Optimization: Drive initiatives for performance tuning, cost optimization, and efficient use of AWS resources. Ensure the infrastructure can scale to meet the demands of the business.
- Automation and Continuous Improvement: Champion the automation of manual tasks, such as deployments, monitoring, and scaling, using tools like Terraform, CloudFormation, Jenkins, and other CI/CD platforms.
- Collaboration: Work closely with cross-functional teams (Engineering, DevOps, Security, etc.) to ensure seamless collaboration in achieving business and technical goals.
- Monitoring and Alerts: Implement and maintain robust monitoring, alerting, and logging systems to detect issues before they impact the business, using AWS CloudWatch, Prometheus, Grafana, etc.
- Cost Management: Help optimize AWS costs while maintaining operational efficiency and reliability.
Required Qualifications:
- Experience: 8+ years of experience in Site Reliability Engineering, DevOps, or similar roles, with at least 2 years in a leadership position.
- AWS Expertise: Extensive experience with AWS services, such as EC2, S3, Lambda, RDS, VPC, CloudFormation, CloudWatch, etc. Hands-on experience with cloud architecture and design.
- SRE Best Practices: Deep understanding of SRE principles and frameworks, including SLOs, SLIs, and Error Budgets.
- Incident Management: Proven experience in incident management, including response, recovery, root cause analysis, and post-mortem reporting.
- Automation Tools: Proficient in automation tools like Terraform, CloudFormation, Jenkins, and other CI/CD tools.
Preferred Qualifications:
- Certifications: AWS Certified Solutions Architect β Professional, AWS Certified DevOps Engineer, or other relevant certifications.
- Agile Methodologies: Experience with Agile and Lean practices in a cloud-native environment.
- Competitive salary and performance-based bonus structure.
- Join a rapidly expanding start-up where personal growth is a part of our DNA.
- Benefit from a flexible work environment focused on deliverable outcomes.
- Receive private medical insurance through Aviva.
- Enjoy the benefits of a company pension plan through Nest.
- 25 days of annual leave plus UK bank holidays.
- Access Perkbox, a global employee rewards platform offering discounts, perks, and wellness resources.
- Participate in a generous employee referral program.
- A highly collaborative and collegial environment with opportunities for career advancement.
- Be encouraged to take bold steps and embrace a mindset of experimentation.
- Choose your preferred device, PC or Mac.
Diversity & Inclusion:
Here at Amber Labs, we are dedicated to fostering an inclusive and equitable workplace for all. Our commitment to diversity, equality, and inclusion includes:
- Valuing the unique experiences, perspectives, and backgrounds of all employees and creating an environment where everyone feels welcomed, respected, and valued.
- Prohibiting all forms of harassment, bullying, discrimination, and victimisation and promoting a culture of dignity and respect for all.
- Educating all new hires on our Diversity and Inclusion policies and ensuring they are aware of their rights and responsibilities to create a safe and inclusive workplace.
- By taking these steps, we are dedicated to building a workplace that reflects and celebrates the diversity of our employees and communities.
This role at Amber Labs is a 12 Month FTC position, and all employees are required to meet the Baseline Personnel Security Standard (BPSS) and hold current SC. Please be advised that, at this time, we are unable to consider candidates who require sponsorship or hold a visa of any type.
What Happens Next?
Our Talent Acquisition Team will be in touch to advise you on the next steps. We have a two-stage interview process for most of our consultants. In certain cases, we may include a third and final stage, which is a conversation with the company Partners. This will only be considered if deemed necessary.
Seniority level
-
Seniority level
Mid-Senior level
Employment type
-
Employment type
Full-time
Job function
-
Job function
Information Technology
-
Industries
IT Services and IT Consulting
Referrals increase your chances of interviewing at Amber Labs by 2x
Get notified about new Head of Engineering jobs in London Area, United Kingdom .
London, England, United Kingdom 2 weeks ago
London, England, United Kingdom 1 week ago
London, England, United Kingdom 1 month ago
London, England, United Kingdom 14 hours ago
London, England, United Kingdom 1 month ago
London, England, United Kingdom 3 months ago
London, England, United Kingdom 2 months ago
London, England, United Kingdom 6 days ago
London, England, United Kingdom 1 month ago
Engineering Manager – Public Cloud, Python, Golang
London, England, United Kingdom 1 month ago
London, England, United Kingdom 2 weeks ago
London, England, United Kingdom 6 days ago
London, England, United Kingdom 3 hours ago
London, England, United Kingdom 1 month ago
Weβre unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr
AWS Head of Site Reliability Engineering (Must hold current SC) (London) employer: Amber Labs
Contact Detail:
Amber Labs Recruiting Team
StudySmarter Expert Advice π€«
We think this is how you could land AWS Head of Site Reliability Engineering (Must hold current SC) (London)
β¨Tip Number 1
Make sure to highlight your leadership experience in Site Reliability Engineering. Amber Labs is looking for someone who can manage and mentor a team, so be prepared to discuss your previous roles where you led teams and drove operational excellence.
β¨Tip Number 2
Familiarise yourself with AWS services and SRE best practices. Since the role focuses heavily on AWS cloud infrastructure, being able to speak confidently about your hands-on experience with services like EC2, S3, and CloudFormation will set you apart.
β¨Tip Number 3
Prepare examples of how you've implemented SRE principles in past roles. Discussing your experience with Service Level Objectives (SLOs) and incident management will demonstrate your understanding of the key responsibilities of the position.
β¨Tip Number 4
Show your passion for collaboration and continuous improvement. Amber Labs values a collaborative culture, so be ready to share instances where you've worked closely with cross-functional teams to achieve technical goals.
We think you need these skills to ace AWS Head of Site Reliability Engineering (Must hold current SC) (London)
Some tips for your application π«‘
Tailor Your CV: Make sure your CV highlights your experience in Site Reliability Engineering and AWS cloud infrastructure. Use specific examples that demonstrate your leadership skills and familiarity with SRE practices.
Craft a Compelling Cover Letter: Write a cover letter that showcases your passion for the role and the company. Mention how your background aligns with Amber Labs' values of collaboration and continuous improvement, and include specific achievements related to incident management and automation.
Highlight Relevant Certifications: If you hold any relevant certifications, such as AWS Certified Solutions Architect or AWS Certified DevOps Engineer, make sure to mention them prominently in your application. This will strengthen your candidacy.
Showcase Your Leadership Experience: In your application, emphasise your experience in leading teams and managing projects. Provide examples of how you've fostered a culture of operational excellence and continuous improvement in previous roles.
How to prepare for a job interview at Amber Labs
β¨Showcase Your AWS Expertise
Make sure to highlight your extensive experience with AWS services during the interview. Be prepared to discuss specific projects where you've implemented AWS solutions, focusing on how you ensured security, reliability, and performance.
β¨Demonstrate Leadership Skills
As a Head of Site Reliability Engineering, you'll need to lead a team effectively. Share examples of how you've mentored junior engineers and fostered a culture of operational excellence in previous roles.
β¨Discuss SRE Best Practices
Be ready to talk about your understanding of SRE principles, such as SLOs, SLIs, and Error Budgets. Provide concrete examples of how you've implemented these practices to improve system reliability in past positions.
β¨Prepare for Incident Management Scenarios
Expect questions related to incident management. Prepare to discuss your experience with root cause analysis and post-incident reviews, and how you've used these processes to enhance system reliability and minimise downtime.