At a Glance
- Tasks: Lead the design and delivery of scalable, reliable infrastructure and services.
- Company: Join a top player in the mobile industry, shaping the future of technology.
- Benefits: Enjoy hybrid remote work, competitive pay, and opportunities for professional growth.
- Why this job: Be part of a dynamic team driving innovation in cloud-native applications.
- Qualifications: 5+ years in Site Reliability Engineering with strong Kubernetes and coding skills.
- Other info: This is a 12-month contract role, outside IR35, offering £300 to £330 per day.
The predicted salary is between 60000 - 84000 £ per year.
Location: Hybrid Remote – London EC2M
Contract: 12 months
Rate: Outside IR35 - £300 to £330 Per Day
About the Role: We are partnering with one of the top companies in the mobile industry to hire a Site Reliability Engineer (SRE) Manager. In this role, you will collaborate with cross-functional teams to drive the design, development, and delivery of high-performing, scalable, and reliable infrastructure and services. You’ll be responsible for building robust systems, automating operations, and enhancing observability and deployment pipelines for modern cloud-native applications.
Key Responsibilities:
- System Reliability & Performance: Maintain and scale critical services and infrastructure. Identify performance bottlenecks and work closely with product engineers to optimize applications.
- Kubernetes Operations: Administer, scale, and troubleshoot clusters in GKE, EKS, or other Kubernetes environments.
- Infrastructure as Code (IaC): Design and maintain scalable infrastructure using Terraform and automate deployments across public, private, or hybrid clouds (mainly AWS).
- CI/CD Pipeline Enhancement: Build and improve robust CI/CD pipelines to support fast and safe deployment cycles.
- Observability & Monitoring: Implement code-based instrumentation and telemetry. Ensure systems are observable with tools for logging, metrics, and alerting.
- Automation & Scripting: Write tooling and automation scripts in Python, Go, or Rust to reduce toil and manual intervention.
- Storage & Networking: Manage and optimise storage services like Amazon S3 or Google Cloud Storage (GCS). Resolve complex networking issues in multi-cloud environments.
Essential Requirements:
- 5+ years of hands-on experience as a Site Reliability Engineer.
- Proven expertise in Kubernetes (GKE/EKS).
- Strong proficiency in Python, Go, or Rust.
- Solid experience with AWS and Infrastructure as Code using Terraform.
- Deep understanding of Linux internals, standard networking protocols, and distributed systems architecture.
- Hands-on experience with automation and performance optimisation.
- Strong knowledge of SRE principles and methodologies.
- Experience with observability tools and telemetry systems.
- Exposure to Google Cloud Platform (GCP).
- Familiarity with hybrid or multi-cloud architecture.
- Experience with service meshes or edge proxies (e.g., Envoy, Istio).
- Working knowledge of container security best practices.
Site Reliability Engineering Manager employer: TECEZE
Contact Detail:
TECEZE Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Site Reliability Engineering Manager
✨Tip Number 1
Network with professionals in the Site Reliability Engineering field. Attend meetups, webinars, or conferences related to SRE and cloud technologies. This can help you gain insights into the industry and potentially connect you with hiring managers.
✨Tip Number 2
Showcase your hands-on experience with Kubernetes and Infrastructure as Code. Engage in online forums or contribute to open-source projects that focus on these technologies. This not only builds your portfolio but also demonstrates your commitment to continuous learning.
✨Tip Number 3
Prepare for technical interviews by practising common SRE scenarios and problems. Focus on performance optimisation, automation, and CI/CD pipeline enhancements. Being able to discuss real-world examples will set you apart from other candidates.
✨Tip Number 4
Familiarise yourself with the specific tools and technologies mentioned in the job description, such as Terraform, AWS, and observability tools. Having practical knowledge of these will allow you to speak confidently about your capabilities during interviews.
We think you need these skills to ace Site Reliability Engineering Manager
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your experience in Site Reliability Engineering, particularly your hands-on work with Kubernetes, AWS, and Infrastructure as Code. Use specific examples that demonstrate your expertise in these areas.
Craft a Compelling Cover Letter: In your cover letter, express your passion for the role and the mobile industry. Mention how your skills align with the responsibilities listed, such as building CI/CD pipelines and automating operations. Personalise it to show why you want to work with this particular company.
Showcase Relevant Projects: If you have worked on projects that involved automation, observability, or performance optimisation, be sure to include them in your application. Describe your role and the impact of your contributions to demonstrate your capabilities.
Highlight Soft Skills: As a manager, soft skills are crucial. Emphasise your ability to collaborate with cross-functional teams, lead projects, and communicate effectively. Provide examples of how you've successfully managed teams or projects in the past.
How to prepare for a job interview at TECEZE
✨Showcase Your Technical Expertise
Be prepared to discuss your hands-on experience with Kubernetes, AWS, and Infrastructure as Code. Highlight specific projects where you've optimised performance or automated processes, as this will demonstrate your capability to handle the responsibilities of the role.
✨Understand the Company’s Infrastructure
Research the company’s existing infrastructure and services. Familiarise yourself with their use of cloud platforms and any specific tools they employ for observability and monitoring. This knowledge will help you tailor your responses and show genuine interest in their operations.
✨Prepare for Scenario-Based Questions
Expect scenario-based questions that assess your problem-solving skills. Think about past challenges you've faced in SRE roles and how you resolved them. Use the STAR method (Situation, Task, Action, Result) to structure your answers effectively.
✨Demonstrate Leadership and Collaboration Skills
As a manager, you'll need to showcase your ability to lead cross-functional teams. Prepare examples of how you've successfully collaborated with product engineers or other stakeholders to drive projects forward, ensuring you highlight your communication and leadership style.