Job Board

Companies

Cisco ThousandEyes

Senior Site Reliability Engineering Manager, Production Engineering

London Full-Time 48000 - 84000 £ / year (est.) Home office (partial)

At a Glance

Tasks: Lead a team of engineers to enhance platform reliability and security.
Company: Cisco ThousandEyes delivers flawless digital experiences through AI-powered insights.
Benefits: Enjoy a hybrid work model with opportunities for career development and mentorship.
Why this job: Join a culture of innovation and collaboration, making a real impact in tech.
Qualifications: Experience in leading SRE teams, Kubernetes expertise, and strong communication skills required.
Other info: Diverse backgrounds are encouraged; apply even if you don't meet every qualification.

The predicted salary is between 48000 - 84000 £ per year.

Please note that we have a hybrid approach to work and would like to find someone who can come into the office in London at least one day a week.

Who We Are

Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even the ones they don’t own. Powered by AI and an unmatched set of cloud, Internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues – before they impact end-user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco’s leading Networking, Security, Collaboration, and Observability portfolios.

About The Role

As the Senior Engineering Manager for our Production Engineering SRE team, you will lead a group of skilled engineers responsible for the design and management of large-scale, highly available distributed systems in the cloud, collaborating directly with application development teams to enhance the reliability, performance, and security of our platform. You will focus on enhancing the reliability, performance, and security of our platform while collaborating with cross-functional teams to drive operational excellence.

What You’ll Do

Team Leadership and Development: Build and mentor a high-performing team of Site Reliability Engineers that embed with application development teams. Foster a culture of continuous learning, innovation, and best practices. Manage performance, set goals, and provide career development opportunities.
Strategic Planning and Execution: Develop and implement strategies to improve platform reliability, security, and performance. Collaborate with other engineering leaders to align SRE initiatives with overall business objectives. Establish and execute on a roadmap to build common platform solutions to reliability, security, and scale challenges engineering teams at ThousandEyes face.
Operational Excellence: Oversee the design and implementation of scalable operations tooling for SREs and Developers. Ensure the effective management of our 24x7 incident response and on-call rotation. Lead efforts to automate production operations and adopt robust monitoring solutions.
Security and Compliance: Partner with application development teams and other platform engineering teams to enhance the security posture of our containerized and cloud-native systems. Ensure compliance with Cisco and industry standards for data protection, scanning, and system security.
Cross-functional Collaboration: Work closely with software development teams to optimize architecture and services for availability and performance. Collaborate with product management to align SRE initiatives with product roadmaps. Represent the Production Engineering SRE team in cross-functional meetings and initiatives.

Minimum Qualifications

Proven track record of leading and scaling SRE teams in a fast-paced environment.
Deep knowledge of site reliability principles, including incident response, change management, and SLOs.
Expert-level knowledge of Kubernetes and its ecosystem.
Strong understanding of cloud platforms, preferably AWS.
Experience with microservices architecture and distributed systems.

Preferred Qualifications

Strong communication and leadership skills, with the ability to influence cross-function stakeholders.
Demonstrated ability in SRE, DevOps, or related fields, with at least 3 years in a management role.
Background in security engineering, DevSecOps or a strong understanding of security best practices in cloud-native environments.
Familiarity with CNCF tools such as Prometheus, OpenTelemetry, and ArgoCD.

Cisco values the perspectives and skills that emerge from employees with diverse backgrounds. That’s why Cisco is expanding the boundaries of discovering top talent by not only focusing on candidates with educational degrees and experience but also placing more emphasis on unlocking potential. We believe that everyone has something to offer and that diverse teams are better equipped to solve problems, innovate, and create a positive impact. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification. Research shows that people from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy. We urge you not to prematurely exclude yourself and to apply if you’re interested in this work.

Contact Detail:

Cisco ThousandEyes Recruiting Team

View Cisco ThousandEyes Profile

StudySmarter Expert Advice 🤫

We think this is how you could land Senior Site Reliability Engineering Manager, Production Engineering

✨Tip Number 1

Familiarise yourself with the latest trends and technologies in site reliability engineering, especially around Kubernetes and cloud platforms like AWS. This knowledge will not only help you during interviews but also demonstrate your commitment to staying current in a fast-paced environment.

✨Tip Number 2

Network with professionals in the SRE field, particularly those who work at Cisco or similar companies. Engaging with them on platforms like LinkedIn can provide insights into the company culture and expectations, which can be invaluable during your application process.

✨Tip Number 3

Prepare to discuss your leadership style and experiences in managing SRE teams. Be ready to share specific examples of how you've fostered a culture of continuous learning and innovation, as this aligns closely with what we value at StudySmarter.

✨Tip Number 4

Showcase your understanding of security best practices in cloud-native environments. Given the emphasis on security in the job description, being able to articulate your experience and strategies in this area will set you apart from other candidates.

We think you need these skills to ace Senior Site Reliability Engineering Manager, Production Engineering

Team Leadership

Site Reliability Engineering (SRE)

Incident Response Management

Change Management

Service Level Objectives (SLOs)

Kubernetes Expertise

Cloud Platform Knowledge (preferably AWS)

Microservices Architecture

Distributed Systems Understanding

Operational Excellence

Automation of Production Operations

Monitoring Solutions Implementation

Security Best Practices in Cloud-Native Environments

Cross-Functional Collaboration

Strong Communication Skills

Strategic Planning and Execution

Some tips for your application 🫡

Tailor Your CV: Make sure your CV highlights your experience in site reliability engineering, team leadership, and cloud platforms. Use specific examples that demonstrate your expertise in Kubernetes and incident response.

Craft a Compelling Cover Letter: In your cover letter, express your passion for enhancing digital experiences and your understanding of the role's responsibilities. Mention how your background aligns with Cisco ThousandEyes' mission and values.

Highlight Relevant Skills: Clearly outline your skills related to SRE principles, DevOps practices, and security best practices. Emphasise your experience with microservices architecture and any familiarity with CNCF tools.

Showcase Leadership Experience: Detail your previous leadership roles and how you have successfully built and mentored teams. Provide examples of how you've fostered a culture of continuous learning and innovation within your teams.

How to prepare for a job interview at Cisco ThousandEyes

✨Showcase Your Leadership Skills

As a Senior Site Reliability Engineering Manager, you'll need to demonstrate your ability to lead and mentor a team. Prepare examples of how you've built high-performing teams in the past and fostered a culture of continuous learning.

✨Understand Site Reliability Principles

Make sure you have a solid grasp of site reliability principles, including incident response and change management. Be ready to discuss your experience with SLOs and how you've implemented them in previous roles.

✨Familiarise Yourself with Relevant Technologies

Given the emphasis on Kubernetes and cloud platforms like AWS, brush up on your knowledge of these technologies. Be prepared to discuss specific projects where you've utilised these tools effectively.

✨Prepare for Cross-Functional Collaboration Questions

Since the role involves working closely with software development teams and product management, think of examples that highlight your collaboration skills. Be ready to explain how you've aligned SRE initiatives with broader business objectives.

Senior Site Reliability Engineering Manager, Production Engineering

Cisco ThousandEyes

Location: London

Senior Site Reliability Engineering Manager, Production Engineering

London

Full-Time

48000 - 84000 £ / year (est.)
Cisco ThousandEyes

1000+

View Cisco ThousandEyes Profile

Similar positions in other companies

UK’s top job board for Gen Z

Discover now

Senior Site Reliability Engineering Manager, Production Engineering

At a Glance

StudySmarter Expert Advice 🤫

✨Tip Number 1

✨Tip Number 2

✨Tip Number 3

✨Tip Number 4

We think you need these skills to ace Senior Site Reliability Engineering Manager, Production Engineering

Some tips for your application 🫡

How to prepare for a job interview at Cisco ThousandEyes

Senior Site Reliability Engineering Manager, Production Engineering

Land your dream job quicker with Premium

Similar positions in other companies

UK’s top job board for Gen Z