At a Glance
- Tasks: Lead a team of engineers to enhance platform reliability and security.
- Company: Cisco ThousandEyes delivers flawless digital experiences through AI-powered insights.
- Benefits: Enjoy a hybrid work model with opportunities for career development and mentorship.
- Why this job: Join a culture of innovation and collaboration, making a real impact in tech.
- Qualifications: Experience in leading SRE teams, Kubernetes expertise, and strong communication skills required.
- Other info: Diverse backgrounds are encouraged; apply even if you don't meet every qualification.
The predicted salary is between 48000 - 84000 £ per year.
Please note that we have a hybrid approach to work and would like to find someone who can come into the office in London at least one day a week.
Who We Are
Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network – even the ones they don’t own. Powered by AI and an unmatched set of cloud, Internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues – before they impact end-user experiences. ThousandEyes is deeply integrated across the entire Cisco technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco’s leading Networking, Security, Collaboration, and Observability portfolios.
About The Role
As the Senior Engineering Manager for our Production Engineering SRE team, you will lead a group of skilled engineers responsible for the design and management of large-scale, highly available distributed systems in the cloud, collaborating directly with application development teams to enhance the reliability, performance, and security of our platform. You will focus on enhancing the reliability, performance, and security of our platform while collaborating with cross-functional teams to drive operational excellence.
What You’ll Do
- Team Leadership and Development: Build and mentor a high-performing team of Site Reliability Engineers that embed with application development teams. Foster a culture of continuous learning, innovation, and best practices. Manage performance, set goals, and provide career development opportunities.
- Strategic Planning and Execution: Develop and implement strategies to improve platform reliability, security, and performance. Collaborate with other engineering leaders to align SRE initiatives with overall business objectives. Establish and execute on a roadmap to build common platform solutions to reliability, security, and scale challenges engineering teams at ThousandEyes face.
- Operational Excellence: Oversee the design and implementation of scalable operations tooling for SREs and Developers. Ensure the effective management of our 24x7 incident response and on-call rotation. Lead efforts to automate production operations and adopt robust monitoring solutions.
- Security and Compliance: Partner with application development teams and other platform engineering teams to enhance the security posture of our containerized and cloud-native systems. Ensure compliance with Cisco and industry standards for data protection, scanning, and system security.
- Cross-functional Collaboration: Work closely with software development teams to optimize architecture and services for availability and performance. Collaborate with product management to align SRE initiatives with product roadmaps. Represent the Production Engineering SRE team in cross-functional meetings and initiatives.
Minimum Qualifications
- Proven track record of leading and scaling SRE teams in a fast-paced environment.
- Deep knowledge of site reliability principles, including incident response, change management, and SLOs.
- Expert-level knowledge of Kubernetes and its ecosystem.
- Strong understanding of cloud platforms, preferably AWS.
- Experience with microservices architecture and distributed systems.
Preferred Qualifications
- Strong communication and leadership skills, with the ability to influence cross-function stakeholders.
- Demonstrated ability in SRE, DevOps, or related fields, with at least 3 years in a management role.
- Background in security engineering, DevSecOps or a strong understanding of security best practices in cloud-native environments.
- Familiarity with CNCF tools such as Prometheus, OpenTelemetry, and ArgoCD.
Cisco values the perspectives and skills that emerge from employees with diverse backgrounds. That’s why Cisco is expanding the boundaries of discovering top talent by not only focusing on candidates with educational degrees and experience but also placing more emphasis on unlocking potential. We believe that everyone has something to offer and that diverse teams are better equipped to solve problems, innovate, and create a positive impact. We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification. Research shows that people from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy. We urge you not to prematurely exclude yourself and to apply if you’re interested in this work.
Contact Detail:
Cisco ThousandEyes Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Senior Site Reliability Engineering Manager, Production Engineering
✨Tip Number 1
Familiarise yourself with the latest trends and technologies in site reliability engineering, especially around Kubernetes and cloud platforms like AWS. This knowledge will not only help you during interviews but also demonstrate your commitment to staying current in a fast-paced environment.
✨Tip Number 2
Network with professionals in the SRE field, particularly those who work at Cisco or similar companies. Engaging with them on platforms like LinkedIn can provide insights into the company culture and expectations, which can be invaluable during your application process.
✨Tip Number 3
Prepare to discuss your leadership style and experiences in managing SRE teams. Be ready to share specific examples of how you've fostered a culture of continuous learning and innovation, as this aligns closely with what we value at StudySmarter.
✨Tip Number 4
Showcase your understanding of security best practices in cloud-native environments. Given the emphasis on security in the job description, being able to articulate your experience and strategies in this area will set you apart from other candidates.
We think you need these skills to ace Senior Site Reliability Engineering Manager, Production Engineering
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your experience in site reliability engineering, team leadership, and cloud platforms. Use specific examples that demonstrate your expertise in Kubernetes and incident response.
Craft a Compelling Cover Letter: In your cover letter, express your passion for enhancing digital experiences and your understanding of the role's responsibilities. Mention how your background aligns with Cisco ThousandEyes' mission and values.
Highlight Relevant Skills: Clearly outline your skills related to SRE principles, DevOps practices, and security best practices. Emphasise your experience with microservices architecture and any familiarity with CNCF tools.
Showcase Leadership Experience: Detail your previous leadership roles and how you have successfully built and mentored teams. Provide examples of how you've fostered a culture of continuous learning and innovation within your teams.
How to prepare for a job interview at Cisco ThousandEyes
✨Showcase Your Leadership Skills
As a Senior Site Reliability Engineering Manager, you'll need to demonstrate your ability to lead and mentor a team. Prepare examples of how you've built high-performing teams in the past and fostered a culture of continuous learning.
✨Understand Site Reliability Principles
Make sure you have a solid grasp of site reliability principles, including incident response and change management. Be ready to discuss your experience with SLOs and how you've implemented them in previous roles.
✨Familiarise Yourself with Relevant Technologies
Given the emphasis on Kubernetes and cloud platforms like AWS, brush up on your knowledge of these technologies. Be prepared to discuss specific projects where you've utilised these tools effectively.
✨Prepare for Cross-Functional Collaboration Questions
Since the role involves working closely with software development teams and product management, think of examples that highlight your collaboration skills. Be ready to explain how you've aligned SRE initiatives with broader business objectives.