Senior Site Reliability Engineer, Production Engineering - Cisco ThousandEyes

Senior Site Reliability Engineer, Production Engineering - Cisco ThousandEyes

Full-Time 48000 - 84000 £ / year (est.) No working from home possible
Cisco Systems Inc

At a Glance

  • Tasks: Design and manage large-scale distributed systems, enhancing reliability and performance.
  • Company: Join Cisco ThousandEyes, a leader in Digital Experience Assurance.
  • Benefits: Competitive salary, health benefits, and opportunities for professional growth.
  • Other info: Collaborative environment with limitless growth opportunities.
  • Why this job: Make a real impact on digital experiences with cutting-edge technology.
  • Qualifications: Expertise in Kubernetes, Python or Go, and cloud providers like AWS.

The predicted salary is between 48000 - 84000 £ per year.

Cisco ThousandEyes is a Digital Experience Assurance platform that empowers organizations to deliver flawless digital experiences across every network - even the ones they don’t own. Powered by AI and an unmatched set of cloud, internet and enterprise network telemetry data, ThousandEyes enables IT teams to proactively detect, diagnose, and remediate issues - before they impact end-user experiences.

ThousandEyes is deeply integrated across the entire Cisco technology portfolio and beyond, helping customers deploy at scale while also delivering AI-powered assurance insights within Cisco’s leading Networking, Security, Collaboration, and Observability portfolios.

Your Impact

We are seeking a skilled Senior Site Reliability Engineer (SRE) in Production Engineering with a strong background in SaaS and operations. You will design and manage large-scale, highly available distributed systems in the cloud, collaborating directly with application development teams to enhance the reliability, performance, and security of our platform.

  • Technical Leadership & Collaboration: Forge strong partnerships with cross-functional stakeholders to identify requirements and deliver solutions that address project and departmental objectives.
  • Solution Design & Deployment: Architect and implement small to mid-size or moderately complex solutions that elevate reliability, availability, latency, and performance across diverse environments and customer segments.
  • Automation & Service Reliability: Combine expertise in design, automation, deployment, and coding to enhance system reliability for new and existing platforms, tailoring approaches to regional, national, or customer-specific needs.
  • High Availability & Disaster Recovery: Develop and validate automated high-availability and disaster recovery mechanisms, ensuring systems are robust, scalable, and support rapid velocity in delivery. Take part in regular disaster recovery drills.
  • Capacity Planning & Reporting: Analyze resource usage and produce actionable reports to forecast and address capacity constraints, supporting proactive decision-making and operational excellence.
  • Monitoring & Tooling: Design, build, and deploy tools that deliver comprehensive visibility into infrastructure performance and reliability. Automate key platform functions for efficiency and resilience.
  • Incident Response & Continuous Improvement: Monitor production environments, collaborate with Development and Operations to diagnose issues, and develop monitoring tools to preemptively identify and resolve problems. Serve as on-call Site Reliability Engineer (SRE), lead post-mortems, and deliver clear root cause analyses.
  • Security & Compliance: Embed strong security controls in architectural design, collaborate with security teams to enhance safeguards, and contribute to incident response efforts as needed. Work closely with various teams specializing in security, to ensure various platform components and infrastructure is secure at the highest possible level.

Minimum Qualifications

  • Expert-level knowledge of Kubernetes and its ecosystem.
  • Proficiency in software development with languages such as Python or Go.
  • In-depth knowledge of cloud providers, preferably AWS.
  • Solid conceptual and practical knowledge in Web technologies, Networking, and Linux.
  • Knowledge of Site Reliability principles: Incident Response, Change Management, Distributed Systems, Deployment Strategies, and SLOs.

Preferred Qualifications

  • Familiarity with best practices for operating a large-scale, highly available enterprise platform.
  • 5+ years of experience in a related role.
  • Proven ability to build and implement scalable and well-tested solutions.
  • Excellent communication and documentation skills.
  • Strong sense of ownership, drive, and attention to detail.

Why Cisco?

At Cisco, we’re revolutionizing how data and infrastructure connect and protect organizations in the AI era – and beyond. We’ve been innovating fearlessly for 40 years to create solutions that power how humans and technology work together across the physical and digital worlds. These solutions provide customers with unparalleled security, visibility, and insights across the entire digital footprint.

Fueled by the depth and breadth of our technology, we experiment and create meaningful solutions. Add to that our worldwide network of doers and experts, and you’ll see that the opportunities to grow and build are limitless. We work as a team, collaborating with empathy to make really big things happen on a global scale. Because our solutions are everywhere, our impact is everywhere.

We are Cisco, and our power starts with you.

Senior Site Reliability Engineer, Production Engineering - Cisco ThousandEyes employer: Cisco Systems Inc

Cisco ThousandEyes is an exceptional employer that fosters a collaborative and innovative work culture, empowering employees to make a significant impact in the realm of digital experience assurance. With a strong emphasis on professional growth, employees benefit from continuous learning opportunities and the chance to work with cutting-edge technology in a supportive environment. Located in a vibrant tech hub, Cisco offers unique advantages such as flexible working arrangements and a commitment to employee well-being, making it an ideal place for those seeking meaningful and rewarding careers.

Cisco Systems Inc

Contact Details:

Cisco Systems Inc Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Senior Site Reliability Engineer, Production Engineering - Cisco ThousandEyes

Join Local Tech Meetups

Get out there and mingle with fellow developers by joining local tech meetups. It’s a fantastic way to meet people who might be working at Cisco Systems Inc or know someone who does. Plus, you can pick up some trendy tech skills and trends while you're at it!

Contribute to Open Source Projects

Show off your coding chops by jumping into open-source projects. Not only does this give you practical experience, but it also gets you noticed in the dev community. You'll create a killer portfolio that speaks volumes about your skills to Cisco Systems Inc.

Tap into Online Developer Communities

Don’t underestimate the power of online developer communities like GitHub, Stack Overflow, and even Reddit. Participate in discussions, share your projects, and build your visibility. We can often find opportunities through these channels that can lead to a full-time gig at companies like Cisco Systems Inc.

Explore Job Boards Specifically for Tech Roles

Keep your eyes peeled on job boards that focus on tech roles. Sites like TechCareers or Stack Overflow Jobs can often have listings for companies like Cisco Systems Inc that might not show up on broader job sites. Make it a habit to check these regularly, and don’t hesitate to apply directly through our website!

We think you need these skills to ace Senior Site Reliability Engineer, Production Engineering - Cisco ThousandEyes

Kubernetes
Python
Go
AWS
Web Technologies
Networking
Linux

Some tips for your application 🫡

Show off your coding skills:When applying for a software engineering role, it's super important to showcase your coding skills. Make sure your CV includes your tech stack, any relevant programming languages you’re comfortable with, and examples of projects you've worked on. If you have a GitHub profile, link it up! We love to see code in action.

Tailor your portfolio:For a full-time role, we’d expect to see some solid examples of your work in your portfolio. Make sure to include at least two or three projects that highlight your problem-solving skills and your ability to work with different technologies. Focus on the projects that are most relevant to the position at Cisco Systems Inc.

Craft a killer cover letter:Your cover letter is your chance to stand out—make it personal! Explain why you want to work at Cisco Systems Inc and how your skills align with the role. Show us your passion for software development. We dig enthusiastic candidates who understand the value of collaboration and continuous learning!

Be clear and concise:When it comes to writing your CV and cover letter, clarity is key. Avoid jargon that could confuse us and stick to simple, direct language. Highlight your achievements with quantifiable results where possible, and keep everything easy to read. A well-organised application goes a long way!

How to prepare for a job interview at Cisco Systems Inc

Brush Up on Your Coding Skills

For a full-time software engineering role, it's crucial that we stay sharp with our coding abilities. Expect technical questions that might involve solving problems on the spot or discussing algorithms. Practise on platforms like LeetCode or HackerRank to get comfortable with the types of questions that often come up.

Know Your Tools and Frameworks

Make sure we’re well-acquainted with the tools and technologies listed in the job description. Familiarise ourselves with any specific frameworks or programming languages mentioned. If Cisco Systems Inc uses React or Node.js, for instance, be ready to discuss how we’ve used them in previous projects or coursework.

Showcase Your Projects

Bring along a portfolio that highlights our best work. This could be code samples, GitHub repositories, or any side projects we’ve built. Make sure we can talk through our thought process for each project, especially the challenges we faced and how we solved them—this shows our problem-solving skills in action.

Prepare for Behavioural Questions

While technical skills are key, full-time positions also require cultural fit. Be ready to discuss our previous experiences and how we handle teamwork, conflict, and deadlines. Brush up on the STAR method—Situation, Task, Action, Result—to clearly articulate our past experiences when discussing how we've contributed to a team.