Job Board

Companies

Barracuda Networks

Director, Site Reliability Engineering

Director, Site Reliability Engineering in London

London Full-Time 72000 - 108000 £ / year (est.) No home office possible

At a Glance

Tasks: Lead global reliability initiatives and oversee a distributed team of Site Reliability Engineers.
Company: Barracuda, a leading cybersecurity company trusted by IT professionals worldwide.
Benefits: Competitive health benefits, retirement plan, flexible time off, and career growth opportunities.
Why this job: Shape the future of cloud reliability and tackle cutting-edge challenges at scale.
Qualifications: 12+ years in infrastructure or SRE roles, with strong leadership and cloud expertise.
Other info: Join a culture of innovation and collaboration while making a real impact.

The predicted salary is between 72000 - 108000 £ per year.

Barracuda is a leading cybersecurity company providing complete protection against complex threats. Our platform protects email, data, applications, and networks with innovative solutions, and a managed XDR service, to strengthen cyber resilience.

We are seeking a strategic and visionary Director of Site Reliability Engineering (SRE) in the Cloud Operations group, to lead global reliability initiatives across Barracuda's SaaS portfolio. You will oversee a distributed team of Site Reliability Engineers and partner closely with Product Engineering, Security & Compliance, and other Cloud Operations teams to ensure our platforms are highly available, scalable, secure, and cost-efficient. This role will also drive AI-powered automation and agentic systems adoption to transform reliability operations.

What will you be working on:

Strategic Leadership: Define and execute Barracuda's global SRE strategy, aligning reliability goals with business objectives and customer SLAs.
Operational Excellence: Drive continuous improvement in availability, latency, performance, and cost optimization across all cloud services.
AI & Agentic Systems Integration: Implement AI-driven observability and anomaly detection for proactive incident prevention; deploy agentic automation systems to manage routine operational tasks, optimize cloud resources, and accelerate remediation workflows; explore LLM-based runbooks and autonomous agents for incident triage and root cause analysis.
Cross-Functional Collaboration: Partner with Engineering, Security, and FinOps teams to embed reliability into product design and delivery pipelines.
Architecture & Governance: Influence architectural decisions for reliability, disaster recovery, and observability systems; ensure compliance with security and regulatory standards.
Automation & Tooling: Champion Infrastructure-as-Code and CI/CD automation at scale using Terraform, Cloud Formation, GitHub Actions, and Jenkins.
Incident & Risk Management: Facilitate incident response protocols, conduct executive-level postmortems, and implement proactive risk mitigation strategies.
Service Level Management: Define and enforce SLIs and SLOs across global services; report reliability metrics to executive leadership.
Team Development: Build and mentor a high-performing SRE organization; foster a culture of ownership, innovation, and collaboration across regions.
Cloud Optimization: Lead initiatives for cost governance and performance tuning in AWS and Azure environments.
Executive Communication: Present reliability roadmaps, KPIs, and risk assessments to senior leadership and stakeholders.

What you bring to the role:

Experience: 12+ years in infrastructure, cloud operations, or SRE roles, including 5+ years in leadership positions managing distributed teams.
Cloud Expertise: Deep knowledge of AWS and Azure architectures, security, and operations in large-scale SaaS environments.
AI & Automation: Experience implementing AI-driven observability, predictive analytics, and autonomous remediation systems.
Infrastructure as Code: Proven success implementing such as Terraform or CloudFormation at enterprise scale.
CI/CD & Automation: Advanced experience with GitHub Actions, Jenkins, and deployment strategies (blue/green, canary, rolling).
Container Orchestration: Expertise in Kubernetes (EKS, AKS) and containerized workloads.
Observability & Resilience: Strong background in Prometheus, Grafana, ELK, and APM tools; experience designing self-healing systems.
Programming: Proficiency in Python, Go, or similar languages for automation and tooling.
Leadership Skills: Exceptional ability to lead globally distributed teams, influence cross-functional stakeholders, and drive cultural change.
Certifications: AWS Solutions Architect/DevOps Professional and Kubernetes certifications (CKA, CKAD) preferred.

What You Will Get from Us:

A leadership role where your vision shapes the reliability of mission-critical systems.
Opportunities for career growth and executive visibility.
High-quality health benefits, retirement plan with employer match, and flexible time off.
The chance to work on cutting-edge cloud reliability challenges at scale.

Director, Site Reliability Engineering in London employer: Barracuda Networks

At Barracuda, we pride ourselves on being a leading cybersecurity company that not only protects our clients but also invests in the growth and well-being of our employees. As a Director of Site Reliability Engineering, you will be part of a dynamic work culture that fosters innovation and collaboration, with ample opportunities for career advancement and executive visibility. Our commitment to employee development is matched by our comprehensive benefits package, including high-quality health benefits, a retirement plan with employer match, and flexible time off, making Barracuda an exceptional place to build a meaningful career.

Contact Detail:

Barracuda Networks Recruiting Team

View Barracuda Networks Profile

StudySmarter Expert Advice 🤫

We think this is how you could land Director, Site Reliability Engineering in London

✨Tip Number 1

Network like a pro! Reach out to folks in your industry on LinkedIn or at events. A friendly chat can open doors that a CV just can't.

✨Tip Number 2

Prepare for interviews by researching the company and its culture. Tailor your answers to show how your experience aligns with their goals, especially around reliability and cloud operations.

✨Tip Number 3

Showcase your leadership skills! Be ready to discuss how you've built and mentored teams in the past. Companies love candidates who can inspire and drive cultural change.

✨Tip Number 4

Don't forget to apply through our website! It’s the best way to ensure your application gets noticed. Plus, we love seeing candidates who are proactive about their job search.

We think you need these skills to ace Director, Site Reliability Engineering in London

Strategic Leadership

Operational Excellence

AI-driven Observability

Anomaly Detection

Infrastructure-as-Code

CI/CD Automation

Cloud Expertise

Incident Management

Risk Mitigation

Service Level Management

Team Development

Cost Governance

Executive Communication

Container Orchestration

Programming in Python or Go

Some tips for your application 🫡

Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the Director of Site Reliability Engineering role. Highlight your leadership experience and any relevant cloud operations expertise to catch our eye!

Craft a Compelling Cover Letter: Use your cover letter to tell us why you're passionate about site reliability engineering and how your vision aligns with Barracuda's goals. This is your chance to show off your personality and strategic thinking!

Showcase Your Achievements: When detailing your past roles, focus on specific achievements that demonstrate your impact in previous positions. Use metrics where possible to quantify your success in improving reliability or optimising costs.

Apply Through Our Website: We encourage you to apply directly through our website for the best chance of being noticed. It’s the easiest way for us to keep track of your application and ensure it gets into the right hands!

How to prepare for a job interview at Barracuda Networks

✨Know Your Stuff

Make sure you brush up on your knowledge of AWS and Azure architectures, as well as the latest trends in AI-driven observability. Be ready to discuss how you've implemented Infrastructure as Code and CI/CD automation in your previous roles.

✨Showcase Your Leadership Skills

Prepare examples that highlight your experience in leading distributed teams. Think about specific challenges you've faced and how you influenced cross-functional stakeholders to drive cultural change within your organisation.

✨Be Ready for Technical Questions

Expect to dive deep into technical discussions around incident management, cloud optimisation, and service level management. Practise articulating your thought process when it comes to designing self-healing systems and managing operational tasks with automation.

✨Communicate Your Vision

Since this role involves presenting reliability roadmaps and KPIs to senior leadership, practise how you would communicate your strategic vision for SRE. Make sure you can clearly articulate how your goals align with business objectives and customer SLAs.

Director, Site Reliability Engineering in London

Barracuda Networks

Location: London

Director, Site Reliability Engineering in London

London

Full-Time

72000 - 108000 £ / year (est.)
Barracuda Networks

1000+

View Barracuda Networks Profile

Similar positions in other companies

UK’s top job board for Gen Z

Discover now

Director, Site Reliability Engineering in London

At a Glance

Director, Site Reliability Engineering in London employer: Barracuda Networks

StudySmarter Expert Advice 🤫

✨Tip Number 1

✨Tip Number 2

✨Tip Number 3

✨Tip Number 4

We think you need these skills to ace Director, Site Reliability Engineering in London

Some tips for your application 🫡

How to prepare for a job interview at Barracuda Networks

Director, Site Reliability Engineering in London

Land your dream job quicker with Premium

Similar positions in other companies

UK’s top job board for Gen Z