At a Glance
- Tasks: Lead a team ensuring the reliability and performance of critical financial applications.
- Company: SS&C is a global leader in investment and financial software services.
- Benefits: Enjoy a full-time role with opportunities for remote work and diverse corporate perks.
- Why this job: Join a dynamic team focused on innovation, operational excellence, and a strong engineering culture.
- Qualifications: 10+ years in engineering, with 5+ years in SRE or DevOps leadership roles required.
- Other info: We value diversity and encourage applications from all backgrounds.
The predicted salary is between 72000 - 108000 £ per year.
1 month ago Be among the first 25 applicants
SS&C is a global provider of investment and financial software-enabled services and software for the global financial services and healthcare industries. The GIDS product suite powers mission-critical investor and distributor services across asset managers, insurance companies, retirement providers, and wealth management platforms.
Job Overview
As the Head of Production Engineering and Site Reliability Engineering (SRE) for the GIDS organisation, you will lead a team responsible for the scalability, resilience, performance, and reliability of cloud and hybrid infrastructure powering some of the most critical client-facing applications in financial services. You will be the strategic and operational leader for platform reliability, observability, incident response, CI/CD modernisation, and developer productivity. You will drive automation, lead with metrics, and build systems and teams that proactively address issues before they impact clients.
Key Responsibilities:
- Define and execute the vision and roadmap for Production Engineering and SRE within GIDS.
- Build and lead globally distributed, high-performance teams with a focus on talent development, SRE culture, and operational excellence.
- Collaborate cross-functionally with Engineering, Product, Compliance, and Infrastructure teams to improve system reliability and efficiency.
Production Operations & Incident Management
- Own reliability, uptime, and performance KPIs for GIDS applications and services.
- Implement a comprehensive incident management lifecycle (on-call, escalation, RCA, blameless postmortems).
- Reduce Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) through automated observability, alerting, and playbooks.
CI/CD and Platform Engineering
- Oversee the development and evolution of CI/CD pipelines for all GIDS products using GitHub Actions, ArgoCD, TeamCity, Octopus Deploy, and GitOps principles.
- Integrate static and dynamic code analysis, vulnerability scanning, artifact promotion, and release gating into the SDLC.
- Ensure pipeline scalability and governance while maintaining developer velocity.
Observability & Troubleshooting
- Lead the implementation and usage of modern observability stacks (e.g., OpenTelemetry, Prometheus, Grafana, Splunk, Datadog).
- Establish SLOs, SLIs, and error budgets with product and engineering teams.
- Drive root cause identification using distributed tracing, advanced log analysis, and anomaly detection.
- Security, Audit & Compliance
- Partner with security and compliance teams to embed controls into infrastructure and software delivery.
- Automate audit evidence collection, change tracking, and access management (e.g., HashiCorp Vault, OPA, AWS IAM).
- Ensure all systems meet internal and regulatory audit requirements (SOC2, GDPR, etc.).
Infrastructure & Automation
- Champion infrastructure-as-code (IaC) using Terraform, Helm, and Kubernetes for scalable cloud and hybrid deployments.
- Optimise infrastructure cost, elasticity, and resilience through autoscaling, canary deployments, and chaos testing.
- Maintain high SLAs for critical services running on Kubernetes, AWS, and on-prem hybrid infrastructure.
Talent Management & Culture
- Attract, retain, and mentor top engineering talent with a strong focus on diversity and continuous learning.
- Cultivate a culture of ownership, transparency, blameless accountability, and operational excellence.
- Drive career development through structured learning paths, performance reviews, and skills-based mentoring.
Talent Management & Global Operations
- Build and scale a globally distributed 24/7 operations team, ensuring consistent coverage and operational resilience.
- Establish and enforce engineering and operational standards for deployments, monitoring, and incident response across geographies.
- Implement and continuously refine a multi-tiered support structure (L1, L2, L3) with clear escalation paths and accountability.
- Drive hiring, onboarding, and training initiatives that support both site reliability and continuous delivery.
- Foster a strong engineering culture rooted in transparency, autonomy, learning, and operational excellence.
- Develop strategies to prevent burnout in around-the-clock operations, including tooling, automation, and shift rotation planning.
Qualifications
Required:
- 10+ years of experience in engineering, with 5+ years in a leadership role in SRE, DevOps, or Production Engineering.
- Proven track record managing reliable, scalable systems in a high-compliance environment (e.g., FinTech, HealthTech).
- Strong understanding of modern software development lifecycle, CI/CD, IaC, and cloud-native technologies.
- Expertise in Kubernetes, AWS (or Azure/GCP), GitOps workflows, observability tools, and automation frameworks.
- Excellent leadership, communication, and stakeholder management skills.
We encourage applications from people of all backgrounds and particularly welcome applications from under-represented groups, to enable us to bring a diversity of perspectives to our thinking and conversation. It's essential to us that we strive to have a diverse workforce in the widest sense.
Unless explicitly requested or approached by SS&C Technologies, Inc. or any of its affiliated companies, the company will not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services.
SS&C Technologies is an Equal Employment Opportunity employer and does not discriminate against any applicant for employment or employee on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status or any other classification protected by applicable discrimination laws.
Seniority level
-
Seniority level
Mid-Senior level
Employment type
-
Employment type
Full-time
Job function
-
Job function
Engineering and Management
-
Industries
Financial Services, Software Development, and Investment Management
Referrals increase your chances of interviewing at SS&C Technologies by 2x
Sign in to set job alerts for “Site Reliability Engineer” roles.
London, England, United Kingdom 1 week ago
London, England, United Kingdom 1 month ago
London, England, United Kingdom 3 weeks ago
Hounslow, England, United Kingdom 3 days ago
Systems Engineer – Systematic Hedge Fund – £200k
London, England, United Kingdom 1 month ago
London, England, United Kingdom 3 weeks ago
London, England, United Kingdom 1 week ago
London, England, United Kingdom 5 minutes ago
London, England, United Kingdom 2 weeks ago
London, England, United Kingdom 1 week ago
London, England, United Kingdom 3 days ago
London, England, United Kingdom 16 hours ago
London, England, United Kingdom 3 weeks ago
London, England, United Kingdom $130,000.00-$180,000.00 1 month ago
London, England, United Kingdom 3 days ago
South Croydon, England, United Kingdom 1 month ago
London, England, United Kingdom 2 days ago
London, England, United Kingdom 1 week ago
London, England, United Kingdom 2 weeks ago
London, England, United Kingdom 4 days ago
City Of London, England, United Kingdom 3 weeks ago
London, England, United Kingdom 22 hours ago
Basildon, England, United Kingdom 1 week ago
London, England, United Kingdom 1 month ago
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
#J-18808-Ljbffr
Head of SRE and Production Engineering (London) employer: SS&C Technologies
Contact Detail:
SS&C Technologies Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Head of SRE and Production Engineering (London)
✨Tip Number 1
Familiarise yourself with the latest trends in Site Reliability Engineering and Production Engineering. Understanding the current tools and methodologies, such as Kubernetes, AWS, and CI/CD practices, will help you speak confidently about your expertise during interviews.
✨Tip Number 2
Network with professionals in the financial services and software development sectors. Attend industry meetups or webinars to connect with potential colleagues and learn more about the company culture at SS&C, which can give you an edge in your application.
✨Tip Number 3
Prepare to discuss your leadership style and experiences in managing high-performance teams. Be ready to share specific examples of how you've fostered a strong engineering culture and driven talent development in previous roles.
✨Tip Number 4
Research SS&C's GIDS product suite and understand its impact on the financial services industry. Being knowledgeable about their products will not only impress your interviewers but also demonstrate your genuine interest in the role and the company.
We think you need these skills to ace Head of SRE and Production Engineering (London)
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights relevant experience in SRE, DevOps, and Production Engineering. Focus on leadership roles and specific achievements that demonstrate your ability to manage scalable systems in high-compliance environments.
Craft a Compelling Cover Letter: In your cover letter, express your passion for engineering and leadership. Discuss how your vision aligns with the goals of SS&C and how you can contribute to their mission-critical applications in financial services.
Showcase Technical Expertise: Emphasise your expertise in Kubernetes, AWS, and CI/CD practices. Provide examples of how you've implemented modern observability stacks and automated processes in previous roles to enhance system reliability and performance.
Highlight Team Management Skills: Discuss your experience in building and leading high-performance teams. Mention specific strategies you've used for talent development, fostering a strong engineering culture, and preventing burnout in operational settings.
How to prepare for a job interview at SS&C Technologies
✨Showcase Your Leadership Experience
As the Head of SRE and Production Engineering, you'll need to demonstrate your leadership skills. Prepare examples of how you've built and led high-performance teams, focusing on talent development and operational excellence.
✨Understand the Technical Landscape
Familiarise yourself with the technologies mentioned in the job description, such as Kubernetes, AWS, and CI/CD practices. Be ready to discuss your experience with these tools and how you've implemented them in previous roles.
✨Emphasise Collaboration Skills
Collaboration is key in this role. Prepare to discuss how you've worked cross-functionally with different teams, such as Engineering, Product, and Compliance, to improve system reliability and efficiency.
✨Prepare for Scenario-Based Questions
Expect scenario-based questions that assess your problem-solving abilities in high-compliance environments. Think about past incidents you've managed and how you ensured reliability and performance under pressure.