At a Glance
- Tasks: Lead and innovate observability solutions across diverse tech environments.
- Company: Join Citi, a global leader in financial services with a dynamic culture.
- Benefits: Enjoy 27 days annual leave, private medical care, and a hybrid work model.
- Why this job: Shape the future of observability while leading a talented team.
- Qualifications: Experience in observability and strong leadership skills required.
- Other info: Be part of a supportive workplace that values diversity and growth.
The predicted salary is between 72000 - 108000 ÂŁ per year.
The SRE Observability Lead Engineer is a hands-on leader responsible for shaping and delivering the future of Observability across Services Technology. This role reports into the Head of SRE Services and sits within a small central enablement team. You will define the long-term vision, build and scale modern observability capabilities across business lines, and lead a small team of SREs delivering reusable observability services. This is a blended leadership and engineering role – the ideal candidate pairs strategic vision with the technical depth to resolve real-world telemetry challenges across on-prem, cloud, and container-based environments (ECS, Kubernetes, etc.). You’ll work closely with architecture & other engineering functions to not only resolve common challenges affecting SREs aligned to LoBs, but will ensure observability is embedded as a non-functional requirement (NFR) for all new services going live. You will collaborate with platform and infrastructure teams to ensure enterprise-scale, not siloed solutions. You will also be responsible for managing a small, high-impact team of SREs based in your region. This role requires a comprehensive understanding of observability challenges across Services (Payments, Securities Services, Trade, Digital & Data) and the ability to influence outcomes at the enterprise level. Strong commercial awareness, technical credibility, and excellent communication skills are essential to negotiate internally, influence peers, and drive change. Some external communication may be necessary.
Responsibilities
- Define and own the strategic vision and multi-year roadmap for Observability across Services Technology, aligned with enterprise reliability and production goals.
- Translate strategy into an actionable delivery plan in partnership with Services Architecture & Engineering function, delivering incremental, high-value milestones toward a unified, scalable observability architecture.
- Lead and mentor SREs across Services, fostering a technical growth and SRE mindset.
- Build and offer a suite of central observability services across LoBs – including standardized telemetry libraries, onboarding templates, dashboard packs, and alerting standards.
- Drive reusability and efficiency by creating common patterns and golden paths for observability adoption across critical client flows and platforms.
- Partner with infrastructure, CTO and other SMBF tooling teams, to ensure observability tooling is scalable, resilient, and avoids duplication (“cottage industries”).
- Work hands-on to troubleshoot telemetry and instrumentation issues across on-prem, cloud (AWS, GCP, etc.), and ECS/K-based environments.
- Collaborate closely with the architecture function to support implementation of observability NFRs in the SDLC, ensuring new apps go live with sufficient coverage and insight.
- Support SRE Communities of Practice (CoP) and foster strong relationships with SREs, developers, and platform leads across Services and beyond to accelerate adoption & promote SRE best practices like SLO adoption, Capacity Planning.
- Use Jira/Agile workflows to track and report on observability maturity across Services LoBs – coverage, adoption, and contribution to improved client experience.
- Remove inefficiencies and provide solutions to enable unified views of consolidated SLOs for critical E2E client journeys for Payments & other Services critical user journeys.
- Influence and align senior stakeholders across functions (applications, infrastructure, controls, and audit) to drive observability investment for critical client flows across Services.
- Represent Services in working groups to influence enterprise observability standards, ensuring feedback from Services is reflected.
- Lead people management responsibilities for your direct team, including management of headcount, goal setting, performance evaluation, compensation, and hiring.
- Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behaviour, conduct and business practices, and escalating, managing and reporting control issues with transparency, as well as effectively supervise the activity of others and create accountability with those who fail to maintain these standards.
Qualifications
- Relevant experience in Observability, SRE, Infrastructure Engineering, or Platform Architecture, including several years in senior leadership roles.
- Deep expertise in observability tools and stacks such as Grafana, Prometheus, OpenTelemetry, ELK, Splunk, and similar platforms.
- Strong hands-on experience across hybrid infrastructure, including on-prem, cloud (AWS, GCP, Azure), and container platforms (ECS, Kubernetes).
- Proven ability to design scalable telemetry and instrumentation strategies, resolve production observability gaps, and integrate them into large-scale systems.
- Experience leading teams and managing people across geographically distributed locations.
- Strong ability to influence platform, cloud, and engineering leaders to ensure observability tooling is built for reuse and scale.
- Deep understanding of SRE fundamentals, including SLIs, SLOs, error budgets, and telemetry-driven operations.
- Strong collaboration skills and experience working across federated teams, building consensus and delivering change.
- Ability to stay up to date with industry trends and apply them to improve internal tooling and design decisions.
- Excellent written and verbal communication skills; able to influence and articulate complex concepts to technical and non-technical audiences.
Education
- Bachelor’s or Master’s degree in Computer Science, Engineering, Information Systems, or a related technical field.
What we’ll provide you
- By joining Citi, you will not only be part of a business casual workplace with a hybrid working model (up to 2 days working at home per week), but also receive a competitive base salary (which is annually reviewed), and enjoy a whole host of additional benefits such as:
- 27 days annual leave (plus bank holidays)
- A discretional annual performance related bonus
- Private Medical Care & Life Insurance
- Employee Assistance Program
- Pension Plan
- Paid Parental Leave
- Special discounts for employees, family, and friends
- Access to an array of learning and development resources
Alongside these benefits Citi is committed to ensuring our workplace is where everyone feels comfortable coming to work as their whole self, every day. We want the best talent around the world to be energized to join us, motivated to stay and empowered to thrive.
SRE Observability Lead Engineer - Senior Vice President employer: Citigroup, Inc.
Contact Detail:
Citigroup, Inc. Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land SRE Observability Lead Engineer - Senior Vice President
✨Tip Number 1
Network like a pro! Reach out to your connections in the industry, especially those who work at companies you're interested in. A friendly chat can open doors and give you insider info on job openings.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects and contributions to observability tools. This gives potential employers a taste of what you can bring to the table.
✨Tip Number 3
Prepare for interviews by practising common SRE scenarios and technical questions. We recommend doing mock interviews with friends or using online platforms to get comfortable with the format.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive about their job search!
We think you need these skills to ace SRE Observability Lead Engineer - Senior Vice President
Some tips for your application 🫡
Tailor Your Application: Make sure to customise your CV and cover letter to highlight your experience in observability and SRE. We want to see how your skills align with the role, so don’t hold back on showcasing your relevant achievements!
Showcase Your Leadership Skills: Since this is a leadership role, it’s crucial to demonstrate your ability to lead and mentor teams. Share examples of how you've influenced outcomes and driven change in previous positions – we love to see that!
Be Clear and Concise: When writing your application, keep it clear and to the point. Use straightforward language to explain your technical expertise and how it relates to the challenges mentioned in the job description. We appreciate clarity!
Apply Through Our Website: We encourage you to apply directly through our website for the best chance of getting noticed. It’s the easiest way for us to track your application and ensure it reaches the right people!
How to prepare for a job interview at Citigroup, Inc.
✨Know Your Observability Tools
Make sure you’re well-versed in the observability tools mentioned in the job description, like Grafana, Prometheus, and OpenTelemetry. Be ready to discuss your hands-on experience with these tools and how you've used them to solve real-world telemetry challenges.
✨Showcase Leadership Experience
Since this role involves leading a small team of SREs, prepare examples that highlight your leadership skills. Think about times when you’ve mentored others or driven change within a team, and be ready to share those stories during the interview.
✨Understand the Business Context
Familiarise yourself with the specific business lines mentioned, such as Payments and Securities Services. Being able to articulate how observability impacts these areas will show that you understand the broader implications of your work.
✨Prepare for Technical Questions
Expect technical questions that assess your understanding of SRE fundamentals, SLIs, SLOs, and error budgets. Brush up on these concepts and be prepared to explain how you’ve applied them in past roles, especially in hybrid infrastructure environments.