Senior Lead Site Reliability Engineer in Glasgow

Senior Lead Site Reliability Engineer in Glasgow

Glasgow Full-Time 80000 - 100000 € / year (est.) No home office possible
JPMorganChase

At a Glance

  • Tasks: Lead an agile team to enhance reliability and observability for critical platforms.
  • Company: Join JPMorgan Chase, a leader in technology and finance.
  • Benefits: Competitive salary, health benefits, and opportunities for professional growth.
  • Other info: Dynamic environment with a focus on diversity and inclusion.
  • Why this job: Make a significant impact on market-leading technology products.
  • Qualifications: Advanced knowledge in site reliability and proficiency in programming languages.

The predicted salary is between 80000 - 100000 € per year.

Be an integral part of an agile team that's constantly pushing the envelope to enhance, build, and deliver top-notch reliability and observability for our most critical platforms. As a Senior Lead Site Reliability Engineer at JPMorgan Chase within the Commercial & Investment Bank, you are an integral part of an agile team that works to enhance, build, and deliver trusted market-leading technology products in a secure, stable, and scalable way. Drive significant business impact through your capabilities and contributions, and apply deep technical expertise and problem-solving methodologies to tackle a diverse array of reliability, observability, and performance challenges that span multiple technologies and applications.

Job responsibilities

  • Regularly provides technical guidance and direction on site reliability practices to support the business and its technical teams, contractors, and vendors.
  • Develops secure and high-quality production code for reliability tooling and telemetry pipelines, and reviews and debugs code written by others.
  • Drives decisions that influence reliability design, observability architecture, application functionality, and technical operations and processes.
  • Serves as a function-wide subject matter expert in one or more areas of site reliability, observability, or telemetry engineering.
  • Leads resiliency design reviews and breaks up complex reliability problems into digestible work for other engineers, acting as a technical lead for large-sized products.
  • Acts as the main point of contact during major incidents, demonstrating the skills to identify and solve issues quickly to avoid financial losses, and champions blameless postmortem culture.
  • Collaborates with team members and stakeholders to define comprehensive service level indicators, service level objectives, and error budgets.
  • Designs, implements, and maintains operational reliability for large-scale OpenTelemetry pipelines on hybrid on-prem/cloud environments, supporting telemetry ingestion, processing, and export to backends such as InfluxDB, Prometheus, Elasticsearch, and OpenSearch.
  • Drives the assessment, refactoring, and incremental migration of custom legacy telemetry collection code to standardized OpenTelemetry instrumentation, reducing technical debt while maintaining system stability.
  • Actively contributes to the engineering community as an advocate of firmwide frameworks, tools, and practices, and influences peers and project decision-makers to consider the use and application of leading-edge observability and reliability technologies.
  • Adds to the team culture of diversity, opportunity, inclusion, and respect.

Required qualifications, capabilities, and skills

  • Formal training or certification on software engineering concepts and advanced applied experience delivering system design, application development, testing, and operational stability.
  • Advanced knowledge of reliability, scalability, performance, security, enterprise system architecture, toil reduction, and other site reliability best practices, with considerable in-depth knowledge in one or more technical disciplines (e.g., cloud, observability, distributed systems, etc.).
  • Advanced proficiency in one or more programming languages (e.g., Java, Python, Go, etc.).
  • Advanced proficiency and experience in observability such as white and black box monitoring, SLO alerting, and telemetry collection using tools such as Grafana, Dynatrace, Prometheus, Datadog, Splunk, Elasticsearch, etc.
  • Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.).
  • Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker, etc.).
  • Hands-on experience with the design, deployment, and operation of OpenTelemetry collectors in production environments, focusing on technical aspects such as configuring, optimizing, and troubleshooting OTLP endpoints and receivers.
  • Ability to tackle reliability design and functionality problems independently with little to no oversight.
  • Practical cloud native experience.
  • Ability to expand and collaborate across different levels and stakeholder groups.

Preferred qualifications, capabilities, and skills

  • Knowledge of distributed tracing, metrics, and logging best practices.
  • Certification in AWS, Kubernetes, or relevant technologies.
  • Proven track record in system health monitoring, capacity management, and blameless postmortems for high-availability services.
  • Deep understanding of distributed system design principles, networking (TCP/IP, DNS, load balancing), and Linux internals.
  • Contributions to open-source observability or telemetry projects.
  • Experience working with agent control planes and management protocols; hands-on knowledge of OpAMP is highly desirable.

Senior Lead Site Reliability Engineer in Glasgow employer: JPMorganChase

At JPMorgan Chase, we pride ourselves on fostering a dynamic and inclusive work environment where innovation thrives. As a Senior Lead Site Reliability Engineer, you will not only contribute to cutting-edge technology solutions but also benefit from extensive professional development opportunities and a culture that values diversity and collaboration. Located in a vibrant financial hub, our team is dedicated to driving impactful change while ensuring a supportive atmosphere for all employees.

JPMorganChase

Contact Detail:

JPMorganChase Recruiting Team

StudySmarter Expert Advice🀫

We think this is how you could land Senior Lead Site Reliability Engineer in Glasgow

✨Tip Number 1

Network like a pro! Attend industry meetups, webinars, or conferences related to site reliability engineering. It's a great way to connect with potential employers and learn about job openings that might not be advertised.

✨Tip Number 2

Show off your skills! Create a portfolio showcasing your projects, especially those involving reliability tooling and observability. This gives you a chance to demonstrate your expertise in a tangible way during interviews.

✨Tip Number 3

Practice makes perfect! Prepare for technical interviews by solving coding challenges and system design problems. Focus on topics like cloud architecture and telemetry pipelines to align with the role's requirements.

✨Tip Number 4

Apply through our website! We love seeing candidates who are genuinely interested in joining us. Tailor your application to highlight your experience with observability tools and site reliability practices to stand out.

We think you need these skills to ace Senior Lead Site Reliability Engineer in Glasgow

Site Reliability Engineering
Observability
Telemetry Engineering
Production Code Development
Resiliency Design
Incident Management
Service Level Indicators (SLIs)

Some tips for your application 🫑

Tailor Your CV:Make sure your CV reflects the skills and experiences that match the job description. Highlight your expertise in site reliability, observability, and any relevant technologies to show us you're the perfect fit!

Craft a Compelling Cover Letter:Use your cover letter to tell us why you’re passionate about site reliability engineering. Share specific examples of how you've tackled challenges in the past and how you can contribute to our agile team.

Showcase Your Technical Skills:Don’t hold back on showcasing your technical prowess! Mention your proficiency in programming languages and tools like OpenTelemetry, Grafana, or Kubernetes. We want to see how you can drive significant business impact with your skills.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates during the process!

How to prepare for a job interview at JPMorganChase

✨Know Your Tech Inside Out

Make sure you brush up on your technical skills, especially in areas like cloud, observability, and distributed systems. Be ready to discuss your experience with tools like Grafana, Prometheus, and OpenTelemetry, as well as your proficiency in programming languages like Java or Python.

✨Showcase Problem-Solving Skills

Prepare to share specific examples of how you've tackled complex reliability issues in the past. Think about times when you acted as a technical lead or resolved major incidents, and be ready to explain your thought process and the impact of your decisions.

✨Understand the Company Culture

Familiarise yourself with JPMorgan Chase's values around diversity, inclusion, and respect. Be prepared to discuss how you can contribute to this culture and how your experiences align with their commitment to these principles.

✨Ask Insightful Questions

Prepare thoughtful questions that show your interest in the role and the company. Inquire about their current challenges in site reliability or how they measure success in their observability practices. This not only demonstrates your enthusiasm but also helps you gauge if the company is the right fit for you.