Site Reliability Engineer, K8s

Job Board

Companies

PulsePoint

Site Reliability Engineer, K8s

Full-Time 50000 - 70000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Monitor and maintain uptime of GCP-hosted APIs, lead incident response, and manage observability infrastructure.
Company: Join WebMD, a leader in health information, committed to diversity and inclusion.
Benefits: Competitive salary, flexible work hours, and opportunities for professional growth.
Other info: Collaborative team environment with a focus on innovation and career advancement.
Why this job: Make a real impact on service reliability while working with cutting-edge cloud technologies.
Qualifications: 2+ years in Site Reliability or DevOps, with practical GCP experience.

The predicted salary is between 50000 - 70000 £ per year.

Position Overview

Our BI team runs a set of GCP-based APIs and data services that a lot of internal products depend on. As we've grown, keeping things running has increasingly been a side responsibility for engineers who are primarily building features — and that's not sustainable. We're looking for an SRE to own that space: service health, incident response, infrastructure monitoring, and making sure we're not blindly burning cloud budget.

Responsibilities

Monitor and maintain uptime of GCP-hosted APIs and services, keeping performance within agreed targets
Lead incident response for BI platform services — triage, resolve, and follow up with post-mortems that actually prevent recurrence
Build and manage observability infrastructure: dashboards, alerts, and logging across GCP services
Track GCP cloud spend and set up cost alerting to flag anomalies before they become problems
Review and fix security gaps — IAP configs, service account permissions, API access controls
Work with data and backend engineers to shore up reliability of data pipelines and BigQuery workflows
Contribute to infrastructure-as-code and help keep deployments documented and reproducible

Qualifications

2+ years in a Site Reliability, DevOps, or Cloud Infrastructure role in a production environment
Bachelor's degree in Computer Science, Engineering, or related field, or equivalent hands‑on experience
Practical experience with GCP — Cloud Run, API Gateway, and BigQuery in particular
Experience with monitoring and observability tooling (Cloud Monitoring, Datadog, or similar)
Solid grasp of cloud security fundamentals — IAM, network controls, access management
Proficiency with Git and version control in a team setting

Preferred Skills

CI/CD pipelines and deployment automation (GitHub Actions, Cloud Build, or similar)
Terraform or other infrastructure-as-code tools
Python for scripting or automation
MySQL, Spanner, or BigQuery at any meaningful depth
GCP cost management and spend optimization
Experience with dbt or Looker
Comfortable working across CET/EST hours in a distributed team

Site Reliability Engineer, K8s employer: PulsePoint

WebMD is an exceptional employer that fosters a culture of inclusivity and innovation, making it an ideal place for Site Reliability Engineers to thrive. With a strong commitment to employee growth, we offer opportunities for professional development and hands-on experience with cutting-edge technologies in a collaborative environment. Our focus on work-life balance and a supportive team dynamic ensures that you can contribute meaningfully while enjoying the benefits of working in a dynamic and rewarding setting.

Contact Details:

PulsePoint Recruitment Team

View PulsePoint profile

StudySmarter Expert Advice🤫

We think this is how you could land Site Reliability Engineer, K8s

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with other SREs on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to GCP, monitoring tools, or infrastructure-as-code. This gives potential employers a taste of what you can do.

✨Tip Number 3

Prepare for interviews by brushing up on incident response scenarios and cloud security fundamentals. Be ready to discuss how you've tackled challenges in previous roles, especially around uptime and cost management.

✨Tip Number 4

Don't forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive about their job search!

We think you need these skills to ace Site Reliability Engineer, K8s

GCP

Cloud Run

API Gateway

BigQuery

Monitoring and Observability Tooling

Cloud Monitoring

Datadog

Cloud Security Fundamentals

IAM

Network Controls

Access Management

Git

CI/CD Pipelines

Deployment Automation

Terraform

Python

Some tips for your application 🫡

Tailor Your CV:Make sure your CV is tailored to the Site Reliability Engineer role. Highlight your experience with GCP, incident response, and any relevant projects that showcase your skills in monitoring and observability.

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about SRE and how your background aligns with our needs. Don’t forget to mention specific tools and technologies you’ve worked with.

Showcase Your Problem-Solving Skills:In your application, give examples of how you've tackled challenges in previous roles. Whether it's improving uptime or managing cloud costs, we want to see your thought process and solutions!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy!

How to prepare for a job interview at PulsePoint

✨Know Your GCP Inside Out

Make sure you brush up on your knowledge of Google Cloud Platform, especially Cloud Run, API Gateway, and BigQuery. Be ready to discuss how you've used these tools in past projects and how they relate to the responsibilities of the role.

✨Showcase Your Incident Response Skills

Prepare examples of how you've handled incident response in previous roles. Discuss specific incidents, your approach to triage, resolution, and what you learned from post-mortems. This will demonstrate your ability to lead in high-pressure situations.

✨Demonstrate Monitoring and Observability Expertise

Familiarise yourself with monitoring tools like Cloud Monitoring or Datadog. Be prepared to talk about how you've built observability infrastructure in the past, including dashboards and alerts, and how this has improved service reliability.

✨Highlight Your Cost Management Experience

Discuss any experience you have with tracking cloud spend and setting up cost alerting. Share specific strategies you've implemented to optimise costs and prevent budget overruns, as this is crucial for the role.

Site Reliability Engineer, K8s

PulsePoint

Apply Now

Site Reliability Engineer, K8s

At a Glance

Site Reliability Engineer, K8s employer: PulsePoint

StudySmarter Expert Advice🤫

We think you need these skills to ace Site Reliability Engineer, K8s

Some tips for your application 🫡

How to prepare for a job interview at PulsePoint

Company

Product

Help