Site Reliability Engineer (K8s)

Apply Now

Job Board

Companies

PulsePoint

Site Reliability Engineer (K8s)

Full-Time No working from home possible

Apply Now

Responsibilities

Our BI team runs a set of GCP-based APIs and data services that a lot of internal products depend on
As we've grown, keeping things running has increasingly been a side responsibility for engineers who are primarily building features — and that's not sustainable
We're looking for an SRE to own that space: service health, incident response, infrastructure monitoring, and making sure we're not blindly burning cloud budget
The Site Reliability Engineer will ensure the availability, performance, and security of the Business Intelligence team’s GCP-hosted APIs and data infrastructure
This role is responsible for proactive monitoring, incident response, and continuous improvement of platform reliability across a cloud-native stack
The engineer will work closely with backend and data engineers to maintain service health and drive operational excellence
This position also carries responsibility for GCP cost visibility, helping the team track and optimize cloud spend through structured monitoring and alerting
Monitor and maintain uptime of GCP-hosted APIs and services, keeping performance within agreed targets
Lead incident response for BI platform services — triage, resolve, and follow up with post‑mortems that actually prevent recurrence
Build and manage observability infrastructure: dashboards, alerts, and logging across GCP services
Track GCP cloud spend and set up cost alerting to flag anomalies before they become problems
Review and fix security gaps — IAP configs, service account permissions, API access controls
Work with data and backend engineers to shore up reliability of data pipelines and BigQuery workflows
Contribute to infrastructure‑as‑code and help keep deployments documented and reproducible

Benefits

Dollars and Sense: 401(k) match
Happy + Healthy: Comprehensive medical plans, affordable medical, dental and vision options, 100%-paid life & disability insurance
Break a Sweat: Free virtual fitness classes, Better Yourself Wellness program
Always Learning: Generous annual tuition reimbursement, ongoing team trainings
Take a Load Off: Paid vacation, sick time, and company holidays (including a floating holiday)
Good Ol’ Fun: Team‑building events, happy hours, holiday celebrations, and more!

Qualifications

Practical experience with GCP — Cloud Run, API Gateway, and BigQuery in particular
Proficiency with Git and version control in a team setting
2+ years in a Site Reliability, DevOps, or Cloud Infrastructure role in a production environment
Bachelor’s degree in Computer Science, Engineering, or related field, or equivalent hands‑on experience
Experience with monitoring and observability tooling (Cloud Monitoring, Datadog, or similar)
Solid grasp of cloud security fundamentals — IAM, network controls, access management
Terraform or other infrastructure‑as‑code tools
CI/CD pipelines and deployment automation (GitHub Actions, Cloud Build, or similar)
Python for scripting or automation
MySQL, Spanner, or BigQuery at any meaningful depth
Experience with dbt or Looker
GCP cost management and spend optimization
Comfortable working across CET/EST hours in a distributed team

#J-18808-Ljbffr

Contact Details:

PulsePoint Recruitment Team

View PulsePoint profile

Site Reliability Engineer (K8s)

PulsePoint

Apply Now

Site Reliability Engineer (K8s)

Company

Product

Help