Staff Software Engineer - Databases SRE | UK | Remote New United Kingdom (Remote)

Staff Software Engineer - Databases SRE | UK | Remote New United Kingdom (Remote)

Full-Time 103958 - 124750 £ / year (est.) Working from home possible
Grafana Labs

At a Glance

  • Tasks: Develop and maintain reliable cloud databases for top-tier customers using innovative technologies.
  • Company: Join Grafana Labs, a leader in open observability with a global remote culture.
  • Benefits: Enjoy competitive salary, equity, bonus opportunities, and 30 days annual leave.
  • Other info: Collaborative environment with clear career growth pathways and approachable leadership.
  • Why this job: Make a real impact on cutting-edge projects while working remotely with a passionate team.
  • Qualifications: 8+ years in engineering, strong SRE experience, and expertise in Kubernetes and cloud platforms.

The predicted salary is between 103958 - 124750 £ per year.

Grafana Labs, the company behind the open observability cloud, is founded on the principles of open source, open standards, open ecosystems, and open culture. Grafana Cloud, our fully managed observability platform, is flexible and built for scale. With Grafana Cloud's actually useful AI, organizations can see, understand, and act on all their disparate data to move at the speed of their ambitions. Today, more than 35 million users and 7,000+ customers – including Anthropic, Bloomberg, NVIDIA, Microsoft, and Salesforce – trust Grafana Labs to ensure reliability of their applications and systems, resolve incidents quickly, and optimize their telemetry to reduce noise and cost. We are a 100% remote company with 1,600+ team members across 40+ countries.

About the role: We are looking for a Staff Software Engineer - SRE to help us support our highest value Grafana Cloud customers by increasing the reliability of our Cloud databases that are based on Mimir, Loki, Tempo, and Pyroscope. We provide these databases as a SaaS product from AWS, GCP, and Azure across all regions. The SRE team is embedded within the Mimir, Loki, and Tempo squads and focuses on ensuring that Grafana Cloud’s database products deliver exceptional reliability for our highest-SLA customers.

In this role, you will:

  • Partner closely with product engineering squads (embedded model)
  • Own production reliability for high-SLA and complex customer environments
  • Design and implement automation to scale our reliability practices
  • Define and evolve per-tenant SLOs and reliability models
  • Proactively reduce SLO burn to prevent repeat incidents
  • Serve as a primary escalation point and on-call for relevant incidents
  • Lead customer-impacting incident response and post-incident reviews
  • Contribute to design docs and code reviews
  • Influence feature design to ensure production scalability and operability
  • Build automation to eliminate toil where needed
  • Improve alert quality and reduce noisy escalations

What we seek:

  • 8+ years engineering experience, 4+ in SRE/CRE/production engineering.
  • Strong preference for those with formal customer reliability engineering experience.
  • Strong Kubernetes experience in AWS, GCP, or Azure, and familiarity with infrastructure-as-code tooling (Helm, Terraform, Jsonnet, etc.).
  • Strong experience with technical leadership, leading a team through projects, mentoring other engineers on the team and serving as a force-multiplier.
  • Experience operating multi-tenant systems in production.
  • Strong experience designing and implementing SLOs.
  • Experience with one or more programming languages (e.g. Go, Python, Java, etc).
  • Experience with Linux operating systems internals, and some knowledge of networking, cloud storage, and scaling.
  • Excellent problem-solving and troubleshooting skills.
  • Experience with calmly and actively participating in blame-free Incident Response, following up on actions, and writing high quality PIRs (Post Incident Reviews).
  • Ability to reason about performance, scaling, and failure modes.
  • Comfortable working within an engineering team where individuals are encouraged to have a strong sense of autonomy and self-direction.
  • Ability to partner deeply with product engineering teams.
  • We highly value those who are intellectually curious, who default to transparency, possess a high bias toward action, and who are also kind.

Your day-to-day will include:

  • Regular 1:1s with your manager and colleagues.
  • Reviewing and creating SLOs, proactively investigating ways in which we can further reduce budget burn for those SLOs, which can be self-directed or as the result of learnings from incidents, and may include improvements to monitoring, automation, increasing self-healing, auto-scaling, etc.
  • Improve observability of customers within their environments.
  • Designing and implementing solutions to ensure reliability and scalability of our environments can meet rapidly increasing demands.
  • Develop fault-tolerant design patterns ensuring that we are considering reliability at all stages of the service lifecycle.
  • Collaborating with our Engineering Leaders to help define and influence product strategy, roadmaps and technical designs.
  • Participate in PR review and collaborating with other engineers on their Design Docs.
  • Teach others about Site Reliability Engineering and communicate best practices to be applied early in development of new features and functionality.
  • Participate in Incident Response when applicable, including investigation through to resolution, PIR, and communication with customers via Bridge calls where necessary.

Compensation and benefits: In the UK, the Base compensation range for this role is £103,958 - £124,750. Actual compensation may vary based on level, experience, and skillset as assessed in the interview process. Benefits include equity, bonus (if applicable) and other benefits.

Why you’ll thrive at Grafana Labs:

  • 100% Remote, Global Culture – As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose.
  • Scaling Organization – Tackle meaningful work in a high-growth, ever-evolving environment.
  • Transparent Communication – Expect open decision-making and regular company-wide updates.
  • Innovation-Driven – Autonomy and support to ship great work and try new things.
  • Open Source Roots – Built on community-driven values that shape how we work.
  • Empowered Teams – High trust, low ego culture that values outcomes over optics.
  • Career Growth Pathways – Defined opportunities to grow and develop your career.
  • Approachable Leadership – Transparent execs who are involved, visible, and human.
  • Passionate People – Join a team of smart, supportive folks who care deeply about what they do.
  • Balance is Key – We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect.

Equal Opportunity Employer: Grafana Labs is an equal opportunities employer. We welcome applications from everyone regardless of race, colour, nationality, origin, caste, sex, gender reassignment identity or expression, sexual orientation, age, religion or belief, disability, veteran status, genetic information, pregnancy, maternity, marital, family or carer status, or any other characteristic which is protected by local law. We believe that equality and diversity build a strong organisation, and we work hard to ensure that is the foundation of our organisation as we grow.

Staff Software Engineer - Databases SRE | UK | Remote New United Kingdom (Remote) employer: Grafana Labs

Grafana Labs is an exceptional employer that champions a 100% remote work culture, fostering collaboration among a diverse team of over 1,600 members across 40+ countries. With a strong emphasis on career growth, transparent communication, and innovation, employees are empowered to tackle meaningful challenges while enjoying a generous annual leave policy and a supportive environment that values autonomy and teamwork.

Grafana Labs

Contact Details:

Grafana Labs Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Staff Software Engineer - Databases SRE | UK | Remote New United Kingdom (Remote)

Join Local Tech Meetups

Get out there and mingle with fellow developers by joining local tech meetups. It’s a fantastic way to meet people who might be working at Grafana Labs or know someone who does. Plus, you can pick up some trendy tech skills and trends while you're at it!

Contribute to Open Source Projects

Show off your coding chops by jumping into open-source projects. Not only does this give you practical experience, but it also gets you noticed in the dev community. You'll create a killer portfolio that speaks volumes about your skills to Grafana Labs.

Tap into Online Developer Communities

Don’t underestimate the power of online developer communities like GitHub, Stack Overflow, and even Reddit. Participate in discussions, share your projects, and build your visibility. We can often find opportunities through these channels that can lead to a full-time gig at companies like Grafana Labs.

Explore Job Boards Specifically for Tech Roles

Keep your eyes peeled on job boards that focus on tech roles. Sites like TechCareers or Stack Overflow Jobs can often have listings for companies like Grafana Labs that might not show up on broader job sites. Make it a habit to check these regularly, and don’t hesitate to apply directly through our website!

We think you need these skills to ace Staff Software Engineer - Databases SRE | UK | Remote New United Kingdom (Remote)

Site Reliability Engineering (SRE)
Kubernetes
AWS
GCP
Azure
Infrastructure-as-Code (Helm, Terraform, Jsonnet)
Technical Leadership

Some tips for your application 🫡

Show off your coding skills:When applying for a software engineering role, it's super important to showcase your coding skills. Make sure your CV includes your tech stack, any relevant programming languages you’re comfortable with, and examples of projects you've worked on. If you have a GitHub profile, link it up! We love to see code in action.

Tailor your portfolio:For a full-time role, we’d expect to see some solid examples of your work in your portfolio. Make sure to include at least two or three projects that highlight your problem-solving skills and your ability to work with different technologies. Focus on the projects that are most relevant to the position at Grafana Labs.

Craft a killer cover letter:Your cover letter is your chance to stand out—make it personal! Explain why you want to work at Grafana Labs and how your skills align with the role. Show us your passion for software development. We dig enthusiastic candidates who understand the value of collaboration and continuous learning!

Be clear and concise:When it comes to writing your CV and cover letter, clarity is key. Avoid jargon that could confuse us and stick to simple, direct language. Highlight your achievements with quantifiable results where possible, and keep everything easy to read. A well-organised application goes a long way!

How to prepare for a job interview at Grafana Labs

Brush Up on Your Coding Skills

For a full-time software engineering role, it's crucial that we stay sharp with our coding abilities. Expect technical questions that might involve solving problems on the spot or discussing algorithms. Practise on platforms like LeetCode or HackerRank to get comfortable with the types of questions that often come up.

Know Your Tools and Frameworks

Make sure we’re well-acquainted with the tools and technologies listed in the job description. Familiarise ourselves with any specific frameworks or programming languages mentioned. If Grafana Labs uses React or Node.js, for instance, be ready to discuss how we’ve used them in previous projects or coursework.

Showcase Your Projects

Bring along a portfolio that highlights our best work. This could be code samples, GitHub repositories, or any side projects we’ve built. Make sure we can talk through our thought process for each project, especially the challenges we faced and how we solved them—this shows our problem-solving skills in action.

Prepare for Behavioural Questions

While technical skills are key, full-time positions also require cultural fit. Be ready to discuss our previous experiences and how we handle teamwork, conflict, and deadlines. Brush up on the STAR method—Situation, Task, Action, Result—to clearly articulate our past experiences when discussing how we've contributed to a team.