Backend Engineer - Cloud Platform & Stack Automation

Backend Engineer - Cloud Platform & Stack Automation

Full-Time 72000 - 90000 £ / year (est.) No working from home possible
Grafana Labs

At a Glance

  • Tasks: Design and build systems for Grafana Cloud stacks, ensuring reliability and efficiency.
  • Company: Join Grafana Labs, a remote-first, open-source powerhouse with a global collaborative culture.
  • Benefits: Enjoy competitive salary, RSUs, and opportunities for professional growth.
  • Other info: Work remotely with a diverse team and contribute to open-source projects.
  • Why this job: Make a real impact by solving complex problems in a dynamic tech environment.
  • Qualifications: Experience in SaaS platforms, Golang, and a passion for developer experience.

The predicted salary is between 72000 - 90000 £ per year.

This role is available for candidates located in the UK, Germany, Spain, Ireland and Sweden.

Overview

Grafana Labs is a remote‑first, open‑source powerhouse. There are more than 20M users of Grafana, the open‑source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted everywhere from a NASA launch and Minecraft HQ to Wimbledon and the Tour de France. We’re scaling fast and staying true to what makes us different: an open‑source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation‑driven environment where transparency, autonomy, and trust fuel everything we do. You may not meet every requirement, and that’s okay. If this role excites you, we’d love you to raise your hand for what could be a truly career‑defining opportunity.

The Opportunity

Application Core Services (AppCore) is a group within Platform, in the Foundations department. Foundations produces the Internal Engineering Platform (IEP) and partners closely with our Cloud, Enterprise, and Grafana teams. Our team develops the essential systems driving Grafana’s business operations. We utilize the grafana.com platform to engineer bespoke integrations and solutions that unify the diverse technical ecosystem of a modern software enterprise. The team owns important domain areas that help keep both our customer workflows and internal business processes running smoothly. AppCore is made up of multiple squads, each focused on one or more of these domains. Our work includes maintaining the billing engine responsible for customer usage calculation, automating provisioning after a customer signs a contract, integrating with cloud marketplaces such as AWS, Azure, and GCP, and building and maintaining the user portal our customers rely on to manage their accounts. This is a team working at the intersection of product, platform, and business operations. The systems we build are critical to how Grafana scales. We are looking for engineers who enjoy solving complex workflow and systems problems, improving reliability and developer experience, and building software that directly supports both customers and internal stakeholders.

What You’ll Be Doing

The AppCore Stacks squad owns the systems that create, configure, reconcile, migrate, and operate Grafana Cloud stacks at scale. A stack is the customer‑facing Grafana Cloud environment that connects an organization to Grafana and the backend services it uses, including Mimir, Loki, Tempo, plugins, dashboards, data sources, and stack‑level configuration. Our work sits at the intersection of product, platform, and operations. We build the control‑plane services and workflows that keep stack state aligned across grafana.com, Stack State Service (SSS), Hosted Grafana, cloud regions, and the underlying Grafana Cloud infrastructure. When this domain works well, customers get reliable stack creation, safe configuration rollout, predictable migrations, and fewer manual operational interventions.

  • Design, build, and operate reconciliation systems, including the SSS backend, to track desired stack state, detect and repair drift across stack templates, grafana.com state, Hosted Grafana, and actual customer stack configuration.
  • Collaborate across SSS, grafana.com, and deployment configurations to ensure stack lifecycle workflows remain reliable, observable, and resilient.
  • Improve operational efficiency by reducing deployment complexity (e.g., aiming for single PR regional SSS deployment) and contributing to the Stack Config Reconciliation project.
  • Manage rollout mechanisms for provisioned plugins, dashboards, data sources, Grafana versions, release channels, and stack‑level configuration.
  • Support new region and cluster rollouts, including the operational paths required to bring stacks online safely in new Grafana Cloud regions.
  • Improve incident response and recovery paths for stack misalignment, reconciliation failures, plugin rollout issues, and Hosted Grafana integration failures.
  • Partner with Product, Hosted Grafana, Infrastructure, Support, and adjacent AppCore squads on customer‑impacting stack lifecycle work.
  • Contribute to roadmap planning, technical design, OnCall improvements, and long‑term simplification of stack operations.

You will help own the production behavior of the systems you build. That includes improving runbooks, dashboards, alerts, reconciliation safety, rollout controls, and recovery procedures. You should be comfortable debugging across service boundaries and making careful changes in systems that affect customer stacks.

What Makes You a Great Fit

At Grafana, we actively embrace AI‑assisted and agentic development practices, integrating these technologies into both our engineering workflows and the systems we deliver. We encourage our engineers to thoughtfully leverage AI tools to enhance every stage of the lifecycle, from design and implementation to testing, documentation, and operations. We also look for strategic opportunities to embed agentic capabilities within our services to eliminate toil, bolster reliability, and ensure that complex customer workflows remain resilient and safe. We are seeking a Backend Engineer who thrives on building production systems where correctness, scalability, and operational clarity are paramount. As a remote‑first organization, you should be comfortable collaborating asynchronously across time zones and taking full ownership of the critical systems powering Grafana Cloud. Our team is small and operates with a high degree of independence; you will be expected to lead major projects, coordinate across service boundaries, and help define the technical direction for our domain. You will be particularly successful in this role if you enjoy solving challenges related to stateful systems, eventual consistency, and reconciliation loops. We value engineers who can take ambiguous lifecycle requirements and transform them into explicit, modular solutions. You should be adept at breaking down complex systems work into safe, iterative increments while clearly communicating technical tradeoffs to both internal stakeholders and adjacent product teams.

  • Writing efficient, readable, and easy to maintain code.
  • Implementing new microservices or systems.
  • Collaborating with teammates and other departments to reach consensus on proposed solutions.
  • Coordinating with product and UX when needed.
  • Responding to customer requests and feedback.
  • When ready, participating in our follow‑the‑sun OnCall rotation.
  • Participating in team decisions, such as roadmap planning and prioritization.

Requirements

  • You have at least 1 year of fully remote work experience.
  • You have some experience working on a SaaS platform and are familiar with common distributed systems concepts (e.g., scalability, multi‑tenancy, HA).
  • Have professional experience with Golang and be willing to work across both backend service and application code.
  • Care deeply about developer and user experience and the quality of the products that you work on.
  • Have some experience contributing to the delivery of projects, from initial brainstorming to shipping a product to the customer.
  • You write clean, well‑tested software that other engineers can understand, operate, and maintain.
  • Can take on well‑defined tasks, break them down, and execute iteratively to deliver working solutions and gather feedback.
  • You are willing to collaborate across teams and ensure your work is aligned with the needs of other squads and external stakeholders.
  • Familiarity with Kubernetes in AWS, GCP, or Azure, and exposure to infrastructure‑as‑code tooling (Helm, Terraform, Jsonnet, etc.).
  • Experience participating in blameless incident response and contributing to post‑incident reviews.

Bonus Points For

  • Experience with TypeScript/Node.js.
  • Experience with Kubernetes control‑plane patterns, operators, reconcilers, or desired‑state systems.
  • Experience with Jsonnet/Tanka, Terraform, Flux, Argo, or similar deployment/configuration tooling.
  • Experience working on SaaS provisioning, tenancy, regional expansion, plugin rollout, or customer lifecycle systems.
  • Experience with incident response involving configuration drift, partial failure, or cross‑service state mismatch.

Compensation & Rewards

In United Kingdom, the compensation range for this role is GBP 72K - GBP 90K. Actual compensation may vary based on level, experience, and skillset as assessed throughout the interview process. All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs’ success. We believe in shared outcomes—RSUs help us stay aligned and invested as we scale globally. Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market’s defined pay range & benefits at the beginning of the process.

Equal Opportunity Employer

We will recruit, train, compensate and promote regardless of race, religion, color, national origin, gender, disability, age, veteran status, and all the other fascinating characteristics that make us different and unique. We believe that equality and diversity builds a strong organization and we’re working hard to make sure that’s the foundation of our organization as we grow.

Backend Engineer - Cloud Platform & Stack Automation employer: Grafana Labs

Grafana Labs is an exceptional employer that champions a remote-first, open-source culture, fostering innovation and collaboration across diverse teams. With a commitment to employee growth, engineers have the opportunity to work on impactful projects while contributing to open-source communities, all within a supportive environment that values transparency and autonomy. The competitive compensation package, including Restricted Stock Units, ensures that every team member shares in the company's success as we scale globally.

Grafana Labs

Contact Details:

Grafana Labs Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Backend Engineer - Cloud Platform & Stack Automation

Tip Number 1

Network like a pro! Reach out to folks in your industry on LinkedIn or join relevant online communities. A personal connection can often get you noticed faster than a CV.

Tip Number 2

Prepare for those interviews! Research the company, understand their products, and be ready to discuss how your skills align with their needs. Practice common interview questions to boost your confidence.

Tip Number 3

Show off your projects! If you've got a GitHub or portfolio, make sure to highlight it during your conversations. Real-world examples of your work can speak volumes about your capabilities.

Tip Number 4

Apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you're genuinely interested in joining our team at Grafana Labs.

We think you need these skills to ace Backend Engineer - Cloud Platform & Stack Automation

Golang
SaaS Platform Experience
Distributed Systems Concepts
Microservices Development
Kubernetes
Infrastructure-as-Code Tooling
Helm

Some tips for your application 🫡

Tailor Your Application:Make sure to customise your CV and cover letter for the Backend Engineer role. Highlight your experience with Golang, distributed systems, and any relevant projects that showcase your skills in building scalable systems.

Show Your Passion for Open Source:Since we’re all about open-source at Grafana, don’t forget to mention any contributions you’ve made to open-source projects. It shows you share our values and understand the community-driven approach we embrace.

Be Clear and Concise:When writing your application, keep it clear and to the point. Use straightforward language to describe your experiences and how they relate to the role. We appreciate readability just as much as you do!

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands and helps us get to know you better from the start!

How to prepare for a job interview at Grafana Labs

Know Your Tech Stack

Make sure you’re familiar with the technologies mentioned in the job description, especially Golang and distributed systems concepts. Brush up on your knowledge of Kubernetes and infrastructure-as-code tools like Terraform or Helm, as these will likely come up during technical discussions.

Showcase Your Problem-Solving Skills

Prepare to discuss specific examples where you've tackled complex systems problems. Think about how you’ve improved operational efficiency or resolved issues related to stateful systems. Be ready to explain your thought process and the impact of your solutions.

Emphasise Collaboration

Since this role involves working across teams, highlight your experience in collaborating with different departments. Share examples of how you’ve reached consensus on proposed solutions and how you’ve communicated technical trade-offs to non-technical stakeholders.

Ask Insightful Questions

Prepare thoughtful questions about the team’s current projects, challenges they face, and their approach to innovation. This shows your genuine interest in the role and helps you gauge if the company culture aligns with your values, especially regarding transparency and autonomy.