Software Engineer - Site Reliability Engineering in London

Software Engineer - Site Reliability Engineering in London

London Full-Time 60000 - 80000 € / year (est.) No home office possible
Albert Invent

At a Glance

  • Tasks: Automate systems for reliability and troubleshoot large-scale cloud environments.
  • Company: Join Neo4j, a leader in database technology with a global impact.
  • Benefits: Competitive salary, flexible work options, and opportunities for professional growth.
  • Other info: Dynamic team environment with a focus on learning and innovation.
  • Why this job: Make a real difference in reliability engineering while working with cutting-edge tech.
  • Qualifications: Experience in backend tools, automation, and SRE practices is essential.

The predicted salary is between 60000 - 80000 € per year.

Neo4j's Site Reliability Engineering team’s mission is to improve the reliability of Neo4j’s DBaaS product: Neo4j Aura. Operating at a global scale across all three major cloud providers, Aura runs hundreds of Kubernetes clusters and hosts thousands of Neo4j instances in production at any given time.

The Role

  • Automate for insight and scale: Build systems that make troubleshooting fast, safe, and scalable across thousands of Neo4j instances. From internal tools that surface clear insights to canaries that support safe rollouts, you'll focus on automation that elevates reliability engineering.
  • Treat operations as a software problem: Replace tribal knowledge and ad-hoc scripts with tools and systems that codify best practices - making operations predictable, scalable, and repeatable.
  • Design for resilience, learn from failure: Own and evolve the tooling and processes behind incident response. From clear alerts to blameless reviews, you'll help ensure teams respond with confidence and learn with clarity.
  • Champion reliability as a product feature: Help teams define and act on SLIs and SLOs, turning reliability into a shared, data-driven priority across engineering.
  • Create signals, not noise: Shape an observability stack that tells us what matters, when it matters - so we can detect issues early and resolve them quickly.

We're interested in hearing from Engineers with deep experience in some of the following areas:

  • Writing backend tools and automation in Go - our primary language - with an emphasis on sound architecture, testing, and maintainability. Strong software skills in other languages, like Python, are also welcome.
  • Applying SRE practices in real-world environments: defining SLIs and SLOs, reducing toil through automation, and driving reliability through engineering.
  • Collaborating with other teams to promote SRE thinking - educating on principles like observability, ownership, and service level objectives.
  • Troubleshooting large-scale, cloud-based systems with confidence and curiosity.
  • Monitoring distributed systems and understanding their performance characteristics.
  • Designing systems with reliability, safety, and debugability as first-class concerns.
  • Working with observability tools like OTel Collector, Prometheus, Grafana, and Google Cloud's operations suite.
  • Deploying and managing applications on Kubernetes; cluster-level administration is a plus.
  • Managing infrastructure with Kustomize and Terraform - keeping it clear, modular, and easy to evolve.
  • Building and maintaining CI/CD workflows - ours run on GitHub Actions.
  • Participating in on-call rotations and incident response with a focus on improvement, not blame.
  • Writing and contributing to postmortems that lead to meaningful, lasting changes.

Software Engineer - Site Reliability Engineering in London employer: Albert Invent

At Neo4j, we pride ourselves on being an exceptional employer, offering a dynamic work culture that fosters innovation and collaboration. Our commitment to employee growth is evident through continuous learning opportunities and a focus on automating for insight and scale, ensuring that our engineers can thrive in a supportive environment. Located at the forefront of technology, our team enjoys the unique advantage of working with cutting-edge tools and practices in a global setting, making a meaningful impact on the reliability of our DBaaS product, Neo4j Aura.

Albert Invent

Contact Detail:

Albert Invent Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land Software Engineer - Site Reliability Engineering in London

Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with current employees at Neo4j. A friendly chat can sometimes lead to opportunities that aren’t even advertised.

Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to SRE practices or automation. This gives us a tangible way to see what you can do beyond your CV.

Tip Number 3

Prepare for interviews by brushing up on your knowledge of Kubernetes and cloud-based systems. We love candidates who can confidently discuss their troubleshooting experiences and how they’ve applied SRE principles in real-world scenarios.

Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team at Neo4j.

We think you need these skills to ace Software Engineer - Site Reliability Engineering in London

Go
Python
Site Reliability Engineering (SRE) practices
SLIs and SLOs definition
Automation
Observability
Troubleshooting large-scale cloud-based systems

Some tips for your application 🫡

Tailor Your CV:Make sure your CV reflects the skills and experiences that align with the role. Highlight your experience in SRE practices, automation, and any relevant tools you've worked with. We want to see how you can contribute to our mission!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about reliability engineering and how your background makes you a great fit for our team. Keep it engaging and personal – we love to see your personality!

Showcase Your Projects:If you've worked on any relevant projects, whether personal or professional, make sure to include them. We’re interested in seeing how you’ve applied your skills in real-world scenarios, especially in automation and cloud-based systems.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows us you’re keen on joining our team at StudySmarter!

How to prepare for a job interview at Albert Invent

Know Your Tech Stack

Make sure you’re well-versed in the technologies mentioned in the job description, especially Go, Kubernetes, and observability tools like Prometheus and Grafana. Brush up on your knowledge of SRE practices and be ready to discuss how you've applied them in real-world scenarios.

Showcase Your Problem-Solving Skills

Prepare to share specific examples of how you've tackled reliability issues or automated processes in previous roles. Use the STAR method (Situation, Task, Action, Result) to structure your answers and highlight your impact.

Understand the Company’s Mission

Familiarise yourself with Neo4j’s DBaaS product and its importance in the market. Be ready to discuss how you can contribute to improving the reliability of Neo4j Aura and why reliability is crucial for their customers.

Ask Insightful Questions

Prepare thoughtful questions that show your interest in the role and the company. Inquire about their current challenges in reliability engineering or how they measure success with SLIs and SLOs. This demonstrates your proactive mindset and eagerness to contribute.