At a Glance
- Tasks: Automate systems for fast troubleshooting and enhance reliability engineering across thousands of instances.
- Company: Join a forward-thinking tech company focused on innovation and collaboration.
- Benefits: Enjoy competitive pay, health perks, remote work options, and opportunities for professional growth.
- Other info: Dynamic role with excellent career advancement and a focus on learning from failures.
- Why this job: Make a real impact by shaping reliable systems and driving engineering excellence.
- Qualifications: Experience in backend tools with Go, strong software skills, and a passion for SRE practices.
The predicted salary is between 60000 - 80000 ÂŁ per year.
The Role
- Automate for insight and scale: Build systems that make troubleshooting fast, safe, and scalable across thousands of Neo4j instances. From internal tools that surface clear insights to canaries that support safe rollouts, you’ll focus on automation that elevates reliability engineering.
- Treat operations as a software problem: Replace tribal knowledge and ad-hoc scripts with tools and systems that codify best practices—making operations predictable, scalable, and repeatable.
- Design for resilience, learn from failure: Own and evolve the tooling and processes behind incident response. From clear alerts to blameless reviews, you’ll help ensure teams respond with confidence and learn with clarity.
- Champion reliability as a product feature: Help teams define and act on SLIs and SLOs, turning reliability into a shared, data-driven priority across engineering.
- Create signals, not noise: Shape an observability stack that tells us what matters, when it matters—so we can detect issues early and resolve them quickly.
Qualifications
- Writing backend tools and automation in Go—the primary language—with an emphasis on sound architecture, testing, and maintainability. Strong software skills in other languages, like Python, are also welcome.
- Applying SRE practices in real-world environments: defining SLIs and SLOs, reducing toil through automation, and driving reliability through engineering.
- Collaborating with other teams to promote SRE thinking—educating on principles like observability, ownership, and service level objectives.
- Troubleshooting large-scale, cloud-based systems with confidence and curiosity.
- Monitoring distributed systems and understanding their performance characteristics.
- Designing systems with reliability, safety, and debugability as first-class concerns.
- Working with observability tools like OTel Collector, Prometheus, Grafana, and Google Cloud’s operations suite.
- Deploying and managing applications on Kubernetes; cluster-level administration is a plus.
- Managing infrastructure with Kustomize and Terraform—keeping it clear, modular, and easy to evolve.
- Building and maintaining CI/CD workflows—ours run on GitHub Actions.
- Participating in on-call rotations and incident response with a focus on improvement, not blame.
- Writing and contributing to postmortems that lead to meaningful, lasting changes.
Software Engineer - Site Reliability Engineering employer: Neo4j
Contact Detail:
Neo4j Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Software Engineer - Site Reliability Engineering
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with current employees at companies you're eyeing. A friendly chat can sometimes lead to opportunities that aren't even advertised!
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to SRE practices. This gives potential employers a taste of what you can do and how you think.
✨Tip Number 3
Prepare for interviews by brushing up on your problem-solving skills. Practice coding challenges and be ready to discuss your approach to reliability engineering. Remember, it's not just about getting the right answer but showing your thought process!
✨Tip Number 4
Apply through our website! We love seeing candidates who take the initiative. Plus, it helps us keep track of your application and makes it easier for us to connect with you directly.
We think you need these skills to ace Software Engineer - Site Reliability Engineering
Some tips for your application 🫡
Tailor Your Application: Make sure to customise your CV and cover letter to highlight your experience with automation and reliability engineering. We want to see how your skills align with our needs, so don’t hold back on showcasing your relevant projects!
Showcase Your Technical Skills: When you’re writing about your technical abilities, be specific! Mention your experience with Go, Python, and any tools like Prometheus or Grafana. We love seeing concrete examples of how you've applied these skills in real-world scenarios.
Emphasise Collaboration: Since we value teamwork, make sure to include examples of how you’ve worked with others to promote SRE practices. Highlight any instances where you’ve educated teams on observability or service level objectives—this shows us you’re a team player!
Apply Through Our Website: Don’t forget to submit your application through our website! It’s the best way for us to receive your details and ensures you’re considered for the role. Plus, it makes the whole process smoother for everyone involved.
How to prepare for a job interview at Neo4j
✨Know Your Tech Stack
Make sure you’re well-versed in Go, as it’s the primary language for this role. Brush up on your knowledge of Python too, as it could come in handy. Familiarise yourself with the tools mentioned in the job description, like Prometheus and Grafana, so you can speak confidently about your experience with them.
✨Showcase Your SRE Experience
Be ready to discuss your real-world experience applying SRE practices. Prepare examples of how you've defined SLIs and SLOs, reduced toil through automation, and improved reliability. This will demonstrate that you understand the principles behind the role and can apply them effectively.
✨Prepare for Problem-Solving Questions
Expect to tackle some troubleshooting scenarios during the interview. Think about past incidents you've managed and how you approached them. Highlight your ability to learn from failures and how you’ve contributed to blameless postmortems that led to improvements.
✨Emphasise Collaboration Skills
This role involves working closely with other teams, so be prepared to discuss how you’ve promoted SRE thinking in previous positions. Share specific examples of how you’ve educated others on observability and service level objectives, showcasing your ability to foster a collaborative environment.