At a Glance
- Tasks: Enhance system reliability and performance while collaborating on innovative solutions.
- Company: Join a leading global operator with a passion for excellence.
- Benefits: Enjoy hybrid working, eye care, flu vaccinations, and life assurance.
- Why this job: Make a real impact on system reliability and observability in a dynamic environment.
- Qualifications: Strong software engineering skills and knowledge of Site Reliability Engineering principles.
- Other info: Be part of a culture that values continuous improvement and teamwork.
The predicted salary is between 28800 - 48000 £ per year.
As a Site Reliability Engineer, you will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices.
You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency.
Using your engineering expertise, you will implement solutions that enhance reliability, including service instrumentation with tools such as Open Telemetry, improve logging practices and develop features for maintainability. You will also help engineer tools and automation for effective service management.
Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your contributions will ensure our systems meet user demands and enhance overall service performance.
This role is eligible for inclusion in the Company’s hybrid working from home policy.
Preferred Skills and Experience- Excellent knowledge of Site Reliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction.
- Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty.
- Knowledge and experience of modern software development techniques and lifecycles.
- Experience with Infrastructure as Code (IaC) automation and orchestration tools such as Ansible and Terraform.
- Prior experience working in a large scale, 24/7 enterprise where system uptime and stability is of paramount importance to the Business.
- Keen interest of industry trends, particularly Platform Engineering.
- Proficiency in shell scripting for automation and system management tasks.
- Writing and contributing to code that enhances the reliability and observability of services, including telemetry, operational APIs and tooling.
- Developing and maintaining tools that facilitate effective management of our systems, ensuring they are operationally efficient and resilient.
- Working with automation and orchestration platforms to automate manual activity and reduce toil.
- Building sophisticated dashboards using a range of telemetry data and dash boarding technologies like Grafana, Splunk and New Relic.
- Maintaining and administering existing monitoring and analytic toolsets.
- Mentoring colleagues in use of new technologies or practices.
- Actively participating in live incident resolution and post-mortem analysis, providing effective remediation strategies to improve overall system health and prevent future issues.
- Driving initiatives to enhance system reliability and observability, contributing to a culture of continuous improvement.
- Collaborating with the central Site Reliability Engineering and Observability teams to establish and uphold standards for reliability and observability, assisting teams in adhering to these practices.
- Working with IT Operations, providing and supporting the use of critical tooling to enable increasing levels of value to the Business.
- Eye care and Flu Vaccinations
- Life Assurance
We are a unique global operator with passion and drive to be the best in the industry. Our values form the foundation of culture and shape the unique way that we work. People are our superpower and we support you to be the best you can be.
Site Reliability Engineer in Manchester employer: bet365 Group
Contact Detail:
bet365 Group Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Site Reliability Engineer in Manchester
✨Tip Number 1
Network like a pro! Reach out to current Site Reliability Engineers on LinkedIn or at industry events. Ask them about their experiences and any tips they might have for landing a role like this.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects related to system reliability and observability. This could include code snippets, dashboards you've built, or even blog posts explaining your approach to incident resolution.
✨Tip Number 3
Prepare for the interview by brushing up on your knowledge of tools like Grafana and Splunk. Be ready to discuss how you've used these in past roles or projects, and think of examples that demonstrate your problem-solving skills.
✨Tip Number 4
Don't forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you're genuinely interested in joining our team.
We think you need these skills to ace Site Reliability Engineer in Manchester
Some tips for your application 🫡
Tailor Your CV: Make sure your CV reflects the skills and experiences that align with the Site Reliability Engineer role. Highlight your software engineering skills, especially in system reliability and observability, to catch our eye!
Craft a Compelling Cover Letter: Use your cover letter to tell us why you're passionate about Site Reliability Engineering. Share specific examples of how you've enhanced system reliability or worked with observability tools in the past.
Showcase Your Technical Skills: Don’t forget to mention your experience with tools like Splunk, New Relic, and Terraform. We love seeing candidates who are familiar with contemporary observability techniques and Infrastructure as Code!
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy!
How to prepare for a job interview at bet365 Group
✨Know Your SRE Principles
Make sure you brush up on your Site Reliability Engineering principles, especially around Service Level Indicators (SLIs) and Service Level Objectives (SLOs). Be ready to discuss how you've applied these in past roles, as this will show your understanding of reliability and customer satisfaction.
✨Familiarise with Observability Tools
Get comfortable with contemporary observability tools like Splunk, New Relic, and Grafana. During the interview, be prepared to share specific examples of how you've used these tools to enhance system performance or resolve incidents.
✨Showcase Your Automation Skills
Highlight your experience with Infrastructure as Code (IaC) tools like Ansible and Terraform. Discuss any automation projects you've worked on that reduced manual toil, as this aligns perfectly with the role's focus on operational efficiency.
✨Emphasise Collaboration
Since collaboration is key in this role, think of examples where you've worked across teams to integrate best practices into the software development life cycle. Be ready to explain how you fostered a culture of continuous improvement in your previous positions.