At a Glance
- Tasks: Enhance system reliability and performance in a dynamic cloud environment.
- Company: Join a forward-thinking tech company focused on innovation and collaboration.
- Benefits: Attractive salary, flexible working options, and opportunities for professional growth.
- Other info: Be part of a global team with exciting challenges and career advancement.
- Why this job: Make a real impact by ensuring systems are robust and reliable.
- Qualifications: Experience in DevOps, AWS, and scripting languages like Python or Bash.
The predicted salary is between 60000 - 80000 € per year.
Requirements
- Experience in production support, DevOps, SRE, cloud operations, or systems engineering
- Hands-on experience with AWS cloud services, including compute, container and serverless workloads
- Practical experience with CI/CD pipelines and DevOps practices, including Git-based version control, pull request workflows, code reviews, and deployment automation
- Experience with SRE principles, monitoring, and reliability engineering practices
- Proficiency in scripting (Python, Bash, or similar) for automation and operational tooling
- Experience with Linux systems and troubleshooting production issues
- Exposure to data platforms and data pipelines (Desirable)
- Understanding of data reliability concepts (Desirable)
- Experience supporting or operating complex distributed systems
What the job involves
- We are looking for a Site Reliability Engineer (SRE) to improve the reliability and performance of business-critical systems.
- You will focus on AWS cloud infrastructure, DevOps tooling, and core SRE practices within a distributed, production environment.
- Reporting to our Lead, you will work with development, platform, and operations teams to ensure systems are stable, scalable, well-monitored and meet defined reliability targets.
- Support high availability, scalability and performance of production systems.
- Work with defined SLIs, SLOs and SLAs, ensuring services meet agreed reliability targets.
- Identify and reduce operational toil through automation and process improvement.
- Contribute to the design and implementation of fault-tolerant and resilient systems.
- Participate in resilience and failure testing activities to validate system behaviour under fault conditions and improve recovery.
- Manage and operate systems hosted on AWS (EC2, EKS/ECS, RDS, S3, Lambda, CloudWatch, IAM, and VPC).
- Support cloud deployments and infrastructure changes following best practices.
- Help with backup, disaster recovery and resiliency planning.
- Work with CI/CD pipelines and DevOps practices to ensure reliable and repeatable deployments, including build, test and release automation processes.
- Use Infrastructure as Code tools such as Terraform or CloudFormation to manage and provision infrastructure.
- Develop automation using scripting languages (Python, Bash or similar) to reduce operational toil and improve efficiency.
- Participate in production incident response, troubleshooting, and service restoration.
- Perform root cause analysis (RCA) and contribute to post-incident reviews.
- Help implement preventive actions to avoid incident recurrence.
- Configure and maintain monitoring, logging, and alerting using tools like CloudWatch, Prometheus, Grafana, Splunk, or Dynatrace.
- Develop dashboards to track system and platform health and reliability metrics across the user journey.
- Improve alert quality to reduce noise and improve response times.
- Work with application and engineering teams to embed reliability into system design.
- Collaborate within a globally distributed team, using clear handovers to ensure continuity.
- Share knowledge and contribute to team-wide best practices.
- Communicate with all kinds of stakeholders, influencing decisions through reliability-focused insights.
Senior Site Reliability Engineer in Nottingham employer: Deepstreamtech
As a Senior Site Reliability Engineer at our company, you will thrive in a dynamic and collaborative work culture that prioritises innovation and employee growth. We offer competitive benefits, including flexible working arrangements and opportunities for professional development, all while being part of a globally distributed team that values your contributions to enhancing the reliability and performance of critical systems in a cutting-edge AWS environment.
StudySmarter Expert Advice🤫
We think this is how you could land Senior Site Reliability Engineer in Nottingham
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with other SREs on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those involving AWS, CI/CD, and automation. This gives potential employers a taste of what you can do beyond your CV.
✨Tip Number 3
Prepare for interviews by brushing up on SRE principles and cloud operations. Practice explaining your past experiences with production support and incident response clearly and confidently. We want to see how you think on your feet!
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are genuinely interested in joining our team.
We think you need these skills to ace Senior Site Reliability Engineer in Nottingham
Some tips for your application 🫡
Tailor Your CV:Make sure your CV highlights your experience in production support, DevOps, and SRE. We want to see how your skills align with our needs, so don’t be shy about showcasing your AWS expertise and any hands-on projects you've tackled.
Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you’re passionate about Site Reliability Engineering and how your background makes you a perfect fit for our team. We love hearing personal stories that connect your experience to the role.
Show Off Your Technical Skills:When detailing your technical skills, be specific! Mention your proficiency in scripting languages like Python or Bash, and any experience with CI/CD pipelines. We’re keen on seeing how you’ve used these tools to improve system reliability in past roles.
Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it gives you a chance to explore more about what we do at StudySmarter!
How to prepare for a job interview at Deepstreamtech
✨Know Your Cloud Inside Out
Make sure you brush up on your AWS knowledge. Be ready to discuss specific services like EC2, Lambda, and RDS, and how you've used them in past projects. Having real-world examples of how you've managed cloud infrastructure will definitely impress.
✨Show Off Your Scripting Skills
Since scripting is key for automation, be prepared to talk about your experience with Python or Bash. Maybe even bring a small script you've written to demonstrate your coding style and problem-solving approach. This will show that you can handle operational tooling effectively.
✨Understand SRE Principles
Familiarise yourself with SRE concepts like SLIs, SLOs, and SLAs. Be ready to explain how you've applied these principles in your previous roles to improve system reliability. This shows that you not only understand the theory but can also implement it in practice.
✨Prepare for Scenario-Based Questions
Expect questions that put you in hypothetical situations, like handling a production incident. Think through your past experiences and how you approached troubleshooting and root cause analysis. This will help you articulate your thought process and decision-making skills during the interview.