At a Glance
- Tasks: Ensure global services for artists and fans are always on and optimised.
- Company: Join a dynamic team in the entertainment tech industry.
- Benefits: Competitive salary, flexible hours, and opportunities for growth.
- Other info: Collaborative environment with a focus on innovation and problem-solving.
- Why this job: Make a real impact by enhancing service reliability and performance.
- Qualifications: Experience in systems administration and programming; cloud platform knowledge preferred.
The predicted salary is between 55000 - 70000 € per year.
Requirements
- A strong background in systems administration (Linux/Windows) in a large-scale environment
- Proficiency in at least one programming language (e.g., Python, Go, Java)
- Hands-on experience with a major cloud platform (AWS, GCP, or Azure), with a high preference for AWS
- Solid understanding of networking, containers (Docker, Kubernetes), and Infrastructure as Code (e.g., Terraform, Ansible)
- Experience with modern monitoring and observability tools (e.g., Prometheus, Grafana, Datadog, Splunk, Dynatrace)
- Proven analytical and problem-solving abilities with experience in a high-pressure environment
- Excellent communication skills and the ability to foster a collaborative team environment
- (Desirable) Bachelor's degree in an IT-related field
- (Desirable) Experience managing large-scale, distributed systems for a global organization
- (Desirable) Familiarity with IT governance standards like ITIL
- (Desirable) Direct experience with ServiceNow for IT service management
- Knowledge of chaos engineering, resilience testing, and advanced capacity planning
What the job involves
- As a Site Reliability Engineer, you won't just be supporting systems; you'll be ensuring the services that connect artists and fans around the globe are always on
- System Reliability & Performance: Design, build, and maintain the availability, scalability, and performance of critical services
- Develop and maintain robust monitoring, alerting, and observability systems (e.g., using AWS CloudWatch, Dynatrace) to ensure rapid issue detection and resolution
- Monitor infrastructure capacity and performance, providing analysis and suggestions for service delivery improvement
- Automation & Efficiency: Drive the automation of repetitive operational tasks, including infrastructure provisioning, deployments, and scaling
- Create and maintain scripts and custom code to support and enhance our operational toolset
- Support and optimize CI/CD pipelines to improve deployment speed and reliability
- Incident Management & Collaboration: Participate in an on-call rotation to troubleshoot and mitigate production incidents
- Lead post-incident reviews and root cause analyses to implement lasting solutions
- Partner with engineering and IT stakeholders to embed SRE best practices (SLOs, error budgets) into the design and development lifecycle
Service Reliability Engineer employer: Deepstreamtech
As a Service Reliability Engineer, you will thrive in a dynamic and innovative environment that prioritises collaboration and continuous improvement. Our company offers competitive benefits, a strong commitment to employee development, and a culture that values creativity and problem-solving, all while being located in a vibrant area that fosters both professional and personal growth. Join us to play a pivotal role in ensuring the reliability of services that connect artists and fans globally, while enjoying the unique advantages of working with cutting-edge technologies and a supportive team.
StudySmarter Expert Advice🤫
We think this is how you could land Service Reliability Engineer
✨Tip Number 1
Network like a pro! Attend industry meetups, webinars, or even local tech events. You never know who might be looking for a Service Reliability Engineer just like you!
✨Tip Number 2
Show off your skills! Create a GitHub repository showcasing your projects, especially those involving cloud platforms or automation tools. This gives potential employers a taste of what you can do.
✨Tip Number 3
Prepare for interviews by brushing up on common SRE scenarios and problem-solving questions. Practise explaining your thought process clearly; communication is key in this role!
✨Tip Number 4
Don’t forget to apply through our website! We love seeing candidates who are genuinely interested in joining our team. Plus, it makes the application process smoother for everyone.
We think you need these skills to ace Service Reliability Engineer
Some tips for your application 🫡
Show Off Your Skills:Make sure to highlight your systems administration experience, especially with Linux and Windows. We want to see your proficiency in programming languages like Python or Go, so don’t hold back on showcasing those skills!
Cloud Experience is Key:If you've got hands-on experience with AWS, GCP, or Azure, let us know! We’re particularly keen on AWS, so mention any projects or tasks where you’ve used it to solve problems or improve services.
Talk About Teamwork:Communication is crucial for us at StudySmarter. Share examples of how you've collaborated with teams in high-pressure situations. We love seeing candidates who can foster a collaborative environment!
Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates from us!
How to prepare for a job interview at Deepstreamtech
✨Know Your Tech Inside Out
Make sure you brush up on your systems administration skills, especially in Linux and Windows. Be ready to discuss your hands-on experience with cloud platforms like AWS, as well as your proficiency in programming languages such as Python or Go. They’ll likely ask you to solve a problem on the spot, so practice coding challenges beforehand!
✨Showcase Your Monitoring Skills
Familiarise yourself with modern monitoring and observability tools like Prometheus, Grafana, or Datadog. Be prepared to explain how you've used these tools in past roles to ensure system reliability and performance. Sharing specific examples of how you’ve improved service delivery through monitoring will definitely impress them.
✨Emphasise Collaboration
Since communication is key in this role, think of examples where you’ve successfully collaborated with teams to resolve incidents or improve processes. Highlight your experience in leading post-incident reviews and how you’ve embedded SRE best practices into projects. This shows you’re not just a techie but also a team player!
✨Prepare for Scenario Questions
Expect scenario-based questions that test your analytical and problem-solving abilities under pressure. Practice articulating your thought process when troubleshooting issues or managing incidents. They want to see how you approach problems, so be clear and structured in your responses.