At a Glance
- Tasks: Resolve Linux infrastructure incidents and improve platform reliability in a dynamic environment.
- Company: Join a high-performing infrastructure support team in Glasgow.
- Benefits: Competitive daily rate, flexible work schedule, and opportunities for professional growth.
- Other info: Collaborative culture with a focus on operational improvement and automation.
- Why this job: Make a real impact on critical systems while enhancing your technical skills.
- Qualifications: Strong Linux expertise and hands-on troubleshooting experience required.
The predicted salary is between 39600 - 39600 £ per year.
We are looking for an experienced Linux Site Reliability Engineer (SRE) to join a high-performing infrastructure support team focused on maintaining and improving critical platform reliability within a large-scale enterprise environment. This position will focus on resolving hardware and platform-related incidents escalated from the L3 support team. The successful candidate will have strong Linux systems expertise, hands-on server troubleshooting experience, and a proactive approach to operational improvement, automation, and incident reduction.
Key Responsibilities
- Investigate and resolve Linux infrastructure and hardware-related incidents
- Perform advanced Linux systems administration and troubleshooting
- Support remote server recovery and diagnostics using out-of-band management technologies
- Manage incidents end-to-end, including triage, mitigation, escalation, communication, and resolution
- Create and maintain operational runbooks and technical documentation
- Identify recurring issues and implement improvements to reduce MTTD and MTTR
- Work closely with engineering and operations teams to improve system reliability and resilience
- Participate in post-incident reviews and root cause analysis
Essential Skills & Experience
- Strong Linux administration and troubleshooting experience
- Knowledge of server hardware including disks, RAID/HBA, NICs, and firmware
- Experience with iDRAC, iLO, IPMI, Redfish, or similar remote management tools
- Proven experience supporting production infrastructure environments
- Understanding of SRE principles including SLOs, SLIs, MTTD, and MTTR
- Strong communication and stakeholder management skills
- Excellent documentation and process improvement experience
Desirable Skills
- Scripting and automation experience with Bash or Python
- Familiarity with VMware, KVM, Docker, or Kubernetes
- Experience with monitoring, observability, and alerting platforms
Linux Site Reliability Engineer in Glasgow employer: Ninetech
Contact Detail:
Ninetech Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Linux Site Reliability Engineer in Glasgow
✨Tip Number 1
Network, network, network! Reach out to your connections in the tech world, especially those who work in SRE roles. A friendly chat can lead to opportunities that aren't even advertised yet.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your Linux troubleshooting projects or automation scripts. This gives potential employers a taste of what you can do.
✨Tip Number 3
Prepare for interviews by brushing up on SRE principles and incident management. Be ready to discuss how you've tackled real-world problems and improved system reliability in past roles.
✨Tip Number 4
Don't forget to apply through our website! We love seeing candidates who are proactive and engaged. Plus, it makes it easier for us to keep track of your application.
We think you need these skills to ace Linux Site Reliability Engineer in Glasgow
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your Linux expertise and troubleshooting experience. We want to see how your skills match the job description, so don’t be shy about showcasing your relevant projects and achievements!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you’re passionate about SRE and how your proactive approach can improve platform reliability. Let us know what makes you the perfect fit for our team.
Showcase Your Documentation Skills: Since we value excellent documentation, include examples of operational runbooks or technical documents you've created. This will demonstrate your attention to detail and commitment to process improvement.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates during the process!
How to prepare for a job interview at Ninetech
✨Know Your Linux Inside Out
Make sure you brush up on your Linux systems knowledge. Be prepared to discuss your hands-on experience with server troubleshooting and administration. They’ll likely ask you about specific incidents you've resolved, so have a few examples ready to showcase your expertise.
✨Familiarise Yourself with Remote Management Tools
Since the role involves using tools like iDRAC, iLO, and IPMI, it’s crucial to understand how these work. If you’ve used them before, be ready to explain how you’ve leveraged these technologies in past roles to support remote server recovery and diagnostics.
✨Highlight Your Incident Management Skills
Be prepared to discuss your approach to managing incidents from start to finish. They’ll want to know how you triage, mitigate, and communicate during an incident. Share specific examples of how you’ve improved MTTD and MTTR in previous positions.
✨Show Off Your Documentation Skills
Creating and maintaining operational runbooks is key in this role. Bring examples of documentation you’ve created in the past, and be ready to talk about how you ensure clarity and usability for your team. This will demonstrate your commitment to process improvement.