At a Glance
- Tasks: Join a dynamic team to enhance platform reliability and troubleshoot server issues.
- Company: High-performing infrastructure support team in a large-scale enterprise.
- Benefits: Competitive day rate, hybrid work model, and opportunities for professional growth.
- Other info: Exciting chance to work with cutting-edge technologies and improve operational practices.
- Why this job: Make a real impact on critical systems while developing your skills in a supportive environment.
- Qualifications: Strong Linux skills, troubleshooting experience, and a proactive mindset.
Contract: Site Reliability Engineer (Linux Administration & Server Hardware)
Location: Glasgow (hybrid - 3 days onsite)
Duration: 6 months
Day Rate: Negotiable (Inside IR35 via umbrella solution)
Reference: 20460
We are looking for an experienced Linux Site Reliability Engineer (SRE) to join a high-performing infrastructure support team focused on maintaining and improving critical platform reliability within a large-scale enterprise environment. This position will focus on resolving hardware and platform-related incidents escalated from the L3 support team. The successful candidate will have strong Linux systems expertise and hands-on physical server troubleshooting experience, and a proactive approach to operational improvement, automation, and incident reduction.
Essential Skills / Requirements
- Strong Linux administration and troubleshooting skills (process, networking basics, logs, package/service management).
- Solid understanding of server hardware and peripherals (disks, RAID/HBA, NICs, firmware) and how failures present at OS level.
- Experience with out-of-band management / lights-out technologies (e.g., iDRAC, iLO, IPMI/Redfish) for remote troubleshooting and recovery.
- Proven ability to own incidents end-to-end: triage, identify mitigations/workarounds, coordinate with L3/engineering, communicate status, and drive to resolution.
- Understanding of SRE operational practices and metrics (e.g., SLO/SLI concepts, error budgets, MTTD/MTTR) and a continuous-improvement mindset.
- Strong communication skills (written and verbal): clear incident updates, customer/stakeholder management, and effective escalation and handoffs.
- Strong documentation skills: writing clear runbooks/procedures, contributing to knowledge bases, and participating in post-incident reviews/root cause analysis.
Nice to Have / Desired Skills
- Scripting and automation skills (e.g., Bash, Python) to build small tools, checks, and workflow automation that reduce toil.
- Familiarity with virtualization and containerization concepts/operations (e.g., VMware/KVM, Docker, Kubernetes) and using automation to support these environments.
- Experience with monitoring/observability and alerting workflows (dashboards, log analysis, alert tuning) and translating signals into actionable response steps.
Linux Site Reliability Engineer in Paisley employer: Networking People (UK) Limited
Contact Detail:
Networking People (UK) Limited Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Linux Site Reliability Engineer in Paisley
✨Tip Number 1
Network like a pro! Attend meetups, webinars, or local tech events where you can connect with other SREs and industry professionals. You never know who might have the inside scoop on job openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a GitHub repository showcasing your projects, scripts, or any automation tools you've built. This gives potential employers a tangible look at what you can do, especially in Linux administration and troubleshooting.
✨Tip Number 3
Prepare for interviews by brushing up on common SRE scenarios. Think about how you would handle incidents, improve reliability, or automate processes. Practising these responses will help you stand out as a proactive candidate.
✨Tip Number 4
Don’t forget to apply through our website! We’ve got loads of opportunities that might be perfect for you. Plus, it’s a great way to ensure your application gets seen by the right people.
We think you need these skills to ace Linux Site Reliability Engineer in Paisley
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights your Linux administration skills and any relevant experience with server hardware. We want to see how your background aligns with the role, so don’t be shy about showcasing your expertise!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you’re passionate about Site Reliability Engineering and how your proactive approach can contribute to our team. Keep it concise but impactful!
Showcase Your Communication Skills: Since strong communication is key for this role, make sure your application reflects your ability to convey complex information clearly. Whether it’s in your CV or cover letter, we want to see that you can communicate effectively with both technical and non-technical stakeholders.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it’s super easy – just follow the prompts!
How to prepare for a job interview at Networking People (UK) Limited
✨Know Your Linux Inside Out
Make sure you brush up on your Linux administration skills. Be ready to discuss troubleshooting processes, networking basics, and how to manage packages and services. They’ll likely ask you about real-world scenarios, so think of examples where you've resolved issues effectively.
✨Get Familiar with Server Hardware
Understand the ins and outs of server hardware and peripherals. Be prepared to explain how different failures present at the OS level. It’s a good idea to have some hands-on experience or anecdotes about dealing with RAID, NICs, and firmware issues.
✨Show Off Your Incident Management Skills
Be ready to talk about your experience owning incidents from start to finish. Highlight your ability to triage, identify workarounds, and communicate effectively with teams. They’ll want to see that you can drive incidents to resolution while keeping everyone in the loop.
✨Demonstrate Your Continuous Improvement Mindset
Discuss your understanding of SRE practices and metrics like SLOs and error budgets. Share examples of how you've contributed to operational improvements or automation in past roles. This shows you’re not just reactive but proactive in enhancing platform reliability.