Linux Site Reliability Engineer

Linux Site Reliability Engineer

Freelance 50000 - 70000 £ / year (est.) Home office (partial)
Ninetech

At a Glance

  • Tasks: Resolve Linux infrastructure incidents and improve platform reliability in a dynamic environment.
  • Company: Join a high-performing infrastructure support team in a large-scale enterprise.
  • Benefits: Competitive pay, flexible work options, and opportunities for professional growth.
  • Other info: Collaborative culture with excellent career advancement opportunities.
  • Why this job: Make a real impact on system reliability while working with cutting-edge technologies.
  • Qualifications: Strong Linux expertise, troubleshooting skills, and experience with automation.

The predicted salary is between 50000 - 70000 £ per year.

We are looking for an experienced Linux Site Reliability Engineer (SRE) to join a high-performing infrastructure support team focused on maintaining and improving critical platform reliability within a large-scale enterprise environment. This position will focus on resolving hardware and platform-related incidents escalated from the L3 support team. The successful candidate will have strong Linux systems expertise, hands-on server troubleshooting experience, and a proactive approach to operational improvement, automation, and incident reduction.

Key Responsibilities

  • Investigate and resolve Linux infrastructure and hardware-related incidents
  • Perform advanced Linux systems administration and troubleshooting
  • Support remote server recovery and diagnostics using out-of-band management technologies
  • Manage incidents end-to-end, including triage, mitigation, escalation, communication, and resolution
  • Create and maintain operational runbooks and technical documentation
  • Identify recurring issues and implement improvements to reduce MTTD and MTTR
  • Work closely with engineering and operations teams to improve system reliability and resilience
  • Participate in post-incident reviews and root cause analysis

Qualifications

  • Strong Linux administration and troubleshooting experience
  • Knowledge of server hardware including disks, RAID/HBA, NICs, and firmware
  • Experience with iDRAC, iLO, IPMI, Redfish, or similar remote management tools
  • Understanding of SRE principles including SLOs, SLIs, MTTD, and MTTR
  • Strong communication and stakeholder management skills
  • Excellent documentation and process improvement experience
  • Scripting and automation experience with Bash or Python
  • Familiarity with VMware, KVM, Docker, or Kubernetes
  • Experience with monitoring, observability, and alerting platforms

Linux Site Reliability Engineer employer: Ninetech

Join a dynamic and innovative team as a Linux Site Reliability Engineer, where your expertise will be valued in a collaborative environment that prioritises operational excellence and continuous improvement. Our company offers competitive benefits, a strong focus on employee development, and a culture that encourages proactive problem-solving and teamwork, all set within a vibrant enterprise landscape. With opportunities for growth and the chance to work with cutting-edge technologies, this role is perfect for those seeking meaningful and rewarding employment.

Ninetech

Contact Details:

Ninetech Recruitment Team

StudySmarter Expert Advice🤫

We think this is how you could land Linux Site Reliability Engineer

Tip Number 1

Network, network, network! Reach out to your connections in the tech world, especially those who work in SRE or related fields. A personal recommendation can make all the difference when you're trying to land that dream job.

Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your Linux troubleshooting projects, automation scripts, or any relevant work. This gives potential employers a tangible look at what you can do.

Tip Number 3

Prepare for technical interviews by brushing up on your Linux knowledge and incident management strategies. Practice common SRE scenarios and be ready to discuss how you've improved system reliability in past roles.

Tip Number 4

Don't forget to apply through our website! We love seeing candidates who are genuinely interested in joining our team. Plus, it makes it easier for us to keep track of your application and get back to you quickly.

We think you need these skills to ace Linux Site Reliability Engineer

Linux Systems Administration
Server Troubleshooting
Operational Improvement
Automation
Incident Management
Technical Documentation
Root Cause Analysis

Some tips for your application 🫡

Tailor Your CV:Make sure your CV highlights your Linux systems expertise and troubleshooting experience. We want to see how your skills match the key responsibilities listed in the job description, so don’t be shy about showcasing your relevant experience!

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you’re passionate about SRE and how your proactive approach can contribute to our team. We love seeing candidates who can communicate their enthusiasm and fit for the role.

Show Off Your Documentation Skills:Since excellent documentation is key for this role, include examples of operational runbooks or technical documents you've created. We appreciate candidates who understand the importance of clear communication and process improvement.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you’re considered for the role. Plus, it shows you’re keen on joining our team!

How to prepare for a job interview at Ninetech

Know Your Linux Inside Out

Make sure you brush up on your Linux systems knowledge. Be prepared to discuss your hands-on experience with server troubleshooting and any specific incidents you've resolved in the past. The more examples you can provide, the better!

Familiarise Yourself with Remote Management Tools

Since this role involves using tools like iDRAC, iLO, and IPMI, it’s crucial to understand how these work. If you’ve used them before, be ready to share your experiences. If not, do a bit of research so you can speak confidently about them.

Showcase Your Automation Skills

The job requires scripting and automation experience, so be prepared to discuss your proficiency in Bash or Python. Think of specific projects where you’ve implemented automation to improve processes or reduce incident response times.

Communicate Clearly and Effectively

Strong communication skills are key for this role. Practice explaining complex technical concepts in simple terms, as you may need to communicate with non-technical stakeholders. Also, be ready to discuss how you handle incident management and post-incident reviews.