SRE (Linux, Firmware & Server Infrastructure) in Paisley

SRE (Linux, Firmware & Server Infrastructure) in Paisley

Paisley Temporary Home office (partial)
Networking People (UK) Limited

At a Glance

  • Tasks: Resolve complex platform and hardware incidents while managing firmware lifecycle and server configurations.
  • Company: Join a high-performing enterprise infrastructure team in Glasgow with a hybrid work model.
  • Benefits: Competitive day rate, flexible working, and opportunities for professional growth.
  • Other info: Collaborative culture with excellent career advancement opportunities.
  • Why this job: Make a real impact on critical platforms and enhance your skills in a dynamic environment.
  • Qualifications: Strong Linux expertise and experience in server hardware and incident management.

Contract: Senior Platform Reliability Engineer (Linux, Firmware & Server Infrastructure)

Location: Glasgow (Hybrid - 3 days onsite)

Duration: 6 months

Day Rate: Negotiable (Inside IR35 via umbrella)

Reference: 20460

Overview

We are seeking a Senior Platform Reliability Engineer with deep Linux systems expertise and strong exposure to server hardware, firmware, and low-level infrastructure operations. This role sits within a high-performing enterprise infrastructure team responsible for maintaining and improving the reliability of critical platforms at scale.

The position is heavily focused on resolving complex platform and hardware-related incidents, particularly those escalated from L3 support, with an emphasis on firmware lifecycle management, disk encryption, logging, and server configuration (BIOS-level controls) across multi-vendor environments. This is a hands-off hardware role, requiring strong remote troubleshooting capabilities, excellent communication skills, and the ability to work closely with internal teams and external vendors to drive issues through to resolution.

Key Responsibilities

  • Own and manage end-to-end incident resolution for platform and hardware-related issues, including triage, mitigation, escalation, and post-incident review
  • Diagnose and troubleshoot Linux OS-level issues arising from hardware faults, firmware changes, or configuration inconsistencies
  • Manage and support firmware lifecycle processes, including upgrades, validation, and issue remediation
  • Work with disk encryption technologies and logging frameworks, ensuring system integrity and auditability
  • Maintain and troubleshoot server configuration settings, including BIOS-level parameters across multiple hardware vendors (strong Dell focus)
  • Utilize out-of-band management tools (e.g., iDRAC, iLO, RACADM, Redfish APIs) for remote diagnostics and recovery
  • Analyse vendor logs, support bundles, and telemetry data to identify root causes and remediation paths
  • Engage directly with hardware vendors and engineering teams, managing escalations and driving timely resolutions
  • Contribute to continuous improvement initiatives, reducing incident recurrence and operational toil
  • Produce and maintain high-quality documentation, including runbooks, troubleshooting guides, and knowledge base articles
  • Participate in post-incident reviews (RCA) and support improvements in reliability metrics (MTTR, MTTD, SLOs)

Essential Skills & Experience

  • Strong Linux administration and troubleshooting expertise, including:
    • Process and service management
    • System logs and diagnostics
    • Networking fundamentals
    • Package and configuration management
  • Solid understanding of server hardware and infrastructure, including:
    • Disks, RAID/HBA controllers
    • NICs and firmware interactions
    • Hardware failure modes and OS-level symptoms
  • Proven experience with:
    • Firmware management and upgrades
    • Disk encryption and secure configurations
    • BIOS/server configuration management
  • Hands-on experience with remote management and lights-out technologies, such as:
    • iDRAC, iLO
    • RACADM
    • Redfish or similar APIs
  • Strong track record of incident ownership, including:
    • Triage and mitigation
    • Cross-team coordination
    • Stakeholder communication
    • Driving issues through to resolution
  • Experience working with:
    • Vendor diagnostics, logs, and support bundles
    • Vendor escalation processes and engineering engagement
  • Excellent communication skills (written and verbal), with the ability to clearly articulate technical issues to both technical and non-technical stakeholders
  • Strong documentation skills, including creation of runbooks, procedures, and RCA reports

Desirable Skills

  • Scripting and automation experience (e.g., Python, Bash, Ansible)
  • Familiarity with configuration management and automation frameworks
  • Exposure to virtualisation and containerisation technologies (VMware, KVM, Docker, Kubernetes)
  • Experience with monitoring, observability, and alerting systems, including log analysis and alert tuning
  • Understanding of SRE principles and metrics, including SLOs, SLIs, error budgets, MTTR/MTTD

Key Attributes

  • Methodical and detail-oriented approach to troubleshooting
  • Strong sense of ownership and accountability
  • Comfortable working in high-pressure, incident-driven environments
  • Collaborative mindset with the ability to work across global teams and vendors
  • Proactive approach to continuous improvement and operational excellence

Networking People (UK) is acting as an Employment Business in relation to this vacancy.

SRE (Linux, Firmware & Server Infrastructure) in Paisley employer: Networking People (UK) Limited

At Networking People, we pride ourselves on being an exceptional employer, offering a dynamic work culture that fosters collaboration and innovation. Our Glasgow location provides a hybrid working model, allowing for flexibility while being part of a high-performing team dedicated to maintaining critical infrastructure. We are committed to employee growth, providing opportunities for continuous learning and development in a supportive environment, making us an ideal choice for those seeking meaningful and rewarding careers.

Networking People (UK) Limited

Contact Detail:

Networking People (UK) Limited Recruiting Team

StudySmarter Expert Advice🤫

We think this is how you could land SRE (Linux, Firmware & Server Infrastructure) in Paisley

Tip Number 1

Get your networking game on! Reach out to folks in the industry, especially those already working as SREs. A friendly chat can lead to insider info about job openings and even referrals.

Tip Number 2

Prepare for those tricky technical interviews! Brush up on your Linux troubleshooting skills and be ready to discuss real-world scenarios. We recommend practising with mock interviews to boost your confidence.

Tip Number 3

Show off your problem-solving skills! During interviews, share specific examples of how you've tackled complex incidents in the past. This will demonstrate your hands-on experience and ability to manage high-pressure situations.

Tip Number 4

Don't forget to apply through our website! It’s the best way to ensure your application gets noticed. Plus, we love seeing candidates who are proactive about their job search!

We think you need these skills to ace SRE (Linux, Firmware & Server Infrastructure) in Paisley

Linux Administration
Troubleshooting Expertise
Server Hardware Knowledge
Firmware Management
Disk Encryption Technologies
BIOS Configuration Management
Remote Management Tools (iDRAC, iLO, RACADM, Redfish)

Some tips for your application 🫡

Tailor Your CV:Make sure your CV highlights your Linux expertise and experience with server hardware. We want to see how your skills match the job description, so don’t be shy about showcasing relevant projects or roles you've had.

Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you’re the perfect fit for the Senior Platform Reliability Engineer role. Share specific examples of how you've tackled complex incidents or improved platform reliability in the past.

Show Off Your Documentation Skills:Since this role involves producing high-quality documentation, include examples of runbooks or troubleshooting guides you've created. This will demonstrate your attention to detail and ability to communicate technical information clearly.

Apply Through Our Website:We encourage you to apply directly through our website. It’s the quickest way for us to receive your application and get you into the process. Don’t wait too long, as we expect a high volume of applications!

How to prepare for a job interview at Networking People (UK) Limited

Know Your Linux Inside Out

Make sure you brush up on your Linux administration skills. Be prepared to discuss troubleshooting techniques, system logs, and diagnostics. They’ll likely ask you about specific scenarios where you've resolved OS-level issues, so have some examples ready!

Familiarise Yourself with Hardware and Firmware

Since this role involves a lot of hardware interaction, it’s crucial to understand server components and firmware management. Review common failure modes and how they manifest at the OS level. Being able to articulate your experience with BIOS configurations and vendor-specific tools will set you apart.

Show Off Your Communication Skills

This position requires excellent communication, especially when dealing with both technical and non-technical stakeholders. Practice explaining complex technical issues in simple terms. You might be asked to demonstrate this during the interview, so think of ways to convey your past experiences clearly.

Prepare for Incident Management Questions

Expect questions around incident ownership and resolution processes. Be ready to discuss how you’ve triaged incidents, coordinated with teams, and driven issues to resolution. Highlight any continuous improvement initiatives you've been part of, as they’re looking for someone who can contribute to reducing operational toil.