Site Reliability Engineer
Site Reliability Engineer

Site Reliability Engineer

Full-Time 48000 - 72000 £ / year (est.) No home office possible
C

At a Glance

  • Tasks: Integrate and maintain advanced computing control systems and software platforms.
  • Company: Leading tech firm in Oxford, fostering innovation and collaboration.
  • Benefits: Attractive salary, health perks, flexible working, and growth opportunities.
  • Why this job: Join a dynamic team and make a real impact on cutting-edge technology.
  • Qualifications: Experience in network technologies, Linux, scripting, and CI/CD tools required.
  • Other info: Fast-paced environment with excellent career advancement potential.

The predicted salary is between 48000 - 72000 £ per year.

A Senior Control Systems Engineer / Site Reliability Engineer (SRE) to integrate and maintain the hardware and software systems that enable advanced computing control systems and software platforms. You will work closely with software engineers, scientists, hardware engineers, and test engineers to install, upgrade, maintain, test, and troubleshoot multiple hardware and software control systems for complex computing platforms. This role is a foundational systems engineering position responsible for ensuring the reliability, stability, and operational functionality of both development and production environments.

Key Responsibilities:

  • Implement, maintain, and test software and hardware within heterogeneous systems that control diverse computing devices.
  • Define, document, implement, and test operational procedures for advanced computing platforms.
  • Manage research, development, and testing infrastructure, including hardware-in-the-loop (HIL) setups, containerized services (e.g., Kubernetes), networking equipment (routers, switches), and test systems.
  • Propose and/or implement automated provisioning, configuration, and orchestration of local and remote compute systems used for control, testing, and simulation.
  • Collaborate with software and test engineering teams to ensure smooth integration and deployment of DevOps tools with custom hardware and specialized workflows.
  • Maintain artifact repositories, test result dashboards, and infrastructure for regression tracking and system health monitoring.
  • Establish and enforce best practices for access control, system configuration, and laboratory operations.
  • Support incident response, troubleshooting, and root cause analysis for CI/CD failures or system anomalies.
  • Implement monitoring and alerting automation by integrating logs and metrics from embedded systems, test environments, and orchestration layers.
  • Work with CI/CD pipelines for building, testing, and deploying software and firmware across control systems.

Required Qualifications:

  • Practical experience implementing LAN and WAN technologies on switches and routers (including VLAN configuration, DNS, DHCP, TCP/IP-based services).
  • Practical Linux (e.g., Ubuntu, Debian, Red Hat) and Windows administration experience, including network operations, application installation, and debugging.
  • Proficiency in scripting languages such as Python, Bash, or Go, and familiarity with tools like Docker, Git, and Kubernetes.
  • Experience with CI/CD tools (e.g., GitLab CI, Jenkins) and infrastructure-as-code platforms (e.g., Ansible, Terraform, or similar).
  • Familiarity with observability tools (e.g., Grafana, Prometheus, ELK stack) and logging systems for real-time monitoring.
  • Hands-on experience with rack-mounted servers.
  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
  • 10+ years of experience in Network SQA, Systems Engineering, SRE, or infrastructure engineering roles.

Preferred Qualifications:

  • Experience supporting hybrid systems involving embedded devices, custom hardware, or real-time control systems.
  • Knowledge of secure software deployment and product lifecycle processes.
  • Experience with hardware-in-the-loop pipelines or distributed lab/test automation environments.
  • Exposure to scientific computing, high-performance computing (HPC), or advanced computing software stacks.
  • Ability to thrive in fast-paced, interdisciplinary environments.

Site Reliability Engineer employer: CT19

Join a forward-thinking team in Oxford as a Site Reliability Engineer, where innovation meets collaboration. Our company fosters a dynamic work culture that prioritises employee growth through continuous learning and development opportunities, while offering competitive benefits and a supportive environment. With a focus on cutting-edge technology and interdisciplinary teamwork, we empower our engineers to make a meaningful impact in the realm of advanced computing control systems.
C

Contact Detail:

CT19 Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer

✨Tip Number 1

Network, network, network! Get out there and connect with folks in the industry. Attend meetups, conferences, or even online webinars related to Site Reliability Engineering. You never know who might have a lead on your dream job!

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those involving CI/CD tools, scripting, or any cool automation you've done. This gives potential employers a taste of what you can bring to the table.

✨Tip Number 3

Don’t just apply blindly! Tailor your approach for each role. Research the company and mention specific projects or values that resonate with you in your conversations. It shows you're genuinely interested and not just sending out cookie-cutter applications.

✨Tip Number 4

Use our website to apply! We’ve got a streamlined process that makes it easy for you to showcase your skills and experience. Plus, it helps us get to know you better right from the start!

We think you need these skills to ace Site Reliability Engineer

Control Systems Engineering
Site Reliability Engineering (SRE)
Hardware and Software Integration
Linux Administration
Windows Administration
Networking Technologies (LAN/WAN)
Scripting (Python, Bash, Go)
Containerization (Docker, Kubernetes)
CI/CD Tools (GitLab CI, Jenkins)
Infrastructure as Code (Ansible, Terraform)
Observability Tools (Grafana, Prometheus, ELK stack)
Troubleshooting and Root Cause Analysis
Monitoring and Alerting Automation
Test Systems Management
Collaboration with Software and Test Engineering Teams

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Site Reliability Engineer role. Highlight your experience with LAN/WAN technologies, Linux administration, and any relevant scripting skills. We want to see how your background aligns with our needs!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about SRE and how your skills can contribute to our team. Be sure to mention any specific projects or experiences that relate to the job description.

Showcase Your Technical Skills: Don’t hold back on showcasing your technical skills! Mention your proficiency in tools like Docker, Git, and Kubernetes, as well as your experience with CI/CD pipelines. We love seeing candidates who are hands-on and ready to dive into complex systems.

Apply Through Our Website: We encourage you to apply through our website for the best chance of getting noticed. It’s super easy, and you’ll be able to submit all your materials in one go. Plus, we love seeing applications come directly from our site!

How to prepare for a job interview at CT19

✨Know Your Tech Inside Out

Make sure you brush up on your knowledge of LAN and WAN technologies, as well as Linux and Windows administration. Be ready to discuss your practical experience with networking, scripting languages like Python or Bash, and tools such as Docker and Kubernetes. The more confident you are in these areas, the better you'll impress the interviewers.

✨Showcase Your Problem-Solving Skills

Prepare to share specific examples of how you've tackled complex issues in previous roles. Think about times when you had to troubleshoot CI/CD failures or system anomalies. Highlight your approach to root cause analysis and how you implemented solutions, as this will demonstrate your hands-on experience and critical thinking abilities.

✨Familiarise Yourself with Their Tools

Research the observability tools and CI/CD platforms mentioned in the job description, like Grafana, Prometheus, GitLab CI, or Jenkins. If you can speak knowledgeably about how you've used these tools in past projects, it will show that you're not just a fit for the role but also genuinely interested in their tech stack.

✨Ask Insightful Questions

Prepare some thoughtful questions to ask at the end of your interview. Inquire about their current challenges with hybrid systems or how they manage their testing infrastructure. This shows that you're engaged and thinking critically about how you can contribute to their team, which is always a plus in an interview.

Site Reliability Engineer
CT19

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>