At a Glance
- Tasks: Lead efforts in maintaining reliability and performance of critical platforms and services.
- Company: Dynamic tech company focused on cloud and hybrid environments.
- Benefits: Competitive hourly rate, flexible work schedule, and opportunities for professional growth.
- Other info: Collaborative environment with a focus on innovation and continuous improvement.
- Why this job: Make a real impact by solving complex operational challenges with cutting-edge technology.
- Qualifications: 5+ years in SRE or DevOps, strong coding skills, and experience with OpenShift.
The predicted salary is between 56000 - 72000 £ per year.
Location: Wokingham (2 days/week onsite)
Type: Inside IR35
Rate: Up to £70.00 per hour (DOE)
We’re looking for a Senior Site Reliability Engineer (SRE) to lead efforts in maintaining the reliability, performance, and scalability of mission-critical platforms and services. This role is ideal for someone who thrives at the intersection of software engineering, infrastructure, automation, and incident response. You’ll be instrumental in defining and implementing the standards and systems that keep applications running smoothly across cloud and hybrid environments—including OpenShift clusters.
What You’ll Be Responsible For
- Ensure high availability, performance, and latency of critical systems across Azure, AWS, and OpenShift.
- Design and implement robust observability systems (logging, monitoring, alerting) to detect and resolve issues proactively.
- Lead and evolve incident management processes—runbooks, comms, postmortems, and root cause analysis.
- Define and monitor SLIs, SLOs, and error budgets to balance innovation with stability.
- Automate manual processes through infrastructure-as-code, scripting, and modern CI/CD pipelines.
- Mentor engineering teams on best practices for deployment, reliability, scalability, and incident preparedness.
- Support and scale OpenShift-based containerized applications, including upgrade strategies, patching, and workload optimization.
Core Responsibilities
- Operations & Incident Management
- Act as the senior escalation point for outages and critical incidents.
- Lead post-incident reviews and implement long-term remediation plans.
- Communicate platform health and risk posture to stakeholders at all levels.
- Engineering & Automation
- Build and improve CI/CD pipelines using tools like Azure DevOps, GitHub Actions, Jenkins, and GitLab.
- Design scalable, fault-tolerant infrastructure with IaC tools (Terraform, Bicep).
- Create internal tools and automation to accelerate development and reduce operational toil.
- Strategic & Advisory
- Architect cloud and container infrastructure, with a focus on OpenShift, Kubernetes, and hybrid deployments.
- Collaborate with engineering, architecture, and security teams to embed reliability into the SDLC.
- Promote advanced deployment strategies (blue-green, canary, rolling updates) and rollback readiness.
- Drive a culture of reliability, observability, and operational excellence across engineering teams.
Technical Environment
Hands-on experience with many of the following is expected:
- Cloud & Containers: Azure, AWS, OpenShift, Kubernetes, Docker, App Services, IaaS (EC2, VMs)
- CI/CD & Automation: Terraform, Bicep, Azure DevOps, Jenkins, GitHub Actions, GitLab
- Observability: Prometheus, Grafana, Datadog, ELK, Splunk, Application Insights, CloudWatch
- Languages & Scripting: Python, C#, Bash, PowerShell
- Networking: DNS, SSL/TLS, load balancing, WAF, proxies, CDN, Azure App Gateway
- Databases: MSSQL, PostgreSQL, MongoDB, CosmosDB, DynamoDB
- OS & Systems: Windows, Linux, Nginx, IIS
Ideal Candidate Profile
- 5+ years of experience in SRE, DevOps, or production engineering roles.
- Expertise operating in high-availability, fast-paced production environments.
- Solid engineering foundation with experience reading and writing production code.
- Hands-on experience deploying, supporting, and scaling OpenShift environments.
- Proven track record of leading incident responses and improving system reliability.
- Strong collaboration and mentoring abilities across infrastructure, development, and security teams.
What You’ll Bring
- Ability to balance operational risk with engineering velocity.
- Strong communication skills across technical and non-technical audiences.
- A passion for automating everything and eliminating manual work.
- A mindset of ownership, continuous improvement, and technical leadership.
Ready to make reliability your legacy? If you’re a senior SRE with OpenShift experience and a drive to solve complex operational challenges, we’d love to hear from you.
Senior Site Reliability Engineer in Wokingham employer: Trades Workforce Solutions
Join a forward-thinking company in Wokingham that values innovation and operational excellence, offering a collaborative work culture where your expertise as a Senior Site Reliability Engineer will be pivotal in shaping the reliability of mission-critical platforms. With opportunities for professional growth, competitive rates, and a commitment to employee well-being, this role provides a unique chance to thrive in a dynamic environment while working alongside talented teams dedicated to continuous improvement and automation.
Contact Details:
Trades Workforce Solutions Recruitment Team
StudySmarter Expert Advice🤫
We think this is how you could land Senior Site Reliability Engineer in Wokingham
✨Tip Number 1
Network like a pro! Attend meetups, conferences, or online webinars related to Site Reliability Engineering. Engaging with industry peers can lead to job opportunities that aren’t even advertised yet.
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects, especially those involving OpenShift, CI/CD pipelines, and automation. This gives potential employers a tangible sense of what you can bring to the table.
✨Tip Number 3
Prepare for interviews by brushing up on common SRE scenarios and incident management processes. Practising how you’d handle outages or improve system reliability can really set you apart from other candidates.
✨Tip Number 4
Don’t forget to apply through our website! We’re always on the lookout for talented individuals like you. Plus, it’s a great way to ensure your application gets the attention it deserves.
We think you need these skills to ace Senior Site Reliability Engineer in Wokingham
Some tips for your application 🫡
Tailor Your CV:Make sure your CV is tailored to the Senior Site Reliability Engineer role. Highlight your experience with cloud platforms like Azure and AWS, and don’t forget to mention your hands-on work with OpenShift. We want to see how your skills align with what we’re looking for!
Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you’re passionate about SRE and how your background makes you a perfect fit for our team. We love seeing candidates who can communicate their enthusiasm and expertise clearly.
Showcase Your Projects:If you’ve worked on any relevant projects, make sure to include them in your application. Whether it’s automating CI/CD pipelines or improving system reliability, we want to know what you’ve done and how it relates to the role. Real-world examples speak volumes!
Apply Through Our Website:We encourage you to apply through our website for the best chance of getting noticed. It helps us keep track of applications and ensures you’re considered for the role. Plus, it’s super easy—just a few clicks and you’re done!
How to prepare for a job interview at Trades Workforce Solutions
✨Know Your Tech Inside Out
Make sure you’re well-versed in the technologies mentioned in the job description, especially Azure, AWS, and OpenShift. Brush up on your knowledge of CI/CD tools like Jenkins and GitHub Actions, as well as observability tools like Prometheus and Grafana. Being able to discuss these confidently will show that you’re ready for the role.
✨Prepare for Incident Management Scenarios
Since incident management is a key part of the role, think of examples from your past experiences where you led incident responses or improved system reliability. Be ready to discuss your approach to post-incident reviews and how you’ve implemented long-term remediation plans. This will demonstrate your hands-on experience and strategic thinking.
✨Showcase Your Automation Skills
Automation is crucial for this position, so come prepared with examples of how you’ve automated processes in previous roles. Whether it’s through infrastructure-as-code with Terraform or scripting in Python, be ready to explain your thought process and the impact of your automation efforts on operational efficiency.
✨Communicate Clearly and Confidently
You’ll need to communicate platform health and risk posture to various stakeholders, so practice articulating complex technical concepts in simple terms. Think about how you can convey your ideas clearly to both technical and non-technical audiences. This will highlight your strong communication skills, which are essential for the role.