Job Board

Companies

Trust In SODA

Site Reliability Engineer

Site Reliability Engineer in Portsmouth

Portsmouth Full-Time No home office possible

Apply now

At a Glance

Tasks: Ensure platform reliability and automate operational tasks for a shared PaaS.
Company: Join a dynamic team focused on delivering secure digital services.
Benefits: Competitive daily rate, flexible work environment, and opportunities for professional growth.
Why this job: Make a real impact on service reliability while working with cutting-edge technologies.
Qualifications: Experience in live service operations, automation, and cloud platforms required.
Other info: Collaborative environment with a focus on continuous improvement and agile practices.

As a Site Reliability Engineer (SRE), you will support the reliability, availability, performance, and security of a shared Platform as a Service (PaaS) used by multiple delivery teams. Operating at SFIA Level 4 (Enable), you will apply established SRE practices to ensure platform stability, automate operational tasks, and improve service resilience. You will work closely with platform engineers, developers, security, and live service teams to support the safe, efficient delivery of digital services in line with DDaT and government standards and in a timely fashion.

Service reliability & operations

Maintain and improve the availability, reliability, and performance of the PaaS.
Support live services, including incident response, investigation, and resolution, following agreed runbooks, and escalation paths.
Participate in on-call rotas and contribute to incident post-incident reviews (PIRs), identifying root causes and improvement actions.
Monitor platform health using logs, metrics, and alerts, proactively identifying, and resolving issues.

Automation & continuous improvement

Automate repeatable operational tasks to reduce toil and improve platform reliability.
Contribute to infrastructure and configuration management using Infrastructure as Code (IaC) approaches.
Support continuous improvement of operational processes, reliability patterns, and resilience practices.

Platform support & collaboration

Support development teams consuming the PaaS, helping them adopt platform standards and reliability best practices.
Work with security and compliance teams to ensure the platform meets government security, resilience, and audit requirements (JSP453).
Contribute to platform documentation, runbooks, and knowledge sharing.
Collaborate within multidisciplinary teams using agile and DevOps practices.

Change & Release Support

Support safe deployment and release processes, including monitoring changes in live environments.
Assist with capacity planning and performance testing activities.
Ensure changes are implemented in line with change management and live service standards.

Skills

Live service operations & incident management experience.
Strong automation & scripting capability.
K8 & Cloud compute platform (e.g. AWS) experience.
Experience supporting live digital services in a production environment.
Practical knowledge of cloud platforms and PaaS concepts (e.g. managed computer, networking, storage, CI/CD).
Experience with container platforms (e.g. Kubernetes) or managed PaaS offerings.
Experience with monitoring, logging, and alerting tools (e.g. Prometheus, Grafana, Elastic).
Ability to diagnose and resolve technical issues using established processes and tooling.
Experience writing scripts or automation using languages such as Python, Bash, or similar.
Understanding of reliability engineering concepts, including incident management, resilience, and failure modes.
Ability to work independently on defined tasks and contribute effectively within a team.
Experience using Infrastructure as Code tools (e.g. Terraform, CloudFormation).

Nice to have skills

Experience working in a government or regulated/secure environment.
Familiarity with SRE practices such as error budgets and blameless post-incident reviews.
Knowledge of security and compliance controls relevant to live services.
Experience using Jira and wider Atlassian project suite (e.g. Confluence).

Site Reliability Engineer in Portsmouth employer: Trust In SODA

Join a forward-thinking organisation as a Site Reliability Engineer in Portsmouth, where you will play a crucial role in enhancing the reliability and performance of our Platform as a Service. We pride ourselves on fostering a collaborative work culture that values continuous improvement and innovation, offering ample opportunities for professional growth and development. With a focus on employee well-being and a commitment to government standards, this position not only provides competitive rates but also the chance to make a meaningful impact in a secure environment.

Contact Detail:

Trust In SODA Recruiting Team

View Trust In SODA Profile

StudySmarter Expert Advice 🤫

We think this is how you could land Site Reliability Engineer in Portsmouth

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with other SREs on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your automation scripts, IaC projects, or any cool stuff you've built. This gives potential employers a taste of what you can do beyond just a CV.

✨Tip Number 3

Prepare for those interviews! Brush up on your incident management scenarios and be ready to discuss how you've improved service reliability in past roles. Practising common SRE interview questions can really help you stand out.

✨Tip Number 4

Don't forget to apply through our website! We’ve got loads of opportunities that might be perfect for you. Plus, applying directly shows your enthusiasm and commitment to joining our team.

We think you need these skills to ace Site Reliability Engineer in Portsmouth

Site Reliability Engineering

Incident Management

Automation

Scripting

Kubernetes

Cloud Computing (e.g. AWS)

Platform as a Service (PaaS)

Monitoring Tools (e.g. Prometheus, Grafana, Elastic)

Technical Issue Diagnosis

Infrastructure as Code (e.g. Terraform, CloudFormation)

Agile Practices

Collaboration

Change Management

Capacity Planning

Performance Testing

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the Site Reliability Engineer role. Highlight your experience with live service operations, automation, and any relevant cloud platforms like AWS. We want to see how your skills match what we're looking for!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about SRE and how your background makes you a great fit for our team. Don't forget to mention your experience with incident management and resilience practices.

Showcase Your Technical Skills: When listing your technical skills, be specific! Mention your experience with scripting languages like Python or Bash, and any tools you've used for monitoring and logging. We love seeing practical examples of how you've applied these skills in real-world scenarios.

Apply Through Our Website: We encourage you to apply through our website for a smoother application process. It helps us keep track of your application and ensures you don’t miss out on any important updates. Plus, we can't wait to hear from you!

How to prepare for a job interview at Trust In SODA

✨Know Your SRE Practices

Make sure you brush up on established Site Reliability Engineering practices. Be ready to discuss how you've applied these in past roles, especially around incident management and resilience. This will show that you understand the core responsibilities of the role.

✨Demonstrate Automation Skills

Prepare to showcase your automation and scripting capabilities. Bring examples of scripts you've written or operational tasks you've automated, particularly using languages like Python or Bash. This is crucial for reducing toil and improving platform reliability.

✨Familiarise with Cloud Platforms

Since the role involves working with cloud compute platforms like AWS, make sure you're comfortable discussing your experience with them. Be ready to talk about any projects where you've used Kubernetes or other container platforms, as well as monitoring tools like Prometheus or Grafana.

✨Collaboration is Key

Highlight your experience working in multidisciplinary teams and using agile and DevOps practices. Be prepared to discuss how you've collaborated with developers, security teams, and others to ensure service reliability and compliance with government standards.

Site Reliability Engineer in Portsmouth

Trust In SODA

Location: Portsmouth

Apply now

Site Reliability Engineer in Portsmouth

Portsmouth

Full-Time

Apply now
Trust In SODA

View Trust In SODA Profile

Similar positions in other companies

UK’s top job board for Gen Z

Discover now

Site Reliability Engineer in Portsmouth

At a Glance

Site Reliability Engineer in Portsmouth employer: Trust In SODA

StudySmarter Expert Advice 🤫

✨Tip Number 1

✨Tip Number 2

✨Tip Number 3

✨Tip Number 4

We think you need these skills to ace Site Reliability Engineer in Portsmouth

Some tips for your application 🫡

How to prepare for a job interview at Trust In SODA

Site Reliability Engineer in Portsmouth

Land your dream job quicker with Premium

Similar positions in other companies

UK’s top job board for Gen Z