Senior Site Reliability Engineer / Technical Architect in Winnersh

Job Board

Companies

United States Digital Space LLC

Senior Site Reliability Engineer / Technical Architect

Senior Site Reliability Engineer / Technical Architect in Winnersh

Winnersh Full-Time 60000 - 80000 £ / year (est.) No working from home possible

Apply Now

At a Glance

Tasks: Design and maintain scalable cloud infrastructure while automating processes and improving reliability.
Company: Join a leading tech firm focused on innovation and collaboration.
Benefits: Enjoy flexitime, competitive salary, and opportunities for professional growth.
Other info: Ideal for those passionate about automation and security in tech.
Why this job: Make a real impact in cloud engineering and work with cutting-edge technologies.
Qualifications: 15+ years in SRE, cloud, or DevOps with strong technical skills.

The predicted salary is between 60000 - 80000 £ per year.

We are looking for a highly experienced Senior Site Reliability Engineer / Technical Architect with strong hands‑on expertise in cloud infrastructure, Kubernetes, platform engineering, automation, observability, and AI‑assisted engineering. The ideal candidate will have deep experience designing, building, and operating reliable, scalable, and secure infrastructure across AWS, Azure, Kubernetes, Terraform, CI/CD, GitOps, and monitoring platforms. This role requires strong ownership of production systems, incident management, automation, infrastructure standards, and collaboration with engineering, security, and platform teams.

Key Responsibilities

Design, build, and maintain scalable cloud infrastructure across AWS and Azure.
Manage Kubernetes platforms including EKS, AKS, Helm, Argo CD, and GitOps workflows.
Create reusable Terraform, Ansible, and automation patterns for infrastructure provisioning.
Define and improve SLOs, SLIs, monitoring, alerting, dashboards, and incident response processes.
Implement observability using tools such as Datadog, Grafana, Prometheus, Loki, Tempo, OpenTelemetry, Splunk, and related platforms.
Improve platform reliability, reduce operational toil, and support root cause analysis during incidents.
Support secure infrastructure access using IAM, Okta, Teleport, RBAC, MFA, TLS/PKI, Secrets Manager, and cloud security controls.
Work with CI/CD tools such as Jenkins, GitLab CI, GitHub Actions, and Argo CD to improve deployment reliability.
Support Linux, Windows Server, Active Directory, DNS, DHCP, LDAP, and Group Policy environments.
Manage large-scale GPU/HPC workloads using SLURM, PySpark, anomaly detection pipelines, and bare‑metal provisioning with IPMI and PXE boot.
Apply AI‑assisted engineering tools such as Cursor, Claude Code, GitHub Copilot, AWS Bedrock, Ollama, Datadog Watchdog, and Grafana AI Agents to improve automation, troubleshooting, and delivery.
Partner with engineering, security, and business teams to turn operational and regulatory requirements into practical platform standards.

Required Skills

Strong experience in Site Reliability Engineering, DevOps, Cloud Infrastructure, or Platform Engineering.
Hands‑on experience with AWS services such as EC2, EKS, ECS, Lambda, RDS, S3, VPC, CloudFront, Route 53, IAM, KMS, WAF, and Secrets Manager.
Experience with Azure services including AKS, Virtual Machines, Virtual Networks, Storage Accounts, Load Balancer, Azure Monitor, and Entra ID.
Strong Kubernetes, Docker, Helm, Terraform, Ansible, and GitOps experience.
Good scripting and automation skills using Python, Bash, or similar languages.
Strong monitoring and observability experience with Datadog, Grafana, Prometheus, Loki, Tempo, OpenTelemetry, Splunk, or Nagios.
Experience with incident response, production support, root cause analysis, capacity planning, cost optimisation, and reliability improvement.
Good understanding of networking, DNS, DHCP, LDAP, load balancers, firewalls, CDN, VPN, and security controls.
Experience working in regulated, high‑availability, or large‑scale production environments.

Preferred Certifications

Certified Kubernetes Administrator
AWS Certified Solutions Architect
Red Hat Certified Engineer
Microsoft Certified Solutions Expert
CCNA Routing and Switching / Security

Candidate Profile

This role is suitable for a senior engineer or architect with 15+ years of experience across SRE, cloud, DevOps, infrastructure, and platform engineering. The candidate should be comfortable working across both hands‑on technical delivery and architecture‑level decision making, with a strong focus on reliability, automation, security, and developer productivity.

Job Type: Full‑time

Pay: £45,000.00 per year

Benefits: Flexitime

Licence/Certification: Certified Kubernetes Administrator (required)

Work Location: In person

Senior Site Reliability Engineer / Technical Architect in Winnersh employer: United States Digital Space LLC

Join a forward-thinking company that values innovation and collaboration, offering a dynamic work culture where your expertise in cloud infrastructure and Site Reliability Engineering will be highly valued. With a strong focus on employee growth, we provide opportunities for continuous learning and development, alongside flexible working arrangements to ensure a healthy work-life balance. Located in a vibrant area, our team enjoys a supportive environment that encourages creativity and the use of cutting-edge technologies.

Contact Details:

United States Digital Space LLC Recruitment Team

View United States Digital Space LLC profile

We think you need these skills to ace Senior Site Reliability Engineer / Technical Architect in Winnersh

Site Reliability Engineering

Cloud Infrastructure

Platform Engineering

Kubernetes

AWS Services (EC2, EKS, ECS, Lambda, RDS, S3, VPC, CloudFront, Route 53, IAM, KMS, WAF, Secrets Manager)

Azure Services (AKS, Virtual Machines, Virtual Networks, Storage Accounts, Load Balancer, Azure Monitor, Entra ID)

Terraform

Ansible

GitOps

Scripting and Automation (Python, Bash)

Monitoring and Observability (Datadog, Grafana, Prometheus, Loki, Tempo, OpenTelemetry, Splunk)

Incident Response

Root Cause Analysis

Networking (DNS, DHCP, LDAP, load balancers, firewalls, CDN, VPN)

AI-assisted Engineering Tools

Senior Site Reliability Engineer / Technical Architect in Winnersh

United States Digital Space LLC

Location: Winnersh

Apply Now

Senior Site Reliability Engineer / Technical Architect in Winnersh

At a Glance

Senior Site Reliability Engineer / Technical Architect in Winnersh employer: United States Digital Space LLC

We think you need these skills to ace Senior Site Reliability Engineer / Technical Architect in Winnersh

Company

Product

Help