At a Glance
- Tasks: Enhance system reliability and performance using AWS cloud and DevOps practices.
- Company: Join Experian, a global leader in data and technology with a people-first culture.
- Benefits: Enjoy hybrid working, competitive pay, healthcare, and generous leave policies.
- Other info: Be part of an award-winning workplace focused on innovation and inclusivity.
- Why this job: Make a real impact on critical systems while collaborating with a diverse team.
- Qualifications: Experience in cloud operations, DevOps, and scripting is essential.
The predicted salary is between 60000 - 80000 £ per year.
Experian is a global data and technology company, powering opportunities for people and businesses around the world. We help to redefine lending practices, uncover and prevent fraud, simplify healthcare, create marketing solutions, and gain deeper insights into the automotive market, all using our unique combination of data, analytics and software. We also assist millions of people to realize their financial goals and help them save time and money.
We invest in people and new advanced technologies to unlock the power of data. As a FTSE 100 Index company listed on the London Stock Exchange (EXPN), we have a team of 22,500 people across 32 countries. Our corporate headquarters are in Dublin, Ireland.
We are looking for a Site Reliability Engineer (SRE) to improve the reliability and performance of business-critical systems. You will focus on AWS cloud infrastructure, DevOps tooling, and core SRE practices within a distributed, production environment. Reporting to our Lead, you will work with development, platform, and operations teams to ensure systems are stable, scalable, well-monitored and meet defined reliability targets.
Main Responsibilities- Support high availability, scalability and performance of production systems
- Work with defined SLIs, SLOs and SLAs, ensuring services meet agreed reliability targets
- Identify and reduce operational toil through automation and process improvement
- Contribute to the design and implementation of fault-tolerant and resilient systems
- Participate in resilience and failure testing activities to validate system behaviour under fault conditions and improve recovery
- Manage and operate systems hosted on AWS (EC2, EKS/ECS, RDS, S3, Lambda, CloudWatch, IAM, and VPC)
- Support cloud deployments and infrastructure changes following best practices
- Help with backup, disaster recovery and resiliency planning
- Work with CI/CD pipelines and DevOps practices to ensure reliable and repeatable deployments, including build, test and release automation processes
- Use Infrastructure as Code tools such as Terraform or CloudFormation to manage and provision infrastructure
- Develop automation using scripting languages (Python, Bash or similar) to reduce operational toil and improve efficiency
- Participate in production incident response, troubleshooting, and service restoration
- Perform root cause analysis (RCA) and contribute to post-incident reviews
- Help implement preventive actions to avoid incident recurrence
- Configure and maintain monitoring, logging, and alerting using tools like CloudWatch, Prometheus, Grafana, Splunk, or Dynatrace
- Develop dashboards to track system and platform health and reliability metrics across the user journey
- Improve alert quality to reduce noise and improve response times
- Work with application and engineering teams to embed reliability into system design
- Collaborate within a globally distributed team, using clear handovers to ensure continuity
- Share knowledge and contribute to team-wide best practices
- Communicate with all kinds of stakeholders, influencing decisions through reliability-focused insights
- Experience in production support, DevOps, SRE, cloud operations, or systems engineering
- Hands-on experience with AWS cloud services, including compute, container and serverless workloads
- Practical experience with CI/CD pipelines and DevOps practices, including Git-based version control, pull request workflows, code reviews, and deployment automation
- Experience with SRE principles, monitoring, and reliability engineering practices
- Proficiency in scripting (Python, Bash, or similar) for automation and operational tooling
- Experience with Linux systems and troubleshooting production issues
- Exposure to data platforms and data pipelines
- Understanding of data reliability concepts
- Experience supporting or operating complex distributed systems
Benefits package includes hybrid working, great compensation and discretionary bonus. Core benefits include pension, Bupa healthcare, Sharesave scheme and more. 25 days annual leave with 8 bank holidays and 3 volunteering days. You can purchase additional annual leave.
We take our people agenda very seriously and focus on what matters; DEI, work/life balance, development, authenticity, collaboration, wellness, reward & recognition, volunteering... the list goes on. Experian's people first approach is award-winning; World's Best Workplacesâ„¢ 2024 (Fortune Top 25), Great Place To Workâ„¢ in 24 countries, and Glassdoor Best Places to Work 2024 to name a few.
Experian is proud to be an Equal Opportunity and Affirmative Action employer. Innovation is an important part of Experian's DNA and practices, and our diverse workforce drives our success. Everyone can succeed at Experian and bring their whole self to work, irrespective of their gender, ethnicity, religion, colour, sexuality, physical ability or age. If you have a disability or special need that requires accommodation, please let us know at the earliest opportunity.
This is a hybrid remote/in-office role.
Senior Site Reliability Engineer (SRE) in Nottingham employer: Experian
Contact Detail:
Experian Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Senior Site Reliability Engineer (SRE) in Nottingham
✨Tip Number 1
Network like a pro! Reach out to current or former employees at Experian on LinkedIn. A friendly chat can give you insider info and maybe even a referral, which can really boost your chances.
✨Tip Number 2
Prepare for the interview by brushing up on your AWS and DevOps knowledge. Make sure you can talk confidently about your experience with cloud services and automation tools. We want to see that you can walk the walk!
✨Tip Number 3
Show off your problem-solving skills! Be ready to discuss past incidents you've managed and how you improved system reliability. Real-life examples will make you stand out as a candidate who can handle the pressure.
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining the Experian team.
We think you need these skills to ace Senior Site Reliability Engineer (SRE) in Nottingham
Some tips for your application 🫡
Tailor Your CV: Make sure your CV is tailored to the Senior Site Reliability Engineer role. Highlight your experience with AWS, DevOps practices, and any relevant SRE principles. We want to see how your skills align with what we're looking for!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about reliability engineering and how you can contribute to our team. Keep it concise but impactful – we love a good story!
Showcase Your Projects: If you've worked on any projects that demonstrate your cloud expertise or automation skills, make sure to mention them. We’re keen to see real-world examples of your work and how you’ve tackled challenges in production environments.
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, you’ll find all the details you need about the role and our company culture there!
How to prepare for a job interview at Experian
✨Know Your AWS Inside Out
Make sure you brush up on your AWS knowledge, especially the services mentioned in the job description like EC2, EKS, and Lambda. Be ready to discuss how you've used these services in past projects and how they can improve system reliability.
✨Demonstrate Your SRE Mindset
Prepare to talk about your experience with SRE principles, SLIs, SLOs, and SLAs. Think of specific examples where you've implemented these concepts to enhance system performance and reliability.
✨Show Off Your Automation Skills
Be ready to discuss your experience with automation tools and scripting languages like Python or Bash. Share examples of how you've reduced operational toil through automation and improved efficiency in previous roles.
✨Communicate Clearly and Collaboratively
Since collaboration is key in this role, practice articulating your thoughts clearly. Prepare to discuss how you've worked with cross-functional teams and how you ensure smooth handovers and continuity in a distributed environment.