At a Glance
- Tasks: Manage cloud infrastructure, automate processes, and ensure system reliability.
- Company: Join Prodege, a leading marketing and consumer insights platform driving innovation.
- Benefits: Enjoy competitive salary, flexible work options, and opportunities for professional growth.
- Why this job: Make a real impact on cloud infrastructure while collaborating with diverse teams.
- Qualifications: 4+ years in IT operations, strong AWS and Terraform skills required.
- Other info: Dynamic environment with a focus on continuous improvement and cutting-edge technology.
The predicted salary is between 36000 - 60000 £ per year.
The Site Reliability Engineer II (SREII) plays a key role in our cloud and data infrastructure, ensuring our products are delivered on a stable, scalable, and secure foundation. By owning AWS/Terraform environments, CI/CD pipelines, and MySQL performance, this role directly impacts release velocity, site reliability, and overall customer experience. Through thoughtful automation and scripting, the SREII reduces operational toil, increases consistency, and frees engineering teams to focus on feature delivery. Tight collaboration with cross-functional teams, paired with strong monitoring, incident response, and documentation, creates predictable, repeatable operations as we grow. By embedding security best practices and continuously evaluating new tools and approaches, this role helps the organization operate more efficiently while de-risking our infrastructure over time.
Primary Objectives:
- Infrastructure Management: Utilize Terraform to define and provision AWS infrastructure. Configure and maintain AWS services (e.g., EC2, S3, RDS, Lambda, VPC).
- Automation & Scripting: Develop and manage automation scripts and tools using Bash, Python, and PHP to streamline operations and enhance efficiency.
- Database Management: Support and manage MySQL databases, including performance tuning, backups, and recovery.
- CI/CD Integration: Implement and manage continuous integration and continuous deployment (CI/CD) pipelines using Jenkins.
- Monitoring & Optimization: Monitor system performance, availability, and resource usage. Implement optimizations to enhance system efficiency and reliability.
- Incident Management: Troubleshoot and resolve infrastructure issues, outages, and performance problems swiftly and effectively.
- Collaboration: Work with cross-functional teams to support application deployments and address infrastructure needs.
- Documentation: Create and maintain comprehensive documentation for infrastructure configurations, processes, and procedures.
- Security: Ensure that all infrastructure and operations adhere to security best practices and compliance standards.
- Continuous Improvement: Evaluate and adopt new technologies and practices to improve infrastructure performance and operational efficiency.
Success in this will include consistently delivering a stable, secure, and scalable AWS infrastructure that supports our products without surprise outages or performance bottlenecks. Deployments run smoothly through well-maintained CI/CD pipelines and automation, with minimal manual intervention and short lead times for changes. Systems are actively monitored, with incidents investigated quickly, root causes documented, and meaningful preventative fixes implemented. Cross-functional teams feel supported because infrastructure needs are anticipated, clearly communicated, and backed by up-to-date documentation. Over time, you’re recognized for reducing operational toil, improving system performance and cost efficiency, and thoughtfully introducing new tools and practices that elevate how we run production.
The MUST Haves:
- Bachelor’s degree (or equivalent) in Computer Science, Software Engineering, Information Technology, or a related discipline; or equivalent professional experience gained in a similar infrastructure/DevOps engineering role.
- Four or more (4+) years of experience in IT operations or a related role with a strong focus on Terraform and AWS.
- Proficiency in Terraform for infrastructure as code (IaC) management.
- Hands-on experience with AWS services (e.g., EC2, S3, RDS, Lambda).
- Experience with scripting languages including Bash and Python/PHP.
- Experience managing MySQL and RDS databases.
- Knowledge of Jenkins for CI/CD pipeline management.
- Strong analytical and troubleshooting skills with the ability to resolve complex infrastructure issues.
- Excellent verbal and written communication skills, with the ability to convey technical information clearly to both technical and non-technical stakeholders.
The Nice to Haves:
- AWS knowledge required, experience with Google Cloud Platform (GCP) is a plus.
- Knowledge across multiple cloud providers.
- Certification in public cloud disciplines will be an advantage.
- Hands-on experience with Docker and Kubernetes.
- Usage of AI knowledge in deployment and uptime automation.
Site Reliability Engineer II (SRE II) employer: Prodege, LLC
Contact Detail:
Prodege, LLC Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Site Reliability Engineer II (SRE II)
✨Tip Number 1
Network like a pro! Reach out to folks in the industry on LinkedIn or at meetups. A friendly chat can lead to opportunities that aren’t even advertised yet.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repo showcasing your projects, especially those involving AWS, Terraform, and CI/CD. This gives potential employers a taste of what you can do.
✨Tip Number 3
Prepare for interviews by practising common SRE scenarios. Think about how you’d handle incidents or optimise systems. We want to see your problem-solving skills in action!
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive!
We think you need these skills to ace Site Reliability Engineer II (SRE II)
Some tips for your application 🫡
Tailor Your Application: Make sure to customise your CV and cover letter for the SRE II role. Highlight your experience with AWS, Terraform, and CI/CD pipelines, as these are key to what we’re looking for. Show us how your skills align with our needs!
Showcase Your Projects: If you've worked on relevant projects, don’t hold back! Include specific examples of how you’ve used automation, scripting, or database management to solve problems. We love seeing real-world applications of your skills.
Keep It Clear and Concise: When writing your application, clarity is key. Use straightforward language and avoid jargon where possible. We want to understand your experience without having to decode it!
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re genuinely interested in joining our team!
How to prepare for a job interview at Prodege, LLC
✨Know Your Tech Stack
Make sure you’re well-versed in AWS services, Terraform, and CI/CD pipelines. Brush up on your knowledge of EC2, S3, RDS, and Jenkins. Being able to discuss your hands-on experience with these tools will show that you’re ready to hit the ground running.
✨Showcase Your Problem-Solving Skills
Prepare to discuss specific incidents where you’ve troubleshot infrastructure issues or optimised performance. Use the STAR method (Situation, Task, Action, Result) to structure your answers, highlighting how you resolved complex problems effectively.
✨Emphasise Collaboration
Since this role involves working with cross-functional teams, be ready to share examples of how you’ve collaborated with others in past projects. Highlight your communication skills and how you’ve supported team members in achieving shared goals.
✨Security Best Practices Matter
Familiarise yourself with security protocols relevant to cloud infrastructure. Be prepared to discuss how you’ve implemented security measures in previous roles and how you stay updated on best practices to ensure compliance and safety.