At a Glance
- Tasks: Design and maintain cloud infrastructure for cutting-edge AI systems and optimise CI/CD pipelines.
- Company: Join AgileRL, a leader in reinforcement learning technology with a collaborative culture.
- Benefits: Competitive salary, stock options, 30 days holiday, flexible working, and a £500 learning budget.
- Why this job: Be part of the AI revolution and work on impactful projects that shape the future.
- Qualifications: Experience in DevOps, cloud platforms, and containerisation technologies required.
- Other info: Dynamic team environment with excellent career growth and regular social events.
The predicted salary is between 36000 - 60000 £ per year.
Join to apply for the DevOps Engineer role at AgileRL. At AgileRL, we are on a mission to accelerate reinforcement learning for building superhuman artificial intelligence systems. We offer Arena, an enterprise-grade reinforcement learning operations (RLOps) platform and a state-of-the-art open-source framework to accelerate RL development. Arena focuses on simulation, training, deployment and monitoring to enable scalable reinforcement learning workflows. We work with companies across industries to deliver autonomous solutions and are looking for talented engineers to develop the systems and tools that will enable the next wave of impactful AI.
Responsibilities
- Design and maintain robust, scalable cloud infrastructure to support high-performance reinforcement learning workloads and distributed training environments.
- Build and optimise CI/CD pipelines for both our open-source framework and Arena enterprise platform, ensuring reliable deployments and automated testing.
- Implement and manage containerisation strategies using Docker and Kubernetes for ML model training, deployment, and orchestration.
- Develop infrastructure as code (IaC) solutions using tools like Terraform, CloudFormation, or Pulumi to ensure reproducible and version-controlled infrastructure.
- Monitor system performance, implement alerting and logging solutions, and troubleshoot production issues across distributed ML training environments.
- Collaborate with ML engineers to optimise resource allocation and cost efficiency for compute-intensive RL training workloads.
- Implement security best practices, manage access controls, and ensure compliance with enterprise security requirements.
- Automate operational tasks including backup strategies, disaster recovery procedures, and system maintenance.
- Support the deployment and scaling of GPU clusters and distributed computing resources for reinforcement learning applications.
- Maintain high availability and performance of production systems serving ML models to external customers.
Requirements
- Bachelor's degree or higher in Computer Science, Engineering, or a related field, or 3+ years of relevant DevOps/infrastructure experience.
- Strong experience with cloud platforms (AWS, GCP, Azure) and their ML/AI services, with expertise in managing compute-intensive workloads.
- Proficiency in containerisation technologies (Docker, Kubernetes) and container orchestration for ML workloads.
- Experience with Infrastructure as Code tools (Terraform, CloudFormation, Pulumi) and configuration management.
- Solid understanding of CI/CD principles and tools (GitHub Actions, GitLab CI, Jenkins) with experience in ML pipeline automation.
- Knowledge of monitoring and observability tools (Prometheus, Grafana, OpenObserve) and their application to ML systems.
- Experience with GPU infrastructure management and distributed computing frameworks for machine learning.
- Familiarity with MLOps practices and tools for model deployment, versioning, and lifecycle management.
- Strong scripting skills in Python, Bash, or similar languages for automation tasks.
- Understanding of networking, security, and database management in cloud environments.
- Experience with high-performance computing environments and job scheduling systems is a plus.
- Knowledge of machine learning workflows and the unique infrastructure requirements of ML training and inference.
- Strong problem-solving skills and ability to work in a fast-paced, collaborative environment.
- Excellent communication skills and experience working with cross-functional teams.
Compensation
- Competitive salary + significant stock options.
- 30 days of holiday, plus bank holidays, per year.
- Flexible working from home and 6-month remote working policies.
- Enhanced parental leave.
- Learning budget of £500 per calendar year for books, training courses and conferences.
- Company pension scheme.
- Regular team socials and quarterly all-company parties.
Learn more about AgileRL at https://agilerl.com.
DevOps Engineer in London employer: AgileRL
Contact Detail:
AgileRL Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land DevOps Engineer in London
✨Network Like a Pro
Get out there and connect with folks in the industry! Attend meetups, webinars, or even online forums. The more people you know, the better your chances of landing that DevOps Engineer role at AgileRL.
✨Show Off Your Skills
Create a portfolio showcasing your projects, especially those involving cloud infrastructure, CI/CD pipelines, and containerisation. This is your chance to demonstrate your expertise and make a lasting impression on potential employers.
✨Ace the Interview
Prepare for technical interviews by brushing up on your knowledge of AWS, Docker, and Kubernetes. Practice common DevOps scenarios and be ready to discuss how you've tackled challenges in previous roles. Confidence is key!
✨Apply Through Our Website
Don't forget to apply directly through our website! It shows you're genuinely interested in joining AgileRL and gives you a better chance of being noticed by our hiring team. Let's get you on board!
We think you need these skills to ace DevOps Engineer in London
Some tips for your application 🫡
Tailor Your CV: Make sure your CV is tailored to the DevOps Engineer role. Highlight your experience with cloud platforms, containerisation, and CI/CD pipelines. We want to see how your skills align with our mission at AgileRL!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Share your passion for reinforcement learning and how you can contribute to our team. Keep it concise but impactful – we love a good story!
Showcase Relevant Projects: If you've worked on any projects related to ML or DevOps, make sure to mention them. We’re keen to see your hands-on experience with tools like Docker, Kubernetes, and Terraform. It’s all about demonstrating your expertise!
Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates. Let’s get started on this journey together!
How to prepare for a job interview at AgileRL
✨Know Your Tech Stack
Make sure you’re well-versed in the technologies mentioned in the job description, like Docker, Kubernetes, and Terraform. Brush up on your cloud platform knowledge too, whether it’s AWS, GCP, or Azure. Being able to discuss your hands-on experience with these tools will show that you’re ready to hit the ground running.
✨Showcase Your Problem-Solving Skills
Prepare to discuss specific challenges you've faced in previous roles, especially those related to CI/CD pipelines or infrastructure management. Use the STAR method (Situation, Task, Action, Result) to structure your answers, making it clear how you approached problems and what the outcomes were.
✨Understand MLOps Practices
Since the role involves working closely with ML engineers, it’s crucial to have a solid grasp of MLOps practices. Be ready to talk about how you’ve implemented model deployment and lifecycle management in past projects. This will demonstrate your ability to collaborate effectively in a cross-functional team.
✨Ask Insightful Questions
Interviews are a two-way street, so prepare some thoughtful questions about AgileRL's approach to reinforcement learning and their future projects. This not only shows your interest in the company but also gives you a chance to assess if it’s the right fit for you.