Job Board

Companies

AgileRL Ltd

DevOps Engineer

DevOps Engineer in City of London

City of London Full-Time 30000 - 50000 £ / year (est.) No home office possible

At a Glance

Tasks: Design and maintain cloud infrastructure for cutting-edge reinforcement learning platforms.
Company: Join a fast-growing team at AgileRL, focused on innovative ML tooling.
Benefits: Competitive salary, stock options, 30 days holiday, and flexible remote work.
Why this job: Be part of a dynamic team shaping the future of reinforcement learning technology.
Qualifications: Experience in DevOps, cloud platforms, and containerisation technologies required.
Other info: Enjoy a vibrant culture with regular socials and a learning budget.

The predicted salary is between 30000 - 50000 £ per year.

We are seeking a talented and experienced DevOps Engineer to join our team. This engineer will contribute to the further development of Arena, a web-based software platform for reinforcement learning training and RLOps. As a DevOps Engineer, you will be responsible for designing, implementing, and maintaining the cloud infrastructure, CI/CD pipelines, and deployment systems that enable businesses to build and deploy reinforcement learning models at scale.

Responsibilities

Design and maintain robust, scalable cloud infrastructure to support high-performance reinforcement learning workloads and distributed training environments.
Build and optimise CI/CD pipelines for both our open-source framework and Arena enterprise platform, ensuring reliable deployments and automated testing.
Implement and manage containerisation strategies using Docker and Kubernetes for ML model training, deployment, and orchestration.
Develop infrastructure as code (IaC) solutions using tools like Terraform, CloudFormation, or Pulumi to ensure reproducible and version-controlled infrastructure.
Monitor system performance, implement alerting and logging solutions, and troubleshoot production issues across distributed ML training environments.
Collaborate with ML engineers to optimise resource allocation and cost efficiency for compute-intensive RL training workloads.
Implement security best practices, manage access controls, and ensure compliance with enterprise security requirements.
Automate operational tasks including backup strategies, disaster recovery procedures, and system maintenance.
Support the deployment and scaling of GPU clusters and distributed computing resources for reinforcement learning applications.
Maintain high availability and performance of production systems serving ML models to external customers.

Requirements

Bachelor's degree or higher in Computer Science, Engineering, or a related field, or 3+ years of relevant DevOps/infrastructure experience.
Strong experience with cloud platforms (AWS, GCP, Azure) and their ML/AI services, with expertise in managing compute-intensive workloads.
Proficiency in containerisation technologies (Docker, Kubernetes) and container orchestration for ML workloads.
Experience with Infrastructure as Code tools (Terraform, CloudFormation, Pulumi) and configuration management.
Solid understanding of CI/CD principles and tools (GitHub Actions, GitLab CI, Jenkins) with experience in ML pipeline automation.
Knowledge of monitoring and observability tools (Prometheus, Grafana, OpenObserve) and their application to ML systems.
Experience with GPU infrastructure management and distributed computing frameworks for machine learning.
Familiarity with MLOps practices and tools for model deployment, versioning, and lifecycle management.
Strong scripting skills in Python, Bash, or similar languages for automation tasks.
Understanding of networking, security, and database management in cloud environments.
Experience with high-performance computing environments and job scheduling systems is a plus.
Knowledge of machine learning workflows and the unique infrastructure requirements of ML training and inference.
Strong problem-solving skills and ability to work in a fast-paced, collaborative environment.
Excellent communication skills and experience working with cross-functional teams.

Compensation

Competitive salary + significant stock options.
30 days of holiday, plus bank holidays, per year.
Flexible working from home and 6‑month remote working policies.
Enhanced parental leave.
Learning budget of £500 per calendar year for books, training courses and conferences.
Company pension scheme.
Regular team socials and quarterly all-company parties.
Bike2Work scheme.

Join the fast-growing AgileRL team and play a key role in the development of cutting-edge reinforcement learning tooling and infrastructure.

DevOps Engineer in City of London employer: AgileRL Ltd

At AgileRL, we pride ourselves on being an exceptional employer that fosters a collaborative and innovative work culture. As a DevOps Engineer, you will enjoy a competitive salary, significant stock options, and a generous holiday allowance, alongside flexible working arrangements that promote work-life balance. Our commitment to employee growth is evident through our annual learning budget and regular team socials, making AgileRL a rewarding place to advance your career in the exciting field of reinforcement learning.

Contact Detail:

AgileRL Ltd Recruiting Team

View AgileRL Ltd Profile

StudySmarter Expert Advice 🤫

We think this is how you could land DevOps Engineer in City of London

✨Network Like a Pro

Get out there and connect with folks in the industry! Attend meetups, webinars, or even online forums. The more people you know, the better your chances of hearing about job openings before they hit the market.

✨Show Off Your Skills

Create a portfolio showcasing your projects, especially those related to cloud infrastructure and CI/CD pipelines. Having tangible examples of your work can really set you apart from other candidates.

✨Ace the Interview

Prepare for technical interviews by brushing up on your knowledge of Docker, Kubernetes, and IaC tools. Practice common interview questions and be ready to discuss how you've tackled challenges in past roles.

✨Apply Through Our Website

Don't forget to apply directly through our website! It shows you're genuinely interested in joining our team and helps us keep track of your application more efficiently.

We think you need these skills to ace DevOps Engineer in City of London

Cloud Infrastructure Design

CI/CD Pipeline Optimisation

Containerisation (Docker, Kubernetes)

Infrastructure as Code (Terraform, CloudFormation, Pulumi)

Monitoring and Observability (Prometheus, Grafana, OpenObserve)

GPU Infrastructure Management

MLOps Practices

Scripting (Python, Bash)

Networking and Security in Cloud Environments

Machine Learning Workflows

Problem-Solving Skills

Communication Skills

Collaboration with Cross-Functional Teams

Some tips for your application 🫡

Be Authentic: When answering the longer-form questions, let your personality shine through! We want to hear your genuine thoughts and experiences, so avoid using AI-generated responses. Show us what makes you tick and why you're excited about this role.

Tailor Your Responses: Make sure to align your answers with the job description. Highlight your relevant experience in cloud infrastructure, CI/CD pipelines, and containerisation. This will help us see how you fit into our team and the AgileRL mission.

Showcase Your Problem-Solving Skills: In your application, don’t just list your skills; demonstrate how you've used them to solve real-world problems. Share specific examples of challenges you've faced in DevOps and how you tackled them. We love a good story!

Apply Through Our Website: We encourage you to submit your application directly through our website. It’s the best way for us to keep track of your application and ensure it gets the attention it deserves. Plus, it’s super easy!

How to prepare for a job interview at AgileRL Ltd

✨Know Your Tech Stack

Make sure you’re well-versed in the specific technologies mentioned in the job description, like AWS, Docker, and Kubernetes. Brush up on your knowledge of CI/CD principles and tools, as well as Infrastructure as Code solutions. Being able to discuss these confidently will show that you're ready to hit the ground running.

✨Showcase Your Problem-Solving Skills

Prepare examples from your past experiences where you've tackled complex issues, especially in cloud infrastructure or ML workloads. Use the STAR method (Situation, Task, Action, Result) to structure your answers. This will help demonstrate your analytical thinking and ability to work under pressure.

✨Understand the Company’s Mission

Familiarise yourself with AgileRL's mission and how your role as a DevOps Engineer fits into it. Be ready to articulate what excites you about contributing to their reinforcement learning platform. This shows genuine interest and alignment with their goals.

✨Prepare for Technical Questions

Expect technical questions that assess your understanding of cloud infrastructure, containerisation, and monitoring tools. Practice explaining your thought process clearly and concisely. You might even want to do some mock interviews with friends or colleagues to get comfortable with the format.

DevOps Engineer in City of London

AgileRL Ltd

Location: City of London

DevOps Engineer in City of London

City of London

Full-Time

30000 - 50000 £ / year (est.)
AgileRL Ltd

50-100

View AgileRL Ltd Profile

Similar positions in other companies

UK’s top job board for Gen Z

Discover now

DevOps Engineer in City of London

At a Glance

DevOps Engineer in City of London employer: AgileRL Ltd

StudySmarter Expert Advice 🤫

✨Network Like a Pro

✨Show Off Your Skills

✨Ace the Interview

✨Apply Through Our Website

We think you need these skills to ace DevOps Engineer in City of London

Some tips for your application 🫡

How to prepare for a job interview at AgileRL Ltd

DevOps Engineer in City of London

Land your dream job quicker with Premium

Similar positions in other companies

UK’s top job board for Gen Z