At a Glance
- Tasks: Deploy and manage cutting-edge AI/ML infrastructure and collaborate on innovative CI/CD pipelines.
- Company: Join a dynamic tech company at the forefront of AI/ML innovation.
- Benefits: Competitive salary, flexible work options, and opportunities for professional growth.
- Other info: Thriving culture that embraces challenges and encourages continuous improvement.
- Why this job: Make a real impact in a fast-paced environment while working with advanced technologies.
- Qualifications: Experience in model training, Kubernetes, and cloud platforms is essential.
The predicted salary is between 60000 - 80000 £ per year.
As a Platform engineer, MLOps, you will be critical to deploying and managing cutting‑edge infrastructure crucial for AI/ML operations, and you will collaborate with AI/ML engineers and researchers to develop a robust CI/CD pipeline that supports safe and reproducible experiments. Your expertise will also extend to setting up and maintaining monitoring, logging, and alerting systems to oversee extensive training runs and client‑facing APIs. You will ensure that training environments are optimally available and efficiently managed across multiple clusters, enhancing our containerization and orchestration systems with advanced tools like Docker and Kubernetes.
This role demands a proactive approach to maintaining large Kubernetes clusters, optimizing system performance, and providing operational support for our suite of software solutions. If you are driven by challenges and motivated by the continuous pursuit of innovation, this role offers the opportunity to make a significant impact in a dynamic, fast‑paced environment.
Your responsibilities:
- Work closely with AI/ML engineers and researchers to design and deploy a CI/CD pipeline that ensures safe and reproducible experiments.
- Set up and manage monitoring, logging, and alerting systems for extensive training runs and client‑facing APIs.
- Ensure training environments are consistently available and prepared across multiple clusters.
- Develop and manage containerization and orchestration systems utilizing tools such as Docker and Kubernetes.
- Operate and oversee large Kubernetes clusters with GPU workloads.
- Improve reliability, quality, and time‑to‑market of our suite of software solutions.
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.
- Provide primary operational support and engineering for multiple large‑scale distributed software applications.
Is this you?
You have professional experience with:
- Model training
- Huggingface Transformers
- Pytorch
- LLM
- TensorRT
- Infrastructure as code tools like Terraform
- Scripting languages such as Python or Bash
- Cloud platforms such as Google Cloud, AWS or Azure
- Git and GitHub workflows
- Tracing and Monitoring
- Familiar with high‑performance, large‑scale ML systems.
You have a knack for troubleshooting complex systems and enjoy solving challenging problems. Proactive in identifying problems, performance bottlenecks, and areas for improvement. Take pride in building and operating scalable, reliable, secure systems. Are comfortable with ambiguity and rapid change.
Preferred skills and experience:
- Familiar with monitoring tools such as Prometheus, Grafana, or similar.
- 5+ years building core infrastructure.
- Experience running inference clusters at scale.
- Experience operating orchestration systems such as Kubernetes at scale.
Platform engineer, MLOps employer: Writer
As a Platform engineer, MLOps at our company, you will thrive in a collaborative and innovative environment that prioritises employee growth and development. We offer competitive benefits, a dynamic work culture that embraces challenges, and the opportunity to work with cutting-edge technology in a fast-paced setting. Join us to make a meaningful impact while enjoying the unique advantages of working in a location that fosters creativity and teamwork.
StudySmarter Expert Advice🤫
We think this is how you could land Platform engineer, MLOps
✨Tip Number 1
Network like a pro! Reach out to folks in the industry, attend meetups, and connect with people on LinkedIn. You never know who might have the inside scoop on job openings or can refer you directly.
✨Tip Number 2
Show off your skills! Create a portfolio or GitHub repository showcasing your projects, especially those related to MLOps, Docker, and Kubernetes. This gives potential employers a taste of what you can do beyond your CV.
✨Tip Number 3
Prepare for interviews by brushing up on common technical questions and scenarios related to CI/CD pipelines and Kubernetes management. Practise explaining your thought process clearly; it’s all about demonstrating your problem-solving skills.
✨Tip Number 4
Don’t forget to apply through our website! We love seeing candidates who are genuinely interested in joining us at StudySmarter. Tailor your application to highlight how your experience aligns with our needs in MLOps.
We think you need these skills to ace Platform engineer, MLOps
Some tips for your application 🫡
Tailor Your CV:Make sure your CV reflects the skills and experiences that match the Platform Engineer, MLOps role. Highlight your experience with CI/CD pipelines, Kubernetes, and any relevant cloud platforms. We want to see how you can contribute to our innovative environment!
Craft a Compelling Cover Letter:Your cover letter is your chance to shine! Use it to explain why you're passionate about AI/ML operations and how your background makes you a perfect fit for our team. We love seeing enthusiasm and a proactive mindset!
Showcase Your Projects:If you've worked on any relevant projects, whether personal or professional, make sure to mention them. We’re interested in your hands-on experience with tools like Docker, Kubernetes, and any ML systems you've tackled. It helps us understand your practical skills!
Apply Through Our Website:We encourage you to apply directly through our website. It’s the best way for us to receive your application and ensures you don’t miss out on any important updates. Plus, we love seeing applications come in through our own channels!
How to prepare for a job interview at Writer
✨Know Your Tech Stack
Make sure you’re well-versed in the tools and technologies mentioned in the job description, like Docker, Kubernetes, and CI/CD pipelines. Brush up on your experience with model training and cloud platforms, as these will likely come up during the interview.
✨Showcase Problem-Solving Skills
Prepare to discuss specific challenges you've faced in previous roles, especially related to troubleshooting complex systems. Use the STAR method (Situation, Task, Action, Result) to structure your answers and highlight how you proactively identified and solved issues.
✨Demonstrate Collaboration
Since this role involves working closely with AI/ML engineers and researchers, be ready to share examples of successful collaborations. Talk about how you’ve contributed to team projects and how you communicate technical concepts to non-technical stakeholders.
✨Ask Insightful Questions
Prepare thoughtful questions that show your interest in the company’s projects and future direction. Inquire about their current challenges with Kubernetes clusters or how they measure success in their CI/CD processes. This not only shows your enthusiasm but also helps you gauge if the role is a good fit for you.