At a Glance
- Tasks: Design and maintain robust infrastructure and data pipelines for AI projects.
- Company: Join a pioneering AI company aiming to revolutionise the spicy content industry.
- Benefits: Enjoy a competitive salary, equity, and a flexible remote work culture.
- Why this job: Be part of a fast-paced team working on cutting-edge AI technology with high impact.
- Qualifications: 5+ years in DevOps or Data Engineering, strong Python skills, and cloud expertise required.
- Other info: Opportunity for professional growth and exposure to innovative AI/ML technologies.
The predicted salary is between 43200 - 72000 £ per year.
Location: Remote
Type: Full-time
Experience Level: Senior
Industry: Generative AI / Artificial Intelligence / Machine Learning
Reports To: Head of Engineering / CTO
About Us
Ready to join a cutting edge AI company? We’re on a mission to become the OpenAI of the spicy content industry, building a full-spectrum ecosystem of revolutionary AI infrastructure and products. Our platform, OhChat, features digital twins of real-world personalities and original AI characters, enabling users to interact with lifelike AI-generated characters through text, voice, and images, with a roadmap that includes agentic superModels, API integrations, and video capabilities.
Role Overview
We are looking for a Senior DevOps Specialist with a strong python and data engineering background to support our R&D and tech teams by designing, building, and maintaining robust infrastructure and data pipelines across AWS and GCP. You will be instrumental in ensuring our systems are scalable, observable, cost-effective, and secure. This role is hands-on, cross-functional, and central to our product and research success.
Key Responsibilities
- DevOps & Infrastructure
- Design, implement, and maintain infrastructure on AWS and Google Cloud Platform (GCP) to support high-performance computing workloads and scalable services.
- Collaborate with R&D teams to provision and manage compute environments for model training and experimentation.
- Maintain / monitor systems, implement observability solutions (e.g., logging, metrics, tracing), and proactively resolve infrastructure issues.
- Manage CI/CD pipelines for rapid, reliable deployment of services and models.
- Ensure high availability, disaster recovery, and robust security practices across environments.
- Build and maintain data processing pipelines for model training, experimentation, and analytics.
- Work closely with machine learning engineers and researchers to understand data requirements and workflows.
- Design and implement solutions for data ingestion, transformation, and storage using tools such as Scrappy, Playwright, agentic workflows (e.g. crawl4ai) or equivalent.
- Optimize and benchmark AI training / inference / data workflows to ensure high performance, scalability, cost and an exceptional customer experience.
- Maintain data quality, lineage, and compliance across multiple environments.
Key Requirements
- 5+ years of experience in DevOps, Site Reliability Engineering, or Data Engineering roles.
- Deep expertise with AWS and GCP, including services like EC2, S3, Lambda, IAM, GKE, BigQuery, and more.
- Strong proficiency in infrastructure-as-code tools (e.g., Terraform, Pulumi, CloudFormation).
- Extensive hands-on experience with Docker, Kubernetes, and CI/CD tools such as GitHub Actions, Bitbucket Pipelines, or Jenkins, with a strong ability to optimize CI/CD workflows as well as AI training and inference pipelines for performance and reliability.
- Exceptional programming skills in Python. You are expected to write clean, efficient, and production-ready code. You should be highly proficient with modern Python programming paradigms and tooling.
- Proficiency in data-centric programming and scripting languages (e.g., Python, SQL, Bash).
- Proven experience designing and maintaining scalable ETL/ELT pipelines.
- Focused, sharp, and results-oriented: You are decisive, work with a high degree of autonomy, and consistently deliver high-quality results. You are quick to understand and solve the core of a problem and know how to summarize it efficiently for stakeholders.
- Effective communicator and concise in reporting: You should be able to communicate technical insights in a clear and actionable manner, both verbally and in written form. Your reports should be precise, insightful, and aligned with business objectives.
Nice to Have
- Experience supporting AI/ML model training infrastructure (e.g., GPU orchestration, model serving) for both Diffusion- and LLM pipelines.
- Familiarity with data lake architectures and tools like Delta Lake, LakeFS, or Databricks.
- Knowledge of security and compliance best practices (e.g., SOC2, ISO 27001).
- Exposure to MLOps platforms or frameworks (e.g., MLflow, Kubeflow, Vertex AI).
What We Offer
- Competitive salary + equity
- Flexible work environment and remote-friendly culture
- Opportunities to work on cutting-edge AI/ML technology
- Fast-paced environment with high impact and visibility
- Professional growth support and resources
DevOps Specialist employer: OhChat
Contact Detail:
OhChat Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land DevOps Specialist
✨Tip Number 1
Familiarise yourself with the specific tools and technologies mentioned in the job description, such as AWS, GCP, Docker, and Kubernetes. Having hands-on experience or projects that showcase your skills with these platforms can set you apart from other candidates.
✨Tip Number 2
Network with professionals in the AI and DevOps fields. Attend relevant meetups, webinars, or online forums where you can connect with current employees or industry experts. This can provide valuable insights into the company culture and potentially lead to referrals.
✨Tip Number 3
Prepare to discuss your previous experiences in detail, especially those related to building and maintaining infrastructure and data pipelines. Be ready to share specific examples of challenges you've faced and how you overcame them, as this demonstrates your problem-solving abilities.
✨Tip Number 4
Showcase your communication skills by being clear and concise when discussing technical topics. Practice explaining complex concepts in simple terms, as effective communication is crucial for collaborating with cross-functional teams in this role.
We think you need these skills to ace DevOps Specialist
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights relevant experience in DevOps, Data Engineering, and your proficiency with AWS and GCP. Use keywords from the job description to ensure your application stands out.
Craft a Compelling Cover Letter: In your cover letter, express your passion for AI and how your skills align with the company's mission. Mention specific projects or experiences that demonstrate your expertise in building scalable infrastructure and data pipelines.
Showcase Technical Skills: Clearly outline your technical skills, especially in Python, Docker, Kubernetes, and CI/CD tools. Provide examples of how you've used these technologies in past roles to solve complex problems or improve processes.
Highlight Communication Abilities: Since effective communication is key for this role, include examples of how you've successfully communicated technical insights to non-technical stakeholders. This will show that you can bridge the gap between technical and business objectives.
How to prepare for a job interview at OhChat
✨Showcase Your Technical Skills
Be prepared to discuss your experience with AWS and GCP in detail. Highlight specific projects where you've implemented infrastructure-as-code tools like Terraform or CloudFormation, and be ready to explain how you optimised CI/CD workflows.
✨Demonstrate Problem-Solving Abilities
Expect to face scenario-based questions that assess your ability to troubleshoot and resolve infrastructure issues. Use examples from your past experiences to illustrate how you approached complex problems and the solutions you implemented.
✨Communicate Clearly and Concisely
As effective communication is key for this role, practice explaining technical concepts in a straightforward manner. Be ready to summarise your reports and insights in a way that aligns with business objectives, ensuring clarity for non-technical stakeholders.
✨Prepare for Collaboration Questions
Since the role involves working closely with R&D teams, think of examples that showcase your collaborative skills. Be ready to discuss how you've worked with machine learning engineers or researchers to meet data requirements and improve workflows.