Responsibilities
- Contribute to building and evolving the platform (infrastructure + reusable abstractions) that standardises data engineering workloads (batch/streaming pipelines, data processing) and traditional ML workflows (feature engineering, training, batch/real-time serving) across teams.
- Implement platform-level IaC, CI/CD, and environment management to support consistent, reproducible workloads across dev/test/prod.
- Build and maintain components using Python and Spark for data processing, shared datasets, and platform services.
- Contribute to shared services for data and ML lifecycle management (data pipelines, experiment tracking, versioning, lineage, permissions), aligned to enterprise governance (e.g. Unity Catalog).
- Support the implementation and operation of a centralised AgentOps capability (LLM gateway, tool integration, prompt and version management).
- Contribute to agent‑specific lifecycle and safety controls (evaluation pipelines, guardrails, access control), with guidance from senior engineers.
- Enhance observability across both domains:
- Data & ML Ops: data quality, pipeline reliability, model performance.
- Agent Ops: traces, responses, evaluations, cost and behaviour monitoring.
- Contribute to problem solving across platform reliability, performance, and security for data, ML, and agent workloads.
- Apply security and compliance best practices (RBAC/ACLs, secure configuration, identity and access management), supporting a secure‑by‑default platform design.
- Collaborate with Data Engineers, Data Scientists, and ML Engineers to enable adoption of platform capabilities across ASOS Tech.
- Contribute to documentation, standards, and best practices across the platform.
Qualifications
- Experience in Data Platforms, Data Engineering, Cloud Engineering, or ML Platform Engineering roles, with exposure to Azure.
- Strong hands‑on experience with Python and Apache Spark.
- Experience with Azure data platform technologies such as Azure Databricks, ADLS Gen2, and Unity Catalog.
- Working knowledge of security and access management (RBAC, ACLs, identity concepts such as Entra ID).
- Familiarity with Infrastructure as Code using Terraform.
- Experience with CI/CD (Azure DevOps, GitHub Actions).
- Basic understanding of Azure networking (vNets, NSGs, Private Endpoints).
- Exposure to Docker/Kubernetes in cloud environments is beneficial.
- Awareness of AgentOps patterns (LLM gateways, prompt/version control, evaluation, observability) is a plus.
- Good communication and collaboration skills, with a strong focus on learning and continuous improvement.
Benefits
- Employee discount (hello ASOS discount!).
- Employee sample sales.
- 25 days paid annual leave + an extra celebration day for a special moment.
- Private medical care scheme.
- Fixed Annual Payment in addition to your salary each year, it's just an extra thank you from us.
- Opportunity for personalised learning and in-the-moment experiences that enable you to thrive and excel in your role.