At a Glance
- Tasks: Lead the design and evolution of ML infrastructure and MLOps capabilities.
- Company: Join iProov, a leader in biometric identity assurance with a diverse culture.
- Benefits: Enjoy competitive pay, performance bonuses, share options, and generous leave.
- Other info: Dynamic hybrid work environment with opportunities for mentorship and growth.
- Why this job: Make a real impact by building scalable systems for cutting-edge machine learning.
- Qualifications: Experience in MLOps, cloud infrastructure, and strong software engineering skills required.
The predicted salary is between 60000 - 80000 £ per year.
About iProov
iProov provides science-based biometric solutions that enable the world’s most security-conscious organizations to streamline secure remote onboarding and authentication for digital and physical access. Our award-winning liveness technology and iSOC offer unmatched resilience against deepfakes and generative AI threats while ensuring effortless, scalable user experiences. Trusted by leading governments and enterprises, including the U.S. Department of Homeland Security, U.K. Home Office, GovTech Singapore, ING, and UBS, iProov sets the standard in biometric identity assurance.
This global trust is built not only on our technology but on the strength of the people behind it. For us, diversity at iProov is about reflecting the customers we serve, holding the principles of equality and inclusion at the heart of everything we do and all that we stand for, embracing differences, creating possibilities, and growing together. We aim to foster a culture where individuals of all backgrounds feel confident in bringing their whole selves to work, feel included, and their talents are nurtured, empowering them to contribute fully to our purpose.
The Role
- Reports to: Chief Scientific Officer
- Location: WeWork Waterloo - Hybrid
- Comp: Negotiable (Base) + Company Performance Bonus (20%) + Share Options + iProov Benefits
We are looking for a highly capable and hands-on Senior ML Infrastructure Lead to build and scale the technical foundations that enable machine learning to operate effectively in production. This is a hybrid leadership role sitting across machine learning infrastructure, platform engineering and MLOps. You will be responsible for designing and evolving the systems, tooling, processes and standards that allow ML teams to train, deploy, monitor and improve models reliably, securely and at scale. You will work at the intersection of machine learning, software engineering, data, cloud infrastructure and platform reliability, helping bridge the gap between research and production. This role is ideal for someone who can think strategically about long-term platform capability, while still being technically hands-on enough to solve complex engineering and operational challenges.
How you can make an impact
- Lead the design and evolution of our ML platform, infrastructure and MLOps capability
- Build and maintain scalable, reliable and secure systems for model training, testing, deployment, monitoring and lifecycle management
- Develop the infrastructure and tooling that enable ML Engineers, Data Scientists and Researchers to work efficiently and ship models with confidence
- Design robust workflows for CI/CD, model versioning, reproducibility, experimentation, feature management and release management
- Own and improve the production environment for machine learning systems, ensuring strong standards for availability, performance, observability and resilience
- Define and implement monitoring across model and platform layers, including system health, data quality, drift, latency, throughput and cost efficiency
- Build or optimise internal self-service tooling and platform capabilities to reduce friction for teams working on ML use cases
- Partner closely with ML, Data, Software and Platform Engineering teams to productionise models and improve the end-to-end ML development lifecycle
- Support the scaling of infrastructure for both training and inference workloads, including high-throughput, real-time or compute-intensive use cases where relevant
- Drive best practice in governance, security, compliance, auditability and operational rigour across the ML lifecycle
- Improve the efficiency and cost-effectiveness of ML systems, including cloud resource usage, compute environments and deployment patterns
- Mentor engineers and act as a technical leader across ML platform and operations topics
- Help define the roadmap for ML enablement, ensuring the platform can support current needs while scaling for future growth
What we would like to see from you
- You will have experience working in high growth, fast paced tech-first environments. You are passionate about building & launching quality products that have a positive impact.
- You’re an experienced product leader with a background in security, identity (IAM), or enterprise SaaS. You combine strategic vision with operational rigour, and you’re motivated by delivering usable, secure, and elegant solutions to complex technical problems.
- Proven experience in a senior MLOps, ML Platform, ML Infrastructure, Platform Engineering or Machine Learning Systems role
- Strong hands-on background in software engineering and cloud infrastructure, ideally with direct experience supporting production machine learning environments
- Experience building and operating systems that support the full ML lifecycle, from experimentation and training through to deployment and monitoring
- Strong knowledge of Python and sound engineering principles, including testing, automation and code quality
- Strong experience with cloud platforms such as GCP
- Experience with Docker, Kubernetes and modern containerised deployment patterns
- Strong experience with CI/CD pipelines, infrastructure-as-code and workflow orchestration
- Experience with tools such as Airflow or similar platform and orchestration technologies
- Good understanding of model observability, data quality, feature pipelines, lineage and reproducibility
- Experience designing scalable infrastructure for ML workloads, including training, batch inference and real-time serving
- Strong appreciation of reliability, security, governance and operational excellence in customer-facing or production-critical systems
- Ability to operate across both strategic and hands-on technical work
- Strong communication skills and the ability to work effectively across engineering, product and data teams
Nice-to-haves
- Experience supporting computer vision, deep learning, LLM or other compute-intensive ML workloads
- Experience with GPU infrastructure, distributed training or high-performance compute environments
- Familiarity with feature stores, model registries and automated retraining pipelines
- Experience building internal developer platforms or self-service ML tooling
- Experience in regulated, high-security or high-availability environments
- Experience leading or mentoring engineers in a scale-up or high-growth technology business
- Familiarity with responsible AI, model governance or risk controls in production ML setting
Benefits
25 days Annual Leave, plus 8
Remote ML Infrastructure Lead in Salford employer: iProov
iProov is an exceptional employer that champions diversity and inclusion, fostering a collaborative work culture where every individual can thrive. Located in the vibrant WeWork Waterloo, employees benefit from a hybrid working model, competitive compensation packages including performance bonuses and share options, and ample opportunities for professional growth in a cutting-edge tech environment focused on biometric security solutions. With a commitment to nurturing talent and empowering teams, iProov stands out as a place where meaningful contributions are valued and innovation is at the forefront.