At a Glance
- Tasks: Lead a team managing AI/ML applications from test to production using cutting-edge MLOps frameworks.
- Company: Join BT Group, a leader in digital infrastructure and innovation.
- Benefits: Competitive salary, career growth, and the chance to shape the future of connectivity.
- Why this job: Make a real impact on AI systems and revolutionise customer experiences.
- Qualifications: Experience in cloud deployment, IT service delivery, and agile methodologies.
- Other info: Be part of a dynamic team driving unprecedented change in the telecom industry.
The predicted salary is between 36000 - 60000 £ per year.
We are looking for an AI Infrastructure & Application Manager to lead a team of engineers responsible for running a suite of AI/ML applications from test through to production covering CI/CD, deployment, monitoring, version control, optimization and drift detection using an enterprise MLOps framework and AWS native services. You will also own the observability design and implementation for the serverless infrastructure behind these applications ensuring it is fit for purpose for production operations, incident response, auditability, cost transparency and service reliability. This is a hands-on leadership role: you will set technical direction, define operational standards, and coach engineers while collaborating closely with data science, product, security and platform teams. You will shape how AI systems are run in production: building the standards, tooling and culture that make AI/ML and agentic applications reliable, observable, secure and cost-effective at enterprise scale.
Responsibilities
- Lead a team of technical engineers to manage the full AI/ML application lifecycle across test/preprod/prod environments, ensuring repeatable, reliable releases.
- Implement and mature an MLOps framework covering code/data/model versioning, automated testing, release governance, rollback strategies and environment promotion controls.
- Own production readiness for AI/ML workloads: SLOs, runbooks, operational dashboards, support processes, incident response and post-incident RCA improvements.
- Design and operate CI/CD for ML solutions using patterns such as SageMaker model registry, controlled approvals and secure promotion of model artefacts through environments.
- Get deep understanding of the underneath use case and the data which is being used to develop and train the models.
- Implement model monitoring (e.g. data quality, model quality, bias drift, feature attribution drift) and alerting driving automated responses such as retraining triggers and controlled redeployments.
- Put in place drift detection, evaluation routines, and model performance reporting; partner with data science to define thresholds, baselines and acceptance criteria.
- Establish operational controls for agentic systems like policy boundaries, auditing of tool usage, quality evaluation and performance monitoring, aligned to enterprise requirements.
- Support production operations of generative AI applications using Amazon Bedrock and Amazon Bedrock AgentCore capabilities to deploy and operate agents securely at scale, with strong governance.
- Design and implement end-to-end observability for serverless services (e.g., Lambda, Step Functions, EventBridge, APIs), including structured logs, metrics, distributed traces, dashboards, alerting and correlation across workflows.
- Monitor agent behaviour, token usage/cost trends, latency, workflow health and security access patterns; drive continuous improvement and cost optimisation with FinOps-aligned reporting.
- Define standards for documentation, change management and quality gates that reduce MTTR and improve platform reliability.
Qualifications / Skills
- Cloud Deployment
- Cloud Strategy
- IT Service Delivery
- Cloud Security
- Cloud Architecture/Design
- Cloud Migration
- Virtualisation
- Agile Methodologies
- Cloud Operations
- Continuous Integration/Continuous Deployment
- Automation & Orchestration
- Cloud Storage
- Software Development Lifecycle
- Project/Programme Management
- Talent Management
- Decision Making
- Growth Mindset
- Performance Management
- Inclusive Leadership
BT Group was the world's first telco and our heritage in the sector is unrivalled. As home to several of the UK's most recognised and cherished brands - BT, EE, Openreach and Plusnet, we have always played a critical role in creating the future, and we have reached an inflection point in the transformation of our business. Over the next two years, we will complete the UK's largest and most successful digital infrastructure project - connecting more than 25 million premises to full fibre broadband. Together with our heavy investment in 5G, we play a central role in revolutionising how people connect with each other. While we are through the most capital-intensive phase of our fibre investment, meaning we can reward our shareholders for their commitment and patience, we are absolutely focused on how we organise ourselves in the best way to serve our customers in the years to come. This includes radical simplification of systems, structures, and processes on a huge scale. Together with our application of AI and technology, we are on a path to creating the UK's best telco, reimagining the customer experience and relationship with one of this country's biggest infrastructure companies. Change on the scale we will all experience in the coming years is unprecedented. BT Group is committed to being the driving force behind improving connectivity for millions and there has never been a more exciting time to join a company and leadership team with the skills, experience, creativity, and passion to take this company into a new era.
Looking in: Leading inclusively and Safely I inspire and build trust through self-awareness, honesty and integrity. Owning outcomes I take the right decisions that benefit the broader organisation. Looking out: Delivering for the customer I execute brilliantly on clear priorities that add value to our customers and the wider business. Commercially savvy I demonstrate strong commercial focus, bringing an external perspective to decision-making. Looking to the future: Growth mindset I experiment and identify opportunities for growth for both myself and the organisation. Building for the future I build diverse future-ready teams where all individuals can be at their best.
AI Infrastructure and Applications Manager in London employer: BT Group
Contact Detail:
BT Group Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land AI Infrastructure and Applications Manager in London
✨Tip Number 1
Network like a pro! Get out there and connect with folks in the AI/ML space. Attend meetups, webinars, or even just grab a coffee with someone in the industry. You never know who might have the inside scoop on job openings!
✨Tip Number 2
Show off your skills! Create a portfolio showcasing your projects, especially those related to AI infrastructure and applications. This is your chance to demonstrate your hands-on experience and technical direction – make it shine!
✨Tip Number 3
Prepare for interviews by diving deep into the company’s tech stack and recent projects. Be ready to discuss how you can lead a team in managing AI/ML applications and what operational standards you’d set. We want to see your passion and expertise!
✨Tip Number 4
Don’t forget to apply through our website! It’s the best way to ensure your application gets seen. Plus, we love seeing candidates who take that extra step to engage with us directly.
We think you need these skills to ace AI Infrastructure and Applications Manager in London
Some tips for your application 🫡
Tailor Your CV: Make sure your CV is tailored to the AI Infrastructure & Application Manager role. Highlight your experience with MLOps frameworks, AWS services, and any leadership roles you've held. We want to see how your skills align with what we're looking for!
Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about AI/ML applications and how you can contribute to our team. Be sure to mention specific projects or experiences that relate to the responsibilities outlined in the job description.
Showcase Your Technical Skills: In your application, don't forget to showcase your technical skills relevant to cloud deployment, CI/CD processes, and observability design. We love seeing concrete examples of how you've implemented these in past roles, so be specific!
Apply Through Our Website: We encourage you to apply through our website for a smoother process. It helps us keep track of your application and ensures you get all the updates directly from us. Plus, it shows you're keen on joining the StudySmarter team!
How to prepare for a job interview at BT Group
✨Know Your MLOps Inside Out
Make sure you have a solid understanding of MLOps frameworks and how they apply to AI/ML applications. Be ready to discuss your experience with CI/CD processes, model versioning, and automated testing. This will show that you can lead the team effectively in managing the full application lifecycle.
✨Demonstrate Leadership Skills
Prepare examples of how you've successfully led teams in the past. Highlight your ability to coach engineers and collaborate with cross-functional teams. This role is hands-on leadership, so showcasing your inclusive leadership style and decision-making skills will be crucial.
✨Get Familiar with AWS Services
Brush up on AWS native services relevant to AI/ML, such as SageMaker and Lambda. Be prepared to discuss how you've used these tools in previous roles, especially in terms of deployment and observability. This knowledge will help you stand out as a candidate who can hit the ground running.
✨Prepare for Technical Questions
Expect technical questions related to cloud architecture, security, and operational controls. Review key concepts and be ready to explain how you've implemented solutions in these areas. Showing your technical depth will reassure the interviewers of your capability to shape AI systems in production.