Senior Software Engineer (ML Quality Assurance) in London

Senior Software Engineer (ML Quality Assurance) in London

London Full-Time No working from home possible
graphcore

Requirements

  • Experience in production-quality software engineering roles
  • Strong software design and architecture skills, with experience working on large or complex systems
  • Strong proficiency in Python, including experience building and maintaining production codebases
  • Solid experience with CI/CD systems and automated testing (preferably GitHub-based workflows)
  • Experience working in Linux environments
  • Familiarity with C or C++, with the ability to read, debug, and reason about low‑level code when needed
  • Proven ability to mentor junior engineers and influence engineering practices within a team
  • Strong problem‑solving skills and a proactive, self‑directed approach to work
  • Bachelor/Master's/PhD or equivalent experience in Computer Science, Maths, Machine Learning, Data Science, or related field
  • (Desirable) Exposure to machine learning frameworks such as PyTorch, JAX, Triton,TensorFlow
  • (Desirable) Experience with distributed workload management systems such asKubernetes, VLLM, Keras or MLOpspipelines
  • (Desirable) Experience working with hardware simulators or emulators (e.g. QEMU)
  • (Desirable) Experience developing for or working with FPGA-based systems
  • (Desirable) Experience with people management or mentoring

What the job involves

  • Applicants for this role should have strong experience designing, developing, and maintaining high‑quality software systems
  • The role focuses on testing and validating a complex machine learning software stack, with particular emphasis on software architecture, automation, and engineering best practices
  • The ideal candidate is an experienced software engineer who values code quality, testability, and long‑term maintainability, and enjoys building systems that other engineers rely on
  • This person will be comfortable working across large codebases, contributing to CI/CD infrastructure, and shaping technical direction through thoughtful design and mentoring in a technically demanding environment spanning ML frameworks, infrastructure, and AI accelerator hardware
  • Design, implement, and maintain robust test infrastructure and automation for a complex ML software stack
  • Architect and evolve test frameworks and tooling with a focus on scalability, maintainability, and developer experience
  • Build and maintain CI/CD pipelines targeting simulators, emulators (e.g. QEMU), and physical hardware
  • Create representativeML workloads and gain insights from their execution. (Numerical accuracy, performance analysis and benchmarking)
  • Work closely with all Software development teams, supporting a culture of quality, security and maintainability
  • Review code and designs, setting a high bar for software engineering best practices
  • Mentor and support junior engineers, helping raise the overall technical capability of the team
  • Evaluate existing test strategies and infrastructure, identifying gaps and driving improvements aligned with team and organizational goals

#J-18808-Ljbffr
graphcore

Contact Details:

graphcore Recruitment Team