Machine Learning Performance Engineer in City of London
Machine Learning Performance Engineer

Machine Learning Performance Engineer in City of London

City of London Full-Time No home office possible
J

Overview

We are looking for an engineer with experience in low-level systems programming and optimisation to join our growing ML team.

Machine learning is a critical pillar of Jane Street\’s global business. Our ever-evolving trading environment serves as a unique, rapid-feedback platform for ML experimentation, allowing us to incorporate new ideas with relatively little friction.

Your part here is optimising the performance of our models – both training and inference. We care about efficient large-scale training, low-latency inference in real-time systems and high-throughput inference in research. Part of this is improving straightforward CUDA, but the interesting part needs a whole-systems approach, including storage systems, networking and host- and GPU-level considerations. Zooming in, we also want to ensure our platform makes sense even at the lowest level – is all that throughput actually goodput? Does loading that vector from the L2 cache really take that long?

If you’ve never thought about a career in finance, you’re in good company. If you have a curious mind and a passion for solving interesting problems, we have a feeling you’ll fit right in.

Responsibilities

Responsibilities are centered on optimising model performance and system integration across training and inference, with a focus on whole-systems approaches beyond CUDA to storage, networking, and host- and GPU-level considerations.

Qualifications

  • An understanding of modern ML techniques and toolsets
  • The experience and systems knowledge required to debug a training run’s performance end to end
  • Low-level GPU knowledge of PTX, SASS, warps, cooperative groups, Tensor Cores and the memory hierarchy
  • Debugging and optimisation experience using tools like CUDA GDB, NSight Systems, NSight Computesight-systems and nsight-compute
  • Library knowledge of Triton, CUTLASS, CUB, Thrust, cuDNN and cuBLAS
  • Intuition about the latency and throughput characteristics of CUDA graph launch, tensor core arithmetic, warp-level synchronization and asynchronous memory loads
  • Background in Infiniband, RoCE, GPUDirect, PXN, rail optimisation and NVLink, and how to use these networking technologies to link up GPU clusters
  • An understanding of the collective algorithms supporting distributed GPU training in NCCL or MPI
  • An inventive approach and the willingness to ask hard questions about whether we\’re taking the right approaches and using the right tools

Note: The final line items in the original description were form-field prompts and additional information for source; those have been omitted to preserve focus on the role content.

#J-18808-Ljbffr

J

Contact Detail:

Jane Street Recruiting Team

Machine Learning Performance Engineer in City of London
Jane Street
Location: City of London

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

J
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>