Senior Remote Support Engineer
Senior Remote Support Engineer

Senior Remote Support Engineer

Full-Time No home office possible
N

Join NScale as a Senior HPC Support Engineer NScale is the GPU cloud engineered for AI. We provide cost-effective, high-performance infrastructure for AI startups and enterprises. Our platform reduces the complexity of AI development, empowering our customers to achieve faster innovation and better outcomes. Our mission is to enable the AI breakthroughs of tomorrow by delivering exceptional infrastructure today. At NScale, we’re builders at heart — driven by ownership, innovation, and urgency. We\’re looking for a Senior HPC Support Engineer to join our fast-growing team, focused on enabling and optimising HPC and AI workloads on GPU-accelerated infrastructure. You’ll work directly with customers solving some of the most complex problems in AI, helping them troubleshoot and optimize performance in compute-intensive, distributed environments. This is a hands-on role requiring deep technical acumen, exceptional problem-solving ability, and comfort working across a diverse set of technologies including GPUs (NVIDIA and AMD), InfiniBand networking, and orchestration systems like Slurm. Provide expert-level support for customer HPC and AI workloads running in production. Troubleshoot complex system-level issues across networking, storage, containers, and GPUs. Collaborate with engineering and vendor partners to resolve hardware/software compatibility and performance issues. Develop internal tools and automation to improve support workflows. Participate in on-call rotations to support high-priority incidents and escalations. Proven experience supporting HPC and/or AI workloads in production environments. Proficiency with system-level debugging, including kernel modules and network interfaces. OpenMPI), InfiniBand, and high-speed Ethernet networking. Solid Linux administration skills and troubleshooting experience. Experience with monitoring tools such as Prometheus, Grafana, and DCGM. Kubernetes expertise, particularly in HPC or AI workload contexts. Familiarity with distributed file systems and advanced storage configurations. Exposure to machine learning frameworks and AI optimization workflows. Scripting skills in Python, Bash, or similar for automation and tooling. Experience in designing and implementing processes to optimize deployment workflows. Please Note: We\’re currently working remotely , but plan to transition to a hybrid working model in 2025 as we look to secure a modern office space in London. We strongly encourage applications from people of colour, the LGBTQ+ community, people with disabilities, neurodivergent people, parents, carers, and people from lower socio-economic backgrounds.

N

Contact Detail:

Nscale Recruiting Team

Senior Remote Support Engineer
Nscale
N
  • Senior Remote Support Engineer

    Full-Time

    Application deadline: 2027-05-27

  • N

    Nscale

Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>