At a Glance
- Tasks: Manage and optimise NVIDIA GPU servers for AI and HPC projects.
- Company: Join Darktrace, a leader in cyber security AI, protecting over 9,000 customers globally.
- Benefits: Enjoy 23 days holiday, private medical insurance, and a cycle to work scheme.
- Why this job: Be part of innovative projects in a collaborative environment with cutting-edge technology.
- Qualifications: Strong problem-solving skills and experience in system administration, especially with HPC and GPU servers.
- Other info: This hybrid role requires 2 days a week in the Cambridge office.
The predicted salary is between 43200 - 72000 £ per year.
Systems Software Engineer (HPC) page is loaded
Systems Software Engineer (HPC)
Apply locations Cambridge Office time type Full time posted on Posted 5 Days Ago job requisition id 1338
Founded by mathematicians and cyber defense experts in 2013, Darktrace is a global leader in cyber security AI, delivering complete AI-powered solutions in its mission to free the world of cyber disruption. We protect more than 9,000 customers from the world’s most complex threats, including ransomware, cloud, and SaaS attacks.
Our roots lie deep in innovation. The Darktrace AI Research Centre based in our Cambridge, UK headquarters, has conducted research establishing new thresholds in cyber security, with technology innovations backed by over 130 patents and pending applications.
For more information on our cutting-edge technology, visit darktrace.com.
Job Description:
Darktrace is seeking an experienced Systems Software Engineer (HPC) to manage, maintain, and optimize a dedicated NVIDIA GPU server and cloud environments for innovation projects. Responsibilities include setting up, configuring, and maintaining the servers and software stack. A successful candidate will work directly with Darktrace researchers and software engineers, ensuring optimal performance and availability for ongoing AI and HPC projects.
This is a hybrid role, with a compulsory attendance of 2 days a week in the Cambridge office.
This role focuses on maintaining and optimising the Linux operating system, file systems, and software stack (Cuda, PyTorch, Python, etc.) for machine learning projects, as well as setting up and configuring NVIDIA HGX servers (installing and updating software, managing user access, and ensuring optimal performance) and cloud infrastructure for GPU compute projects (managing access and ensuring optimal performance). Additional responsibilities include:
- Monitoring server and application performance, identifying bottlenecks, and taking corrective actions to maintain high availability
- Implementing and maintaining server security, including patch management, vulnerability scanning, and intrusion detection
- Collaborating with network administrators, hardware engineers, and researchers to troubleshoot and resolve server and software-related issues
- Working closely with the project manager to ensure efficient resource allocation, server utilisation, and scaling across multiple teams
- Collaborating with data scientists and machine learning engineers to understand their software requirements and provide guidance on best practices
- Assisting in training team members on the capabilities and usage of the HGX servers and the software environment
- Developing multi-use tooling to work with the HPC environments
What experience do I need:
We welcome applications from engineers with strong problem-solving and creative thinking skills as well as excellent communication and the ability to work in a collaborative team environment. You will be an independent thinker with a startup mindset. Technology-wise, you will have experience in system administration, preferably with a focus on HPC platforms, GPU-based servers, and machine learning software environments, as well as familiarity with AI and HPC provisioning and management, both on-premises and in the cloud. You will have experience with server virtualization technologies and containerization and be well-versed with the Linux operating system. Ideally, you will also have:
- Strong knowledge of NVIDIA HGX server architectures and components
- Strong knowledge of AWS or Azure Cloud environments
- Experience with NVIDIA GPU technologies, such as NVLink, NVSwitch, and Tensor Core GPUs
- Experience with machine learning frameworks and libraries, such as PyTorch and associated system optimisations
- Experience with NAS servers
- Experience with data version control systems
Benefits:
- 23 days’ holiday + all public holidays, rising to 25 days after 2 years of service
- Additional day off for your birthday
- Private medical insurance covering you, your cohabiting partner, and children
- Life insurance of 4 times your base salary
- Salary sacrifice pension scheme
- Enhanced family leave
- Confidential Employee Assistance Program
- Cycle to work scheme
#J-18808-Ljbffr
Systems Software Engineer (HPC) employer: Darktrace
Contact Detail:
Darktrace Recruiting Team
StudySmarter Expert Advice 🤫
We think this is how you could land Systems Software Engineer (HPC)
✨Tip Number 1
Familiarise yourself with the specific technologies mentioned in the job description, such as NVIDIA HGX servers and machine learning frameworks like PyTorch. Having hands-on experience or projects that showcase your skills with these technologies can set you apart.
✨Tip Number 2
Network with professionals in the HPC and AI fields, especially those who work at Darktrace or similar companies. Engaging in relevant online communities or attending industry events can help you gain insights and potentially get referrals.
✨Tip Number 3
Prepare to discuss your problem-solving approach during interviews. Be ready to share specific examples of how you've tackled challenges in system administration or optimised server performance in previous roles.
✨Tip Number 4
Showcase your collaborative skills by highlighting experiences where you've worked closely with cross-functional teams, such as researchers and engineers. This will demonstrate your ability to thrive in a team-oriented environment, which is crucial for this role.
We think you need these skills to ace Systems Software Engineer (HPC)
Some tips for your application 🫡
Tailor Your CV: Make sure your CV highlights relevant experience in systems administration, particularly with HPC platforms and GPU-based servers. Emphasise your familiarity with Linux operating systems and machine learning software environments.
Craft a Compelling Cover Letter: In your cover letter, express your passion for cyber security and AI. Mention specific projects or experiences that demonstrate your problem-solving skills and ability to work collaboratively in a team environment.
Showcase Technical Skills: Clearly outline your technical skills related to NVIDIA HGX server architectures, AWS or Azure Cloud environments, and machine learning frameworks like PyTorch. Use bullet points for clarity and impact.
Highlight Collaborative Experience: Since the role involves working closely with researchers and engineers, include examples of past collaborations. Describe how you contributed to team projects and any successful outcomes that resulted from your teamwork.
How to prepare for a job interview at Darktrace
✨Showcase Your Technical Skills
Be prepared to discuss your experience with system administration, particularly in HPC platforms and GPU-based servers. Highlight any specific projects where you've optimised performance or resolved complex issues.
✨Demonstrate Problem-Solving Abilities
Expect to face technical challenges during the interview. Use examples from your past experiences to illustrate how you approached problems creatively and effectively, especially in collaborative environments.
✨Familiarise Yourself with Darktrace's Technology
Research Darktrace's AI-powered solutions and their approach to cyber security. Understanding their technology and how it relates to your role will show your genuine interest and help you ask insightful questions.
✨Prepare for Team Collaboration Questions
Since this role involves working closely with researchers and engineers, be ready to discuss your experience in team settings. Share examples of how you've collaborated on projects and contributed to a positive team dynamic.