HPC Systems Administrator

HPC Systems Administrator

Full-Time 36000 - 60000 £ / year (est.) No home office possible
W

At a Glance

  • Tasks: Design and manage high-performance computing infrastructures for cutting-edge AI projects.
  • Company: Join Accenture, a leader in technology and innovation.
  • Benefits: Competitive salary, travel opportunities, and continuous learning.
  • Why this job: Be at the forefront of AI technology and make a real impact.
  • Qualifications: Experience in HPC environments and proficiency in AI frameworks.
  • Other info: Dynamic team with a focus on diversity and well-being.

The predicted salary is between 36000 - 60000 £ per year.

Salary: Competitive salary and package (Depending on level of experience)

Locations: UK, London (must be willing to travel to client sites throughout the UK on an ad hoc basis)

Accenture are partnering with scaled UK AI compute pioneers to lead the charge on next-generation infrastructure for sovereign AI. To support this endeavor, we’re building a high-performance compute operations team in London. Our work will be sensitive, secure and on the most up-to-date high density compute stacks available. Any offer of employment is subject to satisfactory BPSS and SC security clearance which requires 5 years continuous UK address history (typically including no periods of 30 consecutive days or more spent outside of the UK) at the point of application.

Key Responsibilities:

  • Design, deploy, and manage HPC infrastructures including GPU clusters and parallel computing environments.
  • Support AI model training platforms by maintaining compute resources, optimizing scheduling, and ensuring compatibility with AI frameworks and libraries.
  • Monitor, analyse, and fine tune performance metrics addressing bottlenecks or inefficiencies.
  • Develop and maintain automation scripts and tools (e.g., PowerShell, Python, Bash) to streamline operational tasks, monitoring, and reporting.
  • Document architecture, configurations, processes, and resolutions for compliance, knowledge transfer, and continuous improvement.
  • Participate in root cause analysis (RCA) and post-incident reviews for compute or HPC-related incidents, implementing preventive measures as needed.

Required Skills:

  • Expertise in an HPC environment, including GPU cluster administration (e.g., NVIDIA, AMD) and workload schedulers such as SLURM or PBS.
  • Proficiency with AI model training workflows and experience supporting popular AI/ML frameworks (e.g., TensorFlow, PyTorch, CUDA).
  • Solid understanding of networking, storage, and server platforms in both Windows and Linux environments.
  • Advanced analytical, troubleshooting, and performance tuning skills, with the ability to diagnose and resolve complex compute and HPC issues.
  • Experience with automation, monitoring platforms, and scripting languages (e.g., Python, PowerShell, Bash) to enhance operational efficiency.
  • Strong communication and collaboration skills, with a track record of working effectively across technical and non-technical teams.
  • Familiarity with compliance, data security, and best practices for compute and HPC environments.

Qualification:

  • Relevant certifications such as ITIL, NVIDIA DLI, Dell EMC, etc.

Locations: London

Equal Employment Opportunity Statement: All employment decisions shall be made without regard to age, race, creed, color, religion, sex, national origin, ancestry, disability status, veteran status, sexual orientation, gender identity or expression, genetic information, marital status, citizenship status or any other basis as protected by federal, state, or local law. Job candidates will not be obligated to disclose sealed or expunged records of conviction or arrest as part of the hiring process. Accenture is committed to providing veteran employment opportunities to our service men and women.

About Accenture: We work with one shared purpose: to deliver on the promise of technology and human ingenuity. Every day, more than 775,000 of us help our stakeholders continuously reinvent. Together, we drive positive change and deliver value to our clients, partners, shareholders, communities, and each other. We believe that delivering value requires innovation, and innovation thrives in an inclusive and diverse environment. We actively foster a workplace free from bias, where everyone feels a sense of belonging and is respected and empowered to do their best work. At Accenture, we see well-being holistically, supporting our people’s physical, mental, and financial health. We also provide opportunities to keep skills relevant through certifications, learning, and diverse work experiences. We’re proud to be consistently recognized as one of the World’s Best Workplaces™. Join Accenture to work at the heart of change.

HPC Systems Administrator employer: WeAreTechWomen

Accenture is an exceptional employer, offering a dynamic work culture that prioritises innovation and inclusivity. As a HPC Systems Administrator in London, you will have the opportunity to work with cutting-edge technology while enjoying comprehensive benefits, professional development opportunities, and a commitment to employee well-being. Join us to be part of a team that drives meaningful change and fosters growth in a supportive environment.
W

Contact Detail:

WeAreTechWomen Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land HPC Systems Administrator

✨Tip Number 1

Network like a pro! Reach out to folks in the HPC and AI community on LinkedIn or at industry events. You never know who might have the inside scoop on job openings or can put in a good word for you.

✨Tip Number 2

Show off your skills! Create a portfolio or GitHub repository showcasing your automation scripts, HPC projects, or any relevant work. This gives potential employers a taste of what you can do and sets you apart from the crowd.

✨Tip Number 3

Prepare for interviews by brushing up on common HPC scenarios and troubleshooting questions. Practice explaining your thought process clearly, as communication is key in technical roles like this one.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team at Accenture.

We think you need these skills to ace HPC Systems Administrator

HPC Environment Expertise
GPU Cluster Administration
Workload Schedulers (SLURM, PBS)
AI Model Training Workflows
AI/ML Frameworks (TensorFlow, PyTorch, CUDA)
Networking Knowledge
Storage and Server Platforms (Windows, Linux)
Analytical Skills
Troubleshooting Skills
Performance Tuning
Automation Skills
Scripting Languages (Python, PowerShell, Bash)
Communication Skills
Collaboration Skills
Compliance and Data Security Knowledge

Some tips for your application 🫡

Tailor Your CV: Make sure your CV is tailored to the HPC Systems Administrator role. Highlight your experience with GPU clusters, AI frameworks, and any relevant certifications. We want to see how your skills match what we're looking for!

Craft a Compelling Cover Letter: Your cover letter is your chance to shine! Use it to explain why you're passionate about HPC and how your background makes you a great fit for our team. Keep it engaging and personal – we love to see your personality come through.

Showcase Your Technical Skills: Don’t forget to mention your expertise in scripting languages like Python or PowerShell, and your experience with workload schedulers. We’re keen on seeing how you’ve used these skills in real-world scenarios, so be specific!

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands. Plus, it shows us you’re serious about joining our team at Accenture!

How to prepare for a job interview at WeAreTechWomen

✨Know Your HPC Stuff

Make sure you brush up on your knowledge of HPC environments, especially GPU cluster administration and workload schedulers like SLURM or PBS. Be ready to discuss your experience with AI model training workflows and how you've supported frameworks like TensorFlow or PyTorch.

✨Show Off Your Scripting Skills

Since automation is key in this role, be prepared to talk about your experience with scripting languages like Python, PowerShell, or Bash. Maybe even bring a sample script you've written to demonstrate your skills and how it improved operational efficiency.

✨Performance Metrics Matter

Understand the importance of monitoring and fine-tuning performance metrics. Be ready to share examples of how you've addressed bottlenecks or inefficiencies in previous roles, and what tools or methods you used to analyse and improve performance.

✨Communication is Key

This role requires collaboration across technical and non-technical teams, so highlight your communication skills. Think of specific instances where you successfully worked with diverse teams to solve problems or implement solutions, and be ready to share those stories.

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

W
Similar positions in other companies
UK’s top job board for Gen Z
discover-jobs-cta
Discover now
>