Principal Architect - HPC & AI (NVidia Ecosystem)

Job Board

Companies

World Wide Technology

Principal Architect - HPC & AI (NVidia Ecosystem)

Full-Time 150000 - 175000 € / year (est.) No home office possible

At a Glance

Tasks: Lead the design and implementation of cutting-edge HPC and AI platforms using NVIDIA technology.
Company: Join a forward-thinking tech company focused on innovative solutions and customer satisfaction.
Benefits: Enjoy competitive pay, health benefits, remote work options, and generous paid time off.
Other info: Work in a dynamic environment with opportunities for professional growth and mentorship.
Why this job: Be at the forefront of the AI revolution and shape the future of technology.
Qualifications: 10+ years in HPC or AI architecture with strong technical expertise in NVIDIA systems.

The predicted salary is between 150000 - 175000 € per year.

Solutions Consulting & Engineering is an organization that is customer-focused and solutions-led. We deliver end-to-end and emerging solutions to drive customer satisfaction and increase profitability and growth. Our world-class management consulting, delivery excellence, and engineering brilliance enable our success. We embody the OneWWT mindset by bringing the right talent at the right time from anywhere within WWT to solve our customer's problems. Our goal is to bring together business acumen with full-stack technical know-how to develop innovative solutions for our clients' most complex challenges.

The Principal Architect leads HPC AI focused Professional Services delivery engagements and cross functional technical teams on customer programs or projects. They are responsible for technical communications with WWT Engineers, Architects, and the customer for AI-driven projects. The Principal Architect may participate in several Customer projects concurrently, integrating AI solutions with enterprise IT systems.

The Principal Architect will be at the epicenter of the AI revolution, working with the most advanced hardware on the planet. Whether you're helping a research facility unlock new scientific breakthroughs or an enterprise to build its first private AI cloud, your fingerprints will be on the infrastructure that defines the next decade of technology.

The right person for the job is a senior individual contributor responsible for designing, implementing, and optimizing large-scale High-Performance Computing and AI platforms centered on the NVIDIA data center ecosystem. This role operates in a hybrid capacity, combining hands-on technical architecture with selective customer-facing advisory responsibilities.

The architect serves as a technical authority across GPU-accelerated compute, high-performance networking, and modern parallel storage platforms, influencing architectural standards and delivery outcomes while ensuring successful, on-time, and on-budget customer deployments without escalations.

This is a remote work from home position, with an average travel expectation of approximately 10%, and a willingness for additional travel during peak project phases or critical customer engagements.

Key Responsibilities

Lead the end-to-end architecture of GPU-accelerated HPC and AI platforms, including greenfield AI factory designs and optimization of existing HPC environments.
Architect integrated solutions spanning Compute, Networking, and Storage using NVIDIA HGX and DGX platforms, Grace CPU architectures, Spectrum-X networking, and high-performance parallel storage systems.
Design storage architectures optimized for AI training, inference, and HPC workloads, balancing performance, scalability, resiliency, and cost.
Define reference architectures, design patterns, and best practices for repeatable and supportable customer deployments.
Provide hands-on technical leadership during implementation phases, including cluster bring-up, performance tuning, and workload optimization.
Architect and integrate workload orchestration and scheduling platforms using NVIDIA Base Command Manager, Slurm, Kubernetes and Run:AI.
Optimize end-to-end data pipelines, including GPU utilization, storage throughput, metadata performance, and job scheduling efficiency.
Troubleshoot performance bottlenecks across Compute, Networking, and Storage.
Design and validate high-performance storage solutions using modern parallel and scale-out storage platforms.
Demonstrate hands-on experience with at least one of the following storage technologies: VAST Data, WEKA, DDN, Lustre, Netapp.
Architect storage solutions that support demanding AI and HPC workloads, including high-throughput training pipelines, checkpointing, and large-scale shared datasets.
Collaborate with compute and networking design to ensure balanced, bottleneck-free architectures.
Act as a senior technical authority for HPC and AI architecture across internal teams and customer engagements.
Participate selectively in customer-facing discussions to validate architecture and delivery plans, with a primary focus on design integrity and execution rather than pre-sales.
Influence platform standards, architectural direction, and technical decision-making through expertise and demonstrated execution.
Identify technical risks early across Compute, Networking, Storage, and orchestration layers, and drive mitigation strategies.
Partner with the PMO counterpart to resolve Risks and Issues upon identification and to ensure production-ready, supportable platforms.
Ensure staff, contractors, and partners adhere to WWT best practices and templates for AI solution delivery.
Review deployment documents, technical assessments, and other outputs to ensure consistency and accuracy, aligning with AI and "One Voice" standards.

Required Technical Expertise

Expert level with deep architectural knowledge of NVIDIA data center platforms, including HGX and DGX platforms.
GPU-accelerated compute architecture for AI and HPC workloads.
High-performance networking architectures, especially with Spectrum-X.
Hands-on architectural experience with high-performance parallel or scale-out storage systems.
Deep understanding of storage performance characteristics relevant to AI and HPC workloads, including bandwidth, IOPS, latency, and metadata scaling.
Proven experience integrating storage platforms such as VAST Data, Netapp, WEKA, DDN, or Lustre into GPU-accelerated environments.
NVIDIA Base Command Manager (BCM) for cluster lifecycle management and operations.
Slurm for HPC workload scheduling and resource management.
Run:AI for GPU orchestration and multi-tenant AI workload optimization.
Kubernetes administration including deploying and managing GPU-accelerated AI and HPC workloads.
Linux systems administration in large-scale, performance-sensitive environments.
Containerized AI workflows and their interaction with schedulers and storage systems.

Additional Experience

Experience optimizing existing HPC or AI platforms for performance, utilization, and cost efficiency.
Prior experience with multi-site, air-gapped, or regulated environments is beneficial but not required.
Experience with liquid cooling, power/cooling design, and data center integration strongly preferred.

Leadership & Influence

Senior individual contributor role with influence through technical authority rather than people management.
Ability to mentor engineers and architects through design reviews, architectural guidance, and technical leadership.
Comfortable operating autonomously in complex, high-impact technical environments.

Documentation & Repeatability Expectations

Develop and maintain high quality architectural documentation, including design blueprints, configuration guides, deployment validation reports, and operational runbooks.
Ensure all technical artifacts meet WWT's One Voice standards for clarity, completeness, and technical accuracy, enabling consistent delivery across teams.
Create reusable templates, reference architectures, and standardized design patterns that accelerate future projects and improve delivery quality.
Drive a culture of documentation discipline, ensuring that every deployment is reproducible, supportable, and aligned with architectural intent.

Educational/Experience Requirements

Bachelor's degree in a technical field or equivalent hands-on experience architecting large scale HPC or AI systems.
Advanced degree (MS/PhD) in relevant fields is a plus but not required.
Experience: 10+ years in HPC, Data Center Architecture, and/or Systems Engineering.
Bare Metal Focus: A fundamental preference for, and understanding of, on-premises hardware constraints (power, cooling, cabling).
Proven experience as a Senior, or Lead Architect or equivalent experience in AI projects.

Principal Architect - HPC & AI (NVidia Ecosystem) employer: World Wide Technology

At WWT, we pride ourselves on being an exceptional employer, offering a dynamic work culture that fosters innovation and collaboration. As a Principal Architect in the HPC & AI domain, you will have access to cutting-edge technology while enjoying a comprehensive benefits package, including competitive pay, generous paid time off, and opportunities for professional growth. Our commitment to employee well-being and a supportive environment ensures that you can thrive both personally and professionally, making WWT a truly rewarding place to advance your career.

Contact Detail:

World Wide Technology Recruiting Team

View World Wide Technology Profile

StudySmarter Expert Advice🤫

We think this is how you could land Principal Architect - HPC & AI (NVidia Ecosystem)

✨Tip Number 1

Network like a pro! Reach out to folks in the industry, attend meetups, and connect with potential colleagues on LinkedIn. You never know who might have the inside scoop on job openings or can put in a good word for you.

✨Tip Number 2

Show off your skills! Create a portfolio or a personal website showcasing your projects and achievements. This is your chance to demonstrate your expertise in HPC and AI, especially with NVIDIA platforms, and make a lasting impression.

✨Tip Number 3

Prepare for interviews by brushing up on common technical questions related to HPC and AI architecture. Practice explaining complex concepts in simple terms, as you'll need to communicate effectively with both technical teams and clients.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, it shows you’re genuinely interested in joining our team at WWT.

We think you need these skills to ace Principal Architect - HPC & AI (NVidia Ecosystem)

High-Performance Computing (HPC)

AI Platform Architecture

NVIDIA Data Center Ecosystem

GPU-Accelerated Compute

High-Performance Networking

Storage Architecture

Performance Tuning

Workload Orchestration

Kubernetes Administration

Linux Systems Administration

Technical Documentation

Mentoring and Technical Leadership

Problem-Solving Skills

Customer Engagement

Risk Management

Some tips for your application 🫡

Tailor Your Application:Make sure to customise your CV and cover letter to highlight your experience with HPC and AI, especially within the NVIDIA ecosystem. We want to see how your skills align with our needs, so don’t hold back on showcasing relevant projects!

Showcase Your Technical Expertise:When detailing your experience, focus on your hands-on work with GPU-accelerated platforms and high-performance networking. We love seeing specific examples of how you've tackled complex challenges in previous roles.

Be Clear and Concise:Keep your application straightforward and to the point. Use clear language to describe your achievements and avoid jargon unless it’s necessary. We appreciate clarity as it reflects your communication skills, which are key for this role.

Apply Through Our Website:We encourage you to submit your application directly through our website. It’s the best way to ensure your application gets into the right hands and shows us you’re serious about joining our team!

How to prepare for a job interview at World Wide Technology

✨Know Your NVIDIA Ecosystem

Make sure you brush up on your knowledge of NVIDIA's data centre platforms, especially HGX and DGX. Familiarise yourself with their architecture and how they integrate into HPC and AI workloads. This will show that you're not just a generalist but someone who understands the specific technologies that are crucial for the role.

✨Demonstrate Hands-On Experience

Be ready to discuss your hands-on experience with high-performance computing and AI platforms. Prepare examples of past projects where you've designed or optimised systems, particularly focusing on storage solutions and workload orchestration. This will help you stand out as a candidate who can hit the ground running.

✨Prepare for Technical Discussions

Expect to engage in deep technical discussions during your interview. Brush up on your knowledge of performance tuning, workload optimisation, and troubleshooting across compute, networking, and storage layers. Being able to articulate your thought process and problem-solving strategies will impress the interviewers.

✨Showcase Your Leadership Skills

Even though this is a senior individual contributor role, demonstrating your ability to lead and mentor others is key. Share experiences where you've influenced architectural decisions or guided teams through complex projects. This will highlight your capability to be a technical authority and a collaborative team player.

Principal Architect - HPC & AI (NVidia Ecosystem)

World Wide Technology

Principal Architect - HPC & AI (NVidia Ecosystem)

At a Glance

Principal Architect - HPC & AI (NVidia Ecosystem) employer: World Wide Technology

StudySmarter Expert Advice🤫

We think you need these skills to ace Principal Architect - HPC & AI (NVidia Ecosystem)

Some tips for your application 🫡

How to prepare for a job interview at World Wide Technology

Company

Product

Help