Staff Engineer, Datacenter Server Lifecycle
Staff Engineer, Datacenter Server Lifecycle

Staff Engineer, Datacenter Server Lifecycle

Full-Time Home office (partial)
Anthropic

At a Glance

  • Tasks: Lead the lifecycle of datacenter machines, from deployment to decommissioning.
  • Company: Join Anthropic, a mission-driven AI company focused on safe and beneficial technology.
  • Benefits: Enjoy competitive salary, flexible hours, generous leave, and equity donation matching.
  • Other info: Diverse perspectives are valued; we encourage all candidates to apply.
  • Why this job: Make a real impact in AI while working with cutting-edge technology and a collaborative team.
  • Qualifications: 5+ years in datacenter operations and strong programming skills required.

About Anthropic: Anthropic’s mission is to create reliable, interpretable, and steerable AI systems. We want AI to be safe and beneficial for our users and for society as a whole. Our team is a quickly growing group of committed researchers, engineers, policy experts, and business leaders working together to build beneficial AI systems.

About The Role: Anthropic is expanding beyond cloud infrastructure, and this role sits at the heart of that effort. As a Senior Engineer on the Datacenter Machine Lifecycle team, you will own the end-to-end operational journey of every machine in our facility — from initial provisioning and deployment, across its working life, through maintenance and refresh, and all the way to decommissioning. This is greenfield work: you will help define the processes, tooling, and operational standards that govern how we run and retire hardware at scale.

A distinguishing aspect of this role is its deep intersection with security. The machines in our datacenter handle some of the most sensitive workloads in AI — training frontier models and serving millions of users interacting with Claude. Ensuring that every machine in the fleet is trusted, attested, and operating with a verified chain of integrity from the hardware up is a core part of the job, not an afterthought. You will partner closely with our Infrastructure Security team to define and enforce trusted compute standards across the lifecycle, from secure provisioning through end-of-life handling.

Responsibilities:

  • Lead the build-out of automation to support datacenters containing tens of thousands of servers.
  • Own and define the end-to-end machine lifecycle strategy — from provisioning and deployment through operation, maintenance, refresh, and decommissioning — and maintain automation and operational procedures for common lifecycle events (e.g. hardware failures, firmware upgrades, fleet rotations).
  • Partner closely with Infrastructure Security to design and enforce trusted compute standards across the machine lifecycle.
  • Work closely with our Networking team to ensure end-to-end connectivity across all sites.
  • Build and maintain tooling to track machine health, configuration, and operational status across the full datacenter fleet.

You May Be a Good Fit If You:

  • Have 5+ years of experience in datacenter operations, hardware infrastructure management, or a closely related discipline.
  • Have deep, hands-on experience with server hardware — including rack deployment, cabling, troubleshooting, and understanding failure modes at scale.
  • Understand hardware lifecycle management end-to-end: asset tracking, provisioning workflows, maintenance scheduling, and decommissioning practices.
  • Have strong proficiency in at least one programming language (e.g., Python, Rust, Go, or Java).
  • Are comfortable navigating ambiguity and working independently to drive progress on complex, cross-functional problems.
  • Communicate clearly and can build consensus with a wide range of stakeholders.
  • Have working knowledge of modern cloud infrastructure, including Kubernetes, Infrastructure as Code, AWS, and GCP.
  • Are comfortable with occasional travel to datacenter sites across North America.

Strong Candidates May Also Have:

  • Hands-on experience with GPU or AI accelerator hardware (e.g. NVIDIA A100/H100, AMD MI300, Google TPUs, or AWS Trainium) and an understanding of their operational demands.
  • Familiarity with modern provisioning tooling such as coreboot, LinuxBoot, or u-root.
  • Experience building or contributing to datacenter automation or fleet management platforms.
  • Experience building and deploying server operating system distributions across thousands of hosts.
  • A background in large-scale capacity planning and hardware refresh strategy, ideally at a hyperscaler or large cloud provider.
  • Experience with trusted compute and hardware security concepts such as secure boot, TPM, hardware attestation, and firmware verification — or a strong desire to develop deep expertise in this area.

Annual Salary: £255,000—£325,000 GBP

Logistics:

  • Minimum education: Bachelor’s degree or an equivalent combination of education, training, and/or experience.
  • Required field of study: A field relevant to the role as demonstrated through coursework, training, or professional experience.
  • Minimum years of experience: Years of experience required will correlate with the internal job level requirements for the position.
  • Location-based hybrid policy: Currently, we expect all staff to be in one of our offices at least 25% of the time. However, some roles may require more time in our offices.
  • Visa sponsorship: We do sponsor visas! However, we aren't able to successfully sponsor visas for every role and every candidate. But if we make you an offer, we will make every reasonable effort to get you a visa, and we retain an immigration lawyer to help with this.

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you’re interested in this work. We think AI systems like the ones we’re building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

Your safety matters to us. To protect yourself from potential scams, remember that Anthropic recruiters only contact you from @anthropic.com email addresses. In some cases, we may partner with vetted recruiting agencies who will identify themselves as working on behalf of Anthropic. Be cautious of emails from other domains. Legitimate Anthropic recruiters will never ask for money, fees, or banking information before your first day. If you’re ever unsure about a communication, don’t click any links—visit anthropic.com/careers directly for confirmed position openings.

How We’re Different: We believe that the highest-impact AI research will be big science. At Anthropic we work as a single cohesive team on just a few large-scale research efforts. And we value impact — advancing our long-term goals of steerable, trustworthy AI — rather than work on smaller and more specific puzzles. We view AI research as an empirical science, which has as much in common with physics and biology as with traditional efforts in computer science. We’re an extremely collaborative group, and we host frequent research discussions to ensure that we are pursuing the highest-impact work at any given time. As such, we greatly value communication skills.

Come work with us! Anthropic is a public benefit corporation headquartered in San Francisco. We offer competitive compensation and benefits, optional equity donation matching, generous vacation and parental leave, flexible working hours, and a lovely office space in which to collaborate with colleagues.

Staff Engineer, Datacenter Server Lifecycle employer: Anthropic

Anthropic is an exceptional employer that fosters a collaborative and innovative work culture, where employees are empowered to contribute to the development of safe and beneficial AI systems. With competitive compensation, generous benefits, and a commitment to employee growth, including opportunities for professional development and flexible working arrangements, Anthropic provides a supportive environment for its staff in the vibrant city of San Francisco. The company's focus on impactful research and diverse perspectives ensures that every team member plays a vital role in shaping the future of AI.
Anthropic

Contact Detail:

Anthropic Recruiting Team

StudySmarter Expert Advice 🤫

We think this is how you could land Staff Engineer, Datacenter Server Lifecycle

✨Tip Number 1

Network like a pro! Reach out to folks in your industry on LinkedIn or at meetups. A friendly chat can lead to opportunities that aren’t even advertised yet.

✨Tip Number 2

Prepare for those interviews! Research the company and its culture, and be ready to discuss how your skills align with their mission. We want to see your passion for AI and datacenter operations!

✨Tip Number 3

Show off your projects! If you’ve worked on relevant tech or automation tools, bring them up during interviews. It’s a great way to demonstrate your hands-on experience and problem-solving skills.

✨Tip Number 4

Don’t forget to apply through our website! It’s the best way to ensure your application gets seen by the right people. Plus, we love seeing candidates who are proactive about joining our team.

We think you need these skills to ace Staff Engineer, Datacenter Server Lifecycle

Datacenter Operations
Hardware Infrastructure Management
Server Hardware Experience
Provisioning Workflows
Maintenance Scheduling
Decommissioning Practices
Programming Proficiency (Python, Rust, Go, Java)
Cloud Infrastructure Knowledge (Kubernetes, AWS, GCP)
Automation and Tooling Development
Networking Skills
Capacity Planning
Trusted Compute Concepts
Communication Skills
Problem-Solving Skills
Cross-Functional Collaboration

Some tips for your application 🫡

Tailor Your Application: Make sure to customise your CV and cover letter for the Staff Engineer role. Highlight your experience with datacenter operations and any relevant projects that showcase your skills in hardware lifecycle management.

Showcase Your Technical Skills: Don’t forget to mention your programming proficiency! Whether it’s Python, Rust, or Java, let us know how you’ve used these languages in past projects, especially in automation or infrastructure management.

Communicate Clearly: We value clear communication, so ensure your application reflects this. Use straightforward language and structure your thoughts logically. This will help us see how you can build consensus with various stakeholders.

Apply Through Our Website: We encourage you to apply directly through our website. It’s the best way to ensure your application gets into the right hands and shows your enthusiasm for joining our team at Anthropic!

How to prepare for a job interview at Anthropic

✨Know Your Stuff

Make sure you brush up on your knowledge of datacenter operations and hardware lifecycle management. Be ready to discuss your hands-on experience with server hardware, including troubleshooting and failure modes. This role is all about owning the machine lifecycle, so show them you know what that entails!

✨Showcase Your Programming Skills

Since proficiency in programming languages like Python or Go is key, prepare to talk about your coding experience. Bring examples of how you've used programming to automate processes or solve complex problems in previous roles. They’ll want to see your technical chops in action!

✨Communicate Clearly

This position requires collaboration with various teams, so practice articulating your thoughts clearly. Think about how you can build consensus among stakeholders and share examples of past experiences where effective communication made a difference in your projects.

✨Emphasise Security Awareness

Given the role's focus on security, be prepared to discuss trusted compute standards and any relevant experience you have with hardware security concepts. Show your enthusiasm for developing expertise in this area, as it’s crucial for ensuring the integrity of the machines you'll be managing.

Staff Engineer, Datacenter Server Lifecycle
Anthropic

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>