Senior Engineer, Network Observability
Senior Engineer, Network Observability

Senior Engineer, Network Observability

Full-Time No home office possible
CoreWeave

Overview

Senior Engineer for Network Observability — join CoreWeave’s Network Observability team. You will design, develop, and maintain the monitoring, telemetry, and observability systems that keep CoreWeave’s GPU cloud network operating reliably and at scale. You will build solutions that provide real-time insights into network performance and enable proactive issue detection and rapid resolution.

What You’ll Do

  • Develop, optimize, and maintain network observability platforms. Use Python and Go to create and automate collectors, exporters, and dashboards that provide deep visibility into network health and performance.
  • Collaborate with Network Engineering and Platform teams to ingest and unify logs, metrics, and events from Arista EOS, NVIDIA Cumulus Linux, Nokia SR OS, SR Linux, and other platforms into a single observability pipeline.
  • Design and implement scalable telemetry solutions using protocols like gNMI, SNMP, and streaming analytics. Ensure advanced alerting and anomaly detection with Prometheus, Grafana, and Alertmanager.
  • Work with network developers, site reliability engineers, and security teams to integrate observability across the broader infrastructure. Participate in design discussions, RFCs, and architectural decisions.
  • Join a rotating on-call schedule to troubleshoot observability-related issues and provide timely support to operations teams, quickly isolating and fixing problems.
  • Guide junior team members, share best practices, and foster a culture of continuous learning within the observability domain.

Who You Are — Minimum Qualifications

  • Deep familiarity with Prometheus, Grafana, Alertmanager, gNMI, and SNMP. Experience writing or extending custom metric collectors/exporters is a plus.
  • Experience as a Network Engineer, SRE, Software Developer, or Systems Administrator in large-scale environments with telemetry and monitoring deployments.
  • Passion for automating tasks and reducing human error through automated workflows.
  • Experience containerizing solutions in Kubernetes and deploying container-based workloads efficiently.
  • Proficient with Python, Go, and Bash; familiarity with configuration management and templating tools (e.g., Ansible, Jinja2).
  • Strong knowledge of Linux systems and IP networking concepts, including routing, switching, and network troubleshooting.
  • Hands-on experience with platforms such as Arista EOS, NVIDIA Cumulus Linux, Nokia SR OS, and SR Linux.
  • Collaborative, humble, and open to learning from more senior colleagues.

Preferred Qualifications

  • Bachelor’s degree in Computer Science or related field.
  • Hands-on ML experience for anomaly detection in networks (e.g., TensorFlow, scikit-learn).
  • Network certifications (e.g., CCNA, CCNP) or equivalent.
  • Experience with data pipelines, event correlation, or large-scale analytics.
  • Familiarity with OpenTelemetry, Jaeger, or Zipkin for distributed tracing.

Why CoreWeave?

At CoreWeave, we work hard, have fun, and move fast. We’re in a hyper-growth phase and value curiosity, ownership, and collaboration. Our core values:

  • Be Curious at Your Core
  • Act Like an Owner
  • Empower Employees
  • Deliver Best-in-Class Client Experiences
  • Achieve More Together

We support an entrepreneurial mindset and provide opportunities to develop innovative solutions. You will be surrounded by top talent and gain growth opportunities as we scale.

What We Offer

In addition to a competitive salary, we offer a range of benefits to support your needs:

  • Family-level Medical Insurance
  • Family-level Dental Insurance
  • Generous Pension Contribution
  • Life Assurance at 4x Salary
  • Critical Illness Cover
  • Employee Assistance Programme
  • Tuition Reimbursement
  • Work culture focused on innovative disruption

Benefits may vary by location.

Our Workplace

We prioritize a hybrid work environment; remote work may be considered for candidates located more than 30 miles from an office, based on role requirements. New hires attend onboarding at a hub within the first month. Teams also gather quarterly to support collaboration.

CoreWeave is an equal opportunity employer, committed to fostering an inclusive and supportive workplace. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, disability, age, sexual orientation, gender identity, national origin, veteran status, or genetic information.

Export Control Compliance

This position requires access to export controlled information. To conform to U.S. Government export regulations, applicants must meet certain criteria. CoreWeave may decline to pursue export licensing as appropriate.

#J-18808-Ljbffr

CoreWeave

Contact Detail:

CoreWeave Recruiting Team

Senior Engineer, Network Observability
CoreWeave

Land your dream job quicker with Premium

You’re marked as a top applicant with our partner companies
Individual CV and cover letter feedback including tailoring to specific job roles
Be among the first applications for new jobs with our AI application
1:1 support and career advice from our career coaches
Go Premium

Money-back if you don't land a job in 6-months

>