Kairos
Back to jobs

Infrastructure Hardware Technical Program Manager (Server and Network Systems)

On-site
CerebrasSunnyvale, CA, US / Toronto, ON, CA4 months agoWebsite
Software

Compensation

Salary undisclosed
Apply
Share

Description

As an Infrastructure Hardware Technical Program Manager (Server and Network Systems) on the Cluster Architecture Team, you will drive end-to-end delivery of server and network platform programs across Cerebras CS-3–based AI clusters — from requirements and vendor selection through lab bring-up, qualification, and production rollout. You will be the execution owner for multi-team programs spanning OEM/ODM partners, component vendors, internal software/runtime teams and architects, validation/QA, and deployment/operations.

This role is intentionally technical: you must understand server, network, and system-level trade-offs well enough to run effective technical reviews, keep programs grounded in real constraints, and maintain a crisp decision trail - while partnering closely with the Compute / Server / Network Platform Architects for detailed technical direction and sign-off. You will also build shared understanding with our rack/elevations and physical datacenter design partners so that server and network changes land smoothly in real deployments (without owning physical DC design).

Responsibilities

  • Own end-to-end program execution for server systems and network equipment in Cerebras clusters, including new platforms, refreshes, and major component/config changes.
  • Drive requirements gathering and convert inputs into executable plans with clear milestones, readiness gates, and cross-functional deliverables.
  • Represent Cluster Architecture in executive reviews, OKR cycles, and leadership/customer forums as needed.
  • Build and manage integrated schedules across vendors and internal teams, track dependencies, critical path, and risks.
  • Manage OEM/ODM and switch/vendor engagements (RFI/RFP, samples, escalations, roadmap alignment).
  • Partner with Compute / Server Platform / Network Architects to turn architectural decisions into qualification plans, acceptance criteria, and rollout strategies.
  • Lead qualification and release readiness (lab/staging validation, regression tracking, go/no-go decisions).
  • Own risk and change management into production, including versioning, rollout sequencing, and stakeholder communication.
  • Ensure operational readiness with deployment and fleet teams and maintain alignment with rack/physical DC owners on power, cooling, space, and cabling constraints.

 

Skills and Qualifications

  • B.S. or M.S. in Computer Science, Electrical/Computer Engineering, or equivalent experience.
  • 8+ years in Technical Program Management (or similar delivery leadership) for server, network, or infrastructure platforms from concept through production.
  • Experience coordinating complex server and/or datacenter network programs across OEM/ODMs, switch vendors, and internal engineering teams.
  • Working knowledge of server architecture (CPU/NUMA, memory bandwidth, PCIe, NIC and storage IO) and enough networking fundamentals (leaf-spine fabrics, switch platforms, high-performance interconnects) to run effective technical reviews.
  • Familiarity with Linux server fleet management (provisioning, firmware/BIOS, drivers, field triage).
  • Strong multi-team program execution skills: integrated plans, risk management, dependency tracking, and executive-level communication.
  • Ability to operate in ambiguity and keep parallel server and network workstreams aligned.
  • Experience with AI/ML, HPC, or performance-sensitive distributed infrastructure is a plus.

Stack

Machine Learning
Posted
Feb 19, 2026
Last seen
Jun 25, 2026
First seen
Jun 25, 2026
Status
active
Infrastructure Hardware Technical Program Manager (Server and Network Systems) at Cerebras | Kairos