Kairos
Back to jobs

Manager, Data Center Operations

On-site
CrusoeOH, US / Springfield, MO, US1 hour agoWebsite
Fresh
Full-time
Manager / Lead
Data Center Operations (DIG)

Compensation

Salary undisclosed
Apply
Share

Description

Crusoe is on a mission to accelerate the abundance of energy and intelligence. As the only vertically integrated AI infrastructure company built from the ground up, we own and operate each layer of the stack — from electrons to tokens — to power the world's most ambitious AI workloads. When you join Crusoe, you join a team that is building the future, faster.

We're in the midst of the greatest industrial revolution of our time. The demand for AI compute is boundless, and power is a bottleneck. We're solving that — with an energy-first approach that makes AI infrastructure better for the world and faster for the people innovating with AI.

We're looking for problem-solving, opportunity-finding teammates with a sense of urgency, who believe in the scale of our ambition and thrive on a path not fully paved — people who want to grow their careers alongside a team of experts across energy, manufacturing, data center construction, and cloud services.

If you want to do the most meaningful work of your career, help our customers and partners advance their AI strategies, and be part of a high-performing team that believes in each other, come build with us at Crusoe.

About the Role

Crusoe is seeking a Manager of Data Center Operations to lead our OH5C site in Springfield, Ohio.

This is a hands-on leadership role overseeing the day-to-day health of a high-density, GPU-heavy compute environment. You will lead the on-site technician team, drive hardware reliability and break-fix performance, manage colocation relationships, and ensure the site meets fleet-wide operational standards.

The ideal candidate is a technically strong, highly accountable leader who can move comfortably between the data center floor and senior-level operational reviews.

What You’ll Be Working On

Site Operations

  • Own the daily operation, health, and availability of the OH5C data center.

  • Lead troubleshooting and repair of GPU compute hardware, including GPU trays, DIMMs, drives, cabling, and server nodes.

  • Drive rapid triage and repair while maintaining MTTR and uptime targets.

  • Coordinate RMAs and hardware support with OEM vendors, primarily SuperMicro.

  • Maintain spare-parts inventory and ensure critical hardware is available when needed.

  • Partner with Fleet Operations, SRE, networking, and infrastructure teams on escalations.

Team Leadership

  • Lead, coach, and develop the on-site data center technician team.

  • Set clear expectations for safety, quality, responsiveness, and accountability.

  • Conduct regular one-on-ones, performance reviews, and development planning.

  • Support technician hiring, onboarding, training, and workforce planning.

  • Build a culture of technical precision, ownership, and continuous improvement.

Performance and Reporting

  • Track and report site KPIs, including uptime, MTTR, SLA compliance, deployment velocity, and ticket aging.

  • Use operational data to identify recurring issues and improve reliability.

  • Maintain accurate break-fix workflows in Jira or a comparable ticketing system.

  • Provide clear operational updates, incident summaries, and corrective-action plans to senior leadership.

Colocation and Facilities

  • Serve as the primary on-site liaison with the colocation provider.

  • Hold facility partners accountable to SLAs related to power, cooling, security, and availability.

  • Maintain working knowledge of UPS systems, PDUs, generators, CRAC and CRAH systems, and supporting infrastructure.

  • Escalate and track facility issues through resolution.

  • Coordinate planned maintenance to minimize risk to production systems.

Process and Documentation

  • Maintain site runbooks, SOPs, emergency procedures, and hardware documentation.

  • Ensure work is completed in accordance with safety, security, and change-management standards.

  • Contribute to fleet-wide operating standards and knowledge sharing.

  • Maintain accurate asset, inventory, and configuration records.

What You’ll Bring to the Team

  • 5+ years of data center operations leadership experience in a production environment.

  • Experience managing and developing technical teams.

  • Hands-on experience troubleshooting enterprise server hardware, including GPU nodes, DIMMs, drives, cabling, and rack-level infrastructure.

  • Strong familiarity with SuperMicro hardware, diagnostics, event logs, and RMA processes.

  • Experience working in colocation environments and managing provider SLAs.

  • Working knowledge of data center electrical and mechanical systems.

  • Experience with Jira, ServiceNow, or a similar ticketing platform.

  • Strong understanding of incident management, root-cause analysis, and operational risk.

  • Clear written and verbal communication skills, including the ability to present technical and operational information to senior leaders.

  • Ability to work on-site in Springfield, Ohio, and support critical incidents as needed.

Preferred Qualifications

  • Experience supporting AMD GPU clusters, including MI300X or equivalent platforms.

  • Familiarity with NVIDIA GPU platforms such as H100, H200, or B200.

  • Understanding of RoCE fabric topology and common failure modes.

  • Experience with DCIM or asset-management tools such as NetBox.

  • Multi-site or regional data center operations experience.

  • Experience in rapidly scaling cloud, hyperscale, or AI infrastructure environments.

Location and Travel

This role is based on-site at Crusoe’s OH5C facility in Springfield, Ohio. Periodic travel to other Crusoe sites may be required for training, cross-site projects, or operational support.

Benefits

  • Competitive compensation and equity

  • Restricted Stock Units

  • Paid time off, holidays, and leave programs

  • Medical, dental, and vision insurance

  • Employer HSA contributions

  • Paid parental leave

  • Life, short-term disability, and long-term disability insurance

  • Professional development and tuition reimbursement

  • Mental health and wellness support

  • Commuter benefits

  • Cell phone stipend

  • 401(k) with company match up to 4%

  • Volunteer time off

  • Global travel insurance and emergency assistance

  • Daily meal allowance

  • Additional location-specific benefits

Compensation Range

Compensation will be paid within a range of $135,000–$175,000, plus bonus. Restricted Stock Units are included in all offers. Final compensation will be determined based on the applicant’s knowledge, education, experience, and abilities, as well as internal equity and alignment with market data.

Crusoe is an Equal Opportunity Employer. Employment decisions are made without regard to race, color, religion, disability, genetic information, pregnancy, citizenship, marital status, sex/gender, sexual preference/ orientation, gender identity, age, veteran status, national origin, or any other status protected by law or regulation.

Stack

GPU
Posted
Unknown
Last seen
Jun 29, 2026
First seen
Jun 29, 2026

Similar roles