Kairos
Back to jobs

ML Systems Performance Engineer

On-site
CerebrasBengaluru, IN / Karnataka, IN1 week agoWebsite
Software

Compensation

Salary undisclosed
Apply
Share

Description

About The Role

Engineers on the inference performance team operate at the intersection of hardware and software, driving end-to-end model inference speed and throughput. Their work spans low-level kernel performance debugging and optimization, system-level performance analysis, performance modeling and estimation, and the development of tooling for performance projection and diagnostics.

Responsibilities

  • Build performance models (kernel-level, end-to-end) to estimate the performance of state of the art and customer ML models.
  • Optimize and debug our kernel micro code and compiler algorithms to elevate ML model inference speed, throughput and compute utilization on the Cerebras WSE.
  • Debug and understand runtime performance on the system and cluster.
  • Develop tools and infrastructure to help visualize performance data collected from the Wafer Scale Engine and our compute cluster.

Requirements

  • Bachelors / Masters / PhD in Electrical Engineering or Computer Science.
  • Strong background in computer architecture.
  • Exposure to and understanding of low-level deep learning / LLM math.
  • Strong analytical and problem-solving mindset.
  • 3+ years of experience in a relevant domain (Computer Architecture, CPU/GPU Performance, Kernel Optimization, HPC).
  • Experience working on CPU/GPU simulators.
  • Exposure to performance profiling and debug on any system pipeline.
  • Comfort with C++ and Python.

Stack

PythonC++GPULLMsMachine LearningDeep Learning
Posted
Jun 16, 2026
Last seen
Jun 25, 2026
First seen
Jun 25, 2026
Status
active
ML Systems Performance Engineer at Cerebras | Kairos