About the Role
We are seeking a versatile and experienced engineer to join our Inference Core Model Bringup team. This team is responsible to rapidly bring up state-of-the-art open-source models (like LLaMA, Qwen, etc) or customer-provided proprietary models on our Cerebras CSX systems. Success in this role requires a system-minded generalist who thrives in fast-paced bringup environments and is comfortable working across the entire Cerebras software stack.
Your work will play a critical role in achieving unprecedented levels of performance, efficiency, and scalability for AI applications.

Responsibilities

Contribute to the end-to-end bring up of ML models on Cerebras CSX systems.
Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning.
Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.

Skills & Qualifications

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
Strong debugging skills across performance, numerical accuracy, and runtime integration.
Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion).
Proficiency in C/C++ programming and experience with low-level optimization.
Proven experience in compiler development, particularly with LLVM and/or MLIR.
Strong background in optimization techniques, particularly those involving NP-hard problems.

What We Offer

Competitive salary and benefits package.
Opportunities for professional growth and career advancement.
A dynamic and innovative work environment.
The chance to work on cutting-edge technologies and make a significant impact on the future of AI.

Full Stack LLM Engineer

Description

Stack