Kairos
Back to jobs

Staff Software Engineer, AI Data Platform

On-site
LabelboxSan Francisco Bay Area2 weeks agoWebsite
Staff / Principal
Engineering

Compensation

$250,000-$280,000
Apply
Share

Description

Role Overview

Labelbox is the RL data factory for advancing frontier agent capabilities. We build the data, evaluations, and infrastructure that frontier labs use to train and judge their agents. We're looking for talented, experienced engineers to join us. The bar is high: engineers who have strong judgment and set technical direction, quickly build prototypes that scale into the reliable systems, and are at the frontier of agent-first engineering practices and innovating to accelerate the speed of the business.

What you may work on

  • Eval systems that run millions of agent trajectories to measure model and product quality.
  • Fine-tuning pipelines that turn evaluation signals into measurable agent improvements.
  • Agent-first product surfaces: UX and infrastructure for workflows where the user is a model or an agent operator.
  • The systems behind hundreds of thousands of AI interviews used to source and match freelance workers to projects.
  • Infrastructure that scales to the throughput frontier labs actually need.
  • Integration of the latest models and capabilities into production within days of release.

What we're looking for

  • 4+ year track record of shipping systems customers and other engineers rely on
  • You build full stack prototypes fast and they hold up. The v1 you ship becomes the foundation the rest of the team builds on.
  • Strong system and API design judgement
  • Hard architecture and product calls land with you. You make them, defend them under pressure, and update fast when someone else is right.
  • You ship production code with coding agents daily. You know where they break and what it takes to make them reliable to further accelerate the team's velocity.
  • You set direction by being the example. Other engineers reach for your designs and your code as the reference.
  • You move fast in ambiguous, startup-pace environments with influence over authority.
  • You have worked in all parts of the stack
  • Deep proficiency in TypeScript and/or Python.

Nice to have

  • Production experience building LLM- or agent-driven products.
  • Designing evaluations for LLMs and agents, or producing high-quality data for ML systems.
  • Background in production distributed systems, ML infrastructure, or data systems at scale.

Our Technology Stack

Our engineering team works with a modern tech stack designed for scalability, performance, and developer efficiency:

  • Frontend: React.js with Redux, TypeScript
  • Backend: Node.js, TypeScript, Python, some Java & Kotlin
  • APIs: GraphQL
  • Cloud & Infrastructure: Google Cloud Platform (GCP), Kubernetes
  • Databases: MySQL, Spanner, PostgreSQL
  • Queueing / Streaming: Kafka, PubSub

Stack

GraphQLPythonLLMsDistributed SystemsGCPKafkaTypeScriptJavaPostgreSQLReactKubernetesMachine LearningFine-tuningNode.js
Posted
Jun 9, 2026
Last seen
Jun 25, 2026
First seen
Jun 25, 2026
Status
active