.png?1691093207)
Staff Software Engineer, AI Data Platform
On-site
Staff / Principal
Engineering
Compensation
$250,000-$280,000Description
Role Overview
Labelbox is the RL data factory for advancing frontier agent capabilities. We build the data, evaluations, and infrastructure that frontier labs use to train and judge their agents. We're looking for talented, experienced engineers to join us. The bar is high: engineers who have strong judgment and set technical direction, quickly build prototypes that scale into the reliable systems, and are at the frontier of agent-first engineering practices and innovating to accelerate the speed of the business.
What you may work on
- Eval systems that run millions of agent trajectories to measure model and product quality.
- Fine-tuning pipelines that turn evaluation signals into measurable agent improvements.
- Agent-first product surfaces: UX and infrastructure for workflows where the user is a model or an agent operator.
- The systems behind hundreds of thousands of AI interviews used to source and match freelance workers to projects.
- Infrastructure that scales to the throughput frontier labs actually need.
- Integration of the latest models and capabilities into production within days of release.
What we're looking for
- 4+ year track record of shipping systems customers and other engineers rely on
- You build full stack prototypes fast and they hold up. The v1 you ship becomes the foundation the rest of the team builds on.
- Strong system and API design judgement
- Hard architecture and product calls land with you. You make them, defend them under pressure, and update fast when someone else is right.
- You ship production code with coding agents daily. You know where they break and what it takes to make them reliable to further accelerate the team's velocity.
- You set direction by being the example. Other engineers reach for your designs and your code as the reference.
- You move fast in ambiguous, startup-pace environments with influence over authority.
- You have worked in all parts of the stack
- Deep proficiency in TypeScript and/or Python.
Nice to have
- Production experience building LLM- or agent-driven products.
- Designing evaluations for LLMs and agents, or producing high-quality data for ML systems.
- Background in production distributed systems, ML infrastructure, or data systems at scale.
Our Technology Stack
Our engineering team works with a modern tech stack designed for scalability, performance, and developer efficiency:
- Frontend: React.js with Redux, TypeScript
- Backend: Node.js, TypeScript, Python, some Java & Kotlin
- APIs: GraphQL
- Cloud & Infrastructure: Google Cloud Platform (GCP), Kubernetes
- Databases: MySQL, Spanner, PostgreSQL
- Queueing / Streaming: Kafka, PubSub
Stack
GraphQLPythonLLMsDistributed SystemsGCPKafkaTypeScriptJavaPostgreSQLReactKubernetesMachine LearningFine-tuningNode.js
- Posted
- Jun 9, 2026
- Last seen
- Jun 25, 2026
- First seen
- Jun 25, 2026
- Status
- active