Role Overview

As an Applied Research intern at Labelbox, you will design, build, and productionize evaluation and post‑training systems for frontier LLMs and multimodal models. You’ll own continuous, high-quality evals and benchmarks (reasoning, code, agent/tool‑use, long‑context, vision‑language, et al.), create and curate post‑training datasets (human + synthetic), and prototype RLHF/RLAIF/RLVR/RM/DPO‑style training loops to measure and improve real‑world task and agent performance.

Your Impact

Build and own evaluation and benchmark suites for reasoning, code, agents, long‑context, and V/LLMs.
Create post‑training datasets at scale: design preference/critique pipelines (human + synthetic), and target hard failures surfaced by evals.
Experiment and prototype RLHF/RLAIF/RLVR/RM/DPO‑style training loops to improve real-world task and agent performance.
Land research in product: ship improvements into Labelbox workflows, services, and customer‑facing evaluation/quality features; quantify impact with customer and internal metrics.
Engage with customer research teams: run pilots, co‑design benchmarks, and share practical findings through internal research reports, blog posts, talks, and published papers.

What You Bring

A strong foundation in AI and machine learning, backed by a Ph.D. or Master’s degree in Computer Science, Machine Learning, AI, or a related field (in progress degrees are acceptable for intern positions).
A deep understanding of frontier autoregressive and diffusion multimodal models, along with the human and synthetic data strategies needed to optimize them.
Passion and experience for LLM evaluation and benchmarking.
Expertise in training data quality construction, measurement and refinement.
The ability to bridge research and application by interpreting new findings and translating them into functional prototypes.
A track record of publishing in top-tier AI/ML conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP, NAACL) and contributing to the broader research community.
Proficiency in Python and experience with deep learning frameworks like PyTorch, JAX, or TensorFlow.
Exceptional communication and collaboration skills.

Applied Research at Labelbox

At Labelbox Applied Research, we're committed to pushing the boundaries of AI and data-centric machine learning, with a particular focus on advancing human-AI interaction techniques. We believe that high-quality human data and sophisticated human feedback integration methods are key to unlocking the next generation of AI capabilities. Our research team works at the intersection of machine learning, human-computer interaction, and AI ethics to develop innovative solutions that can be practically applied in real-world scenarios.

Applied Research Intern

Description

Role Overview

Your Impact

What You Bring

Applied Research at Labelbox

Stack