We're hiring PhD-level experts to help train and evaluate advanced AI models. This is remote, flexible contract work where your academic expertise directly shapes how cutting-edge models reason, answer, and improve. You'll review AI-generated content in your field, identify where the model gets things right or wrong, and help raise the quality bar on technical reasoning.

This is a short-term engagement running through the end of June, with potential to extend depending on project needs.

Key Responsibilities

You may contribute your expertise by:

Assessing the factuality and relevance of domain-specific text produced by AI models
Crafting and answering questions related to Machine Learning and AI
Evaluating and ranking domain-specific responses generated by AI models

What We're Looking For

A PhD (completed or in final stages) in one of the following fields: Machine Learning / AI, Computer Science, Engineering, Statistics, or a closely related quantitative subdomain such as Mathematics or Physics
Strong analytical and critical-thinking skills, with the ability to spot subtle errors in technical reasoning
Fluent written English and the ability to communicate complex ideas clearly

Nice to Have

Research experience (academic or industry)
Prior experience with data annotation or AI model evaluation
Experience reviewing or publishing research papers

Compensation

Up to $150/hr, depending on your area of expertise, depth of experience, and assessment performance.

Location

This role is fully remote. We are currently accepting applicants based in:

United States, Canada, Puerto Rico, Mexico, United Kingdom, Australia, New Zealand, and Argentina.

#LI-MS1 #LI-Remote

STEM PhD Expert for AI Reasoning & Evaluation

Description

Key Responsibilities

What We're Looking For

Nice to Have

Compensation

Location

Skills & categories