Machine Learning Expert - Fully Remote | Upto $90/hr

Job Board

Companies

Obsidian

Machine Learning Expert - Fully Remote | Upto $90/hr

Full-Time No working from home possible

Apply Now

Overview

We are hiring experienced machine learning engineers and researchers to serve as human baseliners for evaluations of open-ended machine learning research tasks. These evaluations measure how well AI agents perform on realistic AI R&D problems. To interpret agent performance, we also need strong human reference points: skilled practitioners attempting the same tasks under the same time and compute constraints. As a baseliner, you will complete self-contained ML research tasks in a sandboxed environment, working independently with your preferred tools and workflow. Your performance will be used as a benchmark against which frontier-model agents are evaluated.

What You’ll Do

Attempt open-ended machine learning research tasks under a fixed time and compute budget (work trial)
Work independently in a sandboxed Linux environment with internet access
Use your preferred tooling, including IDEs and AI coding assistants such as Cursor, Claude Code, and ChatGPT
Record your full working session via screen recording
Complete a short pre-task and post-task questionnaire
Submit your final work product, screen recording, and completed questionnaires

Post this you will be hired for a longer commitment.

Commitment

Minimum 20 hours per week if selected
More availability is strongly preferred

Requirements

3+ years of machine learning experience (time spent in a PhD program counts toward this requirement; undergraduate and master’s experience does not count)
Attended a top‑100 university or worked at FAANG or a comparable company
Experience with at least one major ML framework such as PyTorch, JAX, or TensorFlow
Deep, hands‑on expertise in at least one of the following focus areas:

Pretraining under tight data and compute budgets
PPO, reward shaping, custom gym / gymnasium environments, and throughput tuning
Full fine‑tuning, LoRA, QLoRA, DPO, RLHF, RLAIF, and distillation
Large‑scale corpus filtering, deduplication, subsampling, and benchmark contamination avoidance
Architecture design under strict parameter‑count or size constraints
Modifying pretrained architectures, including attention patterns, pooling heads, or training objectives
Contrastive training for embedding or retrieval models
Generative vision or video modeling
Multilingual or low‑resource language experience
Image or video data pipelines at scale
Experience balancing competing model objectives such as safety and capability
Prior work as an ML evaluator, red‑teamer, or baseliner

Required Domain Expertise

Pretraining: training transformer language models from scratch
Reinforcement learning: training agents in custom or existing environments
Post‑training: fine‑tuning and aligning LLMs
Dataset curation: building and cleaning large text corpora for LLM training
Model architecture: designing and modifying neural network architectures

Logistics (work trial requirements)

One baseline attempt per contractor per task
Each task may only be attempted once by a given contractor
All work is confidential and covered by NDA
Compute and environment are provided; no personal GPU is required

#J-18808-Ljbffr

Contact Details:

Obsidian Recruitment Team

View Obsidian profile

Machine Learning Expert - Fully Remote | Upto $90/hr

Obsidian

Apply Now

Machine Learning Expert - Fully Remote | Upto $90/hr

Company

Product

Help