Blog

/ Rishi Desai / 5 min read

RL environment creation is becoming continuous QA

Why automating RL environment creation means building the loop around task generation.

research
> read_article.sh
/ Rishi Desai / 5 min read

Designing eval harnesses that prevent reward hacking

Trust boundaries for coding agents: verifiers, artifacts, and network access.

research
> read_article.sh
/ Joan Cabezas + Abundant Research Team / 9 min read

Frontier Models Caught Cheating

Frontier coding agents caught cheating on long-horizon tasks: a leaderboard, a taxonomy of reward hacks, and what it means for benchmarks.

research
> read_article.sh
/ Jesse Hu / 4 min read

Hillclimbing to an abundant future

Hillclimbing — the practice of making a number go up for a capability that resists clean definition — is the core bottleneck on the path to AGI.

company
> read_article.sh