Autoresearch

Autonomous optimization loop: edit → commit → benchmark → keep/revert → repeat.

What it does

Pi-autoresearch runs a continuous experiment loop against any measurable optimization target — test suite speed, bundle size, Lighthouse scores, model benchmark scores, or any metric you can express as a command that returns a number. Inspired by Andrej Karpathy's autoresearch concept.

Each loop iteration: Pi makes an edit, commits it, runs your benchmark command, compares the result to the baseline, and either keeps or reverts the change. Results are logged to autoresearch.jsonl and persist across context resets — you can stop and resume at any time. A live browser dashboard (/autoresearch export) shows results as they accumulate.

The /skill:autoresearch-create command walks you through setup: what to optimize, what command to run, and what metric to track. /skill:autoresearch-finalize groups successful experiments into independent reviewable branches.

Why it's included

Some optimization work is inherently empirical — you can't reason your way to a faster webpack config or a better regularization schedule. Autoresearch turns Pi into an autonomous experimenter that can run hundreds of iterations while you sleep, keeping what works and discarding what doesn't.

Commands

Command	What it does
`/autoresearch <text>`	Enter autoresearch mode, or resume if `autoresearch.md` exists
`/autoresearch off`	Leave autoresearch mode — keeps the log intact
`/autoresearch clear`	Delete log, reset all state, turn mode off
`/autoresearch export`	Open live browser dashboard that auto-updates as experiments run
`/skill:autoresearch-create`	Guided setup: goal, benchmark command, metric — then starts the loop
`/skill:autoresearch-finalize`	Group kept experiments into independent reviewable branches

→ GitHub — davebcn87/pi-autoresearch