autoresearch — open source GitHub project

autoresearch is Andrej Karpathy’s experiment with AI agents that automatically run research around single-GPU nanochat training.

What it is

autoresearch is Andrej Karpathy’s experimental repository about automating research with AI agents. It centers on single-GPU nanochat training and an agent that runs experiments, reads metrics, and proposes the next step.

The interesting part is not “press a button and get science”, but a workflow demonstration: some routine hypothesis iteration, runs, and result inspection can be delegated to an agent when the environment is constrained and reproducible.

What is inside

The README covers quick start with `uv`, data download, tokenizer training, manual single experiments, agent execution, project structure, and design choices. The repository shows both code and the author’s decisions around the experiment.

A practical use is to reproduce a small local research environment, confirm that a single experiment runs manually, and then let an agent perform a series of changes. This fits tasks where results can be measured and compared quickly.

Research loop run

These commands show the idea: prepare the environment and data, then run a single experiment or the agent.

Language: Bash

uv sync
uv run python data.py
uv run python train.py
uv run python agent.py

Strengths

The strength is honest experimental scope. autoresearch does not pretend to be a universal platform; it shows a concrete lab around nanochat, useful for thinking about reproducible AI research automation.

Limits

The limitation is narrow scope and overinterpretation risk. Agents can run experiments, but scientific quality still depends on task framing, metrics, randomness control, and human review.