R1 the agent that proves its work · open source · self-hostable

Agents that ship. Then prove they shipped.

R1 is an open-source agent framework that plans, runs, checks, and records software work before it merges. It can conduct Claude Code and Codex as sub-agents, but the harness keeps the proof trail and the gate checks.

For solo builders, platform teams, and regulated industries that need proof, not just output.

Ship end to end from plan to commit. Catch regressions before merge. Replay any run with the paper trail intact. Use your models without buying another runtime. Self-host cleanly when data must stay put. Review less because the harness flags risk first.

Start free See it run live Talk to sales See pricing

Demos 9 hands-on demos you can run right now.

Open source Apache 2.0 core you can audit, fork, and self-host.

Model families R1 default, Claude Code, Codex, and custom model wiring.

Receipts Every action signed, replayable, and ready for review.

r1 · run

live

thought

skill

memory

gate

what R1 is·02

Three things you actually get.

Not a chat window. Not a black box. Three concrete capabilities shipped in one open-source runtime.

Agents that finish what they start

R1 plans, executes, verifies, and commits, and it does not merge a change that failed its own checks.

full lifecycleRead the spec →

Roll back any step. Replay any run.

Every action is signed and recorded, so you can replay a run, inspect the proof chain, or rewind a step without guessing.

receipt-firstHow it works →

Bring your model. Bring your infrastructure.

Run on your laptop, on Heroa, or on your own infrastructure. Use the model accounts your team already trusts.

self-host readyWhere it runs →

who picks R1·03

Three kinds of teams pick R1 for three different reasons.

You are the whole engineering team. You want an agent that can pick up a feature, ship it end to end, and not require a 30-minute review of every line. R1 gives you the harness so you only have to review what the harness flagged.

laptop-friendlyhomebrew installone binaryno extra infrastructure

For solo builders →

how R1 works·05

PLAN. EXECUTE. VERIFY. COMMIT.

Four steps, one harness. Skip any of them and the work does not merge.

Plan

R1 reads the task, surveys the codebase, and writes an explicit plan you can approve or amend before execution starts.

human-readablereversibleapproval first

Execute

R1 runs the plan, chooses the right agent for each step, and records every tool call, file change, and retry path.

parallel workfull receipt trailyour models

Verify

Tests run, reviews run, and gate checks run. If anything fails, R1 surfaces it instead of merging it quietly.

cross-family reviewfails loudpolicy-gated

Commit

Only after the gates pass. The merge is serialized, the receipt is written, and the full run stays replayable from that change.

atomic mergesigned receiptrollbackable

See the full flow →

a Tuesday with R1·06

Here is what shipping with R1 actually looks like.

Not a glossy promise. The cadence of one engineer’s day with an agent that does not ship work without a paper trail.