Most developers treat AI coding like a chat interface: prompt, watch, and intervene when it loses the thread. This human-in-the-loop (HITL) approach is fine for prototypes, but it fails on complex features.
The Ralph Loop is a technique for running AI coding agents in a simple bash loop to achieve autonomous, unsupervised work.
It solves the "implicit execution budget" problem, where an AI declares a job "done" even if the tests are still red.
You use a loop because AI has no taste, and it will lie to you about tests passing if you let it. The bigger win is that it solves "context rot." Long sessions don't just fill up the window; the model starts to summarize or "compact" history, losing your critical initial instructions.
By killing the session and restarting with a fresh context for every atomic task, you keep the agent sharp.

What is the Ralph Loop?
The Ralph Loop is not a complex framework; it's a simple bash while loop that runs an AI coding CLI like Claude Code or Amp. Coined by Geoffrey Huntley, the name refers to the Simpsons character Ralph Wiggum—the kid who fails constantly but keeps trying until he eventually succeeds. The loop treats failure as expected, not exceptional, forcing the agent to iterate until it meets binary success criteria.
Technically, the Anthropic plugin uses a stop hook to intercept the exit when the AI tries to end a session. But a "true" Ralph Loop runs outside the agent. It kills the process entirely and restarts it to guarantee a 100% clean context. This is the opposite of vibe coding, where you accept suggestions without scrutiny. In a Ralph Loop the agent—not the human—chooses the next task from a structured requirements file, explores the code, and implements changes until the "Completion Promise" sigil appears.
Why does looping work?
AI agents have a hidden execution budget. Once the model feels it has done a "reasonable" amount of work, it wraps up and exits based on how the code looks rather than how it works. You'll often find half-implemented APIs or skipped edge cases because the model decided it was "good enough."

Worse, context windows are just arrays. Every message adds to that array until the model starts "compaction"—summarizing previous history to save space. Because that compaction loses the original project instructions, long sessions degrade their own reasoning the longer they run. Looping fixes this by starting each task with a fresh context. The agent stays focused because it isn't carrying the baggage of previous failed attempts or bloated history.
Anatomy of a Ralph Loop
A working loop relies on state files to carry memory between context resets. Don't write the prd.json yourself; humans are bad at writing binary, testable requirements. Instead, "mould the clay" by talking the spec through with the AI, then ask it to generate the structured JSON.

- prd.json: The living TODO list with binary
passes: false/trueflags. - progress.txt: Short-term memory of decisions, blockers, and files changed.
- agents.md: Long-term, project-wide patterns and conventions.
- The PIN System: A Markdown lookup table linking specific features to filenames. It stops the agent from hallucinating directory structures or inventing file names.
- The Completion Promise: A specific sigil (e.g.
<promise>COMPLETE</promise>) the agent emits only when everyprd.jsonitem passes.
A minimal prd.json looks like this:
{
"requirements": [
{
"task": "Add users table migration",
"passes": true
},
{
"task": "Implement signup endpoint",
"passes": false
}
]
}Getting started: from HITL to AFK
Don't jump straight to autonomous overnight builds. Learn the screwdriver before you pick up the jackhammer.

- Level 1 — The screwdriver (HITL): Run a single iteration by hand and watch the agent. This is where you refine the prompt and the AI-generated PRD.
- Level 2 — The power drill (attended loops): Run 5–10 iterations at your desk. Catch mistakes early and pause the moment the agent starts going off-track.
- Level 3 — The jackhammer (AFK): Once you trust the feedback loops, set the agent to run 30–50 iterations while you're away from the keyboard.
A practitioner afk-ralph.sh script wires those iterations together:
# afk-ralph.sh
set -e # exit on error
iterations=$1
for i in $(seq 1 $iterations); do
echo "Iteration $i of $iterations"
# -p runs Claude in non-interactive print mode
claude -p "Implement the next task in prd.json. Output <promise>COMPLETE</promise> when all tasks pass." | tee output.log
if grep -q "<promise>COMPLETE</promise>" output.log; then
echo "Task complete."
break
fi
donePrinciples that keep Ralph on the rails
Engineering guardrails are the only thing standing between you and a $100 token bill with nothing to show for it.
- Small steps: Tasks must be atomic. If a task is too big, the agent runs out of context and produces garbage.
- Feedback loops: Non-negotiable. You cannot trust an AI to judge its own work.
- Risk prioritization: Tackle architectural "spikes" and integration points first. Failing fast on a hard problem beats finishing ten easy UI tasks on a broken foundation.
A feedback loop is any check the agent can't argue with:
| Feedback type | What it catches | Priority |
|---|---|---|
| Typecheck | Type mismatches, missing props | Essential |
| Linting | Code style, obvious logic bugs | High |
| Unit tests | Broken logic, regressions | Critical |
| Playwright | UI bugs the model can't "see" in code | High |
| Build check | Compilation or dependency errors | Critical |
When Ralph fits — and when it doesn't
Ralph is "pay to play." High-end models like Claude Sonnet cost roughly $10/hour to loop. Local models aren't viable yet; they lack the reasoning to manage these autonomous state transitions.
Safety and the lethal trifecta
The "lethal trifecta" is the combination of untrusted tokens, internet access, and access to secret data. To defuse it, always run AFK agents inside Docker sandboxes. That stops a runaway agent or a malicious prompt injection from running rm -rf / or stealing your SSH keys. Ralph is great for generating 90% of a feature overnight—but you still review the git log in the morning.
FAQ
Is Ralph a plugin? No. A proper Ralph loop runs outside the agent as a bash script. Plugins that run inside the agent don't fix context rot because they never reset the context array.
Can it run in parallel? It's a red hot mess. Coordinating non-deterministic agents usually ends in contention, with each one stepping on the others' toes. Stick to a monolithic process for reliability.
Why use prd.json instead of a plain list? JSON gives you unambiguous binary tracking. The agent can programmatically flip passes: true once its feedback loops go green.
Can Ralph fix bugs automatically? Yes. Point Ralph at linting errors or failing tests and let it iterate until the status is green. It's software entropy running in reverse.
References
- Everyone's Using Ralph Loops Wrong. Here's What Actually Works. — Alex Dunlop
- 11 Tips For AI Coding With Ralph Wiggum — Matt Pocock
- Everything is a Ralph Loop — Geoffrey Huntley
- Getting Started With Ralph — AI Hero
- Claude Code Ralph Loop: From Basic Prompts to Autonomous Overnight Builds — Joe Njenga
- snarktank/ralph — GitHub
- Ralph Loops: Build Dumb AI Loops That Ship — Chris Parsons, Cherrypick