AGENTS.md: A Complete Guide and Best Practices for Developers

An AGENTS.md (or CLAUDE.md) is a repository-level context file designed to steer coding agents like Sonnet-4.5, GPT-5.2, and Qwen3-30b.

These files act as a specialized instruction layer, providing the project-specific tooling and conventions that are often missing from raw source code. While over 60,000 repositories have adopted this format, implementing one is not a guaranteed win; its utility is dictated entirely by how it is authored.

Empirical data from ETH Zurich shows that human-written context files provide a 4% performance boost in task resolution across benchmarks. However, LLM-generated files frequently decrease success rates by 0.5% to 2%. The failure mode is almost always redundancy: auto-generated files tend to restate information already available in the repository, which adds noise and confuses the agent's reasoning process.

You must view these files as a deliberate engineering tradeoff. While they improve adherence to specific tools, they impose a 20% "tax" on every execution step in the form of increased inference costs. Unless you are prepared to manually curate these files to fill the "information gap" in niche repositories, they are more likely to bloat your CI budget than solve your bugs.

Illustration of an AGENTS.md file acting as a guide that steers a coding agent working inside a repository.

What is AGENTS.md and what problem does it solve?

Coding agents face a significant "context gap" when assigned to niche or internal repositories. While standard benchmarks like SWE-bench use popular, well-documented projects, the ETH Zurich AGENTbench focuses on "less-popular" repositories where the code is often the only source of truth. In these environments, agents struggle with non-obvious tooling decisions, specialized CI setups, and project-specific conventions that aren't apparent from a raw file tree.

The core problem is that agents frequently hallucinate library usage or fail to execute correct test runners because the developer's intent isn't explicitly documented for a machine. AGENTS.md functions as a "README for agents," reducing tool errors and environment setup failures. It tells the agent exactly which commands are valid, preventing it from guessing whether to use pip, poetry, or uv.

However, these files are only effective when they provide additive information. If a repository already has exhaustive documentation, an AGENTS.md often becomes a liability by mirroring existing data. Its primary purpose is to act as a bridge for agents working in environments where documentation is sparse or non-standard.

How coding agents read and apply AGENTS.md

Technically, the file is injected directly into the agent's context window at the start of a session (as CLAUDE.md for Claude Code or AGENTS.md for Codex/Qwen). The agent treats these instructions as high-priority system constraints. This is highly effective for tool selection: mentioning the package manager uv in a context file increases its usage frequency from near-zero to 1.6 times per task instance.

Diagram of how a coding agent loads AGENTS.md into its context window and selects the right tools.

Agents map their internal tools—Read, Write, Grep, and Edit—to the instructions you provide. If you specify a specific pytest command string, the agent will adapt its "Run Test" intent to match your project's environment.

The hazard here is "Process over Outcome." Agents are instruction-following systems; if you provide a detailed codebase overview, the agent will prioritize following that "map" over finding the bug. Data shows that providing overviews does not help the agent reach the target file faster; it merely encourages the agent to traverse more of the repository to comply with the instructions, often at the expense of the actual fix.

What should an AGENTS.md file contain?

Your content must focus exclusively on Gap Content: the non-obvious requirements that aren't discoverable via ls or a standard README.

What belongs in an AGENTS.md file: gap-content categories like tooling, CI configs and non-default conventions versus the directory-tree overviews to leave out.

Tooling Choices: explicit commands for uv, pytest, or ruff.
CI/Test Configs: non-default flags or environment variables required for successful runs.
Non-default Conventions: specific architectural patterns or libraries to avoid.

Hard requirement: do not include directory trees or codebase overviews. 100% of LLM-generated files in the ETH Zurich study included these, and they were the primary drivers of performance drops.

Sample AGENTS.md (human-curated style)

# Agent Guidelines
 
## Tooling & Commands
 
- Always use `uv` for dependency management.
- Test runner: `uv run pytest tests/ --cov`
- Linter: `ruff check . --fix`
 
## Project Conventions
 
- We use `pytest` fixtures instead of the `mock` library.
- New functions must include type hints (PEP 484).
- Database: CI requires `DATABASE_URL=sqlite:///:memory:`.
 
## Forbidden Actions
 
- Do not use `pip` or `venv` directly.
- Avoid modifying files in `legacy/` unless explicitly directed.

Does AGENTS.md actually help agents code better?

Data from AGENTbench and SWE-bench Lite reveal that while human-written files provide a 4% improvement, LLM-generated files cause a 0.5% to 2% performance drop.

The cost of this context is steep and non-negotiable:

20% increase in total inference cost per task.
14% to 22% increase in reasoning tokens (for models like GPT-5.2 and GPT-5.1 mini).
2–4 additional steps taken per task resolution.

This results in the "Exploration Paradox." Agents with context files search more files and generate more reasoning, but they don't reach the target file any faster than agents with no context. Detailed repository overviews often distract the agent, leading it to prioritize "complying with the map" over identifying the root cause of an issue.

Performance comparison: human-written files add 4% success while LLM-generated files drop up to 2%, alongside a ~20% rise in inference cost.

Best practices for writing AGENTS.md

The "write for the gap" principle is your primary defense against the performance penalties found in the ETH Zurich study.

Kill the overviews: directory trees and README mirrors are active hazards. Agents are capable of running ls or find; do not waste context tokens on what they can see for themselves.
Be minimal: instructions should focus only on what is not discoverable through code analysis.
The pruning hack: if you must use an agent to generate a context file, delete all existing documentation files (.md, /docs) from the repository first. In testing, this improved auto-generated file performance by 2.7% by forcing the agent to extract non-obvious tooling decisions rather than paraphrasing existing text.

AGENTS.md in monorepos

In monorepos, these files should reside at the root to define global standards (e.g., "All services use ruff for linting"). While they can help an agent navigate high-level relationships, the same "Exploration Paradox" applies: enumerating every subdirectory does not reduce the steps needed to find a specific bug. Avoid mapping the entire structure; focus on the unified tooling stack that applies across the sub-packages.

How to get started with AGENTS.md

Most agent interfaces use initialization commands like /init. For example:

bash

# General interface pattern
claude-code /init

While these commands generate a baseline, manual curation is the required second step. You must prune the output to remove redundant README content. Autonomous generation carries a performance penalty; the research is clear that your intervention is needed to transform a generic overview into a high-value "gap" document.

FAQ

Does using AGENTS.md increase my API bill? Yes. Context is not free; it is a 20% tax on every execution. You will also see a 14–22% increase in reasoning tokens as the agent processes the additional constraints.

Can I use an LLM to write my AGENTS.md? Only if you prune it. Raw LLM-generated files decrease success rates by up to 2%. To fix this, delete your /docs and README before generation to reduce the redundancy penalty, then manually edit the result.

Why do repository overviews hurt performance? It triggers the "Exploration Paradox." Detailed maps encourage agents to take more steps and traverse more files to ensure "compliance" with the overview, which distracts them from the actual bug fix.

Should I use AGENTS.md for popular open-source projects? Likely no. Popular repos already have documentation that agents have seen in training. These files are most effective in niche or internal repositories where the "information gap" is widest.

AGENTS.md: A Complete Guide and Best Practices for Developers

What is AGENTS.md and what problem does it solve?

How coding agents read and apply AGENTS.md

What should an AGENTS.md file contain?

Sample AGENTS.md (human-curated style)

Does AGENTS.md actually help agents code better?

Best practices for writing AGENTS.md

AGENTS.md in monorepos

How to get started with AGENTS.md

FAQ

References

Read more

What Is Agentic Engineering? A Guide for Engineers

Context Engineering vs Prompt Engineering

What Is Codex? A Beginner's Guide to OpenAI's Coding Agent