A plain-language, end-to-end guide to OpenAI Codex — the CLI in your terminal, the extension in your IDE, the desktop app on your dock, the cloud agent that works on tasks while you do something else, and the GitHub bot that reviews your PRs. What each one is for, when to use which, and the exact commands to type.
If you can copy a sentence from a webpage into a terminal, you have everything you need.
Codex is one product with five front doors. The cloud at chatgpt.com/codex needs zero setup — your repo, your tasks, your machine never leaves the browser tab. The CLI lives in your terminal and runs locally. The IDE extension brings the same agent into VS Code, Cursor, or Windsurf. The desktop app is the cloud and the local agent fused into one window with native notifications. And the GitHub bot reviews PRs when you @codex it in a comment. Most people start in the cloud, then install the desktop app or CLI within a week.
Sign in with ChatGPT, connect a GitHub repo, give Codex a task. It spins up a container, makes the changes, opens a PR. You can fire off a dozen in parallel and check back later. Included with Plus, Pro, Business, Enterprise, and Edu plans.
Open chatgpt.com/codex →One npm i -g @openai/codex (or brew install codex) and you have an agent that reads, edits, and runs code in the project you launched it from. Sign in with your ChatGPT account — no API key needed.
Install from the VS Code marketplace and the same agent runs in a side panel — with the diff viewer, file picker, and your active selection wired in. Cursor and Windsurf are also supported (they share the VS Code extension model).
See IDE setup →A native window that fuses the cloud dashboard, the local agent, and your inbox of running tasks. Notifications, global hotkey, system-tray status, and a richer diff viewer than the IDE. Best if you have several repos and want one place to watch them.
See desktop setup →Install the Codex GitHub app, then mention @codex in any issue or PR comment. It can review diffs, propose fixes as commits, answer questions about the code, and run the cloud agent to implement small changes end-to-end.
Codex is OpenAI's coding agent. It reads files, writes patches, runs commands, asks you questions, follows up on test failures, and stops when it's stuck. The same model and the same instruction set power all four surfaces — the only thing that changes is where the work runs and what kind of approval gate sits between the model and your filesystem.
Your code, your machine, your shell. The agent edits files in place and runs commands inside a sandbox you control. Best for tight loops where you want to watch the dev server, see the failure, ask Codex to fix it — and merge before lunch.
A fresh container per task, cloned from your repo. You give Codex a job, walk away, come back to a PR. Fan out: spin up ten tasks at once and pick the ones that landed clean. Best for backlog grooming, refactors, and "I wonder if this is doable."
Each chip points to a section below. Solid borders ship today; dashed are early-access or rumored. Codex is one of the fastest-moving products at OpenAI — this list will need updates often.
The CLI is the highest-leverage surface — install it once and the IDE extension, the cloud, and the GitHub bot all become richer because you can flip between them. Two install paths cover almost everyone: npm and Homebrew.
If you have Node 20+ already, this is two lines:
# 1. install $ npm install -g @openai/codex # 2. sign in with your ChatGPT account (opens a browser) $ codex login # 3. start a session in the current directory $ codex
$ brew install codex $ codex login $ codex
Codex runs in PowerShell 7+, Windows Terminal, and inside WSL. The native Windows build does file edits the same way as on macOS/Linux; the sandbox is less strict (Windows doesn't have Seatbelt or Landlock). If you do a lot of agent work, run it inside WSL Ubuntu — you get the Linux sandbox and faster filesystem ops on git repos.
You have two ways in. ChatGPT login (free if you already pay for Plus/Pro/Business/Enterprise/Edu) is the default and is included with your plan. API key bills usage against your OpenAI API account — useful if you're on a free ChatGPT tier or you want to switch billing to a company API account.
~/.codex/auth.json.codex login errors out, that's the most common reason.
$ npm update -g @openai/codex # npm install $ brew upgrade codex # homebrew install $ codex --version # check what you're on
Codex ships often — usually a release every week or two. If something behaves oddly, update before you debug.
Two ways to pay for Codex, and you can mix them. A ChatGPT subscription (Plus, Pro, Business, Enterprise, Edu) includes Codex usage with monthly limits — most people start here and never leave. An OpenAI API key bills usage per-token against your developer account — useful when you blow past the ChatGPT limits, when you're on the free tier, or when a finance team wants Codex billed against API spend instead of seats. Pick whichever fits, swap any time with codex logout.
Codex is included with every paid ChatGPT plan. The plan controls how much you can use it before you hit a limit, and which surfaces unlock.
| Plan | Price | Codex usage | Surfaces | Best for |
|---|---|---|---|---|
| Free | $0 | Not included | — | API key route only |
| Plus | $20/mo | Generous everyday usage. Enough for a few cloud tasks a day plus a steady CLI/IDE habit. | All five (CLI, IDE, Desktop, Cloud, GitHub) | Solo devs, side projects, learning |
| Pro | $200/mo | Much higher limits — built for power users who fan out tasks in the cloud and use Codex all day. | All five, with priority cloud throughput | Full-time Codex users, heavy cloud delegation |
| Business | ~$25/user/mo | Plus-tier limits per seat, pooled at the workspace level. Admin controls, SSO, audit log. | All five + admin dashboard, workspace AGENTS.md, SCIM | Small & mid-sized teams |
| Enterprise | Custom | Custom usage caps, data residency options, retention controls, SAML, BAA available. | All five + enterprise SSO, DLP integration, private endpoints | Larger orgs, regulated industries |
| Edu | Discounted/free for verified institutions | Business-equivalent usage at education pricing. | All five | Universities, accredited K-12 |
Both codex login --api-key and the OPENAI_API_KEY environment variable route usage through your OpenAI developer account at platform.openai.com. You pay per million input/output tokens, with billing showing up on the same dashboard as any other API spend.
| Model | Relative cost | What you're paying for |
|---|---|---|
gpt-5-codex | Frontier-tier | Best agentic coder; deep reasoning; the right pick for real work. |
gpt-5.1-codex | Frontier-tier | Latest revision — same price band as gpt-5-codex. |
gpt-5-codex-mini | ~5-10× cheaper | Routine edits, quick reads, batch jobs. The everyday workhorse for cost-sensitive workflows. |
Three lever you should know about — they make the API route substantially cheaper:
--effort low and minimal emit far fewer output tokens than high. For mechanical tasks the cheapest setting is usually also the right one.OPENAI_API_KEY env var).gpt-5-codex-mini in batch jobs for cheap async work.medium effort. high can spend 2-3× the tokens; reach for it deliberately./compact trades recent detail for cheaper subsequent turns.AGENTS.md tight. Every word ships with every prompt. Concise pays back constantly.gpt-5-codex-mini for everyday edits, escalate to gpt-5-codex for hard problems, fan out a couple of cloud tasks per day. If you regularly hit the Plus cap or want to delegate constantly, jump to Pro — at $200/mo it pays for itself if it saves you four hours a month.
It drops you into a TUI: a chat at the bottom, an event log above, a hint bar with slash commands, and a status line showing the working directory and active model. Type a sentence — anything — and watch.
ls, glance at README.md, package.json, AGENTS.md if you have one. Letting it skim is fine; you don't need to brief it on the project structure./undo reverts the last set of file changes./quit or Ctrl-D. Your conversation is logged under ~/.codex/sessions/ and you can resume with codex resume.| Command | What it does |
|---|---|
/init | Drafts an AGENTS.md for the current repo by reading the project structure. |
/diff | Shows everything Codex has changed in this session, as a single git diff. |
/undo | Reverts the last batch of file edits. |
/model | Switch model (e.g. gpt-5-codex ↔ gpt-5-codex-mini). |
/approvals | Change the approval mode mid-session. |
/compact | Summarize history to free up the context window. |
/mention | Pull a file or symbol into the conversation by name. |
/clear | Reset the conversation but keep the working directory. |
/help | The full list, always more current than this table. |
For scripts, CI, or just when you want a single answer:
$ codex exec "add a --json flag to the build script" $ codex exec --model gpt-5-codex-mini "summarize CHANGELOG.md in 3 bullets" $ codex exec --approval-mode full-auto "run the migration and commit"
codex exec takes one prompt, runs to completion, prints the diff and the final reply, and exits. Combine with shell pipes for batch work.
Every action Codex takes — read, edit, run a command, hit the network — passes through a permission gate. The mode you pick decides what the gate lets through silently and what makes it stop and ask. Three modes, in increasing order of trust.
| Mode | Reads | Edits | Shell commands | Network | Best for |
|---|---|---|---|---|---|
| Read Only --approval-mode read-only |
✓ silent | ✗ asks every time | ✗ asks every time | ✗ asks every time | Code review, "explain this", auditing |
| Auto (default) --approval-mode auto |
✓ silent | ✓ silent in workspace | ✓ silent in workspace asks outside it |
✗ asks every time | Day-to-day work (this is what you want) |
| Full Access --approval-mode full-auto |
✓ silent | ✓ silent anywhere | ✓ silent | ✓ silent | CI, fire-and-forget, throwaway VMs |
Inside Auto and Full Access, commands run in a platform-specific sandbox that blocks unexpected writes:
sandbox-exec with Seatbelt. Writes outside the workspace are denied at the kernel.Network egress is blocked by default in Auto. Turn it on per-session with /approvals or per-invocation with --allow-network.
codex exec from a CI job.Codex looks for a file called AGENTS.md at the root of your repo (and, recursively, in subdirectories). Anything in it is loaded into context at the start of every session. It's the single highest-leverage thing you can write in a Codex-using repo — a one-page brief that turns every prompt into a better one.
src/legacy/." "Never write inline styles." "Migrations are reviewed by hand — propose them, don't run them."config.ts files.From inside a repo, run /init at the prompt. Codex will read the project structure, draft an AGENTS.md, and ask you to confirm. Edit the draft — the value is in the bits a generator can't infer.
# Acme Web ## Stack - Next.js 15 (app router) · TypeScript · pnpm - tRPC + Drizzle on Postgres (Neon) - Tailwind v4 + shadcn/ui - Tests: vitest (unit), playwright (e2e) ## How to run - pnpm dev # dev server, port 3000 - pnpm test # vitest, watch mode off - pnpm test:e2e # playwright (assumes dev server is up) - pnpm lint && pnpm typecheck # gate before commits ## Conventions - Server components by default; mark client islands explicitly. - One tRPC router per resource, in src/server/routers/. - No any. If a type is genuinely unknown, write a comment explaining why. ## Don't - Touch src/legacy/billing/ — being rewritten on a branch. - Run migrations. Generate them; we apply by hand. - Add new dependencies without flagging the size in the PR description. ## Watch out for - auth.ts has two code paths (session cookie vs. bearer token). Most bugs come from forgetting the bearer branch.
AGENTS.md saves an hour per week in prompts you didn't have to write and corrections you didn't have to make. Refresh it whenever you notice yourself typing the same context twice.
Drop an AGENTS.md in a subdirectory and Codex will use it when working in that subtree. Useful for monorepos: a top-level brief plus per-package addenda for the parts with their own conventions.
Codex is forgiving — terse prompts work, vague ones often work, even single-word redirects ("smaller", "tests", "no") get parsed sensibly. But the difference between an okay session and a great one comes down to a handful of habits.
"Add a search endpoint" is fine. "Add a search endpoint at /api/search?q= that returns 20 results, ranked by recency, in JSON" is better — and shorter than the back-and-forth you'd otherwise have to disambiguate it.
If you know where the change belongs, say so. "In src/auth/session.ts, swap the cookie name to __Host-sid and update the reader in the same file." Codex will still find the file if you don't, but naming it skips a grep round-trip.
Big asks ("rewrite the auth flow") drift. Small, scoped asks land. Break it down yourself, or let Codex propose a plan and pick the first item.
End the prompt with "verify by running pnpm test auth". Codex will run it, see the failures, and keep iterating. This single habit turns most flaky outputs into self-healing ones.
You don't need to repeat the prompt. "shorter", "no, I meant the client component", "don't touch the tests", "keep going" — all get parsed correctly. Treat the chat like an over-the-shoulder conversation.
Reasoning effort controls how hard the model thinks before it acts. There are four settings:
| Effort | What it's for | Latency |
|---|---|---|
minimal | Mechanical edits, rename, format. No thinking needed. | fastest |
low | Routine changes — add a flag, write a test, fix a typo. | fast |
medium (default) | Most real tasks. Bug fixes, small features, refactors. | moderate |
high | Architectural choices, gnarly bugs, plans you'll execute against. | slow |
Switch with /model in the TUI, or --effort high on codex exec. Default to medium; reach for high when you'd want a senior engineer to think for a minute before writing.
Pasting a sample of the desired output ("this is what one looks like…") is worth a paragraph of explanation. So is pointing at a similar feature already in the repo: "do the same thing we did in orders/route.ts, but for invoices."
Once Codex says it's done, run /diff, scan the change, and ask one targeted question: "why this approach over X?" The answer often surfaces a tradeoff worth a follow-up.
Codex uses purpose-tuned variants of OpenAI's frontier models. They share the GPT-5 family lineage but are trained specifically for long-horizon coding work — staying on task across many tool calls, recovering from test failures, and writing code that runs.
| Model | When to use it | Notes |
|---|---|---|
gpt-5-codex |
The default. Best agentic coder OpenAI ships. Long context, strong tool use, good at staying on task. | Higher latency, higher quality. Use for real work. |
gpt-5-codex-mini |
The fast variant. Routine edits, quick questions, "what does this file do" reads. | Trades depth for speed. Great with --effort low. |
gpt-5.1-codex |
The latest revision — improved tool-call patterns, better at picking sandbox modes, slightly better long-horizon recovery. | Default when available; falls back to gpt-5-codex otherwise. |
gpt-5 / gpt-5.1 |
The general-purpose models. Available if you want a Codex session backed by the non-specialized GPT-5. | Slightly weaker at agentic loops; sometimes stronger on free-form reasoning. |
$ codex --model gpt-5-codex-mini # for this session >_ /model gpt-5-codex # mid-session switch # or set the default in ~/.codex/config.toml — see §11
mini for exploration and quick edits, escalate to the full model the moment you hit a real decision. The TUI tells you which model produced each response, so you can read back and see where the gear-shift happened.
The IDE extension is the CLI with a better UI for the visual parts — diffs, file picks, selections. Install it once and you can switch between terminal and editor without changing your workflow.
@filename, autocompleted against the workspace./compact, /mention, custom keybindings) don't yet have full IDE equivalents.The Codex desktop app is the cloud, the CLI, and the GitHub bot fused into a single native window. Think "ChatGPT desktop, but for code" — it's the surface most people end up living in once they're past the first week. macOS and Windows builds, signed and notarized, available from the OpenAI downloads page.
.dmg (macOS) or run the .exe installer (Windows).It auto-updates in the background, so once it's installed you can mostly forget about it.
Cloud tasks, local sessions, and GitHub mentions all land in the same sidebar. Sort by status, repo, or recency. Click in and you're picking up where you left off.
Start a session locally with full filesystem access. Hand it off to the cloud when you want to walk away. Open the resulting PR back in the desktop app for the final review. No context juggling.
Three-column layout — file tree, before/after, comment thread. Better than the embedded IDE viewer, better than GitHub's. Apply hunk-by-hunk, stage, or send back for changes.
Default ⌘⇧K (macOS) / Ctrl⇧K (Windows) brings up a quick-prompt overlay from any app. Type a sentence, pick a repo, hit return — the task lands in the cloud, you keep doing what you were doing.
Native toast when a cloud task completes, when CI fails on a Codex PR, or when @codex is mentioned on a repo you watch. Configurable per-repo — silent for the noisy ones, loud for the ones you care about.
A lightweight repo explorer for the projects you've connected — open a file, scan recent runs, see which branch a cloud task forked from. Useful for orienting before you write the prompt.
| Surface | Best at | Falls short on |
|---|---|---|
| CLI | Tight terminal loops, scripting, CI, working over SSH on a remote box | Visual diffs, juggling several tasks at once |
| IDE extension | Edit-while-you-code workflow; selections, inline diffs in context | Long async tasks; awareness of cloud work happening in parallel |
| Desktop app | Coordinating local + cloud + GitHub from one place; passive monitoring; fan-out | Pure terminal nerds will still prefer the CLI; in-editor flow lives in the IDE |
~/.codex/config.toml — but you can override per-window if you want a long-running session to stay in read-only.The cloud product at chatgpt.com/codex is the opposite of the CLI: instead of you driving Codex through every step, you describe a task, Codex spins up a container with your repo cloned into it, does the work, and opens a pull request. You watch from a dashboard.
npm i, pnpm i, etc.), and any environment variables it needs.--quiet flag to the CLI that suppresses log output below WARN."AGENTS.md through the agent, starts work.Cloud tasks each run in their own container — there's no contention. Queue up five at once for five different ideas. Two will land clean, two will need follow-ups, one will fail and you'll abandon it. That ratio is the whole point: you're paying for exploration, not for guaranteed outcomes.
| Available | |
|---|---|
| ✓ Yes | Your repo (read & write), package install, test runners, public network egress, an env file you provide. |
| ✗ No | Your local machine, your other repos (unless connected), private VPN endpoints, your secrets (unless you supplied them as env vars). |
Install the Codex GitHub app on a repo and you can summon Codex in any issue, PR, or comment by mentioning @codex. It uses the cloud agent under the hood — same container model, same model family, same AGENTS.md — but the interface is GitHub.
@codex review — review the diff and leave inline comments.@codex why did this PR fail CI? — read the action logs and explain.@codex apply the changes from this review — turn review comments into a commit.@codex implement this — on an issue, this kicks off a new cloud task that opens a PR against the issue.@codex rebase — keep a long-running PR fresh.Configure the app to auto-review PRs (per-repo setting). On each open or push, Codex posts a single review comment with: a summary, concerns, suggestions, and (if requested) a draft of the changes it would make. Treat it like a junior reviewer who reads every line — sometimes wrong, often useful, always fast.
The ChatGPT mobile app has a Codex tab. It's a thin client over the cloud: you can kick off tasks, watch progress, comment on diffs, and merge PRs from your phone. Best uses are reactive — a CI fails while you're at the coffee shop, you tell Codex to investigate, the PR is waiting when you're back.
The Codex Slack app adds a /codex command and an @Codex mention. From any channel:
processOrder function in acme-web actually do?"Pairs well with team-facing repos where the same person who runs the engineering channel isn't always the one writing code.
Codex reads ~/.codex/config.toml on startup. Defaults are sensible; this is where you'd pin a model, set a default approval mode, define profiles, or wire up an MCP server.
# default model and effort for every session model = "gpt-5-codex" effort = "medium" approval_mode = "auto" # where session logs go history_dir = "~/.codex/sessions" # notification when a task completes (script gets one arg: the event JSON) notify = ["~/bin/codex-notify.sh"] # profiles let you swap whole configs with --profile fast / --profile review [profiles.fast] model = "gpt-5-codex-mini" effort = "low" [profiles.review] model = "gpt-5-codex" approval_mode = "read-only" # MCP servers — expose external tools to the agent [mcp_servers.filesystem] command = "npx" args = ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/notes"] [mcp_servers.github] command = "npx" args = ["-y", "@modelcontextprotocol/server-github"] env = { GITHUB_PERSONAL_ACCESS_TOKEN = "ghp_..." }
A profile is a named bundle of overrides. Pick one with codex --profile fast. Three patterns worth setting up:
fast — mini model, low effort. For "I just want it edited."review — full model, read-only. For "tell me what's wrong."yolo — full model, full-access. For inside containers only.Codex speaks the Model Context Protocol, so any MCP server you've already configured for Claude or another client works here too. Drop the server config into [mcp_servers.NAME] and the agent gets new tools at startup. Common ones: filesystem (extra paths), GitHub, Postgres, Slack, your own.
The notify setting runs a script every time a task completes. Use it to: play a sound, post to Slack, ring a Pushover. The script receives one argument — a JSON blob describing what happened.
.codex/config.toml at the root of a repo and Codex merges it on top of your global config when you launch inside that tree. Use it to pin the model per project (e.g. mini for the docs repo, full for the app).
A few patterns that move the needle past "I asked it to write a function." Each one starts with the surface that fits best.
/diff, eyeball, commit.git add -p).--profile review.pnpm test and pnpm typecheck pass."src/api/orders.ts we have a paginated endpoint. Build the same shape for invoices."There's no objective ranking — they're all good, they're all moving. But there are real differences in shape, and picking the right tool per job beats picking the right tool overall.
| Tool | Shape | Strongest at | Pairs best with |
|---|---|---|---|
| OpenAI Codex | CLI + cloud + GitHub + IDE | Delegated work, parallel cloud tasks, PR reviews | Repos with a clean CI you trust |
| Claude Code | CLI + IDE + cloud (Managed Agents) | Long, careful sessions; complex reasoning; agent SDK | See claude.wholetech.com |
| Cursor / Windsurf | IDE-first | Tab-completion, inline edits, "the editor that gets it" | Solo work, design-heavy iteration |
| GitHub Copilot | IDE + chat + agent mode | Autocomplete, fluent IDE integration, line-by-line | Anyone already paying for it |
| Devin / autonomous agents | Hosted, fully async | Tickets-to-PRs at scale | Mature codebases with strong tests |
Before your second Codex session, write a one-page AGENTS.md. You'll feel the improvement instantly.
Don't crank to high unless you're actually waiting on a hard decision. Most tasks are medium-shaped.
It's two keystrokes and it'll save you from at least one accidentally-committed test change.
Codex undoes cleanly. If something doesn't smell right, revert and re-ask with one more sentence of context.
If you'd run something locally and just wait, run two of them in the cloud instead. The whole pricing model assumes you will.
"…then run pnpm test". This single habit cuts your turn count roughly in half.
Reviewing? --profile review. Hacking? Default. Inside a container? --profile yolo.
If Codex starts and kills a dev server constantly, set PORT=3000 in AGENTS.md or your shell. Saves churn.
The event log shows every command Codex runs. Glance at it during long tasks — you'll learn what Codex thinks "search" or "test" means in your repo.
codex resume brings back yesterday's chat. Pair with a daily journal commit and you have a real audit trail.
Codex sometimes claims it ran tests it didn't. Check the diff includes a real test-run output, not just an assertion.
The product moves fast. npm update -g @openai/codex on Mondays.
The cloud product. The fastest way to see what Codex can do without installing anything.
The CLI is open source. Issues, releases, the changelog, and the place to file bugs.
The canonical reference — config keys, model names, sandbox details, MCP setup. Always more current than this page.
API keys, billing, usage dashboards. Where Codex sessions billed against an API key show up.
The companion guide for Anthropic's Claude — Claude Code, the API, Skills, MCP, the works. If you use both, read both.
The broader WholeTech network — workstations, integrations, and the other sites in this family.