- GPT-5-Codex promises higher performance and success rates
- It’s included for Plus, Pro, Business, Edu and Enterprise users
- The model can use 93.7% fewer tokens on lightweight tasks
OpenAI has shared more details about GPT-5-Codex, a purpose-built version of GPT-5 specifically optimized for agentic coding and real-world software engineering, and we’re in for a treat when it comes to reliability and performance.
The ChatGPT maker claimed a SWE-bench Verified benchmark success rate of 74.5%, with refactoring performance improving to 51.3% (up from 33.9% in GPT-5).
Like GPT-5, GPT-5-Codex will dynamically adjust reasoning time for faster performance on small tasks and more comprehensive reasoning on complex ones, and it’s already been tested working independently for over seven hours on large refactors.
GPT-5-Codex is a big upgrade
OpenAI says GPT-5-Codex is strong in code reviews, catching critical bugs before release, but it can also handle frontend work with visual inspection, screenshots and mobile web design improvements.
The news comes just a couple of months after OpenAI launched Codex CLI (in April) and Codex web (in May), before combining them into one “unified… experience connected by… ChatGPT” in early September.
It’s included with ChatGPT Plus, Pro, Business, Edu and Enterprise plans, and works across terminal, IDEs, on the web, in GitHub and on the iOS app.
The company also detailed how GPT-5-Codex uses 93.7% fewer tokens than GPT-5 on lightweight interactions, but it will also spend twice as long reasoning, editing, testing and iterating if it needs to.
Equally as important for developers, the tool will provide logs, citations and test results for transparency. Developers using Codex CLI via API key will also get API access to GPT-5-Codex “soon.”
“Codex is becoming the coding partner we’ve always envisioned – one that’s faster, more reliable, and deeply integrated into the tools you already use,” OpenAI wrote.
Plus, Edu and Business plans have enough to cover “a few focused coding sessions each week” – users who need more should upgrade to Pro for “a full workweek across multiple projects.” Enterprise accounts pay for what they use via a shared credit pool.