日本語

What changed in Codex 5.3 (GPT-5.3-Codex)? A clear comparison with earlier models

alt

“What exactly changed in Codex 5.3?” “Is it fine to stay on 5.1 Max?” “Is there any point switching from 5.2?”
With rapid model updates, choosing by gut feel can be risky. This article calmly organizes the facts about Codex 5.3 (GPT-5.3-Codex) and how it differs from existing models, drawing only from primary sources such as the official OpenAI blog, release notes, developer documentation, and system cards.
We focus on verified information available as of February 9, 2026—speed, public benchmarks, availability, and when to use which model.


Where Codex 5.3 sits

Codex 5.3 (official name: GPT-5.3-Codex) was announced on February 5, 2026 as the newest Codex-based model. The launch post calls it “the most capable agentic coding model to date,” unifying the GPT-5.2-Codex coding stack with GPT-5.2’s reasoning stack while running about 25% faster for Codex users. It is designed to be steered mid-task without losing context.

Headlines from the official announcement

  • ~25% faster end-to-end for Codex users
  • Stronger tracking of progress and mid-process instructions during long runs
  • New highs on SWE-Bench Pro and Terminal-Bench; big gains on OSWorld and GDPval
  • Available today in the Codex app, CLI, IDE extension, and web; API access is “coming soon”

What to compare (practical axes)

In day-to-day work, these factors create the biggest gaps between Codex models:

  • Stability on long tasks
  • Ability to finish work that includes terminal/OS operations
  • Willingness to accept course corrections mid-stream
  • Predictable response time and number of retries within a time budget

Codex 5.3 is explicitly optimized along these axes, so we’ll compare using the same lens.


Codex 5.3 vs GPT-5.2-Codex (official benchmarks)

OpenAI’s appendix publishes side-by-side results at xhigh reasoning effort:

  • SWE-Bench Pro (Public): 56.8% vs 56.4% (GPT-5.2-Codex)
  • Terminal-Bench 2.0: 77.3% vs 64.0%
  • OSWorld-Verified: 64.7% vs 38.2%
  • Cybersecurity CTF Challenges: 77.6% vs 67.4%
  • GDPval (wins or ties): 70.9% vs 70.9% for GPT-5.2 (no 5.2-Codex score reported)

Takeaway: raw code-editing gains are modest, but performance that requires terminal and desktop operations jumps sharply. That indicates Codex 5.3 is more about completing work end-to-end than about isolated code generation.


Codex 5.3 vs GPT-5.1-Codex-Max

GPT-5.1-Codex-Max (released November 19, 2025) was engineered for project-scale, long-running tasks using a technique called “compaction” to stay coherent across multiple context windows. It has been the go-to when you need multi-hour persistence in the Codex surfaces.

Design contrast:

  • 5.1 Max: endurance and multi-window coherence first; best for marathon refactors or multi-hour agent loops.
  • 5.3: speed, interactive steering, and stronger execution on OS/terminal-inclusive tasks; best when you need to guide the agent as it works.

There is no simple “higher/lower” relationship—choose based on task style: persistent slog vs. guided, tool-heavy execution.


What the 25% speed-up means in practice

  • More attempts fit in the same time window.
  • Faster modify–rerun loops reduce friction during debugging.
  • Lower perceived wait time makes mid-course guidance feel natural. Actual latency will still vary by environment and task mix, but the improvement is measurable in Codex surfaces.

Availability and API status

Today you can use GPT-5.3-Codex in:

  • Codex app (including the new macOS client)
  • Codex CLI
  • IDE extension (official support)
  • Web-based Codex Cloud

API access is not yet live; OpenAI’s wording is “working to safely enable API access soon.” Until then, API workflows should remain on gpt-5.2-codex.


Practical selection guide (February 2026)

  • Routine implementation, debugging, and tasks that involve shell/OS steps: start with Codex 5.3.
  • API-automated pipelines: stay on GPT-5.2-Codex until the 5.3 API is released.
  • Ultra-long, persistent agent runs: test both Codex 5.3 and GPT-5.1-Codex-Max on your representative tasks.

These recommendations stay within what official sources currently confirm; avoid firm bets until you validate on your own workloads.


Key takeaways

  • GPT-5.3-Codex (released 2026-02-05) is the latest Codex model and runs about 25% faster for Codex users.
  • It sets new highs on Terminal-Bench 2.0 and OSWorld-Verified, signaling better completion of computer-use tasks.
  • Availability: app, CLI, IDE extension, and web now; API is still pending.
  • Choose models by task profile: speed + steerability (5.3), long-haul persistence (5.1 Max), or API compatibility (5.2-Codex).

Adopting a new model is most reliable when you verify on your own representative scenarios while anchoring to facts from official releases.

Related posts