OpenAI has officially released a research sample of Codex, a new AI coding agent based on a revised version of its o3 model architecture and designed exclusively for software engineering.


Designed to execute and test code autonomously.


The new Codex agent operates on Codex-1, a variation of the O3 model that is tuned for simpler code and more precise instruction execution. Unlike other AI helpers, Codex can automatically verify its outputs and iterate till the code passes. It runs in a sandboxed virtual machine environment hosted in the cloud and can interact with GitHub to preload repositories, allowing it to complete activities such as bug fixes, test generation, and feature implementation in 1 to 30 minutes.

Users can continue to use their PCs while Codex does background operations. The agent, which supports several simultaneous software engineering jobs, is now accessible to ChatGPT Pro, Team, and Enterprise users via the ChatGPT sidebar interface.


While users presently have broad access, OpenAI intends to implement rate constraints in the coming weeks. At that moment, customers will be able to purchase more credits to broaden their access. The business also announced that Codex would be made available to ChatGPT Plus and Edu users soon.

Codex Interface and Workflow


Users in ChatGPT may delegate jobs to Codex by entering a prompt and selecting “Code” or “Ask,” based on whether they want implementation or insight. Assigned tasks are shown in a sidebar, allowing users to track progress. Josh Tobin, OpenAI’s research lead, noted that the idea is for Codex to someday operate as a “virtual teammate,” capable of autonomously accomplishing difficult tasks that would normally take engineers many hours or even days.

Safety, Limitations, and Future Road Map
Codex works in an air-gapped environment, with no access to the public internet or third-party APIs, which decreases the possibility of misuse but may limit specific use cases. OpenAI indicated that Codex has measures in place to reject requests for dangerous software development. The tool incorporates several of the O3 model’s safety features.


Despite its advancements, OpenAI recognized its limits. AI coding agents are still prone to blunders, particularly during debugging. According to a recent Microsoft study, numerous top-tier models, like the Claude 3.7 Sonnet and o3-mini, failed to correct faulty code consistently.

In addition, OpenAI is enhancing Codex CLI, its open-source terminal-based coding agent, with an o4-mini model designed for engineering activities. This updated version is currently the default in Codex CLI and will also be available through OpenAI’s API, with prices starting at $1.50 per million input tokens and $6 per million output tokens.