Site iconSite icon ForkLog

OpenAI unveils GPT-5.5, its flagship AI model

OpenAI unveils GPT-5.5, its flagship AI model

OpenAI has released GPT-5.5. The model is positioned as “a new level of intelligence for real work and powering agents.”

The neural network is “built to understand complex tasks, use tools, check its work, and carry more tasks through to completion.”

The model can infer user intent, plan work autonomously and carry tasks through to completion. GPT-5.5 “does an excellent job” of writing and debugging code, searching the internet, analysing data, creating documents and spreadsheets, controlling software and switching between tools.

“Instead of carefully supervising every step, you can assign GPT‑5.5 a complex multi‑stage task and trust it to plan, apply tools, check its work, overcome ambiguities and keep going,” the announcement says.

GPT‑5.5 tests across benchmarks. Source: OpenAI.

OpenAI noted that the new model is particularly effective in agentic programming, computer control, knowledge work and early-stage scientific research—areas where long chains of reasoning and actions matter.

“GPT‑5.5 delivers a jump in intelligence without sacrificing speed. Larger, more capable models often run more slowly, but GPT‑5.5 matches GPT‑5.4 in real‑world per‑token latency while exhibiting a much higher level of intelligence,” the startup said.

The model uses “significantly fewer” tokens when operating in Codex.

OpenAI said it applied the “most powerful” set of safety measures ahead of release, working with internal and external experts.

Availability

GPT-5.5 is available in ChatGPT and Codex for Plus, Pro, Business and Enterprise plans. A separate GPT‑5.5 Pro is offered for Pro, Business and Enterprise.

Both variants will soon be available via API at $5m per 1m input tokens and $30m for output. The context window is 1m tokens.

In Codex, GPT‑5.5 is available for Plus, Pro, Business, Enterprise, Edu and Go plans with a 400,000 context window. GPT‑5.5 is offered in Fast mode, generating tokens 1.5 times faster at 2.5x the cost.

GPT‑5.5 is pricier than GPT‑5.4. This is explained by higher token efficiency.

Capabilities of GPT-5.5

The new model uses fewer tokens and relies less on retries when solving tasks. On Artificial Analysis’s programming index it delivers a “state‑of‑the‑art level of intelligence” at half the cost compared with competitors.

Source: OpenAI.

GPT-5.5 is the company’s most powerful system for agentic programming. On Terminal‑Bench 2.0, which evaluates complex command‑line scenarios, it reaches 82.7% accuracy.

On SWE‑Bench Pro it scored 58.6%; on Expert‑SWE the model outperformed GPT‑5.4.

Across all three benchmarks the newcomer beat its predecessor while using fewer tokens.

Source: OpenAI.

“The model’s strengths in programming are particularly apparent in Codex, where it can perform engineering tasks—from implementation and refactoring to debugging, testing and validation,” the company blog says.

GPT-5.5 has a better grasp of system structure: why something fails, where fixes are needed and which parts of the code they affect.

The model “significantly outperforms” GPT-5.4 and Claude Opus 4.7 in reasoning and autonomy: it anticipates problems, and predicts testing and review needs without explicit prompts.

On GDPval, which assesses agents’ ability to complete well‑defined intellectual tasks across 44 professions, GPT‑5.5 scores 84.9%. On OSWorld‑Verified it posts 78.7%, and on Tau2‑bench 98%.

Source: OpenAI.

GPT‑5.5 also performs strongly in other tests: 60% on FinanceAgent, 88.5% on internal investment‑banking modelling tasks and 54.1% on OfficeQA Pro.

Source: OpenAI.

Information work

GPT-5.5 is “a powerful tool for everyday computer work.” The model better understands user intent and more confidently traverses the full information‑handling cycle: search, analysis, tool use, verification and transforming raw inputs into finished outputs.

In Codex, GPT‑5.5 outperforms GPT‑5.4 at producing documents, spreadsheets and slide presentations.

More than 85% of staff across OpenAI divisions use Codex weekly, including in software development as well as finance, communications, marketing, data analytics and product management.

Scientific research

In scientific and technical workflows GPT-5.5 also does better. These are tasks that do not boil down to answering a specific question: the model can explore an idea step by step, gather evidence, test a hypothesis and interpret the resulting data.

Source: OpenAI.

GPT‑5.5 shows improvements over GPT‑5.4 on GeneBench—a platform for multi‑step analysis of scientific data in genetics and quantitative biology.

On BixBench the new model also surpassed its predecessor.

Source: OpenAI.

In April, OpenAI introduced “agents for the workspace” in ChatGPT. Teams can create shared assistants for complex tasks and long‑running processes.

Exit mobile version