OpenAI rolls out GPT-5.5, aims to automate complex work end-to-end

According to the company, the new model is designed to handle complex, loosely defined assignments from beginning to end with minimal guidance. Unlike earlier systems that required detailed step-by-step instructions, GPT-5.5 is capable of planning workflows, selecting and using tools, reviewing its own outputs, and continuing execution until a task is fully completed.
OpenAI rolls out GPT-5.5, aims to automate complex work end-to-end
GPT-5.5 can take on messy, multi-part tasks from start to finish, according to the press release by the company. |Image source: IANS|

OpenAI has introduced GPT-5.5, describing it as its most advanced and intuitive model to date, and positioning it as a major step toward AI systems that can independently execute complete tasks rather than simply respond to prompts.

OpenAI unveils GPT-5.5 as its most advanced model yet

According to the company, the new model is designed to handle complex, loosely defined assignments from beginning to end with minimal guidance. Unlike earlier systems that required detailed step-by-step instructions, GPT-5.5 is capable of planning workflows, selecting and using tools, reviewing its own outputs, and continuing execution until a task is fully completed. OpenAI says it performs strongly across areas such as coding, debugging, online research, data analysis, document creation, spreadsheet work, and even interacting with software across applications.

Add Zee Business as a Preferred Source

Strong gains in coding benchmarks and agentic tasks

A major focus of the upgrade is in agent-style coding and computer operation abilities. On the Terminal-Bench 2.0 evaluation, which measures advanced command-line task handling involving planning and tool use, the model achieves 82.7 per cent accuracy, marking a new benchmark. In the SWE-Bench Pro test, which assesses real-world GitHub issue resolution, GPT-5.5 reaches 58.6 per cent and is reported to solve more tasks end-to-end in a single attempt compared to earlier versions. It also performs better than GPT-5.4 on OpenAI’s internal Expert-SWE benchmark, which involves long-duration coding projects lasting up to 20 hours. The company additionally highlights that these improvements come with reduced token usage, making the system more efficient while increasing capability. On Artificial Analysis’s Coding Index, GPT-5.5 is said to deliver top-tier performance at roughly half the cost of comparable models.

Built on NVIDIA infrastructure with optimised system design

OpenAI notes that the model was jointly developed and deployed on NVIDIA GB200 and GB300 NVL72 infrastructure. It was also used in testing and optimisation processes through Codex. One of the technical improvements includes dynamic load balancing, where instead of dividing requests into fixed sections, the system analysed real production traffic patterns over several weeks to design improved partitioning methods. This change reportedly improved token generation speed by more than 20 per cent.

Positioned as an AI assistant for knowledge work

For knowledge-oriented work, GPT-5.5 is positioned as a more capable assistant-style system. It is designed to retrieve relevant information more effectively, process key insights, use external tools, and convert raw input into structured outputs. Within Codex, it produces higher-quality documents, spreadsheets, and presentations. OpenAI also reports internal adoption across departments such as finance, communications, marketing, and product teams. For example, its finance team used the model to review 24,771 K-1 tax forms spanning over 71,000 pages, reducing a process that typically takes two weeks. In another case, the communications team developed a scoring system for speaking requests and deployed an automated Slack-based agent to manage low-risk queries without human intervention.

ChatGPT adds “Thinking” and “Pro” variants with improved performance

In ChatGPT, the “Thinking” variant of GPT-5.5 is designed to provide faster and more concise responses for complex reasoning tasks, while the “Pro” version targets higher-quality outputs for demanding professional domains including business, law, education, and data science. Performance benchmarks cited by the company include 84.9 per cent on GDPval for multi-domain professional tasks, 78.7 per cent on OSWorld-Verified for real computer environment operations, and 98 per cent on Tau2-bench Telecom for customer service workflows without the need for prompt tuning.

Expanded safety framework and cybersecurity controls

On the safety side, OpenAI states that GPT-5.5 incorporates its most robust safeguards so far, particularly around high-risk cybersecurity scenarios. The model has undergone expanded evaluation with external red team experts. Within the company’s Preparedness Framework, its cybersecurity and biology-related capabilities are rated as “High,” though not classified as “Critical.” As part of its safety approach, OpenAI is also introducing a Trusted Access program for cybersecurity use cases, allowing verified professionals expanded access to cyber-enabled models for legitimate defensive purposes.

Phased rollout begins across ChatGPT and Codex

The rollout of GPT-5.5 has begun for Plus, Pro, Business, and Enterprise users across ChatGPT and Codex. The Pro version is available to Pro, Business, and Enterprise tiers, while broader API access is expected in the near future after additional safety and security assessments are completed.