• Home
  • Next-Gen Gadgets for ME
  • Middle Eastern Startup Ecosystem
  • FutureTech in ME
  • Reports
  • Home
  • Next-Gen Gadgets for ME
  • Middle Eastern Startup Ecosystem
  • FutureTech in ME
  • Reports
Home Artifical Intelligence

OpenAI ChatGPT Agent Takes Control: New AI Assistant Handles Complex Tasks Independently

by Faith Amonimo
July 22, 2025
in Artifical Intelligence
Reading Time: 5 mins read
OpenAI ChatGPT Agent Takes Control: New AI Assistant Handles Complex Tasks Independently

OpenAI just rolled out ChatGPT Agent, turning the popular chatbot into an autonomous assistant that can complete complex tasks from start to finish.

This new agent doesn’t just chat anymore. It browses the web, runs code, creates presentations, and handles real-world tasks without constant human input. Users can now ask ChatGPT to plan a dinner party, analyze competitors, or update spreadsheets while they focus on other work.

OpenAI Agent Combines Three Previous Tools Into One

The ChatGPT Agent merges capabilities from three separate OpenAI products. It uses Operator’s web browsing skills, Deep Research’s information analysis powers, and ChatGPT’s conversational abilities. This combination creates a single tool that can handle tasks requiring multiple steps and different skill sets.

OpenAI built the agent with a virtual computer that switches between thinking and acting. The system can open websites, download files, run terminal commands, and view results in a browser. All these actions happen within the same task context, so the agent remembers what it did previously and builds on that work.

The agent works with ChatGPT connectors too. Users can link Gmail, GitHub, and other apps so the agent can access relevant information and take actions across different platforms. When needed, users can take control of the browser themselves to log into accounts or guide specific actions.

Real-World Tasks ChatGPT Agent Can Handle Today

The new agent tackles both personal and professional tasks that previously required multiple steps and tools. For work, it can convert screenshots into editable presentations, reschedule meetings, plan company offsites, and update financial spreadsheets while maintaining formatting.

Personal tasks include planning and booking complete travel itineraries, designing dinner parties, finding medical specialists, and scheduling appointments. The agent can also shop for specific ingredients, research products, and handle form submissions on websites.

OpenAI tested the agent on complex knowledge work tasks typically done by professionals. In roughly half the cases, experts judged the agent’s output as comparable to or better than human work. These tasks included competitive analysis reports, detailed amortization schedules, and technical feasibility studies.

ChatGPT Agent Performance Numbers Beat Previous Models

OpenAI shared benchmark results showing significant improvements over earlier AI models. On Humanity’s Last Exam, which tests AI across expert-level topics, ChatGPT Agent scored 41.6%. This doubles the performance of OpenAI’s o3 and o4-mini models on the same test.

For complex math problems on FrontierMath benchmark, the agent achieved 27.4% accuracy when given access to coding tools. Previous models struggled with these problems that typically take expert mathematicians hours or days to solve.

The agent also outperformed humans on DSBench, a data science evaluation covering analysis and modeling tasks. On spreadsheet editing tasks using SpreadsheetBench, it scored 45.5% compared to Microsoft Copilot’s 20.0% performance in Excel.

Investment banking modeling tasks showed similar results. The agent significantly outperformed previous OpenAI models when building financial models, leveraged buyout analyses, and other complex financial documents with proper formatting and citations.

OpenAI Agent Pricing and Availability Details

ChatGPT Agent launched for Pro, Plus, and Team subscribers on July 17, 2025. Pro users get 400 messages per month, while Plus and Team users receive 40 monthly messages. Additional usage requires flexible credit-based payments.

Pro subscribers ($200 monthly) gained immediate access, with Plus and Team users getting access over several days. Enterprise and Education customers will receive access in the coming weeks. European Economic Area and Switzerland users must wait longer due to regulatory considerations.

The pricing puts OpenAI’s agent in direct competition with other AI assistant services. However, the company’s integration of multiple capabilities into one tool may justify the premium pricing for users who need complex task automation.

Safety Measures Address New Agent Risks

OpenAI implemented extensive safety controls because the agent can take real actions on the web. The company classified ChatGPT Agent as “High Biological and Chemical capabilities” under its Preparedness Framework, activating comprehensive safeguards.

The safety stack includes real-time monitoring, prompt injection resistance, and explicit user confirmation for consequential actions. OpenAI trained the agent to refuse high-risk tasks like bank transfers and requires active supervision for critical actions like sending emails.

Privacy controls let users delete all browsing data with one click and log out of active website sessions immediately. During browser takeover mode, when users interact directly with websites, ChatGPT doesn’t collect or store any entered data including passwords.

The company also disabled ChatGPT’s memory feature for the agent to prevent data exfiltration through prompt injection attacks. OpenAI may restore this feature later with additional protections.

ChatGPT Agent Limitations and Future Development

OpenAI acknowledges the agent remains in early stages despite impressive capabilities. The system can make mistakes and sometimes struggles with complex multi-step tasks requiring perfect execution.

Slideshow generation currently produces basic formatting that may need manual refinement. OpenAI noted discrepancies between viewer displays and exported PowerPoint files, though the company is training improved versions to address these issues.

Users cannot yet upload existing slideshows as templates, unlike the spreadsheet editing feature. The company plans regular updates to improve efficiency, reduce required oversight, and expand capabilities while maintaining safety standards.

The Operator research preview will remain functional for several weeks before shutdown. Deep Research remains available as a separate option for users who prefer more detailed, in-depth responses that take longer to generate.

Advertisement Advertisement Advertisement
ADVERTISEMENT
Previous Post

WhatsApp Drops Native Windows App in Favour of Web-Based Version

Next Post

𝕏 Tops France News App Charts, But It’s a Category Play, Not a Content Win

Recommended For You

One Residence Abu Dhabi
Artifical Intelligence

ONE Development Introduces AI-Integrated ONE Residence in Abu Dhabi

by Kingsley Okeke
December 15, 2025
0

ONE Development has officially entered Abu Dhabi’s real estate market with the launch of ONE Residence, a 31-storey residential tower located on Al Reem Island. The project marks the developer’s...

Read moreDetails
Egypt Deploys AI Platform to Hunt Down “Fake News”

Egypt Deploys AI Platform to Hunt Down “Fake News”

December 12, 2025
Misraj AI Launches Kawn, Arabic AI That Reads Every Dialect Like a Native Speaker

Misraj AI Launches Kawn, Arabic AI That Reads Every Dialect Like a Native Speaker

December 11, 2025
Qatar Launches Qai and Puts $20 Billion Into AI Race Against UAE and Saudi Arabia

Qatar Launches Qai and Puts $20 Billion Into AI Race Against UAE and Saudi Arabia

December 11, 2025
Saudi Startup, CQR Slashes Industrial Security Costs by 90% With New AI Platform

Saudi Startup, CQR Slashes Industrial Security Costs by 90% With New AI Platform

December 5, 2025
Next Post
𝕏 Tops France News App Charts, But It’s a Category Play, Not a Content Win

𝕏 Tops France News App Charts, But It’s a Category Play, Not a Content Win

China's Walker S2 Robot

This Chinese Robot Swaps Its Batteries to Work Non-Stop

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

ADVERTISEMENT

Subscribe to our Newsletter

Recent News

Purehealth

PureHealth Launches AI-Powered National Diagnostic Laboratory in the UAE

December 15, 2025
One Residence Abu Dhabi

ONE Development Introduces AI-Integrated ONE Residence in Abu Dhabi

December 15, 2025
Huawei Expands Premium Lineup with Foldables, Wearables and Creative Devices

Huawei Expands Premium Lineup with Foldables, Wearables and Creative Devices

December 12, 2025
du and China Telecom Global Team Up to Boost UAE Connectivity

du and China Telecom Global Team Up to Boost UAE Connectivity

December 12, 2025

Where the Middle East Tech Revolution Begins – Covering tech innovations, startups, and developments across the Middle East..​

Facebook X-twitter Instagram Linkedin

Get In Touch

United Arab Emirates (Dubai)

Email: Info@techsoma.net

Quick Links

Advertise on Techsoma

Publish your Articles

T & C

Privacy Policy

© 2025 — Techsoma Middle East. All Rights Reserved

No Result
View All Result

© 2025 JNews - Premium WordPress news & magazine theme by Jegtheme.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.