• Home
  • Next Gen Gadgets for ME
  • Middle Eastern Startup Ecosystem
  • FutureTech in ME
  • Reports
  • Home
  • Next Gen Gadgets for ME
  • Middle Eastern Startup Ecosystem
  • FutureTech in ME
  • Reports
Home Artifical Intelligence

OpenAI ChatGPT Agent Takes Control: New AI Assistant Handles Complex Tasks Independently

July 22, 2025
in Artifical Intelligence
Reading Time: 5 mins read
OpenAI ChatGPT Agent Takes Control: New AI Assistant Handles Complex Tasks Independently
Share on FacebookShare on Twitter

OpenAI just rolled out ChatGPT Agent, turning the popular chatbot into an autonomous assistant that can complete complex tasks from start to finish.

You might also like

Replit Raises $250M Series C at $3B Valuation and Launches Agent 3

The Invigilator Secures $11M to Expand its Anti-Cheating Platform Globally

Apple Considers Launching Its Own AI Service

This new agent doesn’t just chat anymore. It browses the web, runs code, creates presentations, and handles real-world tasks without constant human input. Users can now ask ChatGPT to plan a dinner party, analyze competitors, or update spreadsheets while they focus on other work.

OpenAI Agent Combines Three Previous Tools Into One

The ChatGPT Agent merges capabilities from three separate OpenAI products. It uses Operator’s web browsing skills, Deep Research’s information analysis powers, and ChatGPT’s conversational abilities. This combination creates a single tool that can handle tasks requiring multiple steps and different skill sets.

OpenAI built the agent with a virtual computer that switches between thinking and acting. The system can open websites, download files, run terminal commands, and view results in a browser. All these actions happen within the same task context, so the agent remembers what it did previously and builds on that work.

The agent works with ChatGPT connectors too. Users can link Gmail, GitHub, and other apps so the agent can access relevant information and take actions across different platforms. When needed, users can take control of the browser themselves to log into accounts or guide specific actions.

Real-World Tasks ChatGPT Agent Can Handle Today

The new agent tackles both personal and professional tasks that previously required multiple steps and tools. For work, it can convert screenshots into editable presentations, reschedule meetings, plan company offsites, and update financial spreadsheets while maintaining formatting.

Personal tasks include planning and booking complete travel itineraries, designing dinner parties, finding medical specialists, and scheduling appointments. The agent can also shop for specific ingredients, research products, and handle form submissions on websites.

OpenAI tested the agent on complex knowledge work tasks typically done by professionals. In roughly half the cases, experts judged the agent’s output as comparable to or better than human work. These tasks included competitive analysis reports, detailed amortization schedules, and technical feasibility studies.

ChatGPT Agent Performance Numbers Beat Previous Models

OpenAI shared benchmark results showing significant improvements over earlier AI models. On Humanity’s Last Exam, which tests AI across expert-level topics, ChatGPT Agent scored 41.6%. This doubles the performance of OpenAI’s o3 and o4-mini models on the same test.

For complex math problems on FrontierMath benchmark, the agent achieved 27.4% accuracy when given access to coding tools. Previous models struggled with these problems that typically take expert mathematicians hours or days to solve.

The agent also outperformed humans on DSBench, a data science evaluation covering analysis and modeling tasks. On spreadsheet editing tasks using SpreadsheetBench, it scored 45.5% compared to Microsoft Copilot’s 20.0% performance in Excel.

Investment banking modeling tasks showed similar results. The agent significantly outperformed previous OpenAI models when building financial models, leveraged buyout analyses, and other complex financial documents with proper formatting and citations.

OpenAI Agent Pricing and Availability Details

ChatGPT Agent launched for Pro, Plus, and Team subscribers on July 17, 2025. Pro users get 400 messages per month, while Plus and Team users receive 40 monthly messages. Additional usage requires flexible credit-based payments.

Pro subscribers ($200 monthly) gained immediate access, with Plus and Team users getting access over several days. Enterprise and Education customers will receive access in the coming weeks. European Economic Area and Switzerland users must wait longer due to regulatory considerations.

The pricing puts OpenAI’s agent in direct competition with other AI assistant services. However, the company’s integration of multiple capabilities into one tool may justify the premium pricing for users who need complex task automation.

Safety Measures Address New Agent Risks

OpenAI implemented extensive safety controls because the agent can take real actions on the web. The company classified ChatGPT Agent as “High Biological and Chemical capabilities” under its Preparedness Framework, activating comprehensive safeguards.

The safety stack includes real-time monitoring, prompt injection resistance, and explicit user confirmation for consequential actions. OpenAI trained the agent to refuse high-risk tasks like bank transfers and requires active supervision for critical actions like sending emails.

Privacy controls let users delete all browsing data with one click and log out of active website sessions immediately. During browser takeover mode, when users interact directly with websites, ChatGPT doesn’t collect or store any entered data including passwords.

The company also disabled ChatGPT’s memory feature for the agent to prevent data exfiltration through prompt injection attacks. OpenAI may restore this feature later with additional protections.

ChatGPT Agent Limitations and Future Development

OpenAI acknowledges the agent remains in early stages despite impressive capabilities. The system can make mistakes and sometimes struggles with complex multi-step tasks requiring perfect execution.

Slideshow generation currently produces basic formatting that may need manual refinement. OpenAI noted discrepancies between viewer displays and exported PowerPoint files, though the company is training improved versions to address these issues.

Users cannot yet upload existing slideshows as templates, unlike the spreadsheet editing feature. The company plans regular updates to improve efficiency, reduce required oversight, and expand capabilities while maintaining safety standards.

The Operator research preview will remain functional for several weeks before shutdown. Deep Research remains available as a separate option for users who prefer more detailed, in-depth responses that take longer to generate.

Tags: AI agentsAI assistantAI capabilitiesAI safetyArtificial Intelligenceautonomous AIChatGPT Agentmachine learningOpenAIOpenAI pricingtask automationtech newsweb browsing AI
Advertisement Advertisement Advertisement
ADVERTISEMENT
Previous Post

WhatsApp Drops Native Windows App in Favour of Web-Based Version

Next Post

𝕏 Tops France News App Charts, But It’s a Category Play, Not a Content Win

Recommended For You

Replit Raises $250M Series C at $3B Valuation and Launches Agent 3
Artifical Intelligence

Replit Raises $250M Series C at $3B Valuation and Launches Agent 3

by Covenant Aladenola
September 11, 2025
0

Replit has raised $250 million in Series C funding at a $3 billion valuation, led by Prysm Capital with participation from Google, American Express Ventures, and existing investors Y Combinator,...

Read moreDetails
The Invigilator Secures $11M to Expand its Anti-Cheating Platform Globally

The Invigilator Secures $11M to Expand its Anti-Cheating Platform Globally

September 5, 2025
Apple Considers Launching Its Own AI Service

Apple Considers Launching Its Own AI Service

September 4, 2025
xAI Is Open-Sourcing Grok 2: Here’s What You Need to Know

Grok Code Surpasses Competitors on OpenRouter, Elon Musk Announces

September 4, 2025
Anthropic raises $13B Series F at $183B post-money valuation

Anthropic raises $13B Series F at $183B post-money valuation

September 2, 2025
Next Post
𝕏 Tops France News App Charts, But It’s a Category Play, Not a Content Win

𝕏 Tops France News App Charts, But It’s a Category Play, Not a Content Win

China's Walker S2 Robot

This Chinese Robot Swaps Its Batteries to Work Non-Stop

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Popular Stories

  • Replit Raises $250M Series C at $3B Valuation and Launches Agent 3

    Replit Raises $250M Series C at $3B Valuation and Launches Agent 3

    0 shares
    Share 0 Tweet 0
  • UAE PropTech Leader PRYPCO Raises Pre-Series A Funding from General Catalyst

    0 shares
    Share 0 Tweet 0
  • Dubai Startup Seraya Raises $1.8M to Scale Premium Short-Term Stays

    0 shares
    Share 0 Tweet 0
  • Spotify Launches Lossless Audio for Premium Subscribers

    0 shares
    Share 0 Tweet 0
  • LEVEL UP 2025: Dubai to Launch Sports Tech Innovation Expo, Targeting $3.3B Market Growth

    0 shares
    Share 0 Tweet 0

Where Africa’s Tech Revolution Begins – Covering tech innovations, startups, and developments across Africa.​

Facebook X-twitter Instagram Linkedin

Get In Touch

United Arab Emirates (Dubai)

Email: Info@techsoma.net

Quick Links

Advertise on Techsoma

Publish your Articles

T & C

Privacy Policy

© 2025 — Techsoma Africa. All Rights Reserved

Add New Playlist

No Result
View All Result

© 2025 JNews - Premium WordPress news & magazine theme by Jegtheme.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.
Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?