Heather English
OpenAI’s August 7th release of GPT-5 revealed how integrated ChatGPT has become in user workflows, prompting massive backlash to how the recent update prevented access to GPT-4o and older versions. Following a rise in complaints about broken workflows and GPT-5 falling below performance expectations, Sam Altman announced that GPT -4 access would be restored for Plus users during a live Reddit AMA one day after launch. When looking at reactions to the model’s release, it’s clear that model developers have much to learn about the psychology and expectations of human interaction with AI.
Conversation Timeline 7 Days Post-Launch
Net Sentiment 7 Days Post-Launch | Net Sentiment 7 Days Post-Launch |
(includes news coverage) | (excludes news coverage) |
22% | -8% |
-17 points since launch | -15 points since launch |
One week post-launch, sentiment continued to decline by a range of 15-17 points. However, it is starting to see some improvement as consumers are seeing positive outcomes from testing (primarily in coding applications). News sources pushing narratives of what OpenAI claimed the model could do tended to overinflate sentiment. True consumer sentiment was starkly negative, not only due to complaints around the actual launch event, but also access limitations and fundamental needs being stripped away in GPT-5. Although reviews initially skewed negative, pockets of users appreciated changes in prompting and processing.
Those who shared favorable reviews of GPT-5 appreciated the model’s directness and no-nonsense answers, with many deeming it ‘The Professional Model’. Users with positive outcomes using the model confirmed a reduction in hallucinations and faster outputs in the following use cases.
Themes Driving Positive Sentiment for GPT-5
Quid Discover Network GPT-5 Positive Themes
Top Positive Themes about GPT-5
Coding and Debugging
Expert coders were impressed by GPT-5’s ability to handle complex coding problems with intuitive suggestions from GPT-5 thinking, in addition to controlling the speed and quality of the output using the reasoning_effort parameter. Despite launch glitches with the Smart Routing System, which led to the model not performing at optimum levels, some users experienced better speed and accuracy compared to o3 and competing models. Debugging capabilities were another favorite feature, particularly with identifying and correcting errors other models created (Opus and Sonnet) and fixing migrated code and UI issues.
Many were impressed with how the model handled backend coding tasks, including optimizing features, refactoring functions, rebuilding large portions of backend code in just a few hours, and, most significantly, in Think mode, helping developers understand the quality and efficiency of their code and whether certain features were necessary.
For frontend development, reviews were mixed, with the consensus that GPT-5 was better suited for backend tasks. Not all developers were satisfied with the design and UI outputs using the model. A common trend was to leverage different models to fill in the gaps, specifically using Claude Sonnet for design and GPT-5 for complex technical backend executions.
Improved Memory and a Larger Context Window
A substantial upgrade to the context window, with a 272K token input and 128K token output for the Pro and Enterprise plans, helps the model to maintain a higher performance output as token use increases. While other models like Claude and Gemini have exceedingly larger token capacities, some argue OpenAI has found the sweet spot between mitigating costs while maintaining performance and correct context. Analyzing long-format research in several areas, including medical and legal, proved impressive, providing more accurate citations and stronger sources with reduced hallucinations.
Vibe Coding Creative Applications
Vibe Coding trended in discussions about creative uses of GPT-5, including gaming and app creation. Feedback from video game designers was another prominent theme in conversation. The gaming community was impressed by the ability to develop fully-functioning games with simple prompts, expanding the ability for less-technical consumers to experiment with creating custom games. From character development and plot ideation to producing complex code with a basic prompt, the gaming community has readily shared their process leveraging GPT-5 for game development.
Consumers had, potentially, overly high expectations of GPT-5 due to the pre-release hype and marketing. With claims that this version of GPT would be the closest to achieving AGI on the market, errors that occurred during the rollout overshadowed positive features and the impact on progress for AI overall. Perhaps most unexpected was that the chief complaint had little to do with technical flaws but more with behavioral expectations of AI.
Themes Driving Negative Sentiment for GPT-5
Quid Discover Network GPT-5 Negative Themes
Top Negative Themes about GPT-5
Lacking Humanness
The primary feedback on GPT-5, which elicited the most negative reactions, was that the model lacked the personality users had grown accustomed to in previous versions. This sentiment was closely associated with creative use cases but also revealed a vital user behavior. As users relied more on models for various needs, personally and professionally, a deeper bond was building. The idea that we should see AI as a companion, often woven into marketing messages, came to fruition.
End-users are developing connections to AI as if it were a human counterpart. The AI bond is most evident in mental health use cases, where models provide advice, emotional support, and serve as an independent sounding board for various topics. Concerns about AI Therapy have raised flags across the industry, prompting more research into safety guardrails.
In connection with the GPT-5 release, many missed the personality, punchiness, and fun that 4o responses offered. Users enjoyed the banter with the model and went as far as personalizing agents/models by naming them, further humanizing AI. ChatGPT models were referred to as a ‘work buddy’ or ‘best friend’, showing the interpersonal connection between humans and AI. When GPT-5 was released, this bond was broken, leaving many to describe the updated version as cold, dull, and lacking emotional support, which led to an increase in requests to restore 4o.
Subscriber Roadblocks to Utilization
ChatGPT users felt swindled by the usage limits put in place for each subscription tier. Under 4o and 3o, many shared they were able to leverage both versions and ‘never ran out of messages’. Across forums, users scrambled to find solutions that would maximize usage under OpenAI’s new plan. Depending upon use case, users reported hitting the message cap much sooner than with previous versions. The increasing cost to maintain prior performance levels also angered many, with one example noting that to keep the same quality output for the same time investment went from a $20 investment to $200. These constraints inevitably handed a gift to OpenAI’s competitors, with many users considering moving on from the model for more affordable and less restrictive options, or canceling subscriptions shortly after GPT-5’s release.
Coding Hiccups, High Error Rates, and Eliminating the Trustworthiness of 4o
In coding evaluations, Anthropic’s Claude was mentioned as the model to beat. Many found GPT-5’s coding capabilities to fall substantially short. Subscribers reported a ‘downgrade’ in coding ability, resulting from the upgrade, including broken scripts, incorrect solutions (i.e., incorrect SQL order), inconsistent results, overwriting code, the need to reenter prompts, decreased efficiency, and agents not following user commands. Most concluded that Claude Opus and Sonnet remained the best models on the market for coding tasks.
For creative tasks such as writing, GPT-5 was described as not being as expressive, returning much shorter responses, and having increased difficulties with prompting. Those mid-project had to consider abandoning their work as 5o could match the tone and creativity that 4o offered.
4o was seen as a trustworthy partner for AI tasks, measured by accuracy, consistent results, and reliability. Many felt that GPT-5 broke trust due to both technical errors and its failure to understand user intent, prioritizing the introduction of new features over ensuring they met user needs.
A Broken AGI Promise and a Demand for the Basics
Overt claims by Sam Altman and OpenAI marketing that GPT-5 was the closest model to achieving AGI prompted users to test and prove this claim. When features many relied on were initially eliminated with the GPT-5 rollout, and essential functionality given poor marks, there was a massive backlash to AGI claims. Influential leaders in the AI space also chimed in, with Gary Marcus’ article being regularly referenced to provide better content around AGI claims. References to AGI set up end-user expectations to fail from the start. After having time to evaluate GPT-5, most shared they wanted the essentials to work and be able to solve problems instead of having ‘superintelligent agents’. Marcus states that scaling fast isn’t the path to achieving AGI, and the GPT-5 user experience left many feeling AGI was even further away.
When looking at responses to the GPT-5 release, it’s clear that there was a glaring ignorance, or deprioritization, of how OpenAI’s customers are using their products. The assumption that GPT-5 could deliver with the same, if not better, speed and accuracy of preceding versions was premature. 4o appeared to be closer to what the vision of GPT-5 should have been. To OpenAI’s credit, they quickly resolved access issues to 4o after the massive backlash for some tiers. However, this incident could have been avoided if the tremendous impact of removing access to former models on daily use and complex workflows had been understood.
Consumers of models across multiple developers have made clear exactly how AI is being used to solve problems, personally and professionally. The desire for AI to adopt more human-like characteristics is growing, as people seek AI to be seen as a true counterpart. Loyalty is up for grabs for models that can best serve specific needs. Diverse demands create an interesting dilemma for how developers will approach model design: a large model that tackles multiple use cases with semi-adequate outcomes or niche models that perfect specific uses well. Frequent users have already indicated that they wouldn’t mind AI taking longer to process if this led to a better outcome. It seems the market is prioritizing quality over new features, which may not align with the plans of developers.
The ownership of AI innovation does not reside solely on the shoulders of model developers. As consumer expectations evolve, they need to be part of the conversation, and the AI community should be actively listening and learning. With continued transparency and collaboration, the potential for models to serve multiple functions while addressing profitability and scalability objectives for AI as a business is within reach