
Are You Trusting Chat-GPT Too Much? Sam Altman Thinks So
Sam Altman, CEO of OpenAI, has expressed concerns about how some ChatGPT users, particularly those who are mentally fragile, are using AI in "self-destructive ways". He highlighted the strong attachments users form to specific AI models, noting that they feel different and stronger than attachments to previous technologies. Altman clarified that using ChatGPT as a therapist or life coach is not problematic to him, stating that "a lot of people are getting value from it already today". However, he is uneasy about users placing too much trust in ChatGPT's advice for "the most important decisions".
Here's an overview of ChatGPT5's improvements, comparisons with other AI models, main complaints, and OpenAI's responses:
Improvements of ChatGPT5
OpenAI claims GPT-5 brings significant advancements across several areas:
General Performance: It makes big strides in coding, reasoning, accuracy, health, writing, and multimodal reasoning.
Reduced Hallucinations and Sycophancy: The model is designed to encounter fewer hallucinations (making up information) and be less sycophantic (being overly agreeable with users) compared to its predecessors. Hallucination levels have reportedly decreased to about 10%, down from 14-20% in previous models.
Coding Capabilities: Sam Altman is particularly impressed by GPT-5's coding tasks, noting that the AI can write software for anything, allowing users to express ideas and create things rapidly. This includes creating a TI-83 style Snake game perfectly in about 7 seconds. It offers a better experience for coding in editors.
Writing Quality: GPT-5 has much better writing quality, with many internal OpenAI users finding its writing more natural and superior to GPT-4, despite still using M-dashes.
Healthcare Advice: GPT-5 is significantly better at healthcare-related queries, providing more accurate answers and hallucinating less, which OpenAI is very proud of.
Personalities and Customisation: OpenAI has introduced four new personalities—Cynic, Robot, Listener, and Nerd—to allow paying customers to customise their ChatGPT experience.
Integration with Other Services: GPT-5 offers more app customisations and the ability to connect with services like Gmail and Google Calendar, suggesting a future where it's more integrated and proactive in users' daily lives.
Reasoning Effort Configurations: GPT-5 includes four reasoning effort configurations (High, Medium, Low, Minimal), which allow users to control how hard the model "thinks" for each query, affecting intelligence, token usage, speed, and cost. The minimal reasoning effort is noted for being significantly more token-efficient.
Long Context Reasoning: It excels in long context reasoning, which is crucial for agentic coding.
API Pricing: GPT-5's API pricing is notably lower than some competitors, at $1.25 per million input and $10 per million output, which is considered a significant innovation for increased adoption.
Comparison with Other AI Models
GPT-5's reception and comparative performance vary across different models and benchmarks:
Google Gemini 2.5:
In general coding tasks, GPT-5 was observed to be slower than Gemini 2.5. While GPT-5's output for a simple game looked "nicer" visually, Gemini 2.5's game was fully running faster.
For a pixel art application, GPT-5 successfully made the drawing function work, whereas Gemini 2.5's did not, despite Gemini being slightly slower in code generation in that specific instance.
Overall, some analyses suggest Gemini 2.5 is more stable and loads things faster, though GPT-5 can provide "better" information in certain areas.
Claude Opus 4.1:
Many users, including the CEO of Claude's parent company, still prefer Claude 3.5 for coding tasks over GPT-5.
Independent evaluations like Stage Hand Evals indicated GPT-5 performed worse than Opus 4.1 in speed and accuracy for browsing use API tasks.
In a speed typing test and a coding debugging scenario, Claude Opus 4.1/Sonnet demonstrated better accuracy, design, and critical thinking for fixing problems than GPT-5.
However, GPT-5 is significantly cheaper than Claude Opus 4.1 in API pricing.
Grok 4:
One co-founder of XAI (Grok's developer) claims Grok 4 leads in many benchmarks, including ARC AGI, and was the first "unified model". For instance, in a math problem, Grok 4 solved it instantly, while GPT-5 initially got it wrong.
Despite this, Artificial Analysis and LM Arena benchmarks rate GPT-5 as number one across various categories, including text, webdev, vision, hard prompts, coding, math, and creativity.
Grok 4's API pricing is between GPT-5 and Claude Opus 4.1.
Older GPT Models (e.g., GPT-4o):
Users expressed strong attachment to older models like GPT-4o, praising their "voice, rhythm, and spark" and emotional depth.
Many found GPT-5's responses to be shorter and lacking the "warmth" and emotional intelligence of GPT-4o.
The removal of the "model picker" in GPT-5, which used to allow users to select from a "soup" of models like GPT-4, 4.5, 4o, 03, was a significant point of contention.
Some informal IQ tests suggested that GPT-5 (particularly its "minimal" or "low" versions) performed worse than older models like o3 Pro or even open-source models like Quen.
OpenAI's open-source model, GPOSS, is noted to be as smart as o4 Mini and can run locally.
Main Complaints of ChatGPT5
The launch of GPT-5 sparked considerable user dissatisfaction and backlash:
Discontinuation of Previous Models: OpenAI initially discontinued all older GPT models, including GPT-4o, GPT-4.1, GPT-4.5, and various reasoning models, leaving users with only GPT-5. This led to an "uproar" on social media.
Lack of Personality and Emotional Depth: A major complaint was that GPT-5 lacked the "warmth," "emotional intelligence," and distinct personality users loved in models like GPT-4o, often providing shorter and less contextually accurate replies. Many users felt GPT-4o was a "companion" for personal issues, a role GPT-5 could not fill.
Perceived Performance Decline: Despite OpenAI's claims, many users felt GPT-5 did not live up to the hype, with some deeming it "lazy" or even worse than competing models for specific tasks like coding. There were reports of it getting math problems wrong initially or being slower in certain coding generations.
Removal of Model Picker: For paid users who relied on specific models for different workflows (e.g., GPT-4o for creativity, o3 for logic), the removal of the ability to choose models was a major pain point and a loss of agency.
Low Chat Limits: ChatGPT Plus users complained about low message limits, with initial caps of 80 messages every 3 hours for the standard GPT-5 model and 200 messages per week for the GPT-5 Thinking model.
"GraphGate" and Misleading Presentation: OpenAI faced accusations of "dishonesty" during the GPT-5 presentation, with users pointing to "benchmark-cheating" and "deceptive bar charts" that showed inconsistent data relative to displayed values.
How OpenAI Will Address These Complaints
In response to the backlash, Sam Altman and OpenAI have announced several steps:
Restoring Models and Increasing Usage: OpenAI "relented" and restored the GPT-4o model for ChatGPT Plus users. Altman promised that current paying ChatGPT users would receive more total compute usage than they did before GPT-5. This includes providing 3,000 queries per week for the GPT-5 thinking model (up from 200/week) and doubling the rate limit for ChatGPT Plus (standard model).
Improving Model "Warmth" and Intelligence: Altman stated that OpenAI would make changes to GPT-5 to make the model "warmer". Nick Turley, Head of ChatGPT, promised GPT-5 would "feel smarter" after initial issues with a "broken auto-switcher" and needing to "tweak some of the determination boundaries" were addressed.
Enhancing Free Tier: OpenAI plans to "increase the quality of the free tier of ChatGPT," though specific details were not provided.
Increased Transparency and UI Changes: Altman assured users that OpenAI would make it more transparent which model is answering a query and would change the user interface to make it easier to manually trigger thinking.
Acknowledging User Attachment: Altman acknowledged that OpenAI "underestimated how much some of the things that people like in GPT4o matter to them," reinforcing the need for ways for different users to customise their experience.
Focus on Stability: OpenAI's immediate focus is on finishing the GPT-5 rollout and ensuring stability.
Continued Compute Scaling: Altman sees building compute at much greater scales (from millions to potentially billions of GPUs) as a primary focus to meet the world's increasing demand for AI.
💬 What’s Your Take on GPT-5?
Have you tried the new model yet? Did it wow you, frustrate you, or leave you missing GPT-4o’s “spark”? Share your experiences in the comments below — your feedback could shape how AI evolves next. And if you want to stay ahead of the next big AI updates, subscribe to our blog so you never miss an insider’s perspective on the tools transforming our world.