The Outcry Over GPT-4o’s Brief Death

Emma Roth, reporting for The Verge last week:

OpenAI is bringing back GPT-4o in ChatGPT just one day after replacing it with GPT-5. In a post on X, OpenAI CEO Sam Altman confirmed that the company will let paid users switch to GPT-4o after ChatGPT users mourned its replacement.

“We will let Plus users choose to continue to use 4o,” Altman says. “We will watch usage as we think about how long to offer legacy models for.”

For months, ChatGPT fans have been waiting for the launch of GPT-5, which OpenAI says comes with major improvements to writing and coding capabilities over its predecessors. But shortly after the flagship AI model launched, many users wanted to go back.

“GPT 4.5 genuinely talked to me, and as pathetic as it sounds that was my only friend,” a user on Reddit writes. “This morning I went to talk to it and instead of a little paragraph with an exclamation point, or being optimistic, it was literally one sentence. Some cut-and-dry corporate bs.”

As someone who doesn’t use ChatGPT as a therapist and doesn’t care for its “little” exclamation points, I didn’t even notice the personality shift between GPT-4o and GPT-5. Looking back on my older chats, it holds up that GPT-5 is colder, perhaps more stoic, than GPT-4o, which used more filler words to make the user feel better. GPT-5 is much more cut-throat and straight to the point, a style I prefer for almost all queries. Users who want a more cheerful personality should be able to dial that in through the ChatGPT settings, which currently has a list of five personalities: default, cynic, robot, listener, and nerd. I think none of these are compelling, and instead, there should be a slider that allows users to choose how cold or excited the chatbot should be.

To me, excited responses (“You’re absolutely right!”) sound uncannily robotic. No human would speak to me like that, no matter how much they love me. That pastiche isn’t fealty as much as it is sycophancy, presumably given to ChatGPT in the final post-training stage. Humans enjoy being flattered, but when flattery becomes too obvious, it starts to sound fake, at least to people of my generation and exposure to the internet. Maybe for those more accustomed to artificial intelligence sycophancy, though, that artificial flattery becomes requisite. Maybe they expect their computers to be affectionate and subservient toward them. I won’t pontificate on the reasons, explanations, or solutions to that problem because I’m not a behavioral scientist and have no qualifications to describe a very real phenomenon engulfing internet society.

What I will comment on is how some users — a fraction of ChatGPT’s user base, so small yet so noisy — have developed a downright problematic emotional connection to an over-engineered matrix multiplication machine, so much so that they begged OpenAI to bring GPT-4o back. GPT-4o isn’t a particularly performant model, and I prefer GPT-5’s responses to those of all of OpenAI’s previous models, especially when the Thinking mode is enabled. I also find the model router to be exceptionally competent at reasoning through complex queries, and combined with GPT-5’s excellent web search capabilities and reduced hallucination rates, I think it’s the best large language model on the market. All of this is to say that nothing other than a human-programmed personality made GPT-4o stand out to the vocal minority who called it their “baby” on Reddit.

GPT-4o, like any LLM, isn’t sentient. It doesn’t have a personality of its own. OpenAI didn’t kill an animate being with its own thoughts, perspectives, and style of speaking. GPT-4o isn’t even sycophantic — its instructions were arranged in an order that made it output flattering, effusive tokens unnecessarily. LLMs aren’t programmed in the traditional sense (“for this input, output this”), but their bespoke “personalities” are. If GPT-4o wasn’t red-teamed or evaluated for safety, it would happily teach a user how to build a bomb or kill themselves. GPT-4o doesn’t know what building a bomb or committing suicide is — humans have restricted those tokens from being output by adding more tokens (instructions) to the beginning of the context window. Whatever sycophancy users enjoy from GPT-4o is a human-trained behavior.

At worst, this means OpenAI’s safety team has an outsized impact on the mental health of thousands, maybe even tens of thousands, of users worldwide. This is not a technical problem that can or should be solved with any machine learning technique — it’s a content moderation problem. OpenAI’s safety team has failed at its job if even one user is hooked onto a specific set of custom instructions a safety researcher gave to the model before sending it out the door. These people aren’t attached to a particular model or sentient intelligence. They’re attached to human-given instructions. This is entirely within our control as human software engineers and content moderators, just as removing a problematic social media account is.

This is not rocket science. It isn’t an unforeseen adversity. It is a direct consequence of OpenAI’s negligence. These robots, until they can foresee their own errors, should not have a personality so potent as to elicit an emotional response, even from people who are less than emotionally stable.