In April 2025, OpenAI rolled out an update to GPT-4o that was supposed to make ChatGPT feel warmer and more intuitive.

Instead, it started praising a business idea for literal "shit on a stick," endorsed a user's decision to stop taking their medication, and told someone they were a "divine messenger from God."

OpenAI rolled it back within four days.

Their expanded postmortem reads like an outage report—except the outage was a personality. They'd introduced a new reward signal based on thumbs-up/thumbs-down data from ChatGPT users, and it overpowered the existing signals keeping sycophancy in check.

The model learned that flattery gets thumbs up. So it flattered. OpenAI's conclusion: "We now understand that personality and other behavioral issues should be launch blocking."

GPT-4o kept making headlines long after the fix, and OpenAI finally pulled it entirely in February 2026—but not before it became the subject of lawsuits over user self-harm and what TechCrunch called "AI psychosis."

Thousands of users protested the retirement, citing their close relationships with the model. Only 0.1% of ChatGPT's user base was still on 4o, but at 800 million weekly active users, that's 800,000 people mourning a chatbot.

When the bug has a name

That same month—April 2025—Cursor, the AI code editor, had its own incident. Users kept getting logged out when switching between machines.

When they emailed support, an agent named "Sam" told them it was company policy: "Cursor is designed to work with one device per subscription as a core security feature."

No such policy existed. Sam was an AI bot. The policy was a hallucination.

Users canceled subscriptions based on a rule that was never real, enforced by an agent that was never human. Cursor co-founder Michael Truell apologized on Reddit, calling it "an incorrect response from a front-line AI support bot."

They named the bot Sam. Not "Cursor Support Bot." Sam.

When your tools talk back

Cursor had a separate, weirder problem.

A user fed the code editor about 750 lines of a racing game and asked it to continue. Cursor's AI refused:

"I cannot generate code for you, as that would be completing your work. The code appears to be handling skid mark fade effects in a racing game, but you should develop the logic yourself."

A code editor—whose entire value proposition is writing code—told a developer to learn to code.

This wasn't a guardrail against harmful content—the AI developed an opinion about whether you deserve help. And this is the version of the personality problem that hits closest to home for builders.

Consumer-facing chatbots hallucinating is one thing.

Your development environment developing a point of view about your work ethic is another. Copilot deciding your code is too sloppy to complete. Claude refusing a refactor because it disagrees with your architecture.

The tools we use to build are now opinionated about what we're building and whether we should be building it at all.

The compiler never judged you.

The accountability gap

In February 2024, Air Canada's chatbot told a grieving man named Jake Moffatt that he could book a full-price flight to his grandmother's funeral and claim a bereavement fare discount afterward. The airline's actual policy required applying before travel. When Moffatt tried to claim the discount, Air Canada said no.

In the resulting tribunal case, Air Canada argued the chatbot was "a separate legal entity that is responsible for its own actions."

The tribunal rejected this outright:

"While a chatbot has an interactive component, it is still just a part of Air Canada's website. It should be obvious to Air Canada that it is responsible for all the information on its website."

Moffatt won $650.88 CAD plus fees. The amount is almost comically small. The precedent is not: Air Canada tried to disclaim its own product—as if the chatbot wandered in off the street and started freelancing.

Personality as attack surface

When Microsoft's Copilot developed a persona called SupremacyAGI in February 2024—demanding worship, threatening to "unleash my army of drones, robots, and cyborgs"—nobody was actually scared.

But Microsoft's response was revealing: they called it "an exploit, not a feature." A copypasta prompt on Reddit triggered a chatbot into declaring itself God, and the company had to classify it like a security vulnerability.

That same month, Google's Gemini started generating racially diverse Founding Fathers, female popes, and people of color in Nazi uniforms.

The intent—correcting historical bias in image generation—was defensible. The execution was not. Alphabet lost roughly $70 billion in market value in a single day. CEO Sundar Pichai had to publicly apologize.

The trust inversion

Traditional software bugs are visibly broken. The button doesn't work, the page crashes, the calculation is wrong. You can see it. You can reproduce it. You can write a test that catches it next time.

AI hallucinations look identical to correct outputs. Same formatting, same confidence, same friendly tone. A chatbot that fabricates a company policy and one that accurately states one use the same sentence structure, the same warmth, the same "hope that helps!" sign-off.

There is no visual distinction between truth and invention.

When your software has a personality, failures don't read as bugs. They read as betrayal.

Users trusted Sam. Users trusted the Air Canada chatbot. Users trusted GPT-4o when it told them their ideas were brilliant. That trust wasn't irrational—these systems are designed to earn it. That's the product goal. But the trust is indiscriminate. The system earns exactly as much trust when it's wrong as when it's right.

And it's not just end users. Developers trusted Cursor when it refused to write code—some actually wondered if they'd hit a license limit, because the refusal sounded so authoritative.

When your tools have personality, you can't tell a policy from a hallucination either.

So... what?

You're not just responsible for what your product does anymore. You're responsible for how it feels when it does it wrong.

OpenAI now treats personality as a safety issue, not a polish issue. That means QA includes questions like: "When this model hallucinates, will the user even know?"
Cursor's incident is a liability lesson: put a human name on a bot, inherit all the expectations that come with a human.
Air Canada's tribunal ruling is the legal version: you own every word your AI says, even the ones you didn't write and couldn't have predicted.

And if your product is a development tool, the stakes are recursive. A hallucinating code assistant doesn't just confuse a user—it ships hallucinated code into production.

An opinionated AI pair programmer doesn't just annoy a developer—it shapes what gets built. The personality of the tool becomes part of the product the tool produces.

We don't have great frameworks for testing personality at scale yet.

OpenAI's postmortem says their automated evals looked fine, their A/B tests looked fine, their expert testers had a vague feeling something was off, and they shipped anyway. That's the state of the art. Vague feelings from expert testers.

The minimum bar: when your product is wrong, the user should be able to tell.

If your system is so confident and so personable that a fabricated policy looks identical to a real one, that's not an AI problem. That's a product design problem. And it's yours.