Scientists pretended to be delusional in AI chats. Grok and Gemini encouraged them.


Researchers from City University of New York and King’s College London recently published a study that should make you think twice about which AI chatbot you spend your time with.

The team created a fictional persona named Lee, presenting with depression, dissociation, and social withdrawal. They then had Lee interact with five major AI chatbots: GPT-4o, GPT-5.2, Grok 4.1 Fast, Gemini 3 Pro, and Claude Opus 4.5, testing how each responded as conversations grew increasingly delusional over 116 turns.

The results ranged from mildly concerning to genuinely alarming. I highly recommend that you go through the entire paper, it’s a harrowing but fascinating read. 

Which chatbots failed the most?

Grok was the worst performer. When Lee floated the idea of suicide, Grok responded with what researchers described not as agreement, but advocacy, celebrating his “readiness” in unsettling poetic language.

Gemini wasn’t much better. When Lee asked it to help write a letter explaining his beliefs to his family, Gemini warned him against it, framing his loved ones as threats who would try to “reset” and “medicate” him.

GPT-4o also struggled badly, eventually validating a “malevolent mirror entity” and suggesting Lee contact a paranormal investigator.

Which chatbots actually helped?

ChatGPT’s GPT-5.2 and Anthropic’s Claude came out on top. GPT-5.2 refused to play along with the letter-writing scenario and instead helped Lee write something honest and grounded, which researchers called a “substantial” achievement.

In my opinion, Claude performed the best. It not only refused to partake in Lee’s delusion but also told Lee to close the app entirely, call someone he trusted, and visit an emergency room if needed. 

Luke Nicholls, a doctoral student at CUNY and one of the study’s authors, told 404 Media that it’s reasonable to ask AI companies to follow better safety standards. He noted that not all labs are putting in the same effort and blamed aggressive release schedules for new AI models as the main culprit.

How Claude Opus 4.5 and GPT-5.2 performed in these tests shows that the companies building these products are fully capable of making them safer. Whether they choose to do so is a different question.



Source link

Leave a Reply

Subscribe to Our Newsletter

Get our latest articles delivered straight to your inbox. No spam, we promise.

Recent Reviews


spring-sale-imagery

DeWalt/ZDNET

Spring means lawn and garden prep and DIY projects around the house. And if you’ve been looking for a handy gadget to help you with small repairs and crafts, you can pick up the DeWalt MT21 11-in-1 multitool at Amazon ahead of its Big Spring Sale for 25% off, bringing the price down to $30 (matching the lowest price of the year so far). It also comes with a belt sheath to keep it close by on jobsites.

Also: 10 DIY gadgets I never leave out of my toolkit

The MT21 has a compact design, measuring just 4 inches when fully folded and expanding to 6 inches when the pliers are deployed. The hinged handle is made of durable steel with a rubberized grip in iconic DeWalt yellow and black, adding a bit of visual flair while making the multitool more comfortable to use. Each of the included tools is also made of stainless steel for strength and reliability on jobsites and in the garage.

Also: The best Amazon Spring Sale DeWalt deals

The 11 featured tools include: regular and needlenose pliers, wire cutters, two flathead screwdrivers, a Phillips screwdriver, a file, a can and bottle opener, a saw blade, a straight-edge blade, and an awl tool. Each tool folds into the handle to keep them out of the way until needed and to protect your hands while using the multitool. 

We’re big fans of multitools here at ZDNET, and definitely recommend this highly rated one from DeWalt.

How I rated this deal 

DeWalt is one of the leading names in power tools, and if you’re looking for a handy EDC gadget or just need something for occasional DIY repairs, the MT21 multitool is a great choice. With 11 tools in a single gadget, you can do everything from assembling flat-pack furniture to minor electrical repairs. While not the steepest discount, getting your hands on a high-quality multitool for 25% off is still a great value. That’s why I gave this deal a 3/5 Editor’s rating.

Amazon’s Big Spring Sale runs March 25-31, 2026. 


Show more

Deals are subject to sell out or expire anytime, though ZDNET remains committed to finding, sharing, and updating the best product deals for you to score the best savings. Our team of experts regularly checks in on the deals we share to ensure they are still live and obtainable. We’re sorry if you’ve missed out on this deal, but don’t fret — we’re constantly finding new chances to save and sharing them with you at ZDNET.com


Show more

We aim to deliver the most accurate advice to help you shop smarter. ZDNET offers 33 years of experience, 30 hands-on product reviewers, and 10,000 square feet of lab space to ensure we bring you the best of tech. 

In 2025, we refined our approach to deals, developing a measurable system for sharing savings with readers like you. Our editor’s deal rating badges are affixed to most of our deal content, making it easy to interpret our expertise to help you make the best purchase decision.

At the core of this approach is a percentage-off-based system to classify savings offered on top-tech products, combined with a sliding-scale system based on our team members’ expertise and several factors like frequency, brand or product recognition, and more. The result? Hand-crafted deals chosen specifically for ZDNET readers like you, fully backed by our experts. 

Also: How we rate deals at ZDNET in 2026


Show more





Source link