Maybe ChatGPT has some pre-frontal cortex problems

by solresolon 1/11/2025, 11:17 PMwith 8 comments

by MisterKenton 1/12/2025, 12:10 AM

This is a really odd way to test capabilities of an LLM. First, most photos of clocks are 10:10, since the training data for watches are usually set to 10:10 (in order to better sell watches etc).

Second, I don't think the photo generation aspect of chat gpt is being marketed or presented as a problem solving AI.

by chompon 1/12/2025, 2:57 AM

I like the part where the AI couldn’t be trusted to draw a clock, so we trusted it to psychoanalyze the incorrect clock

by solresolon 1/11/2025, 11:17 PM

I administered the CDT to ChatGPT and got Claude to diagnose what was wrong with the "patient" based on the results.

There are signs of pre-frontal cortex damage or early stage dementia.

by pnm45678on 1/12/2025, 12:19 AM

Here's the thing (which you probably knew going in).. Generative AI is quite well-known to be terrible at drawing specific times on clock faces.

This is down to the training data. It has been trained on a huge amount of images.

That includes advertising. For whatever reason, wrist watch manufacturers have a tendency to set watches to 10:10 in ads, almost without exception. Perhaps it's just a nice-looking time, or it's good for comparison purposes.

Simply Google "wrist watch" and you'll see.

So, these generative models have a huge bias towards 10:10 on clock faces, because that's what all the clocks they've been trained on look like.

by airstrikeon 1/12/2025, 12:17 AM

FWIW, Claude 3.5 Sonnet got the SVG right on the first try: https://claude.site/artifacts/8dedf16e-b861-4497-96e2-872773...

Prompt was just "create an svg of a clockface with the time being 10 past 11"

by pockybum522on 1/12/2025, 5:05 AM

I love the concept of the article where one LLM can't draw a simple clock but the other one can accurately diagnose medical conditions from a hypothetical drawn image.

by batch12on 1/12/2025, 12:17 AM

It has sentience problems...