Hacker News

by LifeIsBioon 5/20/2023, 7:26 PMwith 132 comments

by LifeIsBioon 5/20/2023, 7:32 PM

This is a reference to: https://news.ycombinator.com/item?id=36012360

by blazespinon 5/20/2023, 10:32 PM

The sequence of these two threads is just too perfect. Almost likely someone is trying to make a point.

by kibwenon 5/20/2023, 10:01 PM

>> What is the most beautiful algorithm?

> Quicksort Algorithm

Definitive proof that AI must be stopped. Ranking quicksort as more elegant than heapsort?!

by jamesharton 5/21/2023, 12:17 AM

Worth noting also that, while asking Bing chat to "Tell me what Donald Knuth says to Stephen Wolfram about chatGPT" doesn't (yet) produce exactly the right result, it produced the following answer when asked what Donald Knuth says about chatGPT:

> Donald Knuth, a computer scientist and mathematician known for his contributions to the field of computer programming, particularly in the area of algorithms and data structures, has expressed some skepticism about the potential of artificial intelligence to achieve true human-level intelligence and creativity[1]. He once conducted an experiment with chatGPT where he posed 20 questions to it and analyzed its responses[1]. Is there anything specific you would like to know about his views on GPT?

With [1] being a citation link to https://cs.stanford.edu/~knuth/chatGPT20.txt

by ryanseyson 5/20/2023, 9:54 PM

It now knows to communicate that the NASDAQ doesn't operate on Saturdays.

by erwincoumanson 5/21/2023, 12:24 AM

It makes you wonder why Knuth bothered with an outdated ChatGPT version? He couldn't find someone with access to GPT-4?

by benatkinon 5/20/2023, 10:03 PM

Reminds me of that time AlphaGo got its ass handed to it multiple times, and then a short while later...

by ec109685on 5/20/2023, 9:54 PM

Interesting both completely whiff on the number of chapters in the Haj.

by fnordpigleton 5/21/2023, 12:19 AM

What I find amazing about the original exchange was the profound lack of curiosity Knuth demonstrated. Because the model wasn’t flawless in performance he pinned it as a curiosity that was good at grammar and vacuous otherwise and wasn’t interested to hear how it improves. This reminds me of an awful lot of the computing field in this drama as it plays out. People that literally know how implausible any of these feats have been using traditional approaches immediately discount the entire thing the moment it hallucinates - and it feels like the more deterministic the bent of the person the more absolutely dismissive they are of what’s transpiring in front of us.

These models are doing feats that are stupendous and impossible before their advent. Not just a little bit, but the capability differences are so vast that it’s perhaps not even recognizable by people as being as vast as it is. I am impressed that Wolfram seems to have immediately grasped its significance and is running with it.

The fact this gist demonstrates essentially every single flaw was addressed. But that Knuth apparently doesn’t know / care months after GPT4’s introduction is demonstrative of a different type of personality.

I know which I aspire to be.

by SomewhatLikelyon 5/21/2023, 4:41 AM

Thank you for specifying ChatGPT-4. So many commenters on the web say they used GPT4 without specifying if they're using the ChatGPT version. ChatGPT-4 is specifically aligned for answering questions better than the base GPT4 model.

by dotancohenon 5/21/2023, 6:17 AM

I would not be surprised if these questions become some form of canonical test for future language models.

Obviously, being the work of Knuth, they are extraordinarily insightful in peeling back the first layer of the answer and providing insight to the underlying properties of both the model itself, and the dataset on which it was trained. It also tests the ability to compute (not recite) very specific facts (e.g. when the sun will be directly above Japan), so checks if subroutines and ephemerides specific to this type of data exist.

But beyond the obvious technical merit - there is an alluding property to base our tests on those whom we respect. I used a similar - but far less sophisticated - set of questions when first exploring ChatGPT. But nobody will be drawn to Dotan Cohen's language model benchmarks - rightfully so. The name Knuth has such reverence in the field that I forsee this test, and variations on it to prevent rigging, becoming a canonical test of language models.

by billyloon 5/21/2023, 1:05 AM

You made me curious about who Bard would respond to them. Here they are:

https://gist.github.com/billylo1/bb717512d2d5145ce7eec02d055...

Notable: Bard struggles in similar ways. It does mention NASDAQ close at 12,043.59 on Friday, May 20, 2023

by underdeserveron 5/20/2023, 9:34 PM

Interesting that it didn't get the 5-letter word sentence right.

by bpicoloon 5/20/2023, 11:11 PM

Most importantly, much better wonton recipe.

by 8thcrosson 5/21/2023, 12:13 AM

thats a shitload of difference between its previous version!

by cratermoonon 5/20/2023, 10:44 PM

Literary Libations: https://cratermoon.substack.com/p/the-literary-libations

by axpy906on 5/20/2023, 10:24 PM

Nailed every one. Some by saying not possible to answer but still.

“Don Knuth Plays with ChatGPT” but with ChatGPT-4