Evaluating Multimodal LLMs Using the Google IO 2024 Puzzle

by simonbutton 3/15/2024, 7:10 PMwith 2 comments

by maleton 3/15/2024, 8:02 PM

Surprising to see these models stumbling on what at first glance seems like a simple task, it would be interesting to see how the non-vision models fare if you convert the problems to ascii art

by simonbutton 3/15/2024, 7:10 PM

GPT-4V, Claude 3 Opus and Gemini Ultra go head to head in solving GoogleIO Puzzle 2024