Impressive benchmarks here. The 90% eval for one of the math categories on 0-shot vs 74.5% GPT-4 8-shot is nice.
https://twitter.com/AnthropicAI/status/1764653830468428150
Related ongoing thread:
Claude 3 model family - https://news.ycombinator.com/item?id=39590666 - March 2024 (347 comments)
Impressive benchmarks here. The 90% eval for one of the math categories on 0-shot vs 74.5% GPT-4 8-shot is nice.
https://twitter.com/AnthropicAI/status/1764653830468428150