DeepSeek-R1 at 3,872 tokens / second on a single Nvidia HGX H200

by moondistanceon 1/31/2025, 3:08 AMwith 1 comments

by billconanon 1/31/2025, 3:59 AM

https://news.ycombinator.com/item?id=42879864

this is cerebras' 70B number, 1600 tokens / sec, not sure about the costs.