the non-archive.org submission, where simonw actually is watching and commenting https://news.ycombinator.com/item?id=43462317
Impressive use of reasoner CoT distillation method applied to deepseek R1. MIT license for the weights. Thanks, Deepseek!
The only open AI in town
The headline seems to imply that the full 641GB model is running at >20tok/sec on the Mac Studio, but the blog says:
>The model only came out a few hours ago and MLX developer Awni Hannun already has it running at >20 tokens/second on a 512GB M3 Ultra Mac Studio ($9,499 of ostensibly consumer-grade hardware) via mlx-lm and this mlx-community/DeepSeek-V3-0324-4bit 4bit quantization, which reduces the on-disk size to 352 GB.