Hacker News

by fgfmon 1/17/2024, 2:10 PMwith 10 comments

by lxeon 1/17/2024, 6:10 PM

A 6.7B model that's as good as GPT-4 is mostly due to overfitting in such a way that favors certain benchmarks.

Wavecoder – a CodeLLM with 6.7B params scoring just behind GPT4