I’ve found ggerganovs work on llama.cpp to be amazing, and I’ve loved playing around with it. However, has anyone used it in production? I’m sure there are some really cool use cases, but I haven’t seen any yet.
4chan /g/ has a small community of people playing around with local LLMs, maybe check there.
Some people are.
https://old.reddit.com/r/localllama