I know this has been asked before, but things are moving so quickly in this realm and people here seem to have a good insight, so I am asking again.
The speed of answers and computation is not really an issue, and I know that most selfhosted solutions obviously is in no way fully on-par with services like Chatgpt or Stable-diffusion.
I do have somewhat modest resources.
16 GB RAM NVIDIA GPU with 4 GB vRAM.
Is there any options that means I can run it selfhosted?
https://github.com/invoke-ai/InvokeAI should work on your machine. For LLM models, the smaller ones should run using llama.cpp, but I don't think you'll be happy comparing them to ChatGPT.