Ask HN: RAG as a Service?

by Difwifon 5/22/2024, 1:59 PMwith 5 comments

I know the various different solutions for building RAG systems using a vector store manually but I'm working on a new project that could benefit from a RAG assisted version of GPT-4o.

I'm aware of the OpenAI Assistants API (still pretty early stage) but I'm wondering if there's any other options out there?

To be specific: I'm looking for an off the shelf solution where I can easily drop in a bunch of documents or a list of URLs and then ask search or summarization type questions on the dataset. I'm trying to avoid building it since it's a bit of distraction from my primary objective. Most of what I've found is pretty simple and DIY and still requires you to figure out document parsing and chunking. I just need something that works for existing web pages (with some non-static content).

My best solution is to download pages after a delay using a Chrome extension and load them into the OpenAI Assistants API but that's limited to returning 20 chunks and can be a bit limited on summarization.

by kkoppenhaveron 5/22/2024, 9:20 PM

Someone I know runs https://docsbot.ai/ and that seems like maybe what you're talking about?

by pants2on 5/22/2024, 6:13 PM

I have looked around quite a bit but haven't really found anything that strikes the right balance for me of features, cost, effectiveness, etc.

I've been sort of building my own system in Go which is already a big improvement on the Python-based solutions, but it's nowhere near ready for this type of thing. But it uses Go-colly and go-readability for web scraping.

by muzanion 5/22/2024, 3:00 PM

AWS Bedrock does it, but we had about 300 rows of CSV so it was faster to roll out our own. I spent a couple days wrestling with permissions on Bedrock, but if you're familiar with AWS, you may find it useful.

by ianpurtonon 5/22/2024, 4:16 PM

Take a look at Bionic https://bionic-gpt.com/

by aregsaron 5/22/2024, 2:55 PM

David sacks’ glueai app uses ragie.ai under the hood