Show HN: LlamaExtract, a tool to automatically extract schema from documents

by pierreon 7/26/2024, 2:03 PMwith 4 comments

We build LlamaExtract, a tool that allow you to automatically extract a data model from a collection of documents, and then reusing this datamodels (JSON Schema) to extract data from documents.

Available as a Python library and as an API.

Announcement blog: https://www.llamaindex.ai/blog/introducing-llamaextract-beta...

by verdvermon 7/26/2024, 2:44 PM

When you say "available as a python library", do you mean the LlamaCloud API wrapped in a Python package? This doesn't seem to be something that we can use without a LlamaCloud account. When I hear that, I think more that this is something I could run locally.

Is there any documentation or a paper on the methods?

Is this intended to be a proprietary service?

by cheesyFisheson 7/26/2024, 2:22 PM

Pretty neat initial launch. What's top of mind to add to it?

by BinaryBrainon 7/26/2024, 2:14 PM

Does it handles multiple input documents for extraction?