We build LlamaExtract, a tool that allow you to automatically extract a data model from a collection of documents, and then reusing this datamodels (JSON Schema) to extract data from documents.
Available as a Python library and as an API.
Announcement blog: https://www.llamaindex.ai/blog/introducing-llamaextract-beta...
Pretty neat initial launch. What's top of mind to add to it?
Does it handles multiple input documents for extraction?
When you say "available as a python library", do you mean the LlamaCloud API wrapped in a Python package? This doesn't seem to be something that we can use without a LlamaCloud account. When I hear that, I think more that this is something I could run locally.
Is there any documentation or a paper on the methods?
Is this intended to be a proprietary service?