```
try:
answer = chain.invoke(question)
# print(answer) # raw JSON output
display_answer(answer)
except Exception as e: print(f"An error occurred: {e}")
chain_no_parser = prompt | llm
raw_output = chain_no_parser.invoke(question)
print(f"Raw output:\n\n{raw_output}")
```Wait, are you calling LLM again if parsing fails just to get what LLM has sent to you already?
The whole thing is not difficult to do if you directly call API without Lang chain, it'd also help you avoid such inefficiency.
People still use langchain?
The use case in the article is relatively simple. For more complex structures, BAML (https://www.boundaryml.com/) is a better option.
It's right there. In the screenshot in the blog post. Grammar > 'JSON Schema + Convert'. That's what structured output is.
... it's going to be september forever, isn't it?
The version of llama.cpp that Llamafile uses supports structured outputs. Don't waste your time with bloat like langchain.
Think about why langchain has dozens of adapters that are all targeting services that describe themselves as OAI compatible, Llamafile included.
I'd bet you could point some of them at Llamafile and get structured outputs.
Note that they can be made 100% reliable when done properly. They're not done properly in this article.