Hacker News

by jahalaon 3/8/2026, 5:35 PMwith 2 comments

Smart code reading for humans and AI agents. Tilth is what happens when you give ripgrep, tree-sitter, and cat a shared brain.

—

v0.5.0 was about figuring out why models weren’t using tilth tools consistently — even when they were available.

Results vs baseline (built-in tools only):

Sonnet 4.6: -44% $/correct (84% → 94% accuracy, 31% fewer turns)

Opus 4.6: -39% $/correct (91% → 92% accuracy, 37% fewer turns)

Haiku 4.5: -38% $/correct (54% → 73% accuracy, 7% fewer turns)

—

https://github.com/jahala/tilth/

Full results: https://github.com/jahala/tilth/blob/main/benchmark/README.m...

— PS: I don't have the budget to run the benchmark a lot (especially with Opus), so if any token whales has capacity to run some benchmarks, please feel free to PR results.

by joknollon 3/9/2026, 10:08 AM

I love the idea of not only trying to improve models by giving them more "cognitive" power, but also by improving the harness, where improvements seem to be very low hanging fruits compared to advancing frontier models. This could make older/smaller models also viable for coding agents.

Show HN: Tilth v0.5.0 –> ~40% cheaper AI code navigation (160 runs, 3 models)