Hacker News

by csantinion 4/28/2025, 11:09 AMwith 1 comments

by zahlmanon 4/28/2025, 8:33 PM

This is a great idea, but it doesn't "force correct code" any more than unit tests do when applied to human-written code. It also can't force the code to have other desirable properties such as readability (granted that this doesn't usually seem to be a problem with LLM output).

Unvibe: A Python Test-Runner that forces LLMs to generate correct code