Bitter Lesson is about AI agents

by ankit219on 3/23/2025, 9:16 AMwith 105 comments

by noosphron 3/23/2025, 9:45 PM

For a blog post of 1,200 words the bitter lesson has done more damage to AI research and funding than blowing up a nuclear bomb at neurips would.

Every time I try to write a reasonable blog post about why it's wrong it blows up to tens of thousands of words and no one can be bothered to read it, let alone the supporting citations.

In the spirit of low effort anec-data pulled from memory:

The raw compute needed to brute force any problem can only be known after the problem is solved. There is no sane upper limit to how much computation, memory and data any given task will take and humans are terrible at estimating how hard tasks actually are. We are after all only 60 years late for the undergraduate summer project that would solve computer vision.

Today VLMs are the best brute force approach to solving computer vision we have, and they look like they will take a PB of state to solve and the compute needed to train them will be available some time around 2040.

What do we do with the problems that are too hard to solve with the limited compute that we have? Lie down for 80 years and wait for compute to catch up? Or solve a smaller problem using specialized tricks that don't require a $10B super computer to build?

The bitter lesson is nothing of the sort, there is plenty of space for thinking hard, and there always will be.

by lsyon 3/23/2025, 3:43 PM

Going back to the original "Bitter Lesson" article, I think the analogy to chess computers could be instructive here. A lot of institutional resources were spent trying to achieve "superhuman" chess performance, it was achieved, and today almost the entire TAM for computer chess is covered by good-enough Stockfish, while most of the money tied up in chess is in matching human players with each other across the world, and playing against computers is sort of what you do when you're learning, or don't have an internet connection, or you're embarrassed about your skill and don't want to get trash-talked by an Estonian teenager.

The "Second Bitter Lesson" of AI might be that "just because massive amounts of compute make something possible doesn't mean that there will be a commensurately massive market to justify that compute".

"Bitter Lesson" I think also underplays the amount of energy and structure and design that has to go into compute-intensive systems to make them succeed: Deep Blue and current engines like Stockfish take advantage of tablebases of opening and closing positions that are more like GOFAI than deep tree search. And the current crop of LLMs are not only taking advantage of expanded compute, but of the hard-won ability of companies in the 21st century to not only build and resource massive server farms, but mobilize armies of contractors in low-COL areas to hand-train models into usefulness.

by PollardsRhoon 3/23/2025, 8:40 PM

The time span on which these developments take place matter a lot for whether the bitter lesson is relevant to a particular AI deployment. The best AI models of the future will not have 100K lines of hand-coded edge cases, and developing those to make the models of today better won't be a long-term way to move towards better AI.

On the other hand, most companies don't have unlimited time to wait for improvements on the core AI side of things, and even so building competitive advantages like a large existing customer base or really good private data sets to train next-gen AI tools have huge long-term benefits.

There's been an extraordinary amount of labor hours put into developing games that could run, through whatever tricks were necessary, on whatever hardware actually existed for consumers at the time the developers were working. Many of those tricks are no longer necessary, and clearly the way to high-definition real-time graphics was not in stacking 20 years of tricks onto 2000-era hardware. I don't think anyone working on that stuff actually thought that was going to happen, though. Many of the companies dominating the gaming industry now are the ones that built up brands and customers and experience in all of the other aspects of the industry, making sure that when better underlying scaling came there they had the experience, revenue, and know-how to make use of that tooling more effectively.

by abstractcontrolon 3/23/2025, 8:34 PM

> Investment Strategy: Organizations should invest more in computing infrastructure than in complex algorithmic development.

> Competitive Advantage: The winners in AI won’t be those with the cleverest algorithms, but those who can effectively harness the most compute power.

> Career Focus: As AI engineers, our value lies not in crafting perfect algorithms but in building systems that can effectively leverage massive computational resources. That is a fundamental shift in mental models of how to build software.

I think the author has a fundamental misconception what making best use of computational resources requires. It's algorithms. His recommendation boils down to not do the one thing that would allow us to make the best use of computational resources.

His assumptions would only be correct if all the best algorithms were already known, which is clearly not the case at present.

Rich Sutton said something similar, but when he said it, he was thinking of old engineering intensive approaches, so it made sense in the context in which he said it and for the audience he directed it at. It was hardly groundbreaking either, the people whom he wrote the article for all thought the same thing already.

People like the author of this article don't understand the context and are taking his words as gospel. There is no reason not to think that there won't be different machine learning methods to supplant the current ones, and it's certain they won't be found by people who are convinced that algorithmic development is useless.

by serjesteron 3/23/2025, 4:51 PM

This misses that if the agent is occasionally going haywire, the user is leaving and never coming back. AI deployments are about managing expectations - you’re much better off with an agent that’s 80 +/- 10% successful than 90 +/- 40%. The more you lean into full automation, the more guardrails you give up and the more variance your system has. This is a real problem.

by dtagameson 3/23/2025, 2:05 PM

Good stuff but the original "Bitter Lesson" article has the real meat, which is that by applying more compute power we get better results (just more accurate token predictions, really) than with human guiderails.

by typonon 3/23/2025, 5:36 PM

The counter argument is a bitter lesson that Tesla is learning from Waymo and the lesson might be bitter enough to tank the company. Waymo's approach to self driving isn't end to end - they have classical control combined with tons of deep learning, creating a final product that actually works in the real world, meanwhile the purely data driving approach from Tesla has failed to deliver a working product.

by extron 3/23/2025, 6:32 PM

I bring this up often at work. There is more ROI in assuming models will continue to improve, and planning/engineering with that future in mind, rather than using a worse model and spending a lot of dev time shoring up it's weaknesses, prompt engineering, etc. The best models today will be cheaper tomorrow. The worst models today will literally cease to exist. You want to lean into this - have the AI handle as much as it possibly can.

Eg: We were using Flash 1.5 for awhile. Spent a lot of time prompt engineering to get it to do exactly what we wanted and be more reliable. Probably should have just done multi-shot and said "take best of 3", because as soon as Flash 2.0 came out, all the problems evaporated.

by xg15on 3/23/2025, 7:06 PM

It's not wrong, but I find the underlying corrolay pretty creepy that actually trying to understand those problems and fix errors at edge cases is also a fool's errand, because why try to understand a specific behavior if you can just (try to) finetune it away?

So we'll have to get used for good to a future where AI is unpredictable, usually does what you want, but has a 0.1% chance of randomly going haywire and no one will know how to fix it?

Also, the focus on hardware seems to imply that it's strictly a game of capital - who has access to the most compute resources wins, the others can stop trying. Wouldn't this lead to massive centralization?

by moojacobon 3/23/2025, 8:13 PM

> For instance, in customer service, an RL agent might discover that sometimes asking a clarifying question early in the conversation, even when seemingly obvious, leads to much better resolution rates. This isn’t something we would typically program into a wrapper, but the agent found this pattern through extensive trial and error. The key is having enough computational power to run these experiments and learn from them.

I am working on a gpt wrapper in customer support. I’ve focused on letting the LMs do what they do best, which is writing responses using context. The human is responsible for managing the context instead. That part is a much harder problem than RL folks expect it to be. How does your AI agent know all the nuance of a business? How does it know you switched your policy on returns? You’d have to have a human sign off on all replies to customer inquiries. But then, why not make an actual UI at that point instead of an “agent” chatbox.

Games are simple, we know all the rules. Like chess. Deepmind can train on 50 million games. But we don’t know all the rules in customer support. Are you going to let an AI agent train itself on 50 million customer interactions and be happy with it sucking for the first 20 million?

by elicksauron 3/24/2025, 1:55 AM

> For instance, in customer service, an RL agent might discover that sometimes asking a clarifying question early in the conversation, even when seemingly obvious, leads to much better resolution rates.

Why does this read to me as the bot finding a path of “Annoy the customer until they hang up and mark the case as solved.” ?

by patconon 3/23/2025, 6:11 PM

YES to the nature analogy.

We are not guaranteed a world pliable to our human understanding. The fact that we feel entitled to such things is just a product of our current brief moment in the information-theoretic landscape, where humans have created and have domination over most of the information environment we navigate. This is a rare moment for any actor. Most of our long history has been spent in environments that are unmanaged ecologies that have blossomed around any one actor.

imho neither we nor any single AI agent will understand the world as fully as we do. We should retire the idea that we are destined to be privileged to that knowledge.

https://nodescription.net/notes/#2021-05-04

by advaelon 3/24/2025, 12:27 AM

The bitter lesson is a good demonstration of how people have really short memories and distributed work loses information

Every AI "breakthrough" comes at a lag because the people who invent a new architecture or method aren't the ones who see its full potential. Because of the financial dynamics at play, the org or team that sees the crazy-looking result often didn't invent the pieces they used. Even if they did, it's been years and in a fast-moving field that stuff has already started to feel "standard" and "generic". The real change they saw was something like more compute or more data

Basically, even the smartest people in the world are pretty dumb, in the sense of generalizing observations poorly

by t_mannon 3/24/2025, 12:37 AM

Has anyone empirically assessed the claims of the Bitter Lesson? The article may sound convincing, but ultimately it's just a few anecdotes. It seems to have a lot of 'cultural' impact in AI research, so it would be good to have some structured data-based analysis before we dismiss entire research directions.

by sgt101on 3/23/2025, 9:22 PM

I don't get how RL can be applied in a domain where there is no simulator.

So for customer service, to do RL on real customers... well this sounds like it's going to be staggeringly slow and very expensive in terms of peeved customers.

by gpapilionon 3/23/2025, 2:58 PM

More generally beats better. That’s the continual lesson from data intensive workloads. More compute, more data, more bandwidth.

The part that I’ve been scratching my head at is whether we see a retreat from aspects of this due to the high costs associated with it. For cpu based workloads this was a workable solution, since the price has been reducing. gpus have generally scaled pricing as a constant of available flops, and the current hardware approach equates to pouring in power to achieve better results.

by TylerLiveson 3/23/2025, 6:40 PM

It's actually about LLMs. They're fundamentally limited by our preconceptions. Can we go back to games and AlphaZero?

by edon 3/23/2025, 5:17 PM

It’d be nice if this post included a high-level cookbook for training the 3rd approach. The hand-waving around RL sounds great, but how do you accurately simulate a customer for learning at scale?

by danguson 3/23/2025, 11:53 PM

I think this goes for almost all software. Hardware is still getting impressively faster every year despite moore’s law expiring.

by RachelFon 3/23/2025, 9:33 PM

I think an even more bitter lesson is coming very soon: AI will run out of human-generated content to train on.

Already AI companies are probably training AI with AI generated slop.

Sure there will be tweaks etc, but can we make it more intelligent than its teachers?

by Foreignbornon 3/23/2025, 11:18 PM

please, please stop letting AI rewrite for you. i’m so tired of reading AI slop.

instead, ask it to be a disagreeable editor and have it ask questions about your draft. you’ll go so much further, and the writing won’t be nonsensical.

by cainxinthon 3/24/2025, 12:20 PM

> My plants don’t need detailed instructions to grow. Given the basics (water, sunlight, and nutrients), they figure out the rest on their own.

They do need detailed instructions to grow. The instructions are encoded in their DNA, and they didn’t just figure them out in real time.

by _wire_on 3/23/2025, 8:45 PM

If only artificial intelligence was intelligent!

Oh, well...