Given the current trend of companies developing increasingly large language models, with the goal of creating a single model that can do it all and surpass human-level performance, it is worth considering an alternative future scenario. In this scenario, models are micro-sized and highly specialized, capable of performing only a few specific tasks.
These micro-models are independent and can run on CPUs, with a high degree of automation that allows them to identify problems and dynamically combine with other micro-models to complete larger tasks.
This approach would allow for the creation of flexible model ensembles that can be easily disassembled into their constituent micro-models."
Possible? Sure. There's a huge body of research into that sort of thing going back to the 1980's and including things like Minsky's Society of Mind[1] approach, Blackboard architectures[2], and the entire field of "multi-agent systems"[3]. On a more contemporary note, there's all sorts of ensemble systems[4], and approaches using model blending[5], mixture of experts models[6], etc.
Will it actually turn out that way? Who's to say? But if I had a gun to my head and had to guess, I'd guess that future AI's will incorporate some sort of "multiple models" approach where some may be very big, some may be very small, and some may not even be "models" at all, but rather components that do symbolic reasoning using Rete[7] or forward/backward chaining[8][9], or SAT resolution[10] or something.
[1]: https://en.wikipedia.org/wiki/Society_of_Mind
[2]: https://en.wikipedia.org/wiki/Blackboard_system
[3]: https://en.wikipedia.org/wiki/Multi-agent_system
[4]: https://en.wikipedia.org/wiki/Ensemble_learning
[5]: https://arxiv.org/html/2401.02994v3
[6]: https://en.wikipedia.org/wiki/Mixture_of_experts
[7]: https://en.wikipedia.org/wiki/Rete_algorithm
[8]: https://en.wikipedia.org/wiki/Forward_chaining
[9]: https://en.wikipedia.org/wiki/Backward_chaining
[10]: https://en.wikipedia.org/wiki/Boolean_satisfiability_problem