Long context windows are IMO, “AGI enough.”
100M context window means it can probably store everything you’ve ever told it for years.
Couple this with multimodal capabilities, like a robot encoding vision and audio into tokens, you can get autonomous assistants than learn your house/habits/chores really quickly.
It should be benchmarked against something like RULER[1]
1: https://github.com/hsiehjackson/RULER (RULER: What’s the Real Context Size of Your Long-Context Language Models)
Context windows are becoming larger and larger, and I anticipate more research focusing on this trend. Could this signal the eventual demise of RAG? Only time will tell. I recently experimented with RAG and the limitations are often surprising (https://www.lycee.ai/blog/rag-fastapi-postgresql-pgvector). I wonder if we will see some of the same limitations for long context LLM. In context learning is probably a form of semantic / lexical cues based arithmetic.
I was wondering how they could afford 8000 H100’s, but I guess I accidentally skipped over this part:
> We’ve raised a total of $465M, including a recent investment of $320 million from new investors Eric Schmidt, Jane Street, Sequoia, Atlassian, among others, and existing investors Nat Friedman & Daniel Gross, Elad Gil, and CapitalG.
Yeah, I guess that'd do it. Who are these people and how'd they convince them to invest that much?
What is the state of art on context on open models? Magic won't be open I guess after getting 500m in VC money.
Based on Mamba ?
does anyone have a detailed tech breakdown of these guys? not quite sure how their LTM architecture works.
FYI wouldn't interview here. Got rejected after a 30 minute behavioral screen after spending 8 hours on an unpaid take-home.