Author here! While working on h-matched.com (tracking time between benchmark release and AI achieving human-level performance), I just added the first negative datapoint - LongBench v2 was solved 22 days before its public release.
This wasn't entirely unexpected given the trend, but it raises fascinating questions about what happens next. The trend line approaching y=0 has been discussed before, but now we're in uncharted territory.
Mathematically, we can make some interesting observations about where this could go:
1. It won't flatten at zero (we've already crossed that)
2. It's unlikely to accelerate downward indefinitely (that would imply increasingly trivial benchmarks)
3. It cannot cross y=-x (that would mean benchmarks being solved before they're even conceived)
My hypothesis is that we'll see convergence toward y=-x as an asymptote. I'll be honest - I'm not entirely sure what a world operating at that boundary would even look like. Maybe others here have insights into what existence at that mathematical boundary would mean in practical terms?
Author here! While working on h-matched.com (tracking time between benchmark release and AI achieving human-level performance), I just added the first negative datapoint - LongBench v2 was solved 22 days before its public release.
This wasn't entirely unexpected given the trend, but it raises fascinating questions about what happens next. The trend line approaching y=0 has been discussed before, but now we're in uncharted territory.
Mathematically, we can make some interesting observations about where this could go: 1. It won't flatten at zero (we've already crossed that) 2. It's unlikely to accelerate downward indefinitely (that would imply increasingly trivial benchmarks) 3. It cannot cross y=-x (that would mean benchmarks being solved before they're even conceived)
My hypothesis is that we'll see convergence toward y=-x as an asymptote. I'll be honest - I'm not entirely sure what a world operating at that boundary would even look like. Maybe others here have insights into what existence at that mathematical boundary would mean in practical terms?