Note that I said "predict" not "describe". It feels like we're still in the era of Kepler, not Newton.
As a fellow tufte css enjoyer, Why is user select turned off on the sidenotes? I would like to be able to copy paste them quite badly.
We're given a signal channel and a reservoir. Signal lives in the channel, noise lives in the reservoir, and the reservoir supposedly doesn’t show up at test time.
Okay, but then we have: why would SGD put the right things in the right bucket?
If the answer is “because the reservoir is defined as the stuff that doesn’t transfer to test,” then this is close to circular.
The Borges/Lavoisier stuff is a tell. "We have unified the field” rhetoric should come after nontrivial predictions and results. Claiming to solve benign overfitting, double descent, grokking, implicit bias, risk of training on population, how to avoid a validation set, and last but not least, skipping training by analytically jumping to the end is 6 theory papers, 3 NeurIPS winners, and a $10B startup. Let's get some results before we tell everyone we unified the field. :) I hope you're right.
[1] https://arxiv.org/pdf/2604.21691
[2] https://news.ycombinator.com/item?id=47893779