Want to understand feature learning and the final network weights?

Part 5 of A Quickstart Guide to Learning Mechanics (prev | next)

[long list of authors]
2025-09-01

[TO BE WRITTEN]

this one’s hard. it’s the subject of a lot of speculation, and a lot of people want to know. it’s been the subject of a lot of mechinterp, but it’s proven quite hard to get an analytically.

deep linear nets
single- / multi-index model theory
RFMs
complicated calculational tools at large width: DMFT, TP, etc.

A Quickstart Guide to Learning Mechanics

Introduction: what do you want to understand?
...the average size of hidden representations?
...hyperparameter selection (and why should theorists care)?
🚧 ...the convergence and stability of optimization?
🚧 ...feature learning and the final network weights?
🚧 ...generalization?
🚧 ...neuron-level sparsity?
🚧 ...the structure in the data?
🚧 Places to make a difference

A Quickstart Guide to Learning Mechanics

Comments