Interpreting Intelligence Part 3: What is Happening When Models Learn to Generalize

Three spooky things we’ve learned in 2023 about how neural networks learn. Part 3.

An abstract image of a book

This week, in part 3, we look at what we've learned this year since the discovery in 2021 of "grokking" where generalization happens abruptly and long after fitting the training data.

You can read part 1 of the series here, and part 2 here.

Switching from Memorizing to Generalizing

In 2021, researchers training tiny models made a surprising discovery. A set of models suddenly flipped from memorizing their training data to correctly generalizing on unseen inputs after being trained for a much longer time. Since then, this phenomenon—called “grokking”—has been investigated further and reproduced in many contexts, at larger scale.

Generalization is a three stage process. Initially, models memorize data. They then form intricate internal circuits for problem solving. Finally, they refine these solutions. In a “clean up” phase, they shed redundant data dependencies.

Though appearing sudden in performance metrics, this process is gradual and nuanced under the surface. Train versus test metrics, which track the learning over time, show a linear progression. The sudden shift is evidence of the complex, layered nature of AI learning, where transformative moments are built upon a foundation of gradual, consistent learning.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Artificiality.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.