How AI Understands Concepts: From Atomic to Galactic Level

🧠 AI is increasingly (sometimes frighteningly) resembling the brain: how AI understands concepts

Recently, scientists, including David Baek, discovered interesting details about how the Sparse Autoencoder (SAE) model represents concepts. They identified three levels that help understand how AI 'sees' our world.

The atomic level is fine. Here, the SAE gathers similar concepts into 'crystals' with parallelograms or trapezoids, similar to well-known examples like 'man-woman' and 'king-queen'. These shapes improve when distracting factors (like word length) are removed using linear discriminant analysis, which helps focus on important features.

The brain level is medium. Here, concepts like 'mathematics' and 'coding' are grouped together, resembling areas of the human brain. They create an analogy of 'regions', as in our brain, where each area is responsible for its function. For example, there are areas in the brain for language or vision, and here AI highlights 'regions' for related concepts.

The galactic level is large. At this level, the model forms an entire 'universe' of concepts, where different concepts are at varying distances from each other. These distances are not random but follow a power law: the further from the center, the less similar the concepts. This resembles the distribution of galaxies in the universe.

Thus, at the medium level, AI's work indeed resembles how the human brain divides functions into separate zones. This is an important step in understanding how AI can learn and process complex concepts, mimicking our natural ability to organize information.

📝 Paper: https://arxiv.org/abs/2410.19750