Do we [know] how AI’s work?
Mechanistic interpretability (Mech Interp) is a fascinating and rapidly evolving field within AI research that aims to reverse-engineer neural networks to understand their internal algorithms and mechanisms. Chris Olah, one of the pioneers of this field, likens it to neurobiology, as AI models are not programmed in the traditional sense but rather “grown” through training, … Read more