Join me on my ML Journey from novice to expert! with some nomadic meanderings along the way.

Machine Learning

Language models can explain neurons in language models

Language models can explain neurons in language models

We use GPT-4 to automatically write explanations for the behavior of neurons in large language models and to score those explanations. We release a dataset of these (imperfect) explanations and scores for every neuron in GPT-2.

Go to Source
Author:
https://openai.com/research/language-models-can-explain-neurons-in-language-models