Reference

What is Goodfire?

Goodfire is a tool or method that analyzes model internals, as described by researchers. The material tracks its use in identifying unexpected dependencies in model predictions and improving sparse autoencoder (SAE) labels.

Mar 2026 - Dan Balsam said Goodfire revealed that a model’s Alzheimer’s predictions depended overwhelmingly on fragment length, contrary to expectations from literature.
Mar 2026 - Dan Balsam said Goodfire allowed model editing with no degradation in capabilities, with changes within noise levels.
Mar 2026 - Geoffrey Irving said Goodfire adds another wrinkle to the mess but does not change fundamental aspects of model interpretability.
Apr 2026 - Cameron Berg said Goodfire allows bootstrapping SAE labels by having the model label its own activations, yielding more accurate labels.

Signal Headquarters · reference note, compiled from attributed expert discussion.