SafetyThe Sequence AI of the Week #859: Reading Claude's Mind in English: A Note on Natural Language AutoencodersAnthropic's fascinating new papers for the future of AI interpretability.TheSequence · May 13 · 1 min read · score 7.0From the sourceAnthropic's fascinating new papers for the future of AI interpretability.TagsunscoredEditorial scoring · unknownNovelty5.0Impact5.0Technical depth5.0Relevance5.0Non-promotional5.0Originally published at TheSequenceRead at source →