Alex Foote
Alex Foote is a computer scientist with an MSc from the University of Warwick, where he worked on adversarial robustness for his dissertation. He now works at a startup called Ripjar as a data scientist developing NLP software for tackling financial crime. In his spare time he’s being doing research into mechanistic interpretability with the support of Apart Research.
Alex’s current main research interest is mechanistic interpretability, particularly developing methods for better interpreting MLP layers in Language Models. He previously worked on adversarial robustness, as well as the connection between robustness and interpretability in vision models.
Alex will work on a method for building interpretable representations for MLP neurons in Language Models.