About

Hi! I'm Wei. My research focuses on AI and philosophy.

This site is a concise overview of my current work: selected publications, contact information, and my CV.

Publications

Blog

Sanity Checks for Distributed Alignment Search

DAS is one of the more promising methods in the causal interpretability toolkit. I ran three randomization-based sanity checks on the MoNLI experiment to test whether its high IIA reflects genuine causal structure or geometric coincidence.

Contact

Email: ws2720@columbia.edu

GitHub: shengweiming