Jason Zhang
Hi! I am a Computer Science student at Stanford University with a deep love for AI research. Currently a member of technical staff at Stanford's NLP Group, advised by Zhengxuan Wu and Hao Zhu. Recently, I've been focusing on model interpretability and social intelligence.
Featured Projects
Uncovering Latent CoT Vectors in Language Models [Under Review]
Applied Steering Vectors towards Chain of Thought Thinking. Show that steered systems can be steered towards CoT structure while maintaining competitive performance on reasoning benchmarks. Read arXiv preprint here.
The Structural Safety Generalization Problem [Neurips SafeGenAI 2024]
Introduce new subclass of AI Safety problems - failure of current safety techniques to generalize over structure, despite semantic equivalence. Read here.
Building Better Benchmarks: Towards Standardized AI Evaluation
Blog post regarding how the field of benchmarking can move forward. TLDR: we need standardization. Read here.
More to Come..
Stay tuned for updates!