About me
I’m a Computer Science PhD student at the DSE Lab of Michigan State University, co-advised by Prof. Jiliang Tang and Prof. Yue Xing.
If you’d like to get in touch, feel free to email me at linyupin [#at#] msu [#dot#] edu.
Research Area
My research centers on understanding how large language models, and LLM-based agents built on top of them, actually behave internally, and on using that understanding to make them more transparent, safer, and more efficient.
Interpretability of Large Language Models (LLMs): investigating the internal mechanisms underlying specific model behaviors, such as the dynamics of retrieval heads, structural patterns within attention representations, and how fine-tuning reshapes model behavior at the circuit level.
Safety in LLMs: applying this mechanistic understanding to safety problems, such as explaining why certain jailbreak attacks succeed, and developing methods to monitor, control, or selectively reverse safety-relevant behaviors.
Design and Optimization of LLM-based Agents: extending this understanding to agentic systems, such as characterizing how core capabilities like memory management shape agent behavior over time, and building techniques to improve agent reliability and efficiency.
Education
- Ph.D. in Computer Science (2023-Present)
- Michigan State University (MSU)
- Advisor: Professor Jiliang Tang
- Research Area: Safety in Large Language Models, Trustworthy AI, Natural Language Processing
- M.S. in Electronic and Computer Engineering (2022-2023)
- University of Massachusetts, Amherst (UMass Amherst)
- Advisor: Professor Tongping Liu
- Research Area: Machine Learning Model Compression
- B.Eng. in Information Security (2017-2021)
- University of Electronic and Science Technology of China (UESTC)
Publications and Preprints
*: Equal contribution
- Yuping Lin, Jiayuan Ding, Yue Xing, Pengfei He, Jiliang Tang, Subhabrata Mukherjee. A Simple Plug-in for Improving Eviction-Based KV Cache Compression. Preprint [arXiv].
- Recovers reconstructable value information that binary KV cache eviction would otherwise discard, improving the quality memory trade-off under tight cache budgets.
- Yuping Lin, Pengfei He, Yue Xing, Yingqian Cui, Jiayuan Ding, Subhabrata Mukherjee, Hui Liu, Zhen Xiang. Crafting Reversible SFT Behaviors in Large Language Models. ICML 2026 Workshop on Mech Interp [arXiv].
- Localizes SFT induced behaviors into sparse, causally necessary subnetworks that can be selectively suppressed or restored at inference time without modifying model weights.
- Yuping Lin, Zitao Li, Yue Xing, Pengfei He, Yingqian Cui, Yaliang Li, Bolin Ding, Jingren Zhou, Jiliang Tang. Retrieval Heads are Dynamic. ACL 2026 (Main) [arXiv].
- Shows that retrieval heads vary dynamically across generation timesteps and are linked to a predictive planning signal in the model’s hidden states.
- Zidi Xiong*, Yuping Lin*, Wenya Xie*, Pengfei He, Jiliang Tang, Himabindu Lakkaraju, Zhen Xiang. Towards Optimal Memory Management: Investigating Experience-Following Behavior of Large Language Model Agents. ACL 2026 (Main) [arXiv].
- An empirical study showing that LLM agents exhibit an experience following property, and how memory addition and deletion choices shape long term agent performance.
- Tai-Quan Peng, Kaiqi Yang, Sanguk Lee, Hang Li, Yucheng Chu, Yuping Lin, Hui Liu. Beyond partisan leaning: A comparative analysis of political bias in large language models. Journal of Information Technology & Politics 2026 [Paper][arXiv]
- Shenglai Zeng, Jiankun Zhang, Bingheng Li, Yuping Lin, Tianqi Zheng, Dante Everaert, Hanqing Lu, Hui Liu, Yue Xing, Monica Xiao Cheng, Jiliang Tang. Towards Knowledge Checking in Retrieval-augmented Generation: A Representation Perspective. NAACL 2025 (Main) [Paper][arXiv]
- Yingqian Cui, Jie Ren, Yuping Lin, Han Xu, Pengfei He, Yue Xing, Lingjuan Lyu, Wenqi Fan, Hui Liu, Jiliang Tang. FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models. SIGKDD Explorations Newsletter 2025 [Paper][arXiv]
- Pengfei He, Yuping Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu. Red-Teaming LLM Multi-Agent Systems via Communication Attacks. ACL 2025 (Findings) [Paper][arXiv]
- Yuping Lin*, Pengfei He*, Han Xu, Yue Xing, Makoto Yamada, Hui Liu, Jiliang Tang. Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis. EMNLP 2024 (Main) [Paper][arXiv].
- A representation space analysis showing that successful jailbreaks shift the representation of harmful prompts toward the region occupied by harmless ones.
Internships
- Summer 2026, Applied Scientist Intern, Amazon
- Spring 2026, Intern, Hippocratic AI
- Worked on the research of KV Cache Management in LLMs.
- Worked on the development of a hybrid agentic RAG system.
- Summer 2025, Research Intern, Alibaba
Service
- Reviewer for ARR May 2026
- Reviewer for NeurIPS 2026
- Reviewer for ICML 2026 Workshop on Mech Interp
- Reviewer for ARR January 2026
- Emergency Reviewer for ARR January 2026
- Reviewer for Computational Linguistics, 2025
- Reviewer for ARR October 2025
Teaching
- Spring 2025, Teaching Assistant, CSE 335 Object-oriented Software Development
- Fall 2024, Teaching Assistant, CSE 482 Big Data Analysis
