About me

I’m a Computer Science PhD student at the DSE Lab of Michigan State University, co-advised by Prof. Jiliang Tang and Prof. Yue Xing.

If you’d like to get in touch, feel free to email me at linyupin [#at#] msu [#dot#] edu.

Research Area

My research centers on understanding how large language models, and LLM-based agents built on top of them, actually behave internally, and on using that understanding to make them more transparent, safer, and more efficient.

Interpretability of Large Language Models (LLMs): investigating the internal mechanisms underlying specific model behaviors, such as the dynamics of retrieval heads, structural patterns within attention representations, and how fine-tuning reshapes model behavior at the circuit level.

Safety in LLMs: applying this mechanistic understanding to safety problems, such as explaining why certain jailbreak attacks succeed, and developing methods to monitor, control, or selectively reverse safety-relevant behaviors.

Design and Optimization of LLM-based Agents: extending this understanding to agentic systems, such as characterizing how core capabilities like memory management shape agent behavior over time, and building techniques to improve agent reliability and efficiency.

Education

  • Ph.D. in Computer Science (2023-Present)
    • Michigan State University (MSU)
    • Advisor: Professor Jiliang Tang
    • Research Area: Safety in Large Language Models, Trustworthy AI, Natural Language Processing
  • M.S. in Electronic and Computer Engineering (2022-2023)
    • University of Massachusetts, Amherst (UMass Amherst)
    • Advisor: Professor Tongping Liu
    • Research Area: Machine Learning Model Compression
  • B.Eng. in Information Security (2017-2021)
    • University of Electronic and Science Technology of China (UESTC)

Publications and Preprints

*: Equal contribution

  • Yuping Lin, Jiayuan Ding, Yue Xing, Pengfei He, Jiliang Tang, Subhabrata Mukherjee. A Simple Plug-in for Improving Eviction-Based KV Cache Compression. Preprint [arXiv].
    • Recovers reconstructable value information that binary KV cache eviction would otherwise discard, improving the quality memory trade-off under tight cache budgets.
  • Yuping Lin, Pengfei He, Yue Xing, Yingqian Cui, Jiayuan Ding, Subhabrata Mukherjee, Hui Liu, Zhen Xiang. Crafting Reversible SFT Behaviors in Large Language Models. ICML 2026 Workshop on Mech Interp [arXiv].
    • Localizes SFT induced behaviors into sparse, causally necessary subnetworks that can be selectively suppressed or restored at inference time without modifying model weights.
  • Yuping Lin, Zitao Li, Yue Xing, Pengfei He, Yingqian Cui, Yaliang Li, Bolin Ding, Jingren Zhou, Jiliang Tang. Retrieval Heads are Dynamic. ACL 2026 (Main) [arXiv].
    • Shows that retrieval heads vary dynamically across generation timesteps and are linked to a predictive planning signal in the model’s hidden states.
  • Zidi Xiong*, Yuping Lin*, Wenya Xie*, Pengfei He, Jiliang Tang, Himabindu Lakkaraju, Zhen Xiang. Towards Optimal Memory Management: Investigating Experience-Following Behavior of Large Language Model Agents. ACL 2026 (Main) [arXiv].
    • An empirical study showing that LLM agents exhibit an experience following property, and how memory addition and deletion choices shape long term agent performance.
  • Tai-Quan Peng, Kaiqi Yang, Sanguk Lee, Hang Li, Yucheng Chu, Yuping Lin, Hui Liu. Beyond partisan leaning: A comparative analysis of political bias in large language models. Journal of Information Technology & Politics 2026 [Paper][arXiv]
  • Shenglai Zeng, Jiankun Zhang, Bingheng Li, Yuping Lin, Tianqi Zheng, Dante Everaert, Hanqing Lu, Hui Liu, Yue Xing, Monica Xiao Cheng, Jiliang Tang. Towards Knowledge Checking in Retrieval-augmented Generation: A Representation Perspective. NAACL 2025 (Main) [Paper][arXiv]
  • Yingqian Cui, Jie Ren, Yuping Lin, Han Xu, Pengfei He, Yue Xing, Lingjuan Lyu, Wenqi Fan, Hui Liu, Jiliang Tang. FT-Shield: A Watermark Against Unauthorized Fine-tuning in Text-to-Image Diffusion Models. SIGKDD Explorations Newsletter 2025 [Paper][arXiv]
  • Pengfei He, Yuping Lin, Shen Dong, Han Xu, Yue Xing, Hui Liu. Red-Teaming LLM Multi-Agent Systems via Communication Attacks. ACL 2025 (Findings) [Paper][arXiv]
  • Yuping Lin*, Pengfei He*, Han Xu, Yue Xing, Makoto Yamada, Hui Liu, Jiliang Tang. Towards Understanding Jailbreak Attacks in LLMs: A Representation Space Analysis. EMNLP 2024 (Main) [Paper][arXiv].
    • A representation space analysis showing that successful jailbreaks shift the representation of harmful prompts toward the region occupied by harmless ones.

Internships

  • Summer 2026, Applied Scientist Intern, Amazon
  • Spring 2026, Intern, Hippocratic AI
    • Worked on the research of KV Cache Management in LLMs.
    • Worked on the development of a hybrid agentic RAG system.
  • Summer 2025, Research Intern, Alibaba
    • Worked on the research of Long-Context processing in LLM Agents.
    • Developed an open-source MCP toolkit for LLM Agent Long-Context handling. [PyPI] [GitHub]

Service

  • Reviewer for ARR May 2026
  • Reviewer for NeurIPS 2026
  • Reviewer for ICML 2026 Workshop on Mech Interp
  • Reviewer for ARR January 2026
    • Emergency Reviewer for ARR January 2026
  • Reviewer for Computational Linguistics, 2025
  • Reviewer for ARR October 2025

Teaching

  • Spring 2025, Teaching Assistant, CSE 335 Object-oriented Software Development
  • Fall 2024, Teaching Assistant, CSE 482 Big Data Analysis