✨ About Me

Hello, my name is Kongcheng Zhang (张孔枨). I am currently a second-year master student in the College of Computer Science and Technology at Zhejiang University and a member of VIPA Group, supervised by Prof. Mingli Song. In 2024, I received my B.Eng. degree in Computer Science from Zhejiang University and was admitted to persue my M.S. degree in Zhejiang University without entrance examination.

My research field is Large Language Models (LLMs), particularly focusing on pushing forward the reasoning (e.g., math and instruction following) and agentic (e.g., coding and tool use) capabilities in LLMs through Reinforcement Learning (RL). Please feel free to contact me if you are interested in my research :)

📖 Educations

2024.09 - Present, College of Computer Science and Technology, Zhejiang University, Hangzhou, China
2020.09 - 2024.06, College of Computer Science and Technology, Zhejiang University, Hangzhou, China

📝 Selected Publications

* denotes equal contribution.

Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following
Instruction Following

Kongcheng Zhang, Qi Yao, Shunyu Liu, Wenjian Zhang, Min Cen, Yang Zhou, Wenkai Fang, Yiru Zhao, Baisheng Lai, Mingli Song

International Conference on Machine Learning (ICML), 2026

PDF Code
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
Self Rewarding

Kongcheng Zhang, Qi Yao, Shunyu Liu, Yingjie Wang, Baisheng Lai, Jieping Ye, Mingli Song, Dacheng Tao

Advances in Neural Information Processing Systems (NeurIPS), 2025

PDF Code
Reasoning with Reinforced Functional Token Tuning
Math Reasoning

Kongcheng Zhang, Qi Yao, Baisheng Lai, Jiaxing Huang, Wenkai Fang, Dacheng Tao, Mingli Song, Shunyu Liu

International Conference on Learning Representations (ICLR), 2026

PDF Code
Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs
Exploration

Wenjian Zhang, Kongcheng Zhang, Jiaxin Qi, Baisheng Lai, Jianqiang Huang

International Conference on Machine Learning (ICML), 2026

PDF Code
Odyssey: Empowering Minecraft Agents with Open-World Skills
Agent Skill

Shunyu Liu^*, Yaoru Li^*, Kongcheng Zhang^*, Zhenyu Cui^*, Wenkai Fang^*, Yuxuan Zheng, Tongya Zheng, Mingli Song

International Joint Conference on Artificial Intelligence (IJCAI), 2025

PDF Code
SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data
Self Play

Wenkai Fang, Shunyu Liu, Yang Zhou, Kongcheng Zhang, Tongya Zheng, Kaixuan Chen, Mingli Song, Dacheng Tao

Advances in Neural Information Processing Systems (NeurIPS), 2025

PDF Code
MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models
Safety

Siyu Yan, Long Zeng, Xuecheng Wu, Chengcheng Han, Kongcheng Zhang, Chong Peng, Xuezhi Cao, Xunliang Cai, Chenjuan Guo

Empirical Methods in Natural Language Processing (EMNLP), 2025

PDF Code
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
Rubrics

Yang Zhou, Sunzhu Li, Shunyu Liu, Wenkai Fang, Kongcheng Zhang, Jiale Zhao, Jingwen Yang, Yihe Zhou, Jianwei Lv, Tongya Zheng, Hengtong Lu, Wei Chen, Yan Xie, Mingli Song

International Conference on Machine Learning (ICML), 2026

PDF Code

💬 Academic Services

Reviewer: ICLR 2026, ICML 2026

Kongcheng Zhang

✨ About Me

📖 Educations

📝 Selected Publications

Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Reasoning with Reinforced Functional Token Tuning

Experience is the Best Teacher: Motivating Effective Exploration in Reinforcement Learning for LLMs

Odyssey: Empowering Minecraft Agents with Open-World Skills

SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data

MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models

Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

💬 Academic Services