✨ About Me
Hello, my name is Kongcheng Zhang (张孔枨). I am currently a second-year master student in the College of Computer Science and Technology at Zhejiang University and a member of VIPA Group, supervised by Prof. Mingli Song. In 2024, I received my B.Eng. degree in Computer Science from Zhejiang University and was admitted to persue my M.S. degree in Zhejiang University without entrance examination.
My research field is Large Language Models (LLMs), particularly focusing on pushing forward the reasoning (e.g., math and instruction following) and agentic (e.g., coding and tool use) capabilities in LLMs through Reinforcement Learning (RL). Please feel free to contact me if you are interested in my research :)
📖 Educations
- 2024.09 - Present, College of Computer Science and Technology, Zhejiang University, Hangzhou, China
- 2020.09 - 2024.06, College of Computer Science and Technology, Zhejiang University, Hangzhou, China
📝 Selected Publications
* denotes equal contribution.
-
Replay Failures as Successes: Sample-Efficient Reinforcement Learning for Instruction Following
Instruction FollowingarXiv preprint arXiv:2512.23457
-
Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning
Self RewardingAdvances in Neural Information Processing Systems (NeurIPS), 2025
-
Reasoning with Reinforced Functional Token Tuning
Math ReasoningInternational Conference on Learning Representations (ICLR), 2026
-
Odyssey: Empowering Minecraft Agents with Open-World Skills
Agent SkillInternational Joint Conference on Artificial Intelligence (IJCAI), 2025
-
SeRL: Self-Play Reinforcement Learning for Large Language Models with Limited Data
Self PlayAdvances in Neural Information Processing Systems (NeurIPS), 2025
-
MUSE: MCTS-Driven Red Teaming Framework for Enhanced Multi-Turn Dialogue Safety in Large Language Models
SafetyEmpirical Methods in Natural Language Processing (EMNLP), 2025
-
Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning
RubricsarXiv preprint arXiv:2508.16949
💬 Academic Services
Reviewer: ICLR 2026, ICML 2026