Vision Language Model Training
2024
G-llava: Solving geometric problem with multi-modal large language model
Jiahui Gao, Renjie Pi, Jipeng Zhang, Jiacheng Ye, Wanjun Zhong, Yufei Wang, Lanqing Hong, Jianhua Han, Hang Xu, Zhenguo Li, Lingpeng Kong.
ICLR 2024
[
bib]
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization
Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang.
ECCV 2024
[
bib]
VL-GenRM: Enhancing Vision-Language Verification via Vision Experts and Iterative Training
Jipeng Zhang, Kehao Miao, Renjie Pi, Runtao Liu, Zhaowei Wang, Rui Pan, Tong Zhang.
on submission
[
bib]
2023
X2-VLM: All-In-One Pre-trained Model For Vision-Language Tasks
Yan Zeng, Xinsong Zhang, Hang Li, Jiawei Wang, Jipeng Zhang, Wangchunshu Zhou.
TPAMI 2023
[
bib]
UniMath: A Foundational and Multimodal Mathematical Reasoner
Zhenwen Liang, Tianyu Yang, Jipeng Zhang, Xiangliang Zhang.
EMNLP 2023
[
bib]
Reinforcement Learning and Agent
2024
Mitigating the Alignment Tax of RLHF
Yong Lin, Hangyu Lin, Wei Xiong, Shizhe Diao, Jianmeng Liu, Jipeng Zhang, Rui Pan, Haoxiang Wang, Wenbin Hu, Hanning Zhang, Hanze Dong, Renjie Pi, Han Zhao, Nan Jiang, Heng Ji, Yuan Yao, Tong Zhang.
EMNLP 2024
[
bib]
Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization
Renjie Pi, Tianyang Han, Wei Xiong, Jipeng Zhang, Runtao Liu, Rui Pan, Tong Zhang.
ECCV 2024
[
bib]
VL-GenRM: Enhancing Vision-Language Verification via Vision Experts and Iterative Training
Jipeng Zhang, Kehao Miao, Renjie Pi, Runtao Liu, Zhaowei Wang, Rui Pan, Tong Zhang.
arxiv 2025
[
bib]
ExeSQL: Self-Taught Text-to-SQL Models with Execution-Driven Bootstrapping for SQL Dialects
Jipeng Zhang, Haolin Yang, Kehao Miao, Ruiyuan Zhang, Renjie Pi, Jiahui Gao, Xiaofang Zhou.
arxiv 2025
[
bib]
2023
Raft: Reward ranked finetuning for generative foundation model alignment
Hanze Dong, Wei Xiong, Deepanshu Goyal, Rui Pan, Shizhe Diao, Jipeng Zhang, Kashun Shum, Tong Zhang.
TMLR 2023
[
bib]
DetGPT: Detect What You Need via Reasoning
Renjie Pi, Jiahui Gao, Shizhe Diao, Rui Pan, Hanze Dong, Jipeng Zhang, Lewei Yao, Jianhua Han, Hang Xu, Lingpeng Kong, Tong Zhang.
EMNLP 2023
[
bib]
Pretraining and Training Methods/Frameworks for LLMs/VLMs
2024
Fox Foundation Model: A Pioneering Small Language Model (SLM) for Cloud and Edge
Zijian Hu*, Jipeng Zhang*, Rui Pan*, Zhaozhuo Xu, Shanshan Han, Han Jin, Alay Dilipbhai Shah, Dimitris Stripelis, Yuhang Yao, Salman Avestimehr, Chaoyang He, Tong Zhang.
arXiv 2024
[
bib]
Lmflow: An extensible toolkit for finetuning and inference of large foundation models
Shizhe Diao, Rui Pan, Hanze Dong, Ka Shun Shum, Jipeng Zhang, Wei Xiong, Tong Zhang.
NAACL 2024 best demo paper
[
bib]
LLM Data Techniques
2025
TAGCOS: Task-Agnostic Gradient Clustered Coreset Selection
Jipeng Zhang*, Yaxuan Qin*, Renjie Pi*, Weizhong Zhang, Rui Pan, Tong Zhang.
NAACL 2025
[
bib]
Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code
Jipeng Zhang*, Jianshu Zhang*, Yuanzhe Li*, Renjie Pi, Rui Pan, Runtao Liu, Ziqiang Zheng, Tong Zhang.
ACL 2025
[
bib]
An Improved Autoregressive Evaluation Paradigm for Large Language Models
Jipeng Zhang*, Rui Pan*, Yuzheng Hu*, Kashum Shum, Guanyu Yao, Xiang Liu, Renjie Pi, Hanze Dong, Shizhe Diao, Yong Lin, Han Zhao, Tong Zhang.
TIST 2025
[
bib]
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting
Rui Pan, Dylan Zhang, Hanning Zhang, Xingyuan Pan, Minrui Xu, Jipeng Zhang, Renjie Pi, Xiaoyu Wang, Tong Zhang.
ACL 2025
[
bib]
Reasoning
2024
TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts
Ruida Wang*, Jipeng Zhang*, Yizhen Jia*, Rui Pan, Shizhe Diao, Renjie Pi, Tong Zhang.
EMNLP 2024
[
bib]
2020
Graph-to-Tree Learning for Solving Math Word Problems
Jipeng Zhang*, Lei Wang*, Roy Ka-Wei Lee, Yi Bin, Yan Wang, Jie Shao, Ee-Peng Lim.
ACL 2020
[
bib]
2022
MWP-BERT: Numeracy-Augmented Pre-Training for Math Word Problem Solving
Zhenwen Liang, Jipeng Zhang, Lei Wang, Wei QIN, Yunshi Lan, Jie Shao, Xiangliang Zhang.
NAACL 2022
[
bib]