Homepage - Ruizhe Chen

Ruizhe Chen

PhD student

Ruizhe Chen is currently a Ph.D. candidate in Computer Science at Zhejiang University, advised by Prof. Zuozhu Liu, and I expect to graduate in June 2026. My research centers on large-model post-training and multimodal video understanding. Previously, I contributed to Qwen3-VL with Alibaba’s Qwen team. I’ve published in top venues including NeurIPS, ICLR, ACL, EMNLP, and NAACL. I’m currently seeking positions focused on large multimodal models, especially video-LLMs.

ruizhec.21(at)intl.zju.edu.cn Google Scholar

Education

Zhejiang University

Department of Computer Science
Ph.D. Student

Sep. 2021 - present
Zhejiang University

B.S. in Electrical Engineering

Sep. 2017 - Jul. 2021

Work Experience

Qwen Team, Alibaba Group

Contribute to Qwen3-VL with a focus on Video Understanding and Agentic RL.

Research Intern

2025

News

2025

Served as a core contributor to the Qwen3-VL, technical report released.

Oct

Four papers accepted by EMNLP 2025

Aug

Four papers accepted by ACL 2025

Apr

Two paper accepted by ICLR 2025. One paper accepted by NAACL 2025.

Feb

One paper accepted by MedIA.

Jan

2021

Start PhD at Zhejiang University

Aug

Selected Publications (view all )

Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning

EMNLP 2025

This paper introduces a two-stage SFT+RL framework that improves Video Temporal Grounding accuracy and robustness.

[Paper]

Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning

EMNLP 2025

This paper introduces a two-stage SFT+RL framework that improves Video Temporal Grounding accuracy and robustness.

[Paper]

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

ACL 2025

Diffusion-styled Preference Optimization (DPO) is a plug-and-play, policy-agnostic inference-time alignment method that aligns LLMs at the sentence level to reduce latency while improving alignment quality across benchmarks and model scales.

[Paper]

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

ACL 2025

[Paper]

Pad: Personalized alignment of llms at decoding-time

Ruizhe Chen, Zuozhu Liu

ICLR 2025

Large Language Models Alignment.

Pad: Personalized alignment of llms at decoding-time

Ruizhe Chen, Zuozhu Liu

ICLR 2025

Large Language Models Alignment.

Learnable Privacy Neurons Localization in Language Models

Ruizhe Chen, Tianxiang Hu, Zuozhu Liu

The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 main) 2024

Large Language Models Safety (Privacy).

Learnable Privacy Neurons Localization in Language Models

Ruizhe Chen, Tianxiang Hu, Zuozhu Liu

The 62nd Annual Meeting of the Association for Computational Linguistics (ACL 2024 main) 2024

Large Language Models Safety (Privacy).

Fast model debias with machine unlearning

Ruizhe Chen, Jianfei Yang, Zuozhu Liu

Advances in Neural Information Processing Systems 2023

DL Fairness, Large Language Models Fairness, Machine Unlearning via Influence Function

[Paper]

Fast model debias with machine unlearning

Ruizhe Chen, Jianfei Yang, Zuozhu Liu

Advances in Neural Information Processing Systems 2023

DL Fairness, Large Language Models Fairness, Machine Unlearning via Influence Function

[Paper]

Warning

Action required

Education

Work Experience

News

Selected Publications (view all )

Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning

Datasets and Recipes for Video Temporal Grounding via Reinforcement Learning

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

DiffPO: Diffusion-styled Preference Optimization for Efficient Inference-Time Alignment of Large Language Models

Pad: Personalized alignment of llms at decoding-time

Pad: Personalized alignment of llms at decoding-time

Learnable Privacy Neurons Localization in Language Models

Learnable Privacy Neurons Localization in Language Models

Fast model debias with machine unlearning

Fast model debias with machine unlearning

All publications