profile photo

Wanru Zhao

I'm a PhD student in Computer Science at University of Cambridge, advised by Prof. Nic Lane at the Cambridge Machine Learning Systems Lab (CaMLSys). I'm also a member of Cambridge AI Safety Lab, working on AI Alignment and Interpretability. Prior to that, I obtained my MPhil in Advanced Computer Science at Cambridge as well.

I’m currently visiting the Vector Institute, working with Colin Raffel at the University of Toronto. I was a research intern at Microsoft Research, mentored by Alessandro Sordoni and Lucas Caccia.

My research focuses on:

  • Modular, distributed/decentralised training (model merging, Mixture-of-Experts) and decentralised inference;
  • Data attribution/selection/curation/balancing/mixing, synthetic data generation and curriculum design for foundation model training;
  • Compositional reasoning of large language models (in math and coding domains) and multi-agent systems

Email  /  Google Scholar  /  GitHub  /  Twitter  /  Bluesky

News
  • [Jan 2026] Two conference papers accepted to ICLR 2025! See you in Rio de Janeiro πŸ‡§πŸ‡·
  • [Sept 2025] One conference paper and one workshop paper accepted to NeurIPS 2025! See you in San Diego πŸ‡ΊπŸ‡Έ / Mexico City πŸ‡²πŸ‡½
  • [Jun 2025] Two workshop papers accepted to ICML 2024 AI for Math Workshop!
  • [Jan 2025] Our workshop proposal on Modular, Collaborative and Decentralized Deep Learning accepted to ICLR 2025! See you in Singapore πŸ‡ΈπŸ‡¬
  • [Feb 2025] One paper accepted to AAMAS 2025!
  • [Feb 2025] One paper accepted to MLSys 2025!
  • [Jan 2025] One paper accepted to ASP-DAC 2025!
  • [Sept 2024] Two conference papers and one workshop paper accepted to NeurIPS 2024! See you in Vancouver πŸ‡¨πŸ‡¦
  • [Feb 2024] One conference paper and two workshop papers accepted to ICLR 2024! See you in Vienna πŸ‡¦πŸ‡Ή
  • [Mar 2023] Our team got the winner of the US-UK Privacy-Enhancing Technologies Prize Challenges! We will present our solution at Innovation and Technology's Centre for Data Ethics and Innovation (CDEI) in London at the end of May. Check out the report on the Cambridge University website!
  • [Mar 2023] One paper accepted to FAccT 2023!
Selected Publications
Rethinking Data Curation in LLM Training: Online Reweighting Offers Better Generalization than Offline Methods
Wanru Zhao, Yihong Chen, Yuzhi Tang, Wentao Ma, Shengchao Hu, Shell Xu Hu, Alex Iacob, Abhinav Mehrotra, Nicholas Lane
International Conference on Learning Representations (ICLR), 2026


Learning to Solve Complex Problems via Dataset Decomposition
Wanru Zhao, Lucas Caccia, Zhengyan Shi, Minseon Kim, Xingdi Yuan, Weijia Xu, Marc-Alexandre CΓ΄tΓ© Alessandro Sordoni
Conference on Neural Information Processing Systems (NeurIPS), 2025
Paper


CLUES: Collaborative High-Quality Data Selection for LLMs via Training Dynamics
Wanru Zhao, Hongxiang Fan, Shell Xu Hu, Wangchunshu Zhou, Bofan Chen Nicholas Lane
Conference on Neural Information Processing Systems (NeurIPS), 2024
Paper / Code / Website


Breaking Physical and Linguistic Borders: Multilingual Federated Prompt Tuning for Low-Resource Languages
Wanru Zhao, Yihong Chen, Royson Lee, Xinchi Qiu, Yan Gao, Hongxiang Fan Nicholas Lane
International Conference on Learning Representations (ICLR), 2024
Paper


Cascadia: A Cascade Serving System for Large Language Models
Youhe Jiang*, Fangcheng Fu*, Wanru Zhao*, Stephan Rabanser, Nicholas Lane, Binhang Yuan
International Conference on Learning Representations (ICLR), 2026
Paper


MR-BEN: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs
Zhongshen Zeng, Yinhong Liu, Yingjia Wan, Jingyao Li, ... , Wanru Zhao, ... , Zhijiang Guo Jiaya Jia
Conference on Neural Information Processing Systems (NeurIPS), 2024


Prompt Tuning with Diffusion for Few-Shot Pre-trained Policy Generalization
Shengchao Hu, Wanru Zhao, Weixiong Lin, Li Shen, Ya Zhang, Dacheng Tao
International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2025


Attacks on Third-Party APIs of Large Language Models
Wanru Zhao, Vidit Khazanchi, Haodi Xing, Xuanli He, Qiongkai Xu, Nicholas Lane
Arxiv preprint, 2024


Harms from Increasingly Agentic Algorithmic Systems
Alan Chan, Rebecca Salganik, Alva Markelius, ... , Wanru Zhao, ... , Umang Bhatt , Adrian Weller , David Krueger, Tegan Maharaj,
ACM Conference on Fairness, Accountability, and Transparency (FAccT), 2023


Evaluating Large Language Models in Scientific Discovery
Zhangde Song, Jieyu Lu, Yuanqi Du, Botao Yu, ... , Wanru Zhao, ... , Huan Sun, Seyed Mohamad Moosavi, Chenru Duan,
arXiv preprint, 2025


(This list is not comprehensive and is being updated. For a complete list of publications, please visit my Google Scholar profile.)

Selected Internships
Selected Honors and Awards
Qualcomm Innovation Fellowship Finalist, 2025
Google PhD Fellowship Finalist, 2024
UK Privacy Enhancing Technologies Challenge Rank 1, 2022
China Competition on Virtual Reality (2020) National Grand Prize, 2020
Chinese Undergraduate Mathematical Contest in Modeling (CUMCM) National First Prize, 2020
ACM International Collegiate Programming Contest (ACM-ICPC) Silver Medal, 2018
CCF National Olympiad in Informatics (NOI) Bronze Medal, 2016
Academic Services
Conference Reviewer: NeurIPS 2024-2025, ICLR 2025, ICML 2025, AISTATS 2025, COLM 2025
Journal Reviewer: TMLR, TIST
Organizing Committee: ICLR 2025 Workshop on Modular, Collaborative and Decentralized Deep Learning (MCDC@ICLR2025)

Design and source code from Jon Barron's website