🙋 About Me
I am a third year Ph.D. student at the University of Cambridge, working under the supervision of Prof. Nigel Collier. Previously, I completed my MPhil degree at Cambridge, focusing on fact-checking under the guidance of Prof. Andreas Vlachos and Dr. Zhijiang Guo.
📢 I am now doing an internship in Google. During my PhD studies, I also did research internships in Microsoft Research, J.P. Morgan AI Research, and Tencent AI Lab.
During my undergraduate studies, I interned at PlusLab (UCLA) working with Dr. Nanyun Peng and Dr. Te-Lin Wu, and completed my capstone project with Prof. Wenjie Li and Dr. Yongqi Li at PolyU NLP Group.
📚 Education
- 2023.10 - Present: University of Cambridge, Ph.D. in Computation, Cognition and Language.
- 2022.10 - 2023.06: University of Cambridge, M.Phil. in Advanced Computer Science.
- 2018.09 - 2022.06: The Hong Kong Polytechnic University, B.Sc. in Computing.
Undergraduate Scholarships:
- • HKSAR Government Scholarship 2020/21 and 2021/22 (HKD 160,000, around USD 20,500)
- • Commercial Radio 50th Anniversary Scholarship 2019/20 (HKD 80,000, around USD 10,250)
- • The Hong Kong Polytechnic University Scholarship 2019/20 (HKD 40,000, around USD 5,125)
- • Wong Tit-shing Student Exchange Scholarship 2020/21 (HKD 20,000, around USD 2,560)
- • WKF Foundation Service-Learning Scholarship 2020/21 (HKD 16,600, around USD 2,125)
- • Wei Lun Foundation Scholarship 2020/21 (HKD 16,600, around USD 2,125)
- • Tellhow Group Scholarship 2018/19 (CNY 10,000, around USD 1,399)
- • Rennie's Mill Student Aid Project Alumni Association Scholarship 2019/20 (HKD 10,000, around USD 1,250)
- • V.K. Hsu & Sons Foundations Ltd. Scholarship 2019/20 (HKD 10,000, around USD 1,250)
- • HKMA IT Management Club Scholarship 2021/22 (HKD 5,000, around USD 640)
- • Proof-of-Concept (POC) Funding Scheme 2021/22 (HKD 5,000, around USD 640)
👨💻 Internships
- Google; Student Researcher (Full-time); ongoing
- Microsoft Research; Research Intern (Full-time); 3 months, 2025
- Focused on model routing in long-horizon agentic workflow for efficiency.
- J.P. Morgan AI Research; Research Intern (Full-time); 3 months, 2025
- Automated factual evaluation via LLM agents for temporal analytics.
- Tencent AI Lab; Research Intern (Full-time); 6 months, 2024
- Advanced my PhD research on uncertainty estimation for long-form generation.
🗺️ Research Roadmap
💡My Research Vision: I believe that our need for uncertainty in LLMs depends on how we position them.
- As tools, LLMs need uncertainty that can be 📐measured externally for evaluation and risk control.
- As collaborators, they must be able to 💬express uncertainty in ways humans can understand and act on. And in human-AI or multi-agent settings, uncertainty fundamentally shapes 🤝interaction.
My research studies these three connected dimensions of uncertainty in LLMs: measurement, expression, and interaction.
graph LR
%% --- NODES & HIERARCHY ---
%% Root Node
Root(("🌲 Uncertainty
in LLMs"))
%% --- Branch 1: Measurement ---
Root --> Measurement("📐 Measurement")
Measurement --> M_Fact["Long-form Factuality"]
M_Fact --> LUQ["📄 LUQ: First work on long-form UQ
(EMNLP '24)"]
M_Fact --> Atomic["⭐ Atomic Calibration
(IJCNLP '25)"]
Measurement --> M_Reason["Reasoning"]
M_Reason --> Rome["🗺️ All Roads Lead to Rome
(EMNLP '25)"]
Measurement --> M_Multi["Multilingual"]
M_Multi --> Beyond["🏗️ Beyond Final Layer
(Preprint)"]
%% --- Branch 2: Expression ---
Root --> Expression("🗣️ Expression")
Expression --> E_Bench["Benchmarking"]
E_Bench --> UNCLE["📏 UNCLE: Benchmarking
(EMNLP '25)"]
Expression --> E_Learn["Learning to Express"]
E_Learn --> LoGU["💬 LoGU: Linguistic Expressions
(ACL '25)"]
E_Learn --> RL["🧠 RL for Verbalized Confidence
(ACL '26)"]
%% --- Branch 3: Interaction ---
Root --> Interaction("🤝 Interaction")
Interaction --> I_Social["Social Influence"]
I_Social --> Conformity["👥 Uncertainty leads to conformity
(ACL '25)"]
Interaction --> I_Multi["Multi-turn & Debate"]
I_Multi --> ConfMulti["🔄 Confidence in Multi-turn
(ACL '26)"]
I_Multi --> MAD["🎙️ Confidence & Diversity in Debate
(ACL '26)"]
%% --- LINKS ---
click LUQ "https://aclanthology.org/2024.emnlp-main.299/" "View Paper"
click Atomic "https://arxiv.org/abs/2410.13246" "View Paper"
click LoGU "https://arxiv.org/abs/2410.14309" "View Paper"
click UNCLE "https://arxiv.org/abs/2505.16922" "View Paper"
click RL "https://arxiv.org/abs/2505.23912" "View Paper"
click Beyond "https://www.arxiv.org/abs/2510.03136" "View Paper"
click Rome "https://arxiv.org/abs/2509.12908" "View Paper"
click Conformity "https://arxiv.org/abs/2410.12428" "View Paper"
click ConfMulti "https://arxiv.org/abs/2601.02179" "View Paper"
click MAD "https://arxiv.org/abs/2601.19921" "View Paper"
%% --- STYLING ---
classDef main fill:#ffffff,stroke:#03396c,stroke-width:2px,color:white,font-size:14px;
classDef domain fill:#ffffff,stroke:#03396c,stroke-width:2px,rx:10,ry:10,color:#03396c,font-size:13px;
classDef label fill:#fff,stroke:none,color:#666,font-size:12px;
classDef paper fill:#fff,stroke:#ddd,stroke-width:1px,rx:5,ry:5,color:#333,font-size:12px;
%% Apply Classes
class Root main;
class Measurement,Expression,Interaction domain;
class M_Fact,M_Reason,M_Multi,E_Bench,E_Learn,I_Social,I_Multi label;
class LUQ,LoGU,UNCLE,RL,Rome,Beyond,ConfMulti,Atomic,Conformity,MAD paper;
📝 Publications
First & Co-First Papers († denotes equal contribution.)
-
Budget-Aware Agentic Routing via Boundary-Guided Training
Preprint
Caiqi Zhang, Menglin Xia, Xuchao Zhang, Daniel Madrigal, Ankur Mallick, Samuel Kessler, Victor Ruehle, Saravan Rajmohan
-
Beyond the Final Layer: Intermediate Representations for Better Multilingual Calibration in Large Language Models
Preprint
Ej Zhou†, Caiqi Zhang†, Tiancheng Hu, Chengzu Li, Nigel Collier, Ivan Vulić, Anna Korhonen
-
Reinforcement Learning for Better Verbalized Confidence in Long-Form Generation
ACL 2026 Main
Caiqi Zhang†, Xiaochen Zhu†, Chengzu Li, Nigel Collier, Andreas Vlachos
-
Confidence Estimation for LLMs in Multi-turn Interactions
ACL 2026 Findings
Caiqi Zhang†, Ruihan Yang†, Xiaochen Zhu, Chengzu Li, Tiancheng Hu, Yijiang River Dong, Deqing Yang, Nigel Collier
-
Demystifying Multi-Agent Debate: The Role of Confidence and Diversity
ACL 2026 Findings
Xiaochen Zhu†, Caiqi Zhang†, Yizhou Chi, Tom Stafford, Nigel Collier, Andreas Vlachos
-
All Roads Lead to Rome: Graph-Based Confidence Estimation for LLM Reasoning
EMNLP 2025 Main
Caiqi Zhang, Chang Shu, Ehsan Shareghi, Nigel Collier
-
UNCLE: Benchmarking Uncertainty Expressions in Long-Form Generation
EMNLP 2025 Main
Ruihan Yang†, Caiqi Zhang†, Zhisong Zhang, Xinting Huang, Dong Yu, Nigel Collier, Deqing Yang
-
Conformity in Large Language Models
ACL 2025 Main
Xiaochen Zhu†, Caiqi Zhang†, Tom Stafford, Nigel Collier, Andreas Vlachos
-
LoGU: Long-form Generation with Uncertainty Expressions
ACL 2025 Main
Ruihan Yang†, Caiqi Zhang†, Zhisong Zhang, Xinting Huang, Sen Yang, Nigel Collier, Dong Yu, Deqing Yang
-
Atomic Calibration of LLMs in Long-Form Generations
ACL 2025 KnowFM Oral / AACL-IJNLP 2025
Caiqi Zhang, Ruihan Yang, Zhisong Zhang, Xinting Huang, Sen Yang, Dong Yu, Nigel Collier
-
LUQ: Long-text Uncertainty Quantification for LLMs
EMNLP 2024 Main
Caiqi Zhang, Fangyu Liu, Marco Basaldella, Nigel Collier.
-
Do We Need Language-Specific Fact-Checking Models? The Case of Chinese
EMNLP 2024 Main
Caiqi Zhang, Zhijiang Guo, Andreas Vlachos.
-
TopViewRS: Vision-Language Models as Top-View Spatial Reasoners
EMNLP 2024 Main (Oral)
Chengzu Li†, Caiqi Zhang†, Han Zhou, Nigel Collier, Anna Korhonen, Ivan Vulić.
Other Collaborations (Full List: Google Scholar)
-
Can Large Language Models Generate High-quality Patent Claims?
NAACL 2025 Findings
Lekang Jiang, Caiqi Zhang, Pascal A Scherz, Stephan Goetz
-
Language is All a Graph Needs
EACL 2024 Findings
Ruosong Ye, Caiqi Zhang, Runhui Wang, Shuyuan Xu, Yongfeng Zhang.
-
Learning to Infer Action-Condition Dependencies from Instructional Manuals for Structural Instruction Understanding
ACL 2023 Main
Te-Lin Wu, Caiqi Zhang, Carol Hu, Alex Spangher, Nanyun (Violet) Peng.
👀 More facts about me:
Volunteer Teaching
During term breaks, I volunteered in various teaching trips to rural areas globally, covering Hong Kong, Taiwan, Guilin, Ho Chi Minh City (Vietnam), Phnom Penh (Cambodia), and Trà Vinh (Cambodia). I've participated in 10+ voluntary services, accumulating 400+ service hours, benefiting 300+ students. Also, I joined the United Nations' Millennium Fellowship 2021 to promote equal education.
Mandarin Debate
As a member of both the PolyU and Cambridge Mandarin Debate Teams, I participated in competitions across various cities, including Singapore, Shanghai, Suzhou, Nanjing, Wuhan, Changsha, Xi'an, and Chengdu. These experiences refined my communication and critical thinking skills and provided international representation opportunities.
Less is more. -Ludwig Mies Van der Rohe