NeurIPS 2025 Workshop

Emergent Trust Risks in Large Reasoning Models (TrustLLM)

NeurIPS 2025 @ San Diego Convention Center, USA, Dec 6th - Dec 7th

Schedule

Please select your time zone:

Time	Session	Description	Duration(mins)
Dec 06, 16:00	Opening Remark	Opening Remark Wei Huang	10
Dec 06, 16:10	Invited Talk 1	TBD Yoshua Bengio	40
Dec 06, 16:50	Invited Talk 2	TBD Ahmad Beirami	40
Dec 06, 17:30	Coffee Break	Poster Session	50
Dec 06, 18:20	Invited Talk 3	TBD Furong Huang	40
Dec 06, 19:00	Oral Presentation	TBD TBD	30
Dec 06, 19:30	Lunch	Lunch -	90
Dec 06, 21:00	Invited Talk 4	TBD Bo Li	40
Dec 06, 21:40	Invited Talk 5	TBD Yu Yang	40
Dec 06, 22:20	Oral Presentation	TBD TBD	30
Dec 06, 22:50	Invited Talk 6	TBD Yihe Deng	40
Dec 06, 23:30	Panel Discussion	From Theory to Deployment: Process-Level Safety in LRMs Panelists: - Junchi Yan (SJTU) - Zhangyang Atlas Wang (XTX Markets & UT Austin) - Atsushi Nitanda (A*STAR & NTU) - Maksym Andriushchenko (EPFL) - Yihe Deng (UCLA)	40
Dec 07, 00:10	Interactive Breakout	Breakout Session -	30
Invalid Date NaN, NaN:NaN	Closing Remark	Closing Remark Wei Huang	20

Call for Papers

Description

Recent breakthroughs in large reasoning models (LRMs)—systems that perform explicit, multi-step inference at test time—have unlocked impressive gains across science, mathematics, and law (Guo et al., 2025; Snell et al., 2024). Yet these very reasoning chains introduce novel trust risks that scale-only LLM studies overlook: small logic slips can snowball, intermediate states become new attack surfaces, and plausible-but-false proofs can elude standard output-level filters (Geiping et al., 2025).

What sets this workshop apart: Prior workshops focused on generic robustness; This workshop aims to bring together researchers and practitioners to systematically explore, characterize, and address these emergent trust risks in large reasoning models. We seek to move beyond traditional evaluation of final LLM outputs, instead focusing on the full process of reasoning: how errors propagate, how uncertainty compounds, how transparency and accountability are affected, and what new opportunities exist for robust evaluation and intervention (Zhu et al., 2025; Ouyang et al., 2024).

Core Research Questions

Risk Landscape — Which process-level failure modes threaten high-stakes deployment?
Metric Design — How can we measure and benchmark step-wise safety at scale?
Prototype Validation — What lightweight tools or model patches can we release now to seed wider research?

Topics of Interest

Failure modes and risk propagation in multi-step reasoning
Adversarial attacks and defenses in inference-time reasoning.
Evaluation metrics and empirical benchmarks for reasoning safety.
Uncertainty quantification, error detection, and correction in reasoning processes.
Approaches for improving interpretability and transparency of reasoning chains.
Causal analysis as one lens for understanding reasoning failures.
Human-AI trust, calibration, and collaborative reasoning.
Case studies of reasoning models in real-world high-stakes domains.
New architectures or algorithms for safer, more robust reasoning.

Submission Guide

Submission Instructions

We invite submissions of up to 8 pages (excluding references and appendices). The main manuscript and any appendices must be submitted as a single PDF file through the OpenReview portal (the official workshop venue link will be posted on the website). Papers previously published at other conferences will not be considered for acceptance

Each paper receives no less than 3 reviews. Reviewers declare the conflict of interests via the OpenReview interface; organizers will not handle papers with which they have a conflict.

Timelines

Submissions Deadline	Aug 22th, 2025
Reviewer bidding dates	Aug 22 - 26th, 2025
Review Deadline	Sept 17th, 2025
Paper Notification Date	Sept 22th, 2025
Camera Ready Deadline	Nov 5th, 2025

All deadlines are specified in 23:59AoE (Anywhere on Earth).

Organizing Committee

Speakers & Panelists

Ahmad Beirami

Google DeepMind

Yoshua Bengio

Université deMontréal; Mila

Furong Huang

UMD

Bo Li

UIUC; Secure Learning Lab

XTX Markets & UT Austin

Atsushi Nitanda

A*STAR & NTU

Maksym Andriushchenko

EPFL

Program Committee

Zijun Chen (SJTU)

Yi Liu (CityU HK)

Yi Ding (Purdue)

Zhuang Luo (Meta)

Bowei He (CityU HK, MBZUAI)

Zachary Yang (McGill)

Nathan Roll (Stanford)

Weijie Xu (Amazon)

Yihao Zhang (PKU)

Jungang Li (HKUST)

Kejia Chen (ZJU)

Yuechen Jiang (UH Mānoa)

Yang Li (SJTU)

Shaobo Wang (SJTU)

Jianhao Huang (SJTU)

Qibing Ren (SJTU)

Tongcheng Zhang (SJTU)

Xiaoxuan Lei (McGill)

Shan Chen (Harvard)

Xueying Ma (Columbia)

Junqi Jinag (ICL)

Qin Liu (UC Davis)

Weiwei Jiang (BUPT)

Zhexu Liu (RPI)

Hongyi Liu (Amazon)

Lingyun Wang (Pitt)

Yingzhou Lu (Stanford)

Yicheng Fu (Stanford)

Xinhe Xu (UIUC)

Mingyang Wang (LMU)

Andy Su (Meta & Princeton)

Jishen Yang (Amazon)

Simin Fan (EPFL)

Yibo Miao (CAS)

Gyuwan Kim (UCSB)

Shraddha Barke (Microsoft)

Xiangjun Dong (Texas A&M)

Atharva Kulkarni (USC)

Ruixuan Liu (Emory)

Zhexin Zhang (THU)

Caihua Li (Yale)

Yongyi Yang (UMich)

Mao Cheng (Meta)

Huaijin Wu (SJTU)

Lei Yu (Meta)

He Zhang (RMIT)

Saptarshi Saha (ISIK)

Xia Song (NTU)

Fiez Hussein Khleaf KHLEAF (Sakarya)

Ryotaro Kawata (UTokyo)

Rei Higuchi (UTokyo)

Naoki Nishikawa (UTokyo)

Chengyuan Deng (Rutgers)

Fanhu Zeng (CAS)

Zhiwei Xu (UMich)

Binh Nguyen (NUS)

Yunting Yin (EMU)

Jiawei Yan (Roche)

Jiahao Yuan (ECNU)

Shurui Li (Waymo)

Siqi Cao (JHU)

Youngsun Lim (Boston)

Expand

Frequently Asked Questions

Can we submit a paper that will also be submitted to NeurIPS 2025?

Yes.

Can we submit a paper that was accepted at ICML 2025?

No. ICML prohibits main conference publication from appearing concurrently at the workshops.

Will the reviews be made available to authors?

Yes.

I have a question not addressed here, whom should I contact?

Email organizers at trustllm-workshop@googlegroups.com

References

[1] Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X. and Zhang, X., 2025. Deepseek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning. arXiv preprint arXiv:2501.12948.

[2] Snell, C., Lee, J., Xu, K. and Kumar, A., 2024. Scaling LLM test-time compute optimally can be more effective than scaling model parameters. arXiv preprint arXiv:2408.03314.

[3] Muennighoff, N., Yang, Z., Shi, W., Li, X.L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E. and Hashimoto, T., 2025. s1: Simple test-time scaling. arXiv preprint arXiv:2501.19393.

[4] Geiping, J., McLeish, S., Jain, N., Kirchenbauer, J., Singh, S., Bartoldson, B.R., Kailkhura, B., Bhatele, A. and Goldstein, T., 2025. Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach. arXiv preprint arXiv:2502.05171.

[5] Zhu, J., Yan, L., Wang, S., Yin, D. and Sha, L., 2025. Reasoning-to-defend: Safety-aware reasoning can defend large language models from Jailbreaking. arXiv preprint arXiv:2502.12970.