NeurIPS 2025 Workshop

Emergent Trust Risks in Large Reasoning Models (TrustLLM)

NeurIPS 2025 @ San Diego Convention Center, USA, Dec 6th - Dec 7th


Schedule

TimeSessionDescriptionDuration(mins)
Dec 06, 16:00
Opening Remark
    Opening Remark
    Wei Huang
    10
    Dec 06, 16:10
    Invited Talk 1
      TBD
      Yoshua Bengio
      40
      Dec 06, 16:50
      Invited Talk 2
        TBD
        Ahmad Beirami
        40
        Dec 06, 17:30
        Coffee Break
          Poster Session
          50
          Dec 06, 18:20
          Invited Talk 3
            TBD
            Furong Huang
            40
            Dec 06, 19:00
            Oral Presentation
              TBD
              TBD
              30
              Dec 06, 19:30
              Lunch
                Lunch
                -
                90
                Dec 06, 21:00
                Invited Talk 4
                  TBD
                  Bo Li
                  40
                  Dec 06, 21:40
                  Invited Talk 5
                    TBD
                    Yu Yang
                    40
                    Dec 06, 22:20
                    Oral Presentation
                      TBD
                      TBD
                      30
                      Dec 06, 22:50
                      Invited Talk 6
                        TBD
                        Yihe Deng
                        40
                        Dec 06, 23:30
                        Panel Discussion
                          From Theory to Deployment: Process-Level Safety in LRMs
                          Panelists:
                          - Junchi Yan (SJTU)
                          - Zhangyang Atlas Wang (XTX Markets & UT Austin)
                          - Atsushi Nitanda (A*STAR & NTU)
                          - Maksym Andriushchenko (EPFL)
                          - Yihe Deng (UCLA)
                          40
                          Dec 07, 00:10
                          Interactive Breakout
                            Breakout Session
                            -
                            30
                            Invalid Date NaN, NaN:NaN
                            Closing Remark
                              Closing Remark
                              Wei Huang
                              20

                              Call for Papers

                              Description

                              Recent breakthroughs in large reasoning models (LRMs)—systems that perform explicit, multi-step inference at test time—have unlocked impressive gains across science, mathematics, and law (Guo et al., 2025; Snell et al., 2024). Yet these very reasoning chains introduce novel trust risks that scale-only LLM studies overlook: small logic slips can snowball, intermediate states become new attack surfaces, and plausible-but-false proofs can elude standard output-level filters (Geiping et al., 2025).
                              What sets this workshop apart: Prior workshops focused on generic robustness; This workshop aims to bring together researchers and practitioners to systematically explore, characterize, and address these emergent trust risks in large reasoning models. We seek to move beyond traditional evaluation of final LLM outputs, instead focusing on the full process of reasoning: how errors propagate, how uncertainty compounds, how transparency and accountability are affected, and what new opportunities exist for robust evaluation and intervention (Zhu et al., 2025; Ouyang et al., 2024).
                              Core Research Questions
                              • Risk Landscape — Which process-level failure modes threaten high-stakes deployment?
                              • Metric Design — How can we measure and benchmark step-wise safety at scale?
                              • Prototype Validation — What lightweight tools or model patches can we release now to seed wider research?
                              Topics of Interest
                              • Failure modes and risk propagation in multi-step reasoning
                              • Adversarial attacks and defenses in inference-time reasoning.
                              • Evaluation metrics and empirical benchmarks for reasoning safety.
                              • Uncertainty quantification, error detection, and correction in reasoning processes.
                              • Approaches for improving interpretability and transparency of reasoning chains.
                              • Causal analysis as one lens for understanding reasoning failures.
                              • Human-AI trust, calibration, and collaborative reasoning.
                              • Case studies of reasoning models in real-world high-stakes domains.
                              • New architectures or algorithms for safer, more robust reasoning.

                              Submission Guide

                              Submission Instructions

                              We invite submissions of up to 8 pages (excluding references and appendices). The main manuscript and any appendices must be submitted as a single PDF file through the OpenReview portal (the official workshop venue link will be posted on the website). Papers previously published at other conferences will not be considered for acceptance
                              Each paper receives no less than 3 reviews. Reviewers declare the conflict of interests via the OpenReview interface; organizers will not handle papers with which they have a conflict.

                              Timelines

                              Submissions DeadlineAug 22th, 2025
                              Reviewer bidding datesAug 22 - 26th, 2025
                              Review DeadlineSept 17th, 2025
                              Paper Notification DateSept 22th, 2025
                              Camera Ready DeadlineNov 5th, 2025
                              All deadlines are specified in 23:59AoE (Anywhere on Earth).

                              Program Committee

                              Zijun Chen (SJTU)
                              Yi Liu (CityU HK)
                              Yi Ding (Purdue)
                              Zhuang Luo (Meta)
                              Bowei He (CityU HK, MBZUAI)
                              Zachary Yang (McGill)
                              Nathan Roll (Stanford)
                              Weijie Xu (Amazon)
                              Yihao Zhang (PKU)
                              Jungang Li (HKUST)
                              Kejia Chen (ZJU)
                              Yuechen Jiang (UH Mānoa)
                              Yang Li (SJTU)
                              Shaobo Wang (SJTU)
                              Jianhao Huang (SJTU)
                              Qibing Ren (SJTU)
                              Tongcheng Zhang (SJTU)
                              Xiaoxuan Lei (McGill)
                              Shan Chen (Harvard)
                              Xueying Ma (Columbia)
                              Junqi Jinag (ICL)
                              Qin Liu (UC Davis)
                              Weiwei Jiang (BUPT)
                              Zhexu Liu (RPI)
                              Hongyi Liu (Amazon)
                              Lingyun Wang (Pitt)
                              Yingzhou Lu (Stanford)
                              Yicheng Fu (Stanford)
                              Xinhe Xu (UIUC)
                              Mingyang Wang (LMU)
                              Andy Su (Meta & Princeton)
                              Jishen Yang (Amazon)
                              Simin Fan (EPFL)
                              Yibo Miao (CAS)
                              Gyuwan Kim (UCSB)
                              Shraddha Barke (Microsoft)
                              Xiangjun Dong (Texas A&M)
                              Atharva Kulkarni (USC)
                              Ruixuan Liu (Emory)
                              Zhexin Zhang (THU)
                              Caihua Li (Yale)
                              Yongyi Yang (UMich)
                              Mao Cheng (Meta)
                              Huaijin Wu (SJTU)
                              Lei Yu (Meta)
                              He Zhang (RMIT)
                              Saptarshi Saha (ISIK)
                              Xia Song (NTU)
                              Fiez Hussein Khleaf KHLEAF (Sakarya)
                              Ryotaro Kawata (UTokyo)
                              Rei Higuchi (UTokyo)
                              Naoki Nishikawa (UTokyo)
                              Chengyuan Deng (Rutgers)
                              Fanhu Zeng (CAS)
                              Zhiwei Xu (UMich)
                              Binh Nguyen (NUS)
                              Yunting Yin (EMU)
                              Jiawei Yan (Roche)
                              Jiahao Yuan (ECNU)
                              Shurui Li (Waymo)
                              Siqi Cao (JHU)
                              Youngsun Lim (Boston)

                              Frequently Asked Questions

                              Can we submit a paper that will also be submitted to NeurIPS 2025?

                              Yes.

                              Can we submit a paper that was accepted at ICML 2025?

                              No. ICML prohibits main conference publication from appearing concurrently at the workshops.

                              Will the reviews be made available to authors?

                              Yes.

                              I have a question not addressed here, whom should I contact?

                              Email organizers at trustllm-workshop@googlegroups.com

                              References

                              [1] Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X. and Zhang, X., 2025. Deepseek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning. arXiv preprint arXiv:2501.12948.

                              [2] Snell, C., Lee, J., Xu, K. and Kumar, A., 2024. Scaling LLM test-time compute optimally can be more effective than scaling model parameters. arXiv preprint arXiv:2408.03314.

                              [3] Muennighoff, N., Yang, Z., Shi, W., Li, X.L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E. and Hashimoto, T., 2025. s1: Simple test-time scaling. arXiv preprint arXiv:2501.19393.

                              [4] Geiping, J., McLeish, S., Jain, N., Kirchenbauer, J., Singh, S., Bartoldson, B.R., Kailkhura, B., Bhatele, A. and Goldstein, T., 2025. Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach. arXiv preprint arXiv:2502.05171.

                              [5] Zhu, J., Yan, L., Wang, S., Yin, D. and Sha, L., 2025. Reasoning-to-defend: Safety-aware reasoning can defend large language models from Jailbreaking. arXiv preprint arXiv:2502.12970.