Schedule
Time | Session | Description | Duration(mins) |
---|---|---|---|
Dec 06, 16:00 | Opening Remark | Opening Remark Wei Huang | 10 |
Dec 06, 16:10 | Invited Talk 1 | TBD Yoshua Bengio | 40 |
Dec 06, 16:50 | Invited Talk 2 | TBD Ahmad Beirami | 40 |
Dec 06, 17:30 | Coffee Break | Poster Session | 50 |
Dec 06, 18:20 | Invited Talk 3 | TBD Furong Huang | 40 |
Dec 06, 19:00 | Oral Presentation | TBD TBD | 30 |
Dec 06, 19:30 | Lunch | Lunch - | 90 |
Dec 06, 21:00 | Invited Talk 4 | TBD Bo Li | 40 |
Dec 06, 21:40 | Invited Talk 5 | TBD Yu Yang | 40 |
Dec 06, 22:20 | Oral Presentation | TBD TBD | 30 |
Dec 06, 22:50 | Invited Talk 6 | TBD Yihe Deng | 40 |
Dec 06, 23:30 | Panel Discussion | From Theory to Deployment: Process-Level Safety in LRMs Panelists: - Junchi Yan (SJTU) - Zhangyang Atlas Wang (XTX Markets & UT Austin) - Atsushi Nitanda (A*STAR & NTU) - Maksym Andriushchenko (EPFL) - Yihe Deng (UCLA) | 40 |
Dec 07, 00:10 | Interactive Breakout | Breakout Session - | 30 |
Invalid Date NaN, NaN:NaN | Closing Remark | Closing Remark Wei Huang | 20 |
Call for Papers
Description
- Risk Landscape — Which process-level failure modes threaten high-stakes deployment?
- Metric Design — How can we measure and benchmark step-wise safety at scale?
- Prototype Validation — What lightweight tools or model patches can we release now to seed wider research?
- Failure modes and risk propagation in multi-step reasoning
- Adversarial attacks and defenses in inference-time reasoning.
- Evaluation metrics and empirical benchmarks for reasoning safety.
- Uncertainty quantification, error detection, and correction in reasoning processes.
- Approaches for improving interpretability and transparency of reasoning chains.
- Causal analysis as one lens for understanding reasoning failures.
- Human-AI trust, calibration, and collaborative reasoning.
- Case studies of reasoning models in real-world high-stakes domains.
- New architectures or algorithms for safer, more robust reasoning.
Submission Guide
Submission Instructions
Timelines
Submissions Deadline | Aug 22th, 2025 |
Reviewer bidding dates | Aug 22 - 26th, 2025 |
Review Deadline | Sept 17th, 2025 |
Paper Notification Date | Sept 22th, 2025 |
Camera Ready Deadline | Nov 5th, 2025 |
Program Committee
Frequently Asked Questions
Can we submit a paper that will also be submitted to NeurIPS 2025?
Yes.
Can we submit a paper that was accepted at ICML 2025?
No. ICML prohibits main conference publication from appearing concurrently at the workshops.
Will the reviews be made available to authors?
Yes.
I have a question not addressed here, whom should I contact?
Email organizers at trustllm-workshop@googlegroups.com
References
[1] Guo, D., Yang, D., Zhang, H., Song, J., Zhang, R., Xu, R., Zhu, Q., Ma, S., Wang, P., Bi, X. and Zhang, X., 2025. Deepseek-R1: Incentivizing reasoning capability in LLMs via reinforcement learning. arXiv preprint arXiv:2501.12948.
[2] Snell, C., Lee, J., Xu, K. and Kumar, A., 2024. Scaling LLM test-time compute optimally can be more effective than scaling model parameters. arXiv preprint arXiv:2408.03314.
[3] Muennighoff, N., Yang, Z., Shi, W., Li, X.L., Fei-Fei, L., Hajishirzi, H., Zettlemoyer, L., Liang, P., Candès, E. and Hashimoto, T., 2025. s1: Simple test-time scaling. arXiv preprint arXiv:2501.19393.
[4] Geiping, J., McLeish, S., Jain, N., Kirchenbauer, J., Singh, S., Bartoldson, B.R., Kailkhura, B., Bhatele, A. and Goldstein, T., 2025. Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach. arXiv preprint arXiv:2502.05171.
[5] Zhu, J., Yan, L., Wang, S., Yin, D. and Sha, L., 2025. Reasoning-to-defend: Safety-aware reasoning can defend large language models from Jailbreaking. arXiv preprint arXiv:2502.12970.