The SAFER workshop tackles key challenges in medical foundation and vision-language models: hallucination detection in radiology and surgery, transparent and reliable reasoning for image-based diagnosis, and stable domain adaptation to unseen imaging environments. The workshop is organized around three core themes to address these challenges:
Theme 1: Stable and Efficient Adaptation of Medical VLMs This theme focuses on approaches that allow medical AI models to adapt to new domains while maintaining performance and faithful reasoning. This will also include human-in-the-loop methods where clinicians provide feedback to guide learning.
Topics of interest include, but are not limited to:
Theme 2: Faithful Reasoning and Explainability in Medical VLMs This theme addresses methods for making AI reasoning transparent, interpretable, and aligned with clinical workflows. We highlight approaches that generate step-by-step explanations, such as chain-of-thought reasoning, that are visually grounded in medical images including X-rays, CT scans, MRIs, pathology slides, and surgical videos. By explicitly linking reasoning steps to image regions and clinical knowledge, these methods make AI decisions easier to inspect, understand, and trust. Clear reasoning traces can reveal which diagnoses were considered, why one was selected over others, and where uncertainty remains, supporting quality assurance, error analysis, and safer human–AI collaboration.
Topics of interest include, but are not limited to:
Theme 3: Evaluation, Safety, and Trustworthiness This theme discusses how to measure and evaluate whether a model’s reasoning is faithful to the data, clinically grounded, and robust under domain shifts. We investigate metrics, benchmarks, and evaluation strategies that go beyond accuracy to assess reliability, uncertainty, and safety in real-world clinical settings.
Topics of interest include, but are not limited to:
Overall, SAFER aims to advance the development of responsible, interpretable, and clinically trustworthy AI models that can reason and adapt reliably for safer healthcare, by bringing together researchers from medical imaging, machine learning, and medical AI to discuss and share recent work on faithful reasoning, robust adaptation, verification, and evaluation of medical AI systems.