09:17
arXiv cs.AI@Marc Aubreville, Jonas Ammeling, Sweta Banerjee, Viktoria Weiss, Taryn A. Donovan, Robert Klopfleisch, Jiaqi Lv, Shan E Ahmed Raza, Raphaël Bourgade, Thomas Walter, Yasemin Topuz, Songül Varlı, Charles-Antoine Collins-Fekete, Zhuoyan Shen, Navya Sri Kelam, Nitin Singhal, Christian Marzahl, Brian Napora, Tengyou Xu, Hongyan Gu, Mario Vento, Gennaro Percannella, Norbert Ropiak, Izabela Wasiak, Jie Xiao, Shaojun Liu, Seungho Choe, April Khademi, Vidushi Walia, Sujatha Kotte, Andrew Broad, Alex Wright, Guillaume Balezo, Esha Sadia Nasir, Mostafa Jahanifar, Yosuke Yamagishi, Shouhei Hanaoka, Mattia Sarno, Francesco Tortorella, Biwen Meng, Jingxin Liu, Sara Krauss, Daniel Hieber, Lavish Ramchandani, Dev Kumar Das, Mieko Ochi, Yuan Bae, Piotr Giedziun, Mateusz Maniewski, Vangala Govindakrishnan Saipradeep, Naveen Sivadasan, Leire Benito-Del-Valle, Adrian Galdran, Kaustubh Atey, Sameer Anand Jha, Adinath Dukre, Imran Razzak, Maxime W. Lafarge, Viktor H. Koelzer, Nils Porsche, Nikolas Stathonikos, Mitko Veta, Dominik Hirling, Zsanett Zsófia Iván, Peter Horvath, Katharina Breininger, Christof A. Bertram MIDOG 2025挑战赛旨在评估有丝分裂检测算法在真实世界中的泛化能力,超越了以往仅关注扫描仪差异的基准。挑战赛构建了包含12种人类、犬类和猫类肿瘤类型、365个病例的测试数据集,并引入了随机组织区域和困难区域检测,以及非典型有丝分裂图分类任务。结果显示,在传统热点区域表现良好的模型在困难区域性能显著下降,假阳性率增加三倍,且在不同肿瘤类型间表现差异巨大,揭示了当前模型的“盲点”。集成方法平均提升了F1分数1.5个百分点和平衡准确率1.3个百分点,而测试时增强无明显改善。该挑战表明,真实世界的有丝分裂检测仍是重大挑战,多情境评估框架为临床可靠性提供了更现实的代理指标。
推荐理由:病理AI团队和计算病理学研究者注意了:MIDOG 2025揭示了当前有丝分裂检测模型在真实世界中的脆弱性,尤其是罕见肿瘤类型和困难区域。如果你的模型只在热点区域表现好,点开看看盲点在哪,以及集成方法如何带来稳定提升。