ناحیه‌بندی تومورهای مغزی با استفاده از رمزگذار ترنسفورمر و ماژول‌های توجه انطباقی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 دانشجوی دکتری دانشکده مهندسی کامپیوتر، دانشگاه صنعتی شاهرود، شاهرود، ایران

2 دانشیار دانشکده مهندسی کامپیوتر، دانشگاه صنعتی شاهرود، شاهرود، ایران

3 استادیار دانشکده ریاضی، آمار و علوم آماری، دانشگاه اسکس، کلچستر، انگلستان

10.22034/abmir.2026.24013.1189

چکیده

ناحیه‌بندی خودکار و دقیق تومورهای مغزی از تصاویر تشدید مغناطیسی، نقشی حیاتی در تشخیص، برنامه‌ریزی درمان و نظارت بر بیماری دارد. معماری‌های مبتنی بر CNN در استخراج ویژگی‌های محلی توانمند هستند اما در درک زمینه سراسری تصویر با محدودیت مواجه‌اند. مدل‌های ترنسفورمر در مدل‌سازی وابستگی‌های دوربرد و سراسری برتری دارند و مشکلات مدل‌های مبتنی بر CNN را در این زمینه رفع می‌کنند. در این مقاله، ما یک معماری ترکیبی نوین U-شکل پیشنهاد می‌کنیم که از نقاط قوت هر دو رویکرد بهره می‌برد. مدل ما از یک ستون فقرات مبتنی بر Swin-Transformer پیش‌آموخته به عنوان رمزگذار برای استخراج ویژگی‌های سلسله‌مراتبی و غنی از زمینه سراسری استفاده می‌کند. نوآوری اصلی این معماری، معرفی دو ماژول توجه فضایی پیشرفته برای پالایش و تطبیق ویژگی‌های استخراج‌شده از رمزگذار با دامنه پزشکی و ماژول افزایش مقیاس به کمک توجه کانالی در رمزگشا برای باز-وزن‌دهی انطباقی و بهینه اطلاعات دریافتی از اتصالات پرشی است. ارزیابی‌های انجام‌شده بر روی مجموعه داده چالش‌برانگیز BRISC نشان داد که روش پیشنهادی ما با دستیابی به امتیاز %6/80 در IoU وزنی و %6/88 در Dice، از مدل‌های پیشرفته پیشین عملکرد بهتری داشته و کارایی ترکیب ترنسفورمر با مکانیزم‌های توجه دوگانه را اثبات می‌کند.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

Tumor Segmentation in MRI using a Transformer Encoder and Adaptive Attention Modules

نویسندگان [English]

  • Noor Isam Abdulnabi 1
  • Mansoor Fateh 2
  • Saideh Ferdowsi 3
1 PHD Candidate, Faculty of Computer Engineering, Shahrood University of Technology, Shahrood, Iran
2 Associate Professor, Faculty of Computer Engineering, Shahrood University of Technology, Shahrood, Iran
3 Assistant Professor, School of Mathematics, Statistics and Actuarial Science, University of Essex, Colchester, United Kingdom
چکیده [English]

Accurate and automatic segmentation of brain tumors from Magnetic Resonance Imaging (MRI) plays a vital role in diagnosis, treatment planning, and disease monitoring. While Convolutional Neural Network (CNN)-based architectures excel at extracting local features, they are limited in comprehending global image context; conversely, Transformer models are superior in modeling long-range and global dependencies, thereby addressing this CNN limitation. In this paper, we propose a novel hybrid U-shaped architecture that effectively combines the strengths of both approaches by utilizing a pre-trained Swin-Transformer backbone as the encoder to extract hierarchical and context-rich global features. The key innovation is the introduction of two sophisticated spatial attention modules to refine and adapt the encoder features specifically for the medical domain, along with a channel attention-aided upsampling module in the decoder to adaptively and optimally re-weight the information received from the skip connections. Evaluations conducted on the challenging BRISC dataset show that our proposed method outperforms previous state-of-the-art models, achieving an 80.6% score in Weighted IoU and 88.6% in the Dice Coefficient, thereby proving the efficiency of combining the Transformer with dual attention mechanisms.

کلیدواژه‌ها [English]

  • Segmentation
  • Attention Mechanism
  • Deep Learning
  • Brain Tumor
  • MRI
[1]     E. Goceri, "An efficient network with CNN and transformer blocks for glioma grading and brain tumor classification from MRIs," Expert Systems with Applications, vol. 268, p. 126290, 2025.
[2]     T. AL-SHEHARI, M. KADRIE, M. AL-RAZGAN, and T. ALFAKIH, "TumorGANet: A Transfer Learning and Generative Adversarial Network-Based Data Augmentation Model for Brain Tumor Classification," 2024.
[3]     N. Altini et al., "A Comparison Between Unimodal and Multimodal Segmentation Models for Deep Brain Structures from T1-and T2-Weighted MRI," Machine Learning and Knowledge Extraction, vol. 7, no. 3, p. 84, 2025.
[4]     M. A. Ilani, D. Shi, and Y. M. Banad, "T1-weighted MRI-based brain tumor classification using hybrid deep learning models," Scientific Reports, vol. 15, no. 1, p. 7010, 2025.
[5]     R. Li et al., "DeepGlioSeg: advanced glioma MRI data segmentation with integrated local-global representation architecture," Frontiers in Oncology, vol. 15, p. 1449911, 2025.
[6]     M. Rajabghane, A. Bahrololoum, and M. Eftekhari, "Improving Unet Networks for Medical Image Segmentation by adding Attention Mechanism Layers," Journal of Machine Vision and Image Processing, vol. 10, no. 4, pp. 49-59, 2024.
[7]     K. C. Pasunoori, C. R. Prasad, and K. R. Kumar, "A systematic review on deep learning based brain tumor segmentation and detection using MRI: Past insights, present techniques and future trends," Computational Biology and Chemistry, p. 108696, 2025.
[8]     N. Huda and K. R. Ku-Mahamud, "CNN-Based Image Segmentation Approach in Brain Tumor Classification: A Review," Engineering Proceedings, vol. 84, no. 1, p. 66, 2025.
[9]     P. K. Tiwary, P. Johri, A. Katiyar, and M. K. Chhipa, "Deep Learning-Based MRI Brain Tumor Segmentation with EfficientNet-Enhanced UNet," IEEE Access, 2025.
[10] Y. Lyu and X. Tian, "MWG-UNet++: Hybrid transformer U-Net model for brain tumor segmentation in MRI scans," Bioengineering, vol. 12, no. 2, p. 140, 2025.
[11] T. M. Angona and M. R. H. Mondal, "An attention based residual U-Net with swin transformer for brain MRI segmentation," Array, vol. 25, p. 100376, 2025.
[12] D. J. Ghadimi et al., "Deep Learning‐Based Techniques in Glioma Brain Tumor Segmentation Using Multi‐Parametric MRI: A Review on Clinical Applications and Future Outlooks," Journal of Magnetic Resonance Imaging, vol. 61, no. 3, pp. 1094-1109, 2025.
[13] N. Rasool and J. I. Bhat, "A critical review on segmentation of glioma brain tumor and prediction of overall survival," Archives of Computational Methods in Engineering, vol. 32, no. 3, pp. 1525-1569, 2025.
[14] R. C. Gonzalez, Digital image processing. Pearson education india, 2009.
[15] M. M. Saleh, M. E. Salih, M. A. Ahmed, and A. M. Hussein, "From traditional methods to 3d u-net: A comprehensive review of brain tumour segmentation techniques," Journal of Biomedical Science and Engineering, vol. 18, no. 1, pp. 1-32, 2025.
[16] J. C. Bezdek, L. Hall, and L. P. Clarke, "Review of MR image segmentation techniques using pattern recognition," Medical physics, vol. 20, no. 4, pp. 1033-1048, 1993.
[17] K. Held, E. R. Kops, B. J. Krause, W. M. Wells, R. Kikinis, and H.-W. Muller-Gartner, "Markov random field segmentation of brain MR images," IEEE transactions on medical imaging, vol. 16, no. 6, pp. 878-886, 1997.
[18] D. Zikic et al., "Decision forests for tissue-specific segmentation of high-grade gliomas in multi-channel MR," in International conference on medical image computing and computer-assisted intervention, 2012: Springer, pp. 369-376.
[19] A. Criminisi and J. Shotton, Decision forests for computer vision and medical image analysis. Springer Science & Business Media, 2013.
[20] J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, 2015, pp. 3431-3440.
[21] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, 2015: Springer, pp. 234-241.
[22] Ö. Çiçek, A. Abdulkadir, S. S. Lienkamp, T. Brox, and O. Ronneberger, "3D U-Net: learning dense volumetric segmentation from sparse annotation," in International conference on medical image computing and computer-assisted intervention, 2016: Springer, pp. 424-432.
[23] F. Milletari, N. Navab, and S.-A. Ahmadi, "V-net: Fully convolutional neural networks for volumetric medical image segmentation," in 2016 fourth international conference on 3D vision (3DV), 2016: Ieee, pp. 565-571.
[24] L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 4, pp. 834-848, 2017.S. M. Rasa et al., "Brain tumor classification using fine-tuned transfer learning models on magnetic resonance imaging (MRI) images," Digital health, vol. 10, p. 20552076241286140, 2024.
[25] D. Rastogi et al., "Brain tumor detection and prediction in MRI images utilizing a Fine-Tuned transfer learning model integrated within deep learning frameworks," Life, vol. 15, no. 3, p. 327, 2025.
[26] S. Anari, G. G. De Oliveira, R. Ranjbarzadeh, A. M. Alves, G. C. Vaz, and M. Bendechache, "EfficientUNetViT: efficient breast tumor segmentation utilizing UNet architecture and pretrained vision transformer," Bioengineering, vol. 11, no. 9, p. 945, 2024.
[27] A. Mukasheva, D. Koishiyeva, G. Sergazin, M. Sydybayeva, D. Mukhammejanova, and S. Seidazimov, "Modification of U-net with pre-trained ResNet-50 and atrous block for polyp segmentation: Model TASPP-UNet," Engineering Proceedings, vol. 70, no. 1, p. 16, 2024.
[28] A. Sharma and P. K. Mishra, "Inception UNet architecture for breast tumor segmentation and detection using hybrid deep learning approach," Multimedia Tools and Applications, vol. 84, no. 24, pp. 28225-28263, 2025.
[29] O. Oktay et al., "Attention u-net: Learning where to look for the pancreas," arXiv preprint arXiv:1804.03999, 2018.
[30] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems, vol. 30, 2017.
[31] A. Dosovitskiy, "An image is worth 16x16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020.
[32] J. Chen et al., "Transunet: Transformers make strong encoders for medical image segmentation," arXiv preprint arXiv:2102.04306, 2021.
[33] H. Cao et al., "Swin-unet: Unet-like pure transformer for medical image segmentation," in European conference on computer vision, 2022: Springer, pp. 205-218.
[34] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," in 2009 IEEE conference on computer vision and pattern recognition, 2009: Ieee, pp. 248-255.
[35] A. Fateh, Y. Rezvani, S. Moayedi, S. Rezvani, F. Fateh, and M. Fateh, "BRISC: Annotated Dataset for Brain Tumor Segmentation and Classification with Swin-HAFNet," arXiv preprint arXiv:2506.14318, 2025.