Comprehensive Segmentation of Gray Matter Structures on T1-Weighted Brain MRI: A Comparative Study of Convolutional Neural Network, Convolutional Neural Network Hybrid-Transformer or -Mamba Architectures

Yujia Wei; Jaidip Manikrao Jagtap; Yashbir Singh; Bardia Khosravi; Jason Cai; Jeffrey L. Gunter; Bradley J. Erickson

doi:10.3174/ajnr.A8544

Graphical Abstract

Abstract

BACKGROUND AND PURPOSE: Recent advances in deep learning have shown promising results in medical image analysis and segmentation. However, most brain MRI segmentation models are limited by the size of their data sets and/or the number of structures they can identify. This study evaluates the performance of 6 advanced deep learning models in segmenting 122 brain structures from T1-weighted MRI scans, aiming to identify the most effective model for clinical and research applications.

MATERIALS AND METHODS: A total of 1510 T1-weighted MRIs were used to compare 6 deep learning models for the segmentation of 122 distinct gray matter structures: nnU-Net, SegResNet, SwinUNETR, UNETR, U-Mamba_BOT, and U-Mamba_ Enc. Each model was rigorously tested for accuracy by using the dice similarity coefficient (DSC) and the 95th percentile Hausdorff distance (HD95). Additionally, the volume of each structure was calculated and compared between normal controls (NCs) and patients with Alzheimer disease (AD).

RESULTS: U-Mamba_Bot achieved the highest performance with a median DSC of 0.9112 (interquartile range [IQR]: 0.8957, 0.9250). nnU-Net achieved a median DSC of 0.9027 [IQR: 0.8847, 0.9205], and had the highest HD95 of 1.392 [IQR: 1.174, 2.029]. The value of each HD95 (<3 mm) indicates its superior capability in capturing detailed brain structures accurately. Following segmentation, volume calculations were performed, and the resultant volumes of NCs and patients with AD were compared. The volume changes observed in 13 brain substructures were all consistent with those reported in existing literature, reinforcing the reliability of the segmentation outputs.

CONCLUSIONS: This study underscores the efficacy of U-Mamba_Bot as a robust tool for detailed brain structure segmentation in T1-weighted MRI scans. The congruence of our volumetric analysis with the literature further validates the potential of advanced deep learning models to enhance the understanding of neurodegenerative diseases such as AD. Future research should consider larger data sets to validate these findings further and explore the applicability of these models in other neurologic conditions.

ABBREVIATIONS:

AD: Alzheimer disease
ADNI: Alzheimer’s Disease Neuroimaging Initiative
CNN: convolutional neural network
DSC: dice similarity coefficient
HD95: 95th percentile Hausdorff distance
IQR: interquartile range
NC: normal control
SSM: state-space sequence model

SUMMARY

PREVIOUS LITERATURE:

Previous studies have demonstrated the utility of CNNs and hybrid transformer models in medical image segmentation, particularly in neuroimaging. U-Net–based architectures have been widely adopted for their ability to capture spatial details, while transformer models show promise in capturing global dependencies. However, many approaches still face limitations, such as computational resource demands and high memory usage, when applied to large-scale data sets. The introduction of structured state-space models, such as U-Mamba, provides a new perspective on improving both segmentation accuracy and computational efficiency in biomedical imaging.

KEY FINDINGS:

Our study found that U-Mamba_Bot outperformed other models, achieving the highest DSC of 0.9112. It also exhibited competitive training and inference times compared with other architectures.

KNOWLEDGE ADVANCEMENT:

The U-Mamba model’s integration of structured state-space mechanisms addresses some of the limitations of traditional CNN and transformer-based models, particularly in capturing long-range dependencies with lower computational cost. These findings highlight U-Mamba’s potential for enhancing neuroimaging analysis in clinical applications.

MRI can deliver superior spatial and contrast resolution and has become a cornerstone in diagnosing and treating neurologic disease. Instance segmentation refers to delineating intracranial structures (segmentation) and assigning individual labels to every structure, which is essential in studying brain MRI. It provides valuable information for structural analysis, volumetric assessment, surgical planning, and image-guided intervention. For example, several studies from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) demonstrated that alterations in intracranial structural volumes (as quantified by using segmentation tools) correlate with outcome measures in clinical trials for Alzheimer disease (AD). A study from Radue et al¹ demonstrated that brain volume loss correlated with clinical and radiologic outcomes in patients with multiple sclerosis, and Mora et al² demonstrated that patients with medial temporal lobe epilepsy exhibit a consistent pattern of gray matter atrophy on MRI, suggesting that a common pathophysiologic process may be responsible for the disease. Thus, diagnosing neurologic and neuropsychiatric diseases necessitates a comprehensive understanding of subcortical structures, and it is crucial to grasp both the structural and functional characteristics of the brain.

Automated segmentation methods have been developed to differentiate between frontotemporal dementia and AD based on MRI.³ Recent advances in deep learning have shown promising results in medical image analysis and segmentation.⁴ State-of-the-art methods, such as U-Net and transformers, have achieved impressive success in medical image segmentation.⁵ The convolutional neural network (CNN)-based U-Net architecture has been widely utilized in various fields, particularly medical image segmentation, due to its effectiveness in capturing spatial information and features. Though initially developed for natural language processing tasks, transformers have recently been adapted for medical image segmentation and offer promising results.^6,7 Though U-Net can better recognize local features and transformers can capture global features, given their respective advantages and disadvantages, many current approaches are combining the two, for example, UNETR⁸ and SwinUNETR⁹ for more accurate results and more robust performance. However, certain shortcomings still exist, such as being resource-intensive, and having high memory and computational requirements.

Recently, state-space sequence models (SSMs), particularly structured SSMs, have emerged as efficient and powerful components for constructing deep networks that deliver top-tier performance in continuous long-sequence data analysis.¹⁰ Mamba improved the structured state-space sequence models model by introducing a selective mechanism that allows the model to select relevant information depending on the input.¹¹ U-Mamba was newly developed for general purpose biomedical image segmentation with a self-adapting function network based on the innovative hybrid CNN-SSM architecture. Compared with the currently popular deep learning network architectures, nnU-Net, SegResNet,¹² and transformer-based SwinUNETR,⁹ U-Mamba achieved the best results in image segmentation for abdominal MRI, instruments in endoscopy, and cells in microscopy.¹³

In this study, we compared the newly released U-Mamba automatic segmentation model and previous state-of-the-art segmentation models for whole-brain substructure segmentation on T1-weighted MRI. We utilized the ADNI database to conduct comparative analyses of the most popular current models for medical image segmentation for almost all cortical subregions and nuclei of the human brain. This sets the stage for further understanding the relationship between brain structural changes and diseases, uncovering unknown disease mechanisms, and providing highly automated and robust tools. We believe these have significant potential for clinical application.

MATERIALS AND METHODS

Data Collection

Data used in the preparation of this article were obtained from the ADNI database (http://adni.loni.usc.edu). The ADNI was launched in 2003 as a public-private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, PET, other biologic markers, and clinical and neuropsychological assessment can be combined to measure the progression of mild cognitive impairment and early AD.

This study analyzed 20,056 randomly selected T1-weighted MRIs from the ADNI data sets. Because each patient typically undergoes multiple MRI scans, we retained the initial MRI from each patient in the ADNI database. We excluded cases lacking detailed information and ultimately obtained 1510 MRI scans for this study (Table 1). Normally distributed data are expressed as means (standard deviation), and non-normally distributed data are expressed as medians with interquartile range (median [IQR: 25th, 75th]).

View this table:

Table 1:

Demographics of the data set obtained for the study

Data Labeling and Preprocessing

To generate ground truth labels (pixel-level segmentation masks), we linearly registered the Mayo Clinic Adult Lifespan Template and Atlas to every volume by using ANTs tool (https://github.com/ANTsX/ANTs) resulting in 122 gray matter structures.^14,34 Board-certified radiologists visually inspected all volumes to ensure good image quality and proper registration. We divided the data set into 768 training, 192 validations, and 550 for testing (all from different patients) for model evaluation. Table 1 summarizes the demographics and characteristics of the patients in this data set.

Because our study is focused on brain segmentation, the deep learning model requires the removal of nonbrain tissue to increase the model’s performance. A deep learning-based brain extraction tool, HD-BET,¹⁵ was used on T1-weighted MR images.

Model Training

All of the deep learning models were adapted to work within the nnU-Net framework. nnU-Net is a deep learning-based segmentation method that automatically configures and runs the entire segmentation pipeline for any biomedical image data set, including preprocessing, data augmentation, model training, and postprocessing. The pipeline handles hyperparameter tuning and does not require any changes to the network architecture. Therefore, it provides a perfect environment for comparing U-Mamba with other methods. Also, it enables U-Mamba to be easily adapted to a wide range of segmentation tasks. In the nnU-Net framework, the patch size is 128 × 128 × 128 with a batch size equal to 2. The Adam optimizer with an initial learning rate of 0.01 was used to optimize network weights, with the momentum set at 0.99. An empirical combination of Dice loss with cross-entropy loss in nnU-Net has enhanced training stability and improved segmentation accuracy.^13,16 To ensure a fair comparison, we also implemented SegResNet,¹² UNETR,⁸ and SwinUNETR⁹ into the nnU-Net framework and utilized nnU-Net recommended and default optimizers for model training.¹³ For more detailed information on how each model was adapted within the nnU-Net framework, please refer to the GitHub repository: https://github.com/wyjzll/U-Mamba. This repository includes comprehensive documentation and code examples illustrating the integration process. To ensure the reproducibility of this study and maintain consistency in the code version used, independent of updates by the original U-Mamba authors,¹³ we forked the original U-Mamba GitHub repository (https://github.com/bowang-lab/U-Mamba). This approach guarantees that the code remains unchanged, allowing others to reliably replicate our findings. All models were trained on 1 graphics processing unit (GPU) (NVIDIA A100 80G SXM) for 1000 epochs with random initial weights. The entire process of this study is presented in Fig 1.

FIG 1.

An overview of the study flow. A, T1-weighted MR images were first processed through a brain extraction step by using the brain extraction tool-HD-BET. B, The processed image then serves as the input to each of the deep learning models, which are responsible for segmenting the brain into different anatomic regions. C, The evaluation process was conducted by comparing the segmentation results for each model with the ground truth.

Evaluation

The performance of the models in segmenting all 122 brain substructures was assessed by using the dice similarity coefficient (DSC) and the 95th percentile Hausdorff distance (HD95).

The DSC is defined in equation 1: $\text{[math]}$ (1)

Here, X and Y represent the ground truth segmentation and model segmentation. The ∣⋅∣ indicates the number of elements in the set and ∩ represents the intersection of the sets.

The HD is defined in equation 2: $\text{[math]}$ (2) where:

A and B are the 2 sets of points (eg, the edges of the segmented regions for the ground truth and for a model)
h(A, B) is defined as maxa ∈ A minb ∈ B d(a, b)
h(B, A) is defined as maxb ∈B mina ∈A d(b, a)
d(a, b) is the distance between points a and b (typically Euclidean distance).

The HD95 is calculated by taking the 95th percentile of all the computed distances rather than the maximum, helping to ignore the most extreme values that might be due to noise or other anomalies. This makes it a robust measure for assessing the accuracy of medical image segmentations, where outliers can skew the results.

Statistics

We use the Kolmogorov-Smirnov test for all normality tests for all of the groups. Two-sample t tests were used to compare differences in normally distributed data between groups. If data were not normally distributed, the data were tested by using nonparametric approaches (Mann-Whitney U test). The statistical tests were performed by using GraphPad Prism 10.0 and SciPy 1.8.0. Results with P < .05 were indicated by asterisks.

This article follows the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis Or Diagnosis Checklist.³⁵

RESULTS

Model Performance

A representative mask generated by our model is shown in Fig 2. All 122 gray matter regions were segmented successfully in all of the models (that is, no structure had a DSC of 0 for any model (Supplemental Data). We have provided downloadable result samples on our GitHub page (https://github.com/wyjzll/Brain_Segmentation) for interested readers to further reference. The models evaluated were nnU-Net, SegResNet, UNETR, SwinUNETR, U-Mamba_BOT, and U-Mamba_Enc. Each model’s effectiveness is assessed by using the DSC and HD95. A success rates of DSC > 0.9 or HD95 < 3 mm was defined to ensure high accuracy and reliability in clinical settings (Table 2). Among these models, U-Mamba_Bot showed superior segmentation accuracy with the highest DSC value of 0.9112 [IQR: 0.8957, 0.9250], achieving high success rates of 68.85% for DSC > 0.9, this model, however, did not have the best HD95 score, suggesting a potential trade-off between general overlap and boundary precision. U-Mamba_Enc demonstrated competitive performance, closely matching nnU-Net in DSC (0.8968 [IQR: 0.8801, 0.9155]), but with a higher HD95 of 1.544 [IQR: 1.224, 2.318]. In contrast, the SegResNet, while having a competitive DSC score (0.9033), exhibited a median HD95 (1.449 mm), which could imply higher boundary accuracy compared with other models. The UNETR model, as presented in the chart, showed a DSC of 0.8709 with an IQR from 0.8521 to 0.8978, which was the lowest among the models tested. This suggests that it may have some limitations in achieving a high degree of overlap between the predicted and true segmentations compared with the other models.

FIG 2.

Representative images generated by the deep learning models versus human-labeled ground truth.

View this table:

Table 2:

Results summary of trained segmentation models on T1-weighted MRI data sets

As shown in Table 3, the training times varied significantly among models. nnU-Net V2 has the longest epoch time at 412 seconds, indicating that it may require considerable computational resources and time for training. U-Mamba_Enc had the fastest training time, completing an epoch in just 169 seconds. SegResNet and UNETR also showed relatively quick epoch times at 183 and 191 seconds, respectively, making them suitable for environments in which faster model training is beneficial. SwinUNETR and U-Mamba_Bot fell in the middle, with epoch times of 314 and 273 seconds, respectively, balancing computational complexity and training speed.

View this table:

Table 3:

A comparison of the epoch training times and inference time per image for different deep learning models

To assess the effectiveness of the models, we calculated the degree of correspondence between the volumes of individual brain regions in each model’s predictions and the ground truth. Figure 3 shows the performance of various deep learning models in segmenting brain substructures. Each model’s accuracy was evaluated based on its ability to match ground truth measurements, with the results categorized into significant differences (P < .05) and nonsignificant differences (P > .05). The models included in the analysis are nnU-Net, SegResNet, UNETR, SwinUNETR, U-Mamba_Bot, and U-Mamba_Enc. SegResNet and U-Mamba_Bot aligned more closely to the ground truth, indicated by more nonsignificant differences versus ground truth (117 and 118, respectively). nnU-Net demonstrated a high discrepancy from ground truth with 122 significant differences.

FIG 3.

Comparison of the number of brain substructure volumes that had significant differences from ground truth by using different models.

Clinical Testing

Patients with AD are intricately linked to changes in brain structure and volume characterized by brain atrophy. Understanding these alterations through neuroimaging studies is crucial for the early detection and monitoring of disease progression. In this part, we analyzed the prediction results produced by U-Mamba_Bot on the test set, which includes 134 healthy individuals and 112 patients with AD. We found that among all the brain areas that atrophied in patients with AD, the amygdala exhibited the most significant volume reduction compared with the NC group, with the left and right sides shrinking by 13.03% and 10.03%, respectively. This was followed by the bilateral entorhinal cortex, which decreased by 8.60% on the left side and 9.33% on the right. In contrast, the volumes of the caudate in the NC and AD groups increased by 8.74% and 7.27%, respectively, which aligns with the findings reported in the literature.¹⁷ For ease of presentation, we only show the volume ranges of the 13 brain regions that have statistically significant differences between the NC and AD groups (Fig 4), based on the image segmentation results generated by the U-Mamba_Bot model.

FIG 4.

Comparison of brain substructure volumes between NC and AD. The dual-colored bars represent quantified volumes for each respective brain substructure, highlighting significant discrepancies between the 2 groups (*P < .05, Mann-Whitney U test).

DISCUSSION

In this paper, we have chosen the latest U-Mamba model to conduct what we believe to be the most extensive substructural segmentation of the brain to date. Our comprehensive experiment effectively highlights the variability in the accuracy of most deep learning models in replicating precise brain substructure volumes, providing insights into their reliability for clinical and research applications, and suggested model selection based on specific requirements for accuracy and margin precision in future clinical applications. The results indicate that U-Mamba’s performance surpasses that of existing CNN and Transformer-based segmentation networks across different modes and segmentation targets. In particular, U-Mamba has significantly faster training speeds compared with CNN and Transformer architectures. This is instrumental in addressing challenges posed by the local nature of CNNs and the computational complexity of Transformers, which affect long-range modeling. This advantage is largely attributed to the architectural design of U-Mamba, which is capable of extracting multiscale local features while capturing long-range dependencies.

The nnU-Net framework, based on the U-Net architecture, has demonstrated exceptional performance in various segmentation tasks, surpassing state-of-the-art models in international biomedical image segmentation challenges. Its success can also be attributed to adaptive preprocessing, extensive data augmentation, model ensembling, and aggregating tiled predictions, which collectively contribute to its consistently high performance across a wide range of tasks.^16,18 nnU-Net can be configured to execute the entire segmentation pipeline automatically; in addition, it offers a range of features that make it highly adaptable and effective across different models and tasks. No other specific data preprocessing (beyond brain extraction) is needed in this part.¹⁶ We have chosen the nnU-Net as our segmentation network backbone, which enabled us to focus on implementing the network while managing other variables like image preprocessing and data augmentation. This arrangement facilitates a fair comparison of U-Mamba with other methods under consistent conditions, where the network architecture is the sole variable that differs.

In evaluating the efficiency of different deep learning models, it is essential to consider the training times, which can vary significantly among models. Our results demonstrated that U-Mamba_Enc has the fastest training time among the most popular models. However, U-Mamba_Bot showed the advantage of having the quickest inference time among all the models we evaluated. Based on studies by Gu et al¹¹ and Ma et al,¹³ Mamba advances SSMs in discrete data modeling, such as text and genomes, through 2 significant enhancements. First, Mamba introduces an input-dependent selection mechanism, a departure from the traditional time- and input-invariant SSMs, enabling effective filtration of information from inputs. This mechanism is achieved by parameterizing the SSM parameters according to the input data.^11,13 Second, Mamba incorporates a hardware-aware algorithm that scales linearly with sequence length and computes the model recurrently with a scan, thereby enhancing processing speed on modern hardware.

The Mamba architecture, which combines SSM blocks with linear layers, is notably simpler and has achieved state-of-the-art performance in various long-sequence domains such as language and genomics. This simplicity translates into significant computational efficiencies in the training and inference phases.^11,13 Wu et al¹⁹ explored the core features of Mamba and conceptually determined that it is best suited for tasks involving long sequences and autoregressive features. For vision tasks that do not have these characteristics, such as image classification, they argue that Mamba may not be necessary. However, while detection and segmentation tasks are not autoregressive, they do involve long sequences.¹⁹ Interestingly, our study also shows that the overall performance of U-Mamba_Bot with U-Mamba block applied only at bottleneck achieves the highest DSC value among all models, exceeding that of U-Mamba_Enc with U-Mamba block applied in all encoder parts. This suggests that replacing all encoder modules with SSM blocks may not necessarily yield optimal accuracy. Therefore, it is worth investigating the potential benefits of applying Mamba to such tasks.

Understanding regional brain volume is crucial for comprehending the pathophysiology of various brain-related diseases. Several studies have investigated the relationship between brain volume and different health issues such as Huntington disease,²⁰ atrial fibrillation,²¹ AD, critical illnesses,²² multiple sclerosis²³ Parkinson disease,²⁴ and migraine.²⁵ This study utilized the ADNI database and successfully completed whole-brain substructure image segmentation, followed by the calculation of brain substructure volumes and a comparative analysis with the NC group. We identified 13 brain functional areas (Fig 4) with significant changes in brain volume, most of which align with findings reported in the literature.^{17,26⇓⇓⇓⇓⇓–32} However, our results did not show the significant reduction in hippocampal volume that other studies have reported,³³ and this discrepancy merits further investigation. In future studies, we also plan to apply our study to a larger database and different diseases to offer valuable insights for the diagnosis, prognosis, and monitoring of treatment in different neurologic conditions.

Limitations

We have found that certain types of intracranial lesions, such as displacement, space-occupying lesions, inflammation, trauma, edema, hemorrhage, etc, can significantly impact the performance of our model. Because the ADNI database primarily consists of imaging data from elderly individuals, including healthy controls and those with various cognitive impairments, even though these subjects may exhibit certain pathologic or age-related changes in brain structure, their basic structure remains relatively unchanged. Therefore, our model is not optimized for segmentation tasks involving such structural alterations.

Moreover, we have identified some minor hand-labeling mistakes during the process. For example, CSF signals adjacent to gyri were occasionally mislabeled as gyri. Although these errors did not significantly affect larger brain regions, they could lead to substantial calculation errors in small brain structures. This is an area that requires improvement for future work.

CONCLUSIONS

We successfully segmented gray matter regions by using several popular deep learning models, including U-Mamba, a newly developed deep learning architecture. Our extensive experimental findings demonstrate that U-Mamba’s performance matches that of existing CNN and Transformer-based segmentation networks across various modes and segmentation targets. Specifically, U-Mamba_Bot exhibits a marked increase in accuracy for segmentation models compared with CNN and Transformer architectures. Furthermore, the variability across different brain regions captured in this study not only reinforces the heterogeneity of Alzheimer pathology but also may guide targeted research into the development of diagnostic markers and therapeutic strategies focused on the most affected areas. We believe it may be used clinically for feature extraction, morphologic analysis, and downstream diagnostic tools, thereby contributing to the development of automated diagnostic and treatment assessment systems.

Acknowledgments

We acknowledge all the authors of the employed public data sets, allowing the community to use these valuable resources for research purposes. We also thank the authors of nnU-Net (https://github.com/MIC-DKFZ/nnUNet) and U-Mamba (https://github.com/bowang-lab/U-Mamba) for making their valuable code publicly available.

Data collection and sharing for the Alzheimer Disease Neuroimaging Initiative is funded by the National Institute on Aging (National Institutes of Health Grant U19 AG024904). The grantee organization is the Northern California Institute for Research and Education. In the past, ADNI has also received funding from the National Institute of Biomedical Imaging and Bioengineering, the Canadian Institutes of Health Research, and private sector contributions through the Foundation for the National Institutes of Health including generous contributions from the following: AbbVie; Alzheimer’s Association; Alzheimer’s Drug Discovery Foundation; Araclon Biotech; BioClinica, Inc.; Biogen; Bristol-Myers Squibb Company; CereSpir, Inc.; Cogstate; Eisai Inc.; Elan Pharmaceuticals, Inc.; Eli Lilly and Company; EuroImmun; F. Hoffmann-La Roche Ltd and its affiliated company Genentech, Inc.; Fujirebio; GE Healthcare; IXICO Ltd.; Janssen Alzheimer Immunotherapy Research & Development, LLC.; Johnson & Johnson Pharmaceutical Research & Development LLC.; Lumosity; Lundbeck; Merck & Co., Inc.; Meso Scale Diagnostics, LLC.; NeuroRx Research; Neurotrack Technologies; Novartis Pharmaceuticals Corporation; Pfizer Inc.; Piramal Imaging; Servier; Takeda Pharmaceutical Company; and Transition Therapeutics.

Footnotes

Disclosure forms provided by the authors are available with the full text and PDF of this article at www.ajnr.org.

References

1.↵
1. Radue E-W,
2. Barkhof F,
3. Kappos L, et al
. Correlation between brain volume loss and clinical and MRI outcomes in multiple sclerosis. Neurology 2015;84:784–93 doi:10.1212/WNL.0000000000001281 pmid:25632085
Abstract/FREE Full Text
2.↵
1. Mora F,
2. Segovia G,
3. del Arco A
. Aging, plasticity and environmental enrichment: structural changes and neurotransmitter dynamics in several areas of the brain. Brain Res Rev 2007;55:78–88 doi:10.1016/j.brainresrev.2007.03.011 pmid:17561265
CrossRef PubMed
3.↵
1. Yu Q,
2. Mai Y,
3. Ruan Y, et al
. An MRI-based strategy for differentiation of frontotemporal dementia and Alzheimer’s disease. Alzheimers Res Ther 2021;13:23
CrossRef PubMed
4.↵
1. Adegun AA,
2. Viriri S,
3. Ogundokun RO
. Deep learning approach for medical image analysis. Comput Intell Neurosci 2021;2021:e6215281 doi:10.1155/2021/6215281
CrossRef
5.↵
1. Chen S,
2. Qiu C,
3. Yang W, et al
. Multiresolution aggregation transformer UNet based on multiscale input and coordinate attention for medical image segmentation. Sensors 2022;22:3820 doi:10.3390/s22103820
CrossRef PubMed
6.↵
1. Strudel R,
2. Garcia R,
3. Laptev I, et al
. Segmenter: transformer for semantic segmentation. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: 2021:7242–7252 doi:10.1109/ICCV48922.2021.00717
CrossRef
7.↵
1. Dosovitskiy A,
2. Beyer L,
3. Kolesnikov A, et al
. An image is worth 16x16 words: transformers for image recognition at scale. arxiv [Epub ahead of print] June 3, 2021. Available on: https://arxiv.org/abs/2010.11929.
8.↵
1. Hatamizadeh A,
2. Tang Y,
3. Nath V, et al
. UNETR: transformers for 3D medical image segmentation. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, HI: 2022:1748–58 doi:10.1109/WACV51458.2022.00181
CrossRef
9.↵
1. Crimi A,
2. Bakas S
1. Hatamizadeh A,
2. Nath V,
3. Tang Y, et al
. Swin UNETR: swin transformers for semantic segmentation of brain tumors in MRI images. In: Crimi A, Bakas S, eds. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Springer-Verlag International Publishing 2022:272–84
10.↵
1. Gu A,
2. Goel K,
3. Ré C
. Efficiently modeling long sequences with structured state spaces. arxiv October 31, 2021 doi:10.48550/arXiv.2111.00396
CrossRef
11.↵
1. Gu A,
2. Dao T
. Mamba: linear-time sequence modeling with selective state spaces. arxiv December 1, 2023 doi:10.48550/arXiv.2312.00752
CrossRef
12.↵
1. Crimi A,
2. Bakas S,
3. Kuijf H
1. Myronenko A, et al
. 3D MRI brain tumor segmentation using autoencoder regularization. In: Crimi A, Bakas S, Kuijf H, eds. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Springer-Verlag International Publishing; 2019:311–20
13.↵
1. Ma J,
2. Li F,
3. U-Mamba WB
. Enhancing long-range dependency for biomedical image segmentation. arxiv [Epub ahead of print] January 9, 2024. Available on: https://arxiv.org/pdf/2401.04722.
14.↵
1. Avants BB,
2. Tustison NJ,
3. Stauffer M, et al
. The Insight ToolKit image registration framework. Front Neuroinform 2014;8:44 doi:10.3389/fninf.2014.00044 pmid:24817849
CrossRef PubMed
15.↵
1. Isensee F,
2. Schell M,
3. Pflueger I, et al
. Automated brain extraction of multisequence MRI using artificial neural networks. Hum Brain Mapp 2019;40:4952–64 doi:10.1002/hbm.24750 pmid:31403237
CrossRef PubMed
16.↵
1. Isensee F,
2. Jaeger PF,
3. Kohl SAA, et al
. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 2021;18:203–11 doi:10.1038/s41592-020-01008-z pmid:33288961
CrossRef PubMed
17.↵
1. Persson K,
2. Bohbot VD,
3. Bogdanovic N, et al
. Finding of increased caudate nucleus in patients with Alzheimer’s disease. Acta Neurol Scand 2018;137:224–32 doi:10.1111/ane.12800 pmid:28741672
CrossRef PubMed
18.↵
1. Kahraman AT,
2. Fröding T,
3. Toumpanakis D, et al
. Enhanced classification performance using deep learning based segmentation for pulmonary embolism detection in CT angiography. Heliyon 2024:10:e38118.
CrossRef PubMed
19.↵
1. Yu W,
2. Wang X
. MambaOut: do we really need Mamba for vision? May 13, 2024. doi:10.48550/arXiv.2405.07992
CrossRef
20.↵
1. Kim R,
2. Müller S,
3. Garcia TP
. svReg: structural varying-coefficient regression to differentiate how regional brain atrophy affects motor impairment for Huntington disease severity groups. Biom J 2021;63:1254–71 doi:10.1002/bimj.202000312 pmid:33871905
CrossRef PubMed
21.↵
1. Moazzami K,
2. Shao IY,
3. Chen LY, et al
. Atrial fibrillation, brain volumes, and subclinical cerebrovascular disease (from the Atherosclerosis Risk in Communities Neurocognitive Study [ARIC-NCS]). Am J Cardiol 2020;125:222–28 doi:10.1016/j.amjcard.2019.10.010 pmid:31771759
CrossRef PubMed
22.↵
1. Walker KA,
2. Gottesman RF,
3. Wu A, et al
. Association of hospitalization, critical illness, and infection with brain structure in older adults. J Am Geriatr Soc 2018;66:1919–26 doi:10.1111/jgs.15470 pmid:30251380
CrossRef PubMed
23.↵
1. Rovaris M,
2. Inglese M,
3. van Schijndel RA, et al
. Sensitivity and reproducibility of volume change measurements of different brain portions on magnetic resonance imaging in patients with multiple sclerosis. J Neurol 2000;247:960–65 doi:10.1007/s004150070054 pmid:11200690
CrossRef PubMed
24.↵
1. He H,
2. Liang L,
3. Tang T, et al
. Progressive brain changes in Parkinson’s disease: a meta-analysis of structural magnetic resonance imaging studies. Brain Res 2020;1740:146847 doi:10.1016/j.brainres.2020.146847 pmid:32330518
CrossRef PubMed
25.↵
1. Chen X-Y,
2. Chen Z-Y,
3. Dong Z, et al
. Regional volume changes of the brain in migraine chronification. Neural Regen Res 2020;15:1701–08 doi:10.4103/1673-5374.276360 pmid:32209774
CrossRef PubMed
26.↵
1. Poulin SP,
2. Dautoff R,
3. Morris JC, et al
; Alzheimer’s Disease Neuroimaging Initiative. Amygdala atrophy is prominent in early Alzheimer’s disease and relates to symptom severity. Psychiatry Res 2011;194:7–13 doi:10.1016/j.pscychresns.2011.06.014 pmid:21920712
CrossRef PubMed
27.↵
1. Holroyd S,
2. Shepherd ML,
3. Downs JH
. Occipital atrophy is associated with visual hallucinations in Alzheimer’s disease. J Neuropsychiatry Clin Neurosci 2000;12:25–28 doi:10.1176/jnp.12.1.25 pmid:10678508
CrossRef PubMed
28.↵
1. Convit A,
2. De Leon MJ,
3. Tarshish C, et al
. Specific hippocampal volume reductions in individuals at risk for Alzheimer’s disease. Neurobiol Aging 1997;18:131–38 doi:10.1016/s0197-4580(97)00001-8 pmid:9258889
CrossRef PubMed
29.↵
1. Foundas AL,
2. Leonard CM,
3. Mahoney SM, et al
. Atrophy of the hippocampus, parietal cortex, and insula in Alzheimer’s disease: a volumetric magnetic resonance imaging study. Cogn Behav Neurol 1997;10:81
30.↵
1. Visser P,
2. Verhey F,
3. Hofman P, et al
. Medial temporal lobe atrophy predicts Alzheimer’s disease in patients with minor cognitive impairment. J Neurol Neurosurg Psychiatry 2002;72:491–97 doi:10.1136/jnnp.72.4.491 pmid:11909909
Abstract/FREE Full Text
31.↵
1. Devanand DP,
2. Pradhaban G,
3. Liu X, et al
. Hippocampal and entorhinal atrophy in mild cognitive impairment: prediction of Alzheimer disease. Neurology 2007;68:828–36 doi:10.1212/01.wnl.0000256697.20968.d7 pmid:17353470
Abstract/FREE Full Text
32.↵
1. Köhler S,
2. Black SE,
3. Sinden M, et al
. Memory impairments associated with hippocampal versus parahippocampal-gyrus atrophy: an MR volumetry study in Alzheimer’s disease. Neuropsychologia 1998;36:901–14 doi:10.1016/s0028-3932(98)00017-7 pmid:9740363
CrossRef PubMed
33.↵
1. Schuff N,
2. Woerner N,
3. Boreta L, et al
; Alzheimer’s Disease Neuroimaging Initiative. MRI of hippocampal volume loss in early Alzheimer’s disease in relation to ApoE genotype and biomarkers. Brain 2009;132:1067–77 doi:10.1093/brain/awp007 pmid:19251758
CrossRef PubMed
34.↵
1. Schwarz CG,
2. Gunter JL,
3. Wiste HJ, et al
; Alzheimer’s Disease Neuroimaging Initiative. A large-scale comparison of cortical thickness and volume methods for measuring Alzheimer’s disease severity. Neuroimage Clin 2016;11:802–12 doi:10.1016/j.nicl.2016.05.017 pmid:28050342
CrossRef PubMed
35.↵
1. Collins GS,
2. Reitsma JB,
3. Altman DG, et al
. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 2015;162:55–63 doi:10.7326/M14-0697 pmid:25560714
CrossRef PubMed

Received June 11, 2024.
Accepted after revision October 8, 2024.

In this issue

Download PDF

Email Article

Cite this article

0 Responses

Respond to this article

Bookmark this article

Purchase

Cited By...

No citing articles found.

More in this TOC Section

Show more Artificial Intelligence

[1] 1.↵
Radue E-W,
Barkhof F,
Kappos L, et al
. Correlation between brain volume loss and clinical and MRI outcomes in multiple sclerosis. Neurology 2015;84:784–93 doi:10.1212/WNL.0000000000001281 pmid:25632085
Abstract/FREE Full Text

[2] Radue E-W,

[3] Barkhof F,

[4] Kappos L, et al

[5] 2.↵
Mora F,
Segovia G,
del Arco A
. Aging, plasticity and environmental enrichment: structural changes and neurotransmitter dynamics in several areas of the brain. Brain Res Rev 2007;55:78–88 doi:10.1016/j.brainresrev.2007.03.011 pmid:17561265
CrossRef PubMed

[6] Mora F,

[7] Segovia G,

[8] del Arco A

[9] 3.↵
Yu Q,
Mai Y,
Ruan Y, et al
. An MRI-based strategy for differentiation of frontotemporal dementia and Alzheimer’s disease. Alzheimers Res Ther 2021;13:23
CrossRef PubMed

[10] Yu Q,

[11] Mai Y,

[12] Ruan Y, et al

[13] 4.↵
Adegun AA,
Viriri S,
Ogundokun RO
. Deep learning approach for medical image analysis. Comput Intell Neurosci 2021;2021:e6215281 doi:10.1155/2021/6215281
CrossRef

[14] Adegun AA,

[15] Viriri S,

[16] Ogundokun RO

[17] 5.↵
Chen S,
Qiu C,
Yang W, et al
. Multiresolution aggregation transformer UNet based on multiscale input and coordinate attention for medical image segmentation. Sensors 2022;22:3820 doi:10.3390/s22103820
CrossRef PubMed

[18] Chen S,

[19] Qiu C,

[20] Yang W, et al

[21] 6.↵
Strudel R,
Garcia R,
Laptev I, et al
. Segmenter: transformer for semantic segmentation. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: 2021:7242–7252 doi:10.1109/ICCV48922.2021.00717
CrossRef

[22] Strudel R,

[23] Garcia R,

[24] Laptev I, et al

[25] 7.↵
Dosovitskiy A,
Beyer L,
Kolesnikov A, et al
. An image is worth 16x16 words: transformers for image recognition at scale. arxiv [Epub ahead of print] June 3, 2021. Available on: https://arxiv.org/abs/2010.11929.

[26] Dosovitskiy A,

[27] Beyer L,

[28] Kolesnikov A, et al

[29] 8.↵
Hatamizadeh A,
Tang Y,
Nath V, et al
. UNETR: transformers for 3D medical image segmentation. In: 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). Waikoloa, HI: 2022:1748–58 doi:10.1109/WACV51458.2022.00181
CrossRef

[30] Hatamizadeh A,

[31] Tang Y,

[32] Nath V, et al

[33] 9.↵
Crimi A,
Bakas S
Hatamizadeh A,
Nath V,
Tang Y, et al
. Swin UNETR: swin transformers for semantic segmentation of brain tumors in MRI images. In: Crimi A, Bakas S, eds. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Springer-Verlag International Publishing 2022:272–84

[34] Crimi A,

[35] Bakas S

[36] Hatamizadeh A,

[37] Nath V,

[38] Tang Y, et al

[39] 10.↵
Gu A,
Goel K,
Ré C
. Efficiently modeling long sequences with structured state spaces. arxiv October 31, 2021 doi:10.48550/arXiv.2111.00396
CrossRef

[40] Gu A,

[41] Goel K,

[42] Ré C

[43] 11.↵
Gu A,
Dao T
. Mamba: linear-time sequence modeling with selective state spaces. arxiv December 1, 2023 doi:10.48550/arXiv.2312.00752
CrossRef

[44] Gu A,

[45] Dao T

[46] 12.↵
Crimi A,
Bakas S,
Kuijf H
Myronenko A, et al
. 3D MRI brain tumor segmentation using autoencoder regularization. In: Crimi A, Bakas S, Kuijf H, eds. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. Springer-Verlag International Publishing; 2019:311–20

[47] Crimi A,

[48] Bakas S,

[49] Kuijf H

[50] Myronenko A, et al

[51] 13.↵
Ma J,
Li F,
U-Mamba WB
. Enhancing long-range dependency for biomedical image segmentation. arxiv [Epub ahead of print] January 9, 2024. Available on: https://arxiv.org/pdf/2401.04722.

[52] Ma J,

[53] Li F,

[54] U-Mamba WB

[55] 14.↵
Avants BB,
Tustison NJ,
Stauffer M, et al
. The Insight ToolKit image registration framework. Front Neuroinform 2014;8:44 doi:10.3389/fninf.2014.00044 pmid:24817849
CrossRef PubMed

[56] Avants BB,

[57] Tustison NJ,

[58] Stauffer M, et al

[59] 15.↵
Isensee F,
Schell M,
Pflueger I, et al
. Automated brain extraction of multisequence MRI using artificial neural networks. Hum Brain Mapp 2019;40:4952–64 doi:10.1002/hbm.24750 pmid:31403237
CrossRef PubMed

[60] Isensee F,

[61] Schell M,

[62] Pflueger I, et al

[63] 16.↵
Isensee F,
Jaeger PF,
Kohl SAA, et al
. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods 2021;18:203–11 doi:10.1038/s41592-020-01008-z pmid:33288961
CrossRef PubMed

[64] Isensee F,

[65] Jaeger PF,

[66] Kohl SAA, et al

[67] 17.↵
Persson K,
Bohbot VD,
Bogdanovic N, et al
. Finding of increased caudate nucleus in patients with Alzheimer’s disease. Acta Neurol Scand 2018;137:224–32 doi:10.1111/ane.12800 pmid:28741672
CrossRef PubMed

[68] Persson K,

[69] Bohbot VD,

[70] Bogdanovic N, et al

[71] 18.↵
Kahraman AT,
Fröding T,
Toumpanakis D, et al
. Enhanced classification performance using deep learning based segmentation for pulmonary embolism detection in CT angiography. Heliyon 2024:10:e38118.
CrossRef PubMed

[72] Kahraman AT,

[73] Fröding T,

[74] Toumpanakis D, et al

[75] 19.↵
Yu W,
Wang X
. MambaOut: do we really need Mamba for vision? May 13, 2024. doi:10.48550/arXiv.2405.07992
CrossRef

[76] Yu W,

[77] Wang X

[78] 20.↵
Kim R,
Müller S,
Garcia TP
. svReg: structural varying-coefficient regression to differentiate how regional brain atrophy affects motor impairment for Huntington disease severity groups. Biom J 2021;63:1254–71 doi:10.1002/bimj.202000312 pmid:33871905
CrossRef PubMed

[79] Kim R,

[80] Müller S,

[81] Garcia TP

[82] 21.↵
Moazzami K,
Shao IY,
Chen LY, et al
. Atrial fibrillation, brain volumes, and subclinical cerebrovascular disease (from the Atherosclerosis Risk in Communities Neurocognitive Study [ARIC-NCS]). Am J Cardiol 2020;125:222–28 doi:10.1016/j.amjcard.2019.10.010 pmid:31771759
CrossRef PubMed

[83] Moazzami K,

[84] Shao IY,

[85] Chen LY, et al

[86] 22.↵
Walker KA,
Gottesman RF,
Wu A, et al
. Association of hospitalization, critical illness, and infection with brain structure in older adults. J Am Geriatr Soc 2018;66:1919–26 doi:10.1111/jgs.15470 pmid:30251380
CrossRef PubMed

[87] Walker KA,

[88] Gottesman RF,

[89] Wu A, et al

[90] 23.↵
Rovaris M,
Inglese M,
van Schijndel RA, et al
. Sensitivity and reproducibility of volume change measurements of different brain portions on magnetic resonance imaging in patients with multiple sclerosis. J Neurol 2000;247:960–65 doi:10.1007/s004150070054 pmid:11200690
CrossRef PubMed

[91] Rovaris M,

[92] Inglese M,

[93] van Schijndel RA, et al

[94] 24.↵
He H,
Liang L,
Tang T, et al
. Progressive brain changes in Parkinson’s disease: a meta-analysis of structural magnetic resonance imaging studies. Brain Res 2020;1740:146847 doi:10.1016/j.brainres.2020.146847 pmid:32330518
CrossRef PubMed

[95] He H,

[96] Liang L,

[97] Tang T, et al

[98] 25.↵
Chen X-Y,
Chen Z-Y,
Dong Z, et al
. Regional volume changes of the brain in migraine chronification. Neural Regen Res 2020;15:1701–08 doi:10.4103/1673-5374.276360 pmid:32209774
CrossRef PubMed

[99] Chen X-Y,

[100] Chen Z-Y,

[101] Dong Z, et al

[102] 26.↵
Poulin SP,
Dautoff R,
Morris JC, et al
; Alzheimer’s Disease Neuroimaging Initiative. Amygdala atrophy is prominent in early Alzheimer’s disease and relates to symptom severity. Psychiatry Res 2011;194:7–13 doi:10.1016/j.pscychresns.2011.06.014 pmid:21920712
CrossRef PubMed

[103] Poulin SP,

[104] Dautoff R,

[105] Morris JC, et al

[106] 27.↵
Holroyd S,
Shepherd ML,
Downs JH
. Occipital atrophy is associated with visual hallucinations in Alzheimer’s disease. J Neuropsychiatry Clin Neurosci 2000;12:25–28 doi:10.1176/jnp.12.1.25 pmid:10678508
CrossRef PubMed

[107] Holroyd S,

[108] Shepherd ML,

[109] Downs JH

[110] 28.↵
Convit A,
De Leon MJ,
Tarshish C, et al
. Specific hippocampal volume reductions in individuals at risk for Alzheimer’s disease. Neurobiol Aging 1997;18:131–38 doi:10.1016/s0197-4580(97)00001-8 pmid:9258889
CrossRef PubMed

[111] Convit A,

[112] De Leon MJ,

[113] Tarshish C, et al

[114] 29.↵
Foundas AL,
Leonard CM,
Mahoney SM, et al
. Atrophy of the hippocampus, parietal cortex, and insula in Alzheimer’s disease: a volumetric magnetic resonance imaging study. Cogn Behav Neurol 1997;10:81

[115] Foundas AL,

[116] Leonard CM,

[117] Mahoney SM, et al

[118] 30.↵
Visser P,
Verhey F,
Hofman P, et al
. Medial temporal lobe atrophy predicts Alzheimer’s disease in patients with minor cognitive impairment. J Neurol Neurosurg Psychiatry 2002;72:491–97 doi:10.1136/jnnp.72.4.491 pmid:11909909
Abstract/FREE Full Text

[119] Visser P,

[120] Verhey F,

[121] Hofman P, et al

[122] 31.↵
Devanand DP,
Pradhaban G,
Liu X, et al
. Hippocampal and entorhinal atrophy in mild cognitive impairment: prediction of Alzheimer disease. Neurology 2007;68:828–36 doi:10.1212/01.wnl.0000256697.20968.d7 pmid:17353470
Abstract/FREE Full Text

[123] Devanand DP,

[124] Pradhaban G,

[125] Liu X, et al

[126] 32.↵
Köhler S,
Black SE,
Sinden M, et al
. Memory impairments associated with hippocampal versus parahippocampal-gyrus atrophy: an MR volumetry study in Alzheimer’s disease. Neuropsychologia 1998;36:901–14 doi:10.1016/s0028-3932(98)00017-7 pmid:9740363
CrossRef PubMed

[127] Köhler S,

[128] Black SE,

[129] Sinden M, et al

[130] 33.↵
Schuff N,
Woerner N,
Boreta L, et al
; Alzheimer’s Disease Neuroimaging Initiative. MRI of hippocampal volume loss in early Alzheimer’s disease in relation to ApoE genotype and biomarkers. Brain 2009;132:1067–77 doi:10.1093/brain/awp007 pmid:19251758
CrossRef PubMed

[131] Schuff N,

[132] Woerner N,

[133] Boreta L, et al

[134] 34.↵
Schwarz CG,
Gunter JL,
Wiste HJ, et al
; Alzheimer’s Disease Neuroimaging Initiative. A large-scale comparison of cortical thickness and volume methods for measuring Alzheimer’s disease severity. Neuroimage Clin 2016;11:802–12 doi:10.1016/j.nicl.2016.05.017 pmid:28050342
CrossRef PubMed

[135] Schwarz CG,

[136] Gunter JL,

[137] Wiste HJ, et al

[138] 35.↵
Collins GS,
Reitsma JB,
Altman DG, et al
. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. Ann Intern Med 2015;162:55–63 doi:10.7326/M14-0697 pmid:25560714
CrossRef PubMed

[139] Collins GS,

[140] Reitsma JB,

[141] Altman DG, et al

Main menu

User menu

Search

American Journal of Neuroradiology

Comprehensive Segmentation of Gray Matter Structures on T1-Weighted Brain MRI: A Comparative Study of Convolutional Neural Network, Convolutional Neural Network Hybrid-Transformer or -Mamba Architectures

Graphical Abstract

Abstract

ABBREVIATIONS:

PREVIOUS LITERATURE:

KEY FINDINGS:

KNOWLEDGE ADVANCEMENT:

MATERIALS AND METHODS

Data Collection

Data Labeling and Preprocessing

Model Training

Evaluation

Statistics

RESULTS

Model Performance

Clinical Testing

DISCUSSION

Limitations

CONCLUSIONS

Acknowledgments

Footnotes

References

In this issue

Citation Manager Formats

Related Articles

Cited By...

More in this TOC Section

Similar Articles

Indexed Content

Cases

More from AJNR

Multimedia

Resources

About Us

American Society of Neuroradiology

Main menu

User menu

Search

Comprehensive Segmentation of Gray Matter Structures on T1-Weighted Brain MRI: A Comparative Study of Convolutional Neural Network, Convolutional Neural Network Hybrid-Transformer or -Mamba Architectures

Graphical Abstract

Abstract

ABBREVIATIONS:

PREVIOUS LITERATURE:

KEY FINDINGS:

KNOWLEDGE ADVANCEMENT:

MATERIALS AND METHODS

Data Collection

Data Labeling and Preprocessing

Model Training

Evaluation

Statistics

RESULTS

Model Performance

Clinical Testing

DISCUSSION

Limitations

CONCLUSIONS

Acknowledgments

Footnotes

References

In this issue

Citation Manager Formats

Jump to section

Related Articles

Cited By...

More in this TOC Section

Similar Articles

Indexed Content

Cases

More from AJNR

Multimedia

Resources

About Us

American Society of Neuroradiology