Abstract
BACKGROUND AND PURPOSE: The central vein sign (CVS) is a proposed diagnostic imaging biomarker for multiple sclerosis (MS). The proportion of white matter lesions exhibiting the CVS (CVS+) is higher in patients with MS compared with its radiologic mimics. Evaluation for CVS+ lesions in prior studies has been performed by manual rating, an approach that is time-consuming and has variable interrater reliability. Accurate automated methods would facilitate efficient assessment for CVS. The objective of this study was to compare the performance of an automated CVS detection method with manual rating for the diagnosis of MS.
MATERIALS AND METHODS: 3T MRI was acquired in 86 participants undergoing evaluation for MS in a 9-site multicenter study. Participants presented with either typical or atypical clinical syndromes for MS. An automated CVS detection method was employed and compared with manual rating, including total CVS+ proportion and a simplified counting method in which experts visually identified up to 6 CVS+ lesions by using FLAIR* contrast (a voxelwise product of T2 FLAIR and postcontrast T2*-EPI).
RESULTS: Automated CVS processing was completed in 79 of 86 participants (91%), of whom 28 (35%) fulfilled the 2017 McDonald criteria at the time of imaging. The area under the receiver operating characteristic curve (AUC) for discrimination between participants with and without MS for the automated CVS approach was 0.78 (95% CI: [0.67,0.88]). This was not significantly different from simplified manual counting methods (select6*) (0.80 [0.69,0.91]) or manual assessment of total CVS+ proportion (0.89 [0.82,0.96]). In a sensitivity analysis excluding 11 participants whose MRI exhibited motion artifact, the AUC for the automated method was 0.81 [0.70,0.91], which was not statistically different from that for select6* (0.79 [0.68,0.92]) or manual assessment of total CVS+ proportion (0.89 [0.81,0.97]).
CONCLUSIONS: Automated CVS assessment was comparable to manual CVS scoring for differentiating patients with MS from those with other diagnoses. Large, prospective, multicenter studies utilizing automated methods and enrolling the breadth of disorders referred for suspicion of MS are needed to determine optimal approaches for clinical implementation of an automated CVS detection method.
ABBREVIATIONS:
- AUC
- area under the curve
- CVS
- central vein sign
- EDSS
- Expanded Disability Status Score
- GBCA
- gadolinium-based contrast agent
- IRR
- interrater reliability
- MIMoSA
- Method for InterModal Segmentation Analysis
- NAIMS
- North American Imaging in MS Cooperative
- ROC
- receiver operating characteristic
- SD
- standard deviation
- WML
- white matter lesion
MS is a chronic neuroinflammatory disease that presents with demyelinating lesions of the central nervous system. MS is often considered in the differential diagnosis in patients with neurologic symptoms and MRI white matter lesions (WMLs), yet can be difficult to distinguish from other white matter disease mimics.1 Lesions that are caused by microvascular ischemia, for instance, may be difficult to differentiate from those caused by MS, as current MRI methods lack diagnostic specificity. The long-standing difficulty in the clinical and radiologic diagnostic differentiation of MS results in diagnostic delay and misdiagnosis associated with unnecessary risk of morbidity and disability.2 For these reasons, accurate diagnostic biomarkers for MS, such as the central vein sign (CVS), an emerging imaging biomarker reflecting the perivenular relationship of lesions that have long been associated with MS,3 are of key interest.
The proportion of WMLs exhibiting the CVS (CVS+) is higher in MS compared with its radiologic mimics.4 To date, many studies have evaluated the proportion of total CVS+ lesions to distinguish between MS and non-MS diagnoses.5⇓-7 Of note, a meta-analysis including articles assessing the CVS on T2*-weighted images and individual patient data from 501 patients with MS reported that the incidence of CVS at the individual lesion level per patient was 74%, with a sensitivity and specificity of 95% and 97%, respectively.6 Common approaches used to evaluate proportions of total CVS+ lesion are limited by the time constraints associated with manual assessment of every lesion for CVS and by interrater variability.8⇓-10 To alleviate these issues, several simplified approaches incorporating more limited manual CVS assessments have been proposed to facilitate CVS adjudication in practice. Among these alternate methods is select6*—the identification of at least 6 CVS+ lesions by using FLAIR* contrast (a voxelwise product of T2 FLAIR and postcontrast T2*-EPI).11,12 Although this method has proved both sensitive and specific in cross-sectional studies,13 these are also not without limitations: most have been single-center and have included participants with established diagnoses of MS of varying duration.
In this study, we included a cohort of participants referred to 9 MS specialty centers for suspicion of MS. We retrospectively assessed the performance of a fully automated and publicly available method for automated assessment of the CVS in comparison to manual rating, including total CVS+ proportion and a simplified counting method (select6*), in which experts visually identified up to 6 CVS+ lesions on FLAIR* imaging.14 This retrospective analysis was conducted on a prospectively recruited and imaged cohort. We hypothesized that automated algorithms may be sensitive, specific, and time-efficient for CVS assessment and may be of future utility for MS diagnosis.
MATERIALS AND METHODS
Study Participants
Individuals were recruited at 9 participating academic medical centers across North America and participated in a single study visit. Participant inclusion criteria included: 1) ages 18 to 65 inclusive, 2) referral to an academic site for a new clinical or radiologic suspicion or diagnosis of multiple sclerosis, 3) cranial MRI scan demonstrating T2-hyperintensities, and 4) ability to provide written informed consent to participate in the study. Exclusion criteria included: 1) use of disease-modifying therapies, 2) treatment with systemic corticosteroids within 4 weeks of enrollment, 3) contraindication to MRI via lack of ability to tolerate the study because of claustrophobia or excessive movement related to tremor, and 4) contraindication to using gadolinium-containing contrast agents (eg, allergy or renal failure).
Diagnosis Adjudication
Participating neurologists at each site considered clinical evaluation, MRIs, and CSF analysis to determine if participants fulfilled the 2017 McDonald criteria. All participants included in this study received a work-up for a diagnosis of MS. The diagnoses were subsequently adjudicated centrally by 3 neurologists, as previously described.15
MRI Data
3T MRI, including 3D T1-weighted MPRAGE, T2 FLAIR, and T2*-weighted 3D echo-planar imaging (T2* 3D-EPI) sequences15 were acquired from 86 participants by using 2 different MRI vendors. Of the 9 participating sites, 3 centers used Philips scanners (Ingenia, Ingenia Elition X, and Achieva dStream), and 6 centers used Siemens scanners (Skyra [n = 3], Prisma Fit [n = 2], Prisma [n = 1]). A T2* 3D-EPI sequence was employed after the administration of a macrocyclic gadolinium chelate at a dose of 0.1 mmol/kg. Details regarding the MRI sequences acquired for this study have been previously published13 and are briefly summarized in the Supplemental Data.
Expert CVS Assessment
Guidelines previously described by the North American Imaging in MS Cooperative (NAIMS)4 were employed for manual CVS rating. Raters at each site were trained by using a standardized data set with previously determined CVS+ and CVS- lesions.8 For proportion-based CVS determination, lesion masks were created via ITK-SNAP by using T2 FLAIR images. These masks were then overlaid onto T2* 3D-EPI acquired postgadolinium injection (postcontrast T2*-EPI), and a trained central rater (L.D.), who was blinded to clinical information, manually reviewed each scan for the presence of CVS across all lesions. For select6* counts, site raters identified up to 6 CVS+ lesions on FLAIR* contrast for their respective sites.12 Interrater reliability (IRR) evaluations were performed by using Cohen κ – evaluating CVS+ and CVS- lesions—as classified by both raters. IRR assessments and side-by-side receiver operating characteristic (ROC) analysis for the diagnosis of MS can be seen in previously published work examining the same data set.16
Image Processing for Automated CVS Assessment
A schematic of the image processing pipeline14 is shown in Fig 1. All images underwent N4-bias correction. T1 MPRAGE images were rigidly aligned to T2 FLAIR images, and these 2 sequences were input into the Method for InterModal Segmentation Analysis (MIMoSA)17,18 to identify white matter lesions. Periventricular lesions were excluded because of the extensive quantity and branching characteristics of veins surrounding the ventricles, following the NAIMS recommendations to exclude lesions with >1 vein or branching veins.4 Confluent lesions were separated by using a previously described technique that works by identifying and removing voxels connecting distinct lesions.19 Lesion masks were subsequently interpolated by using the nearest neighbors and rigidly aligned to the higher-resolution postcontrast T2*-EPI. All lesion masks and processed images were assessed visually to ensure processing quality, and any participants for which no lesions were detected by MIMoSA were excluded.
Visualization of the image processing pipeline used from the automated CVS detection method described by Dworkin et al 2018.14 Lesion segmentation (pink box): Red indicates automated lesion segmentation masks generated by using MIMoSA overlaid on FLAIR images. Image processing (green box): T1 MPRAGE (left) and postcontrast T2*-EPI (right) postprocessed images. Vein detection (yellow box): Axial (left) and sagittal (right) views of vesselness maps generated by using the Frangi filter and postcontrast T2*-EPI. Permutation procedure (blue box): Red indicates CVS probability maps generated by the automated method10 overlaid on axial T1 MPRAGE images.
Automated central vein detection14 was then employed to assess the degree of vein presence at the center of each lesion by integrating information from the T1 MPRAGE, T2 FLAIR, and postcontrast T2*-EPI. Vesselness maps were created from the postcontrast T2*-EPI by using a Frangi filter,20 and lesion centers were determined by using a previously published automated method.19 This previously described lesion center detection technique works by identifying regions containing lesion-probability maps most closely resembling the texture of lesion centers. The clustering of voxels containing high vesselness scores at the determined lesion center may be suggestive of a central vein. To ensure the presence of these proposed central veins, a spatial permutation process by using the generated vesselness maps was used to determine whether the presence of a vein at the center of a given lesion occurred more than would be expected by chance. The associated probability of vein presence at the center of each lesion in each participant was estimated and averaged across lesions within a participant to yield a participant-level CVS score.
Statistical Analysis
All statistical analyses were conducted in the R software environment, assumed a 5% type I error rate, and employed 2-sided hypothesis tests. ROC analysis was employed to assess the performance of the automated pipeline compared with manual CVS rating. The area under the curve (AUC) was determined for the automated method, the total proportion of CVS+ lesions assessed by the central rater, and the select6* assessments performed at each participating site. DeLong tests were conducted using the pROC package21 in the R statistical environment to compare the diagnostic performances of CVS assessment methods.
Quality Control Exclusion
To assess the performance of the automated detection method in cases of suboptimal image quality, analysis was conducted both before and after the exclusion of scans with extensive motion artifact. Two research specialists with greater than 2 years of experience (A.R.M., E.G.) visually assessed all scans for image quality. Images were rated based on a previously defined rating scale,14 and images displaying poor signal-to-noise or at least 1 severe artifact obstructing any vessel in the supratentorial white matter were excluded in a sensitivity analysis. Examples with motion artifact and illustration of CVS identification by using these images can be seen in the Supplemental Data.
RESULTS
Study Sample
Images from 86 adults (66 women) with a suspicion of MS were included in the study. Mean age was 45 years (standard deviation [SD] = 12) and average time since symptom onset was 4.2 years (SD = 6.4 years). The median Expanded Disability Status Score (EDSS)22 was 1.5 (range = 0–4). In total, 5250 lesions were identified in our study sample by an expert rater (L.D.) (median = 39, range = 0–232), with 2097 (median = 43, range = 0–192) identified in the MS group and 3153 (median = 38, range: 0–232) in the non-MS group. Additional demographic information is available in the Table.
Brief description of demographic information for the 86 participants included in this study
Data from 9 sites, with a median recruitment of 10 participants (range = 4–12), were included in our analysis, with the exclusion of 7 participant scans due to image processing failures. Examples of encountered image processing failures include failure of quality assessment by an MRI physicist (P.S.), absence of post-Gd imaging, or excessive image artifact preventing initial image processing. MS diagnosis was ascertained by local site principle investigators in the 79 participants included in the analysis. Subsequently, a diagnosis of MS was adjudicated in 28 participants. Individuals determined to have clinically isolated syndrome or radiologically isolated syndrome were included in the non-MS group. A schematic illustrating the inclusion and exclusion of participants in these analyses can be seen in Fig 2.
Brief schematic of the subject inclusion and exclusion in these analyses. n/a = not applicable.
Automated Lesion Detection
No lesions were detected in 2 participants by the automated pipeline, despite the presence of lesions on their MRIs based on visual assessment. These participants were excluded from the primary analyses. In a sensitivity analysis, these participants were alternatively labeled as not exhibiting the CVS, with an automated score of zero, and results from this analysis are presented in the Supplemental Data.
Quality Control Exclusion
Visual inspection indicated that extensive artifact was present on the MRI from 11 participants due to subject motion. In a sensitivity analysis, these participants were removed, of whom 4 out of 11 met the 2017 McDonald criteria at the time of scanning.
Diagnostic Performance of the CVS
In the 79 participants included in this study, the automated method discriminated between participants with and without MS with an AUC of 0.78 (95% CI: [0.67,0.88]). This was comparable to the manual assessments of select6* (0.80 [0.69,0.91]) and total CVS+ proportion evaluation (0.89 [0.82,0.96]). There was no statistically significant difference in performance between any of the detection methods included in our analysis. The ROC curves associated with each CVS assessment method are shown in Fig 3.
Left: ROC curve before 11 participants were removed because of motion on postcontrast T2*-EPI (n = 79). Twenty-eight participants met the 2017 McDonald criteria. Right: ROC curve after removal of 11 participants because of extensive motion on postcontrast T2* images (n = 68). Twenty-four participants met 2017 McDonald criteria. For both ROC curves, automated results are displayed in green, Select6* counts in red, and total CVS+ proportions in black. AUC values, with 95% CI, determined for each method with and without motion exclusion are included with each ROC curve.
Results were comparable with the exclusion of the 11 images that displayed poor signal-to-noise ratio; the automated CVS approach discriminated between participants with and without MS with an AUC of 0.81 [0.70,0.91]. Similar performance was observed by using select6* (0.79 [0.68,0.92]) and the total proportion of CVS+ lesions (0.89 [0.81,0.97]). These results, with the inclusion of AUC values with and without motion exclusion, are displayed in Fig 3.
DISCUSSION
In this study, we compared the diagnostic performance of an automated CVS detection pipeline14 with CVS determinations assessed by trained raters via select6* counts and criterion-standard total CVS+ proportion. We found that fully automated CVS detection demonstrated good discriminative ability between patients with and without MS in a multicenter study. Additionally, as the main analysis and secondary sensitivity analysis did not differ significantly, the image quality required for this algorithm is not a clinical limitation. These results address a critical issue in translating this imaging biomarker into clinical practice by eliminating a time-intensive manual imaging assessment susceptible to interrater reliability limitations.
The approach utilized in this study may be of particular interest when compared with both automated and manual alternative methods for CVS detection, in part because of the consistency of its performance across study sites and MRI vendors. While other automated techniques have shown promise in demonstrating the difference in the proportion of CVS+ lesions between MS and its common mimics,23 the method explored in this paper does not necessitate manual lesion segmentation. While these alternative methods would be of interest in a comparison study with the automated method utilized in this paper, the lack of availability, both publicly and commercially, limits the ability to do so at this point. It is worth noting that alternative methods may eventually incorporate automated segmentation; however, such performance has not been documented.
Additionally, because both image preprocessing and lesion segmentation are performed within the pipeline and additional parameter tuning is not required for any of the remaining steps of the algorithm, the performance is largely independent of any user-specific factors. These reasons may introduce additional encouragement for the incorporation of the CVS, in addition to the standing MRI criteria regarding dissemination in time and space, into the diagnostic criteria for MS. Not only could this inclusion provide an additional pathway to diagnosis for individuals with atypical presentations of MS,24 but it may be of increasing interest considering the recent finding of increased microstructural damage noted in CVS+ MS lesions.25
Further, it should be noted that in this study, among others referenced throughout this paper, CVS ratings were determined by trained raters. These standards may be challenging, given the forecasted shortage of specialized neurologists and radiologists in the broader community.26 The importance of specialized training with respect to error and variance in manual assessment is not yet well understood. These factors highlight the need for standardization and automated assessment of CVS in practice.
This study is not without limitations. Prior studies have found CVS probability values by using automated methods to be lower than previously reported CVS+ proportion values for patients with MS and higher for those without MS.14 While it has been theorized that this effect could be due to false-positive CVS lesions via the automated segmentation method, the impact of incorrect manual CVS assessment may also be at play. Further, as demonstrated in the 2 cases with false-negative lesion detection included in this study, automated lesion segmentation may display difficulty in cases with a lower lesion burden. While these cases may be infrequent, because of the low lesion burden and low prevalence based on 2 cases from 79 in our cohort, specifically targeting cases with these characteristics may be of particular interest to future studies.
We anticipate this automated process to maintain a similar level of performance as it is trained with more data; however, we cannot guarantee whether there may be accuracy drift over time.
Additionally, while recent findings suggest an increase in sensitivity of the CVS for MS diagnosis with the administration of gadolinium-based contrast agent (GBCA),15 we expect this automated method to maintain a similar level of performance without the use of GBCAs, as well as with the use of SWI. Exploration of this prediction may be a good direction for future study. Finally, while the CVS has been proposed as a useful tool in the prediction of MS evolution and manifestation27 in addition to its more established diagnostic use such as the 40% rule—in which an MS diagnosis can be confidently favored if >40% of lesions demonstrate a CVS5—it should be noted that CVS+ lesions are not exclusive to a diagnosis of MS. While less prevalent than in MS, other neurologic diseases such as neuromyelitis optica spectrum disorder, Susac syndrome, cerebral small vessel disease, and additional systemic autoimmune diseases have been linked to the presence of CVS+ lesions.28⇓⇓⇓-32
The methods explored in this paper show promise for the diagnostic value and standardization of CVS detection in practice. Further optimization of this technique and the translation of these research methods into practice in the broader community should be the focus of future work. Additionally, the correlations between the CVS and other variables, such as lesion load, atrophy analysis, and EDSS scores, may be a future area of interest considering the link between CVS+ MS lesions and increased levels of tissue damage.25 While the results from this pilot study were encouraging, it is important to acknowledge the limitation introduced by the limited sample size. For this reason, studies may expand upon these findings through the incorporation of larger study samples and the inclusion of multiple time points for imaging acquisition and diagnosis adjudication. Finally, large prospective multicenter studies that include the breadth of disorders referred for suspected MS are needed to determine optimal approaches for utilizing the CVS as a diagnostic biomarker in MS.
Footnotes
R.T. Shinohara and P. Sati contributed equally to this article.
This research has been supported by the Intramural Research Program of the NINDS, National Institutes of Health (NIH). Research reported in this publication was supported by the NIH under award numbers R01NS112274, R01MH123550, R01MH112847, and U01NS116776.
Disclosure forms provided by the authors are available with the full text and PDF of this article at www.ajnr.org.
References
- Received January 28, 2024.
- Accepted after revision September 3, 2024.
- © 2025 by American Journal of Neuroradiology