Performance metrics (macro-F1, micro-F1, sensitivity, specificity, and precision) for AI-VTRA/AI-VTRAET predictions of radiologist-based response assessment. Within each category, we binarized the BT-RADS and AI predictions based on the target score and computed the metrics
Imaging Improvement(BT-RADS 1) | No Significant Imaging Change (BT-RADS 2) | Imaging Worsening(BT-RADS 3) | Imaging Worsening Equivalent to RANO Progression (BT-RADS 4) | |||||
---|---|---|---|---|---|---|---|---|
AI-VTRAET | AI-VTRA | AI-VTRAET | AI-VTRA | AI-VTRAET | AI-VTRA | AI-VTRAET | AI-VTRA | |
Macro-F1 | 0.747 | 0.755 | 0.760 | 0.750 | 0.561 | 0.587 | 0.705 | 0.705 |
Micro-F1 | 0.857 | 0.870 | 0.765 | 0.757 | 0.695 | 0.689 | 0.831 | 0.831 |
Sensitivity | 0.747 | 0.700 | 0.793 | 0.746 | 0.222 | 0.298 | 0.596 | 0.596 |
Specificity | 0.873 | 0.895 | 0.746 | 0.765 | 0.920 | 0.875 | 0.872 | 0.872 |
Precision | 0.474 | 0.526 | 0.672 | 0.675 | 0.568 | 0.530 | 0.450 | 0.450 |