Table 3:

Performance metrics (macro-F1, micro-F1, sensitivity, specificity, and precision) for AI-VTRA/AI-VTRAET predictions of radiologist-based response assessment. Within each category, we binarized the BT-RADS and AI predictions based on the target score and computed the metrics

Imaging Improvement(BT-RADS 1)No Significant Imaging Change (BT-RADS 2)Imaging Worsening(BT-RADS 3)Imaging Worsening Equivalent to RANO Progression (BT-RADS 4)
AI-VTRAETAI-VTRAAI-VTRAETAI-VTRAAI-VTRAETAI-VTRAAI-VTRAETAI-VTRA
Macro-F10.7470.7550.7600.7500.5610.5870.7050.705
Micro-F10.8570.8700.7650.7570.6950.6890.8310.831
Sensitivity0.7470.7000.7930.7460.2220.2980.5960.596
Specificity0.8730.8950.7460.7650.9200.8750.8720.872
Precision0.4740.5260.6720.6750.5680.5300.4500.450