Digital Health Innovation and Informatics
Purpose/Objective(s): Stereotactic radiotherapy (SRT) for patients with brain metastasis requires precisely contoured gross tumour volumes (GTV). We aimed to compare deep-learning-generated contours (DC) with expert contours (EC).
Materials/Methods: Dataset 1 (DS1) consisted of 78 brain metastasis SRT plans with fused MPRAGE MR scans from a single centre. Dataset 2 (DS2) consisted of 170 publicly available MR images with brain metastasis contoured by a CNS radiation oncologist. Convolutional neural networks were used to train separate models on DS1, DS2, and a combined model including DS1 and DS2 (DS-all). Two model variations were developed: DC-Axial, which relied solely on axial training slices, and DC-Multi, which used axial contours combined with multiplanar slices in order to reduce false-positive predictions. A validation dataset consisted of 28 MPRAGE MR scans with 46 individual brain metastasis contours from SRT treatment plans.The true-positive DCs were compared with ECs using the Dice Similarity Coefficient (DSC), 95% distance transform (DT-95%), and mean distance transform (DT-mean). Dosimetric analysis was performed by fusing the DCs to the SRT planning CT scan with medical imaging software.
Results: The DC-Axial models identified 63%, 65%, 70% of metastasis for DS1, DS2, and DS-all respectively. The DC-Multi models identified 49%, 58%, 59% of metastasis for DS1, DS2, and DS-all respectively. The number of false positives for the DC-Axial models was 159, 138, 111 for DS1, DS2 and DS-all respectively. The number of false positives for the DC-Multi models was 12, 9, 8 for DS1, DS2 and DS-all respectively. Comparing ECs to true-positive DS1, DS2 and DS-all models demonstrated a mean DSC of 0.68, 0.74, and 0.77, DT-mean of 0.77 mm, 0.66 mm and 0.57 mm, and DT-95% of 1.71 mm, 1.52 mm and 1.34 mm respectively. The mean 80% isodose coverage was 100% for the ECs, 99.8% for DS1, 99.8% for DS2, and 99.9% for DS-all. The mean 90% isodose coverage was 100% for the ECs, 97% for DS1, 96% for DS2 and 97% for DS-all. There were no significant differences in the mean, max and minimum doses for DS1, DS2 and DS-all compared to the ECs.
Conclusion: We observed accurate delineation of true-positive DCs on MPRAGE MR images which demonstrates the feasibility of using deep learning models to aid in tumour delineation for SRT treatment planning. Similar isodose coverage, and mean, max and minimum dose for the models further demonstrates the spatial agreement between the DCs and ECs. Models trained with the largest combined dataset (DS-all) had the best volumetric and dosimetric agreement with ECs. Multiplanar models demonstrated a significantly lower false-positive rate and slightly higher false-negative rate for brain GTVs compared to models trained on axial slices alone. The true-positive detection rate can likely be improved in future studies that incorporate larger training datasets of MR images.