Juan E. Small MD, Polina Osler MD, Aaron Paul MD, and Mara Kunst MD
Department of Radiology, Lahey Hospital and Medical Center, Burlington, MA
To evaluate the performance of a convolutional neural network (CNN) in the CT diagnosis of cervical spine fractures.
MATERIALS AND METHODS
We evaluated an FDA approved CNN model developed by Aidoc (Tel Aviv, Israel) for the detection of cervical spine fractures on CT. The model is specific to the cervical spine, and detects linear bony lucencies in patterns consistent with fractures, not distinguishing acute from chronic. The model was tested in a large suburban level I trauma center on all trauma cervical spine CTs acquired over a four year period (January 2015 through December 2018) that included a cervical spine MRI performed within 48 hours. Cervical spine MRIs were performed for two main reasons: 1) in patients with persistent clinical concern for occult cervical spine injury despite negative CT results and 2) to evaluate the spinal cord and ligaments in patients with positive CT results for fracture. A total of 695 cases were included in the analysis. To establish ground truth, the cervical spine CT and MR images, including their reports were retrospectively reviewed by one of three fellowship trained neuroradiologists who were blinded to the CNN results. The initial reports were considered “real-time” assessments. During retrospective review, cases were marked positive by the neuroradiologists if the fracture was detected on CT or evident in retrospect with the aid of follow-up MRI. The cervical spine CT dataset was evaluated retrospectively by the CNN model. The “real-time” radiologist assessment and the CNN model output were compared to ground truth. Discrepancies with ground truth were reviewed by 2 neuroradiologists. Both acute and chronic fractures were considered true positives. Congenital fusion anomalies, nutrient foramina, degenerative changes, and artifact were all considered false positives. Disc fractures were considered true negatives due to the absence of a detectable bony lucency.
CNN algorithm accuracy in cervical spine fracture detection was 92%, with 76% sensitivity and 97 % specificity. “Real-time” radiologist accuracy was 96%, with 94% sensitivity and 96% specificity. Fractures missed by the CNN mirrored those missed by radiologists by both level and location, including fractured anterior osteophytes, transverse and spinous processes, and those obscured by artifact in the lower cervical spine. CONCLUSION: CNN performance accuracy and specificity was comparable to radiologist interpretations in the detection of cervical spine fractures on CT. Further refinements in sensitivity will likely improve the results of the CNN. This model holds promise for prioritizing exams and assisting clinicians in cervical spine fracture detection.