Assessment of an artificial intelligence aid for the detection of appendicular skeletal fractures in children and young adults by senior and junior radiologists

Toan Nguyen

Pediatric Radiology

Published In

Pediatric Radiology (September 2022)


Richard Maarek, Anne-Laure Hermann, Amina Kammoun, Antoine Marchi, Mohamed R. Khelifi-Touhami, Mégane Collin, Aliénor Jaillard, Andrew J. Kompel, Daichi Hayashi, Ali Guermazi & Hubert Ducou Le Pointe



As the number of conventional radiographic examinations in pediatric emergency departments increases, so, too, does the number of reading errors by radiologists.


The aim of this study is to investigate the ability of artificial intelligence (AI) to improve the detection of fractures by radiologists in children and young adults.

Materials and methods

A cohort of 300 anonymized radiographs performed for the detection of appendicular fractures in patients ages 2 to 21 years was collected retrospectively. The ground truth for each examination was established after an independent review by two radiologists with expertise in musculoskeletal imaging. Discrepancies were resolved by consensus with a third radiologist. Half of the 300 examinations showed at least 1 fracture. Radiographs were read by three senior pediatric radiologists and five radiology residents in the usual manner and then read again immediately after with the help of AI.


The mean sensitivity for all groups was 73.3% (110/150) without AI; it increased significantly by almost 10% (P<0.001) to 82.8% (125/150) with AI. For junior radiologists, it increased by 10.3% (P<0.001) and for senior radiologists by 8.2% (P=0.08). On average, there was no significant change in specificity (from 89.6% to 90.3% [+0.7%, P=0.28]); for junior radiologists, specificity increased from 86.2% to 87.6% (+1.4%, P=0.42) and for senior radiologists, it decreased from 95.1% to 94.9% (-0.2%, P=0.23). The stand-alone sensitivity and specificity of the AI were, respectively, 91% and 90%.


With the help of AI, sensitivity increased by an average of 10% without significantly decreasing specificity in fracture detection in a predominantly pediatric population.