OBJECTIVE: Detection of diabetic retinopathy (DR) outside of specialized eye care settings is an important means of access to vision-preserving health maintenance. Remote interpretation of fundus photographs acquired in a primary care or other nonophthalmic setting in a store-and-forward manner is a predominant paradigm of teleophthalmology screening programs. Artificial intelligence (AI)-based image interpretation offers an alternative means of DR detection. IDx-DR (Digital Diagnostics Inc) is a Food and Drug Administration-authorized autonomous testing device for DR. We evaluated the diagnostic performance of IDx-DR compared with human-based teleophthalmology over 2 and a half years. Additionally, we evaluated an AI-human hybrid workflow that combines AI-system evaluation with human expert-based assessment for referable cases.
DESIGN: Prospective cohort study and retrospective analysis.
PARTICIPANTS: Diabetic patients ≥ 18 years old without a prior DR diagnosis or DR examination in the past year presenting for routine DR screening in a primary care clinic.
METHODS: Macula-centered and optic nerve-centered fundus photographs were evaluated by an AI algorithm followed by consensus-based overreading by retina specialists at the Stanford Ophthalmic Reading Center. Detection of more-than-mild diabetic retinopathy (MTMDR) was compared with in-person examination by a retina specialist.
MAIN OUTCOME MEASURES: Sensitivity, specificity, accuracy, positive predictive value, and gradability achieved by the AI algorithm and retina specialists.
RESULTS: The AI algorithm had higher sensitivity (95.5% sensitivity; 95% confidence interval [CI], 86.7%-100%) but lower specificity (60.3% specificity; 95% CI, 47.7%-72.9%) for detection of MTMDR compared with remote image interpretation by retina specialists (69.5% sensitivity; 95% CI, 50.7%-88.3%; 96.9% specificity; 95% CI, 93.5%-100%). Gradability of encounters was also lower for the AI algorithm (62.5%) compared with retina specialists (93.1%). A 2-step AI-human hybrid workflow in which the AI algorithm initially rendered an assessment followed by overread by a retina specialist of MTMDR-positive encounters resulted in a sensitivity of 95.5% (95% CI, 86.7%-100%) and a specificity of 98.2% (95% CI, 94.6%-100%). Similarly, a 2-step overread by retina specialists of AI-ungradable encounters improved gradability from 63.5% to 95.6% of encounters.
CONCLUSIONS: Implementation of an AI-human hybrid teleophthalmology workflow may both decrease reliance on human specialist effort and improve diagnostic accuracy.
FINANCIAL DISCLOSURES: Proprietary or commercial disclosure may be found after the references.