Fiona Maclean1, Amanda De Souza2, Meridith Peratikos, Claudia Petrucco2
1 Douglass Hanley Moir, Franklin.ai, Sydney, NSW, Australia
2 Franklin.ai, Sydney, NSW, Australia
Abstract for ECP 2025, Vienna, Austria
Background & Objectives
International bodies, including ICCR, ISUP and GUPS have provided recommendations regarding minimum data sets for reporting prostate core biopsy (PCB) and Transurethral resections of the prostate (TURP). The use of artificial intelligence (AI) is a potential solution for discrepancies in diagnostic agreement among pathologists. We sought to demonstrate how our model enhanced agreement between pathologists in classifying key clinical findings when assessing whole slide images (WSI).
Methods
The diagnostic accuracy and overall agreement of 29 pathologists were evaluated by classifying PCB and TURP WSIs (n = 1735). Accuracy was evaluated by the per-subject Area Under the Curve (AUC) and overall agreement by the Interclass Correlation Coefficient (ICC).
Results
Analysis of pathologist performance revealed that when assisted with AI, pathologists demonstrated considerable improved diagnostic accuracy and overall agreement in identification of acinar adenocarcinoma Gleason patterns for both PCB and TURP specimens.
When assisted with AI, pathologists' accuracy of acinar adenocarcinoma in PCBs improved from an AUC of 0.84 to 088 for Gleason pattern 3, 0.93 to 0.94 for Gleason pattern 4, and from 0.91 to 0.89 for Gleason pattern 5. Similarly, in TURP specimens, AI-assisted pathologists reported AUC improvements from 0.83 to 0.84 in Gleason pattern 3, 0.91 to 0.94 in Gleason pattern 4, and 0.87 to 0.92 in Gleason pattern 5.
Further on this, interobserver classification of Gleason patterns of AI assisted pathologists was overall in ‘Good’ agreement (0.75 > ICC > 0.90) for Gleason Patterns 3, 4 and 5 compared to unassisted with ‘Moderate’ to ‘Poor’ agreement (0 > ICC >0.75).
Conclusion
The diagnostic assistance provided by the AI algorithm improves pathologist accuracy and overall agreement. AI applications have the potential to enhance patient care by increasing pathologist diagnostic agreement in Gleason grading accuracy, as well as reducing interobserver variability.