This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Diabetes, is properly cited. The complete bibliographic information, a link to the original publication on http://diabetes.jmir.org/, as well as this copyright and license information must be included.
With the high prevalence of diabetic retinopathy and its significant visual consequences if untreated, timely identification and management of diabetic retinopathy is essential. Teleophthalmology programs have assisted in screening a large number of individuals at risk for vision loss from diabetic retinopathy. Training nonophthalmological readers to assess remote fundus images for diabetic retinopathy may further improve the efficiency of such programs.
This study aimed to evaluate the performance, safety implications, and progress of 2 ophthalmology nurses trained to read and assess diabetic retinopathy fundus images within a hospital diabetic retinopathy telescreening program.
In this retrospective interobserver study, 2 ophthalmology nurses followed a specific training program within a hospital diabetic retinopathy telescreening program and were trained to assess diabetic retinopathy images at 2 levels of intervention: detection of diabetic retinopathy (level 1) and identification of referable disease (level 2). The reliability of the assessment by level 1−trained readers in 266 patients and of the identification of patients at risk of vision loss from diabetic retinopathy by level 2−trained readers in 559 more patients were measured. The learning curve, sensitivity, and specificity of the readings were evaluated using a group consensus gold standard.
An almost perfect agreement was measured in identifying the presence of diabetic retinopathy in both level 1 readers (κ=0.86 and 0.80) and in identifying referable diabetic retinopathy by level 2 readers (κ=0.80 and 0.83). At least substantial agreement was measured in the level 2 readers for macular edema (κ=0.79 and 0.88) for all eyes. Good screening threshold sensitivities and specificities were obtained for all level readers, with sensitivities of 90.6% and 96.9% and specificities of 95.1% and 85.1% for level 1 readers (readers A and B) and with sensitivities of 86.8% and 91.2% and specificities of 91.7% and 97.0% for level 2 readers (readers A and B). This performance was achieved immediately after training and remained stable throughout the study.
Notwithstanding the small number of trained readers, this study validates the screening performance of level 1 and level 2 diabetic retinopathy readers within this training program, emphasizing practical experience, and allows the establishment of an ongoing assessment clinic. This highlights the importance of supervised, hands-on experience and may help set parameters to further calibrate the training of diabetic retinopathy readers for safe screening programs.
Diabetic retinopathy is the main cause of legal and functional blindness in the working-age population and in many developed countries [
The use of teleophthalmology programs to detect diabetic retinopathy and manage follow-up has been shown to be cost-effective [
This study aimed to evaluate the performance, safety implications, and progress of 2 ophthalmology nurses in detecting diabetic retinopathy and identifying referable diseases following specific training in a diabetic retinopathy telescreening program. Their reading results were compared with those obtained from a retina specialist and the gold standard, consisting of a group-arbitrated consensus. A secondary objective was to determine the reason for reading discrepancies.
This study identifies training parameters to help tailor and standardize the training of nonophthalmologist readers for safe diabetic retinopathy interpretation in a screening program and validates the individual and group performance of trainee readers within this program. However, as with any screening program, the need for continuous monitoring and education of readers after the training process remains necessary.
This study is approved by the Institutional Suitability Committee, the Scientific Evaluation Committee and the Research Ethics Committee of the Centre Intégré Universitaire de Santé et de Services Sociaux de l'Est-de-l'Île-de-Montréal, Montreal, Québec, Canada, where it was conducted (US Federal Wide Assurance numbers FWA00001935 and IRB00002087).
This retrospective interobserver reliability study was conducted on 829 patients with type 2 diabetes who attended a screening visit within a hospital-based teleophthalmology program at the Maisonneuve-Rosemont University Ophthalmology Center between February 2016 and September 2018. A total of 4 patients with laser scars from diabetic retinopathy treatment were mistakenly included in the program, who were excluded from the analysis; therefore, the final analysis was conducted on 825 individuals (1650 eyes). Patients were imaged by an ophthalmic photographer with a nonmydriatic camera (iCam-Optovue) after pupil dilation with 1% tropicamide to reduce ungradable imaging. Two 45-degree image fields, 1 image centered on the disc and 2 centered on the macula, were obtained to ensure adequate macular imaging. Demographics were not collected.
The images were securely transmitted to a dedicated hospital server and accessed by all readers from a teleophthalmology diabetic retinopathy electronic platform (iVision from RetinaLabs), which allowed interpretation by various levels of readers. The images were reviewed nonstereoscopically at the capture resolution, with automated or manual image enhancement (magnification, brightness, and contrast) (Adobe Photoshop 7.0, Adobe Systems Inc). Images were assessed using a grading software that showed the grading scheme and the Early Treatment Diabetic Retinopathy Study (ETDRS) standard photographs as references at all times. The integrated grading scheme is based on the Scottish Diabetic Retinopathy Grading Scheme (2007) [
Through the teleophthalmology platform, level 1 readers determine for each eye, the image quality, if diabetic retinopathy is present (corresponding to ≥R1) or absent, and identify any other detected abnormalities. Level 2 readers determine image quality and grade diabetic retinopathy in 5 severity levels: no retinopathy (R0), mild (R1), moderate (R2), severe nonproliferative diabetic retinopathy (R3), and proliferative diabetic retinopathy (R4). They also specifically grade diabetic macular edema (DME) as none (M0), presence of any microaneurysm, hemorrhage, or exudate within 2 disc diameters (DD) of the fovea (M1), or within 1 DD of the fovea (M2). Any other abnormality was identified for ophthalmologic attention as well. Ungradable images are labeled as R6 for the general diabetic assessment and M6 for the macular assessment by all readers, which leads to an automatic referral for an in-person examination after validation by the level 3 reader (retina specialist). The level 3 reader (MB), who is blinded to the trained readers, rereads all of the images on the same teleophthalmology platform, acting as a level 1 or level 2 reader.
For teaching and quality assurance purposes, a weekly group review was attended by all 3 readers, where any discrepancies of level 1 or 2 readings with that of the level 3 reader generated by the built-in quality assurance module of the electronic reading platform, were discussed. The final consensus of any reading disagreements was determined by group arbitration, which was established as the gold standard.
Two ophthalmology nurses (A and B), 1 technical and 1 clinical, voluntarily participated in this study and were trained successively to intervene as level 1 and level 2 readers. Outside of training for visual acuity measurement and instillation of dilating eye drops, they had no relevant experience or credentials in assessing diabetic retinopathy or prior involvement in any eye imaging.
The training of level 1 reading was provided by a validated interactive electronic platform [
Training for level 2 reading involves weekly sessions of quality assurance and group reviews of all new level 1 individual readings. This enables progressive recognition of diabetic retinopathy severity, which leads to a referral to a level 3 reader (retina specialist if the severity is > R2 (exceeds a moderate level of retinopathy) or ≥ M2 (possible DME within 1 DD from the fovea. The precautionary referral of any uncertain or unusual findings, such as other pathology or atypical variation of normal characteristics, is emphasized.
The level 1 readers spontaneously reported feeling comfortable for level 2 reading after the group review and training on 266 imaged patients (532 eyes), of which reader A had individually assessed 114 patients and reader B, 152 patients. This was set as the starting point for the evaluation of their next readings for a total of 1118 level 2 eye readings in 559 patients (323 patients for reader A and 236 for reader B).
The kappa (
The screening performance (sensitivity and specificity), diagnostic accuracy (95% CI), and the learning curve in 50-patient strata of the level 1 and level 2 readers were calculated with the consensus gold standard readings as those of the level 3 reader with respect to each level 1 and level 2 cohorts. Grading of the most affected eye was used to calculate the sensitivity and specificity of the patient readings.
There were 532 eyes (266 patients) evaluated at the level 1 reading level, of which level 1 reader A and reader B individually assessed 228 eyes (114 patients) and 304 eyes (152 patients), respectively. A total of 1118 eyes (559 patients) were assessed by the level 2 readers, which also included an evaluation for DME, and of which level 2 reader A and reader B assessed 646 eyes (323 patients) and 472 eyes (236 patients), respectively.
Excluding any ungradable images as per the consensus gold standard, the global prevalence of diabetic retinopathy (≥R1) was 46.2% (117/254) and 37.3% (196/526) in the level 1 and level 2 cohorts, respectively, and the total prevalence of diabetic macular involvement was 25.8% (135/523). The prevalence and distribution of disease severity and number of ungradable images were comparable between level 1 and level 2 cohorts and between reader A and B according to the consensus gold standard grading (
The most common reason for referral was DME (102/151, 67.5%), followed by severe diabetic retinopathy with DME (11/151, 7.3%), and severe diabetic retinopathy without DME (2/151, 1.3%;
The kappa values in
Reasons for diabetic retinopathy referral in level 2 and level 3 readers and the consensus gold standard (N=559).
Diabetic retinopathy grading | Reader A, n (%) | Level 3 reader for reader A, n (%) | Consensus gold standard for reader A, n (%) | Reader B, n (%) | Level 3 reader for reader B, n (%) | Consensus gold standard for reader B, n (%) | Consensus gold standard for all readings, n (%) |
M>1 only (including R6) | 70 (76) | 57 (72) | 60 (72) | 38 (57) | 36 (58) | 42 (62) | 102 (67.6) |
R6 and M6 only | 16 (17) | 17 (22) | 17 (21) | 18 (27) | 19 (31) | 19 (28) | 36 (23.8) |
R>2 and M>1 | 5 (5) | 4 (5) | 4 (5) | 8 (12) | 7 (11) | 7 (10) | 11 (7.3) |
R>2 only (including M6) | 1 (1) | 1 (1) | 2 (2) | 4 (5) | 0 (0) | 0 (0) | 2 (1.3) |
Total referrals | 92 | 79 | 83 | 67 | 62 | 68 | 151 |
Agreements of level 1, 2, and 3 readings for referable (>R2) diabetic retinopathy and diabetic macular edema (>M1) and referral to ophthalmology for all eyes versus the consensus gold standard (level 1 reading [n=266] and level 2 reading [n=1118]).
Reader | Consensus gold standard referable diabetic retinopathy, κa (95% CI) | Consensus gold standard referable diabetic macular edema grading, κ (95% CI) | Consensus gold standard referral to ophthalmology, κ (95% CI) | |
|
||||
|
Reader A (n=114) | N/Ab | N/A | 0.859 (0.764-0.953) |
|
Level 3 reader for reader A | N/A | N/A | 1.00 (1.000-1.000) |
|
Reader B (n=152) | N/A | N/A | 0.803 (0.709-0.896) |
|
Level 3 reader for reader B | N/A | N/A | 1.00 (1.000-1.000) |
|
||||
|
Reader A (n=646) | 0.803 (0.757-0.850) | 0.788 (0.733-0.842) | 0.757 (0.677-0.838) |
|
Level 3 reader for reader A | 0.940 (0.912-0.968) | 0.961 (0.935-0.986) | 0.967 (0.935-0.999) |
|
Reader B (n=472) | 0.826 (0.777-0.874) | 0.877 (0.830-0.925) | 0.887 (0.822-0.952) |
|
Level 3 reader for reader B | 0.957 (0.930-0.983) | 0.946 (0.914-0.979) | 0.936 (0.886-0.987) |
aκ: kappa coefficient. All kappas have
bNot applicable.
With respect to the cohorts, good screening threshold sensitivities and specificities were obtained in all level readers (
Sensitivity and specificity for the identification of patient referrals by each reader versus the consensus gold standard.
Reader | Number of patients, n | Sensitivity, % (95% CI) | Specificity, % (95% CI) | |
|
||||
|
Reader A | 114 | 91 (82.70-98.44) | 95 (89.66-100.51) |
|
Level 3 reader for reader A | 114 | 100 (100-100) | 100 (100-100) |
|
Reader B | 152 | 97 (92.72, 101.12) | 85 (77.57-92.55) |
|
Level 3 reader for reader B | 152 | 100 (100-100) | 100 (100-100) |
|
||||
|
Reader A | 323 | 86.8 (79.45-94.04) | 91.7 (88.17-95.16) |
|
Level 3 reader for reader A | 323 | 95.2 (90.57-99.790) | 100 (100-100) |
|
Reader B | 236 | 91.2 (84.43-97.92) | 97.0 (94.45-99.59) |
|
Level 3 reader for reader B | 236 | 91.2 (84.43-97.92) | 100 (100-100) |
Level 2 and 3 reading discrepancies with the consensus gold standard and their impact on patient management are described in
A missed referral to ophthalmology is considered a significant misreading and occurred in 2.8% (9/323) and 2.5% (6/236) of patients in the level 2 readings, respectively, and respective to these cohorts, in 1.2% (4/323) and 2.5% (6/236) of patients in the level 3 readings. A comparable rate of significant misreading (excluding underappreciation of image quality) is shown in both level 2 readers (6/323, 1.9% and 5/236, 2.1%, respectively for reader A and reader B) and level 3 readers (4/323, 1.2% and 6/236, 2.5%). All image misreadings were related to unrecognized isolated microaneurysms located within 1 DD of the fovea in the absence of any exudate, except for 1 eye with neovascularization misinterpreted as an epiretinal membrane by the level 3 reader and confirmed on clinical examination.
Level 2 readers also show an overall underappreciation of ungradable imaging in 1.2% (4/323) and 0.8% (2/236) of the patients, respectively for reader A and reader B. Stratified analysis of 50 successive patients showed that as experience was gained, this rate was still maintained.
The consequences of misreading on patient management, such as the timing of new imaging or referral for in-person examination, were measured to be 73% (48/66) and 55% (23/42) in the level 2 reader cohorts, respectively, and in 67% (12/18) and 64% (9/14) of the level 3 reader, respectively, in the level 2 cohort.
Both level 2 readers tended to be more conservative in their actions, with 6.5% (21/323) and 2.1% (5/236) unnecessary referral recommendations, as compared with 0% for the level 3 reader, reimaging sooner than indicated in 4.3% (14/323) and 4.7% (11/236) of patients, respectively. Both level 2 readers acknowledged possible unnecessary referrals, but still referred patients as a precaution in 1.2% (4/323) and 0.4% (1/236) of all screenings, respectively, which represented 6% (4/66) and 2% (1/42) of their misreads.
Level 2 and level 3 reader disagreements according to the consensus gold standard and impact on patient management (N=559).
Effect of disagreement | Reader A (n=323), n (%) | Level 3 reader for reader A (n=323), n (%) | Reader B (n=236), n (%) | Level 3 reader for reader B (n=236), n (%) | |||||
No impact on patient management | 18 (5.6) | 6 (1.9) | 19 (8.1) | 5 (2.1) | |||||
Impact on patient management | 48 (14.9) | 12 (3.7) | 23 (9.8) | 9 (3.8) | |||||
Total number of disagreements | 66 (20.4) | 18 (5.6) | 42 (17.8) | 14 (5.9) | |||||
No referral although indicated | 9 (2.8) | 4 (1.2) | 6 (2.5) | 6 (2.5) | |||||
Unnecessary referral | 21 (6.5) | 0 (0) | 5 (2.1) | 0 (0) | |||||
Imaging recommended sooner than necessary | 14 (4.3) | 0 (0) | 11 (4.7) | 1 (0.4) | |||||
Imaging recommended later than indicated | 4 (1.2) | 8 (2.5) | 1 (0.4) | 2 (0.9) | |||||
|
|||||||||
|
Missed isolated microaneurysm within 1 DDa of the fovea. | 6 (1.9) | 3 (0.9) | 5 (2.1) | 6 (2.5) | ||||
|
Confusion of neovascularization with an epiretinal membrane | 0 (0) | 1 (0.3) | 0 (0) | 0 (0) | ||||
|
Under appreciation of ungradable imaging | 3 (0.9) | 0 (0) | 1 (0.4) | 0 (0) | ||||
|
|||||||||
|
Misreads with minimal impact on management | 34 (10.5) | 8 (2.5) | 15 (6.4) | 3 (1.3) | ||||
|
Referrals as a precaution | 4 (1.2) | 0 (0) | 1 (0.4) | 0 (0) | ||||
|
Under appreciation of ungradable imaging | 1 (0.3) | 0 (0) | 1 (0.4) | 0 (0) |
aDD: disc diameter.
The per-strata sensitivities and specificities of level 1 and level 2 readers show high sensitivity and specificity for all readers, achieved immediately after training to detect any presence of diabetic retinopathy for level 1 readers and, for level 2 readers, to identify referable disease (>R2 and/or >M1), which were maintained throughout the study (
The cumulative incidence curve of misreadings for level 2 reader A image readings.
The cumulative incidence curve of misreadings for level 2 reader B image readings.
This study emphasizes the importance of practical experience and validates the screening performance and training of level 1 and level 2 diabetic retinopathy readers within this program. It may thus help set parameters to further calibrate the training of diabetic retinopathy readers for safe screening programs.
It shows 91% and 97% sensitivities, and 95% and 85% specificities in detecting any diabetic retinopathy, and 86.8% and 91.2% sensitivities, and 91.7% and 97.0% specificities in the identification of sight-threatening disease relative to the cohorts. These results are comparable to those reported in studies with similar conditions [
Ruamviboonsuk et al [
Although not consistently met in many studies evaluating gold standards in diabetic retinopathy detection [
Certification training programs, such as that of the United Kingdom National Health Service, suggest that good reading performance indicates good training but does not address minimal practical training experience for readers [
The failure of level 2 readers to recognize inadequate imaging under pupil dilation in 1.2% (4/323) and 0.8% (2/236) of all readings, respectively, represented 6% (4/66) and 5% (2/42) of all of their disagreements with the gold standard. In comparison, Farley et al [
Level 2 readers made more conservative assessments, resulting in precautionary referrals in 1.2% (4/323) and 0.4% (1/236) of their readings versus none of the level 3 readings. Although these rates are small, further training to recognize unusual variants of normal and those having to be brought to the attention of the ophthalmologist as a precaution may help increase specificity and further reduce the workload on ophthalmologists.
Significant misreads causing missed referrals to ophthalmology were all related to missed isolated microaneurysms located within 1 DD of the fovea in the absence of any exudate, except for 1 level 3 reader misinterpretation of neovascularization as an epiretinal membrane. An isolated microaneurysm within 1 DD of the fovea does not signal DME unless associated with a positive optical coherence tomography establishing edema, but does signal a potential risk of DME with time. Missed detection of possible DME was found to be the worst scenario in 1.9% (6/323) and 2.1% (5/236) of level 2 reader significant misreads and in 0.9% (3/323) and 2.5% (6/236) of those of the level 3 reader. Level 2 readers appear to have greater sensitivity in detecting these isolated microaneurysms, as these misreadings represent 9% (6/66) and 12% (5/42) of all of their disagreements with the gold standard in comparison to 17% (3/18) and 36% (5/14) of those of level 3 respective to the cohorts. Moss et al [
DME was the major cause of referral in this study at 65% of all referrals, followed by 8.2% for severe diabetic retinopathy with DME and 1% for severe diabetic retinopathy without DME.
Although overall screening posed no visual safety threat in 98.0% (548/559) of patients assessed by the level 2 readers (317/323, 98.1% and 231/236, 97.9%, respectively) and 98.2% (549/559) of all level 3 readings, a small number could be put at risk with this process. The majority were related to difficult positive identification of isolated microaneurysms in the macular area at the limit of detection, which often resulted in arbitration for the final gold standard grading. These could potentially and eventually be resolved with the use of greater resolution cameras for screening. Recommendations for reimaging later than required could represent some level of risk in 1.2% (4/323) and 0.4% (1/236) of the patients assessed by the level 2 readers compared with those of the level 3 reader.
Two challenging cases of an isolated microaneurysm near the fovea. Arrows are used to indicate the location of microaneurysms.
This study outperforms the screening results of Oke et al [
Limitations of this study include its retrospective nature and the small number of trained readers, which only validates the individual and group performance of these readers within this specific training. These results may not apply to a larger reading group where possible individual performance variations could occur.
This study validates the screening performance and accuracy of the specific training of 2 nonphysician graders as level 1 (triage) and level 2 (referable diabetic retinopathy) graders who achieved a very high initial agreement that was maintained throughout the study and whose image interpretations compared favorably with that of a retina specialist and the consensus gold standard. It adds new information to scant literature on diabetic retinopathy reader training modalities, emphasizes the importance of training experience for reading, and suggests a starting threshold in a similar setting to train nonophthalmologist readers and meet quality standards. As with other studies [
Summary of the Scottish Diabetic Retinopathy Grading Scheme 2007 v1.1.
Patient distribution of disease severity according to the worst eye and ungradable imaging between reader A and B, according to gold standard grading.
Diabetic retinopathy findings and ungradable imaging in level 1 and level 3 readers and in the consensus gold standard.
Findings and prevalence of diabetic retinopathy severity, diabetic macular edema, and ungradable imaging in patients by individual readers in level 2 and level 3 readings and the consensus gold standard.
Findings and prevalence of diabetic retinopathy severity, diabetic macular edema, and ungradable imaging by individual readers for all eyes imaged in level 2 and level 3 readings and the consensus gold standard.
Per-strata sensitivities and specificities of level 2 readers for every patient as determined by the eye with the worst grading severity.
Per-strata sensitivities and specificities of level 2 readers for all eyes.
disc diameter
diabetic macular edema
Early Treatment Diabetic Retinopathy Study
The following individuals are the members of the Trained Reader Screening for Diabetic Retinopathy Study Group: Dr MC Boucher, Technical Nurse Céline Bugeaud, Clinical Nurse Annie Croteau, First Year Ophthalmology Resident Michael Trong Duc Nguyen.
None declared.