Bias in Estimating Accuracy of a Binary Screening Test with Differential Disease Verification

Overview

Journal Stat Med

Publisher Wiley

Specialty Public Health

Date 2011 Apr 16

PMID 21495059

Citations 7

Authors

Todd A Alonzo

John T Brinton

Brandy M Ringham

Deborah H Glueck

Affiliations

Soon will be listed here.

Abstract

Sensitivity, specificity, positive and negative predictive value are typically used to quantify the accuracy of a binary screening test. In some studies, it may not be ethical or feasible to obtain definitive disease ascertainment for all subjects using a gold standard test. When a gold standard test cannot be used, an imperfect reference test that is less than 100 per cent sensitive and specific may be used instead. In breast cancer screening, for example, follow-up for cancer diagnosis is used as an imperfect reference test for women where it is not possible to obtain gold standard results. This incomplete ascertainment of true disease, or differential disease verification, can result in biased estimates of accuracy. In this paper, we derive the apparent accuracy values for studies subject to differential verification. We determine how the bias is affected by the accuracy of the imperfect reference test, the percent who receive the imperfect reference standard test not receiving the gold standard, the prevalence of the disease, and the correlation between the results for the screening test and the imperfect reference test. It is shown that designs with differential disease verification can yield biased estimates of accuracy. Estimates of sensitivity in cancer screening trials may be substantially biased. However, careful design decisions, including selection of the imperfect reference test, can help to minimize bias. A hypothetical breast cancer screening study is used to illustrate the problem.

Citing Articles

Estimating Cancer Screening Sensitivity and Specificity Using Healthcare Utilization Data: Defining the Accuracy Assessment Interval.

Chubak J, Burnett-Hartman A, Barlow W, Corley D, Croswell J, Neslund-Dudas C Cancer Epidemiol Biomarkers Prev. 2022; 31(8):1517-1520.

PMID: 35916602 PMC: 9484579. DOI: 10.1158/1055-9965.EPI-22-0232.

Adjusting for verification bias in diagnostic accuracy measures when comparing multiple screening tests - an application to the IP1-PROSTAGRAM study.

Day E, Eldred-Evans D, Prevost A, Ahmed H, Fiorentino F BMC Med Res Methodol. 2022; 22(1):70.

PMID: 35300611 PMC: 8932251. DOI: 10.1186/s12874-021-01481-w.

Diagnostic test evaluation methodology: A systematic review of methods employed to evaluate diagnostic tests in the absence of gold standard - An update.

Umemneku Chikere C, Wilson K, Graziadio S, Vale L, Allen A PLoS One. 2019; 14(10):e0223832.

PMID: 31603953 PMC: 6788703. DOI: 10.1371/journal.pone.0223832.

Anticipating missing reference standard data when planning diagnostic accuracy studies.

Naaktgeboren C, de Groot J, Rutjes A, Bossuyt P, Reitsma J, Moons K BMJ. 2016; 352:i402.

PMID: 26861453 PMC: 4772780. DOI: 10.1136/bmj.i402.

Exploring the Underdiagnosis and Prevalence of Autism Spectrum Conditions in Beijing.

Sun X, Allison C, Matthews F, Zhang Z, Auyeung B, Baron-Cohen S Autism Res. 2015; 8(3):250-60.

PMID: 25952676 PMC: 4690159. DOI: 10.1002/aur.1441.

References

Pisano E, Gatsonis C, Hendrick E, Yaffe M, Baum J, Acharyya S . Diagnostic performance of digital versus film mammography for breast-cancer screening. N Engl J Med. 2005; 353(17):1773-83. DOI: 10.1056/NEJMoa052911. View

Reitsma J, Rutjes A, Khan K, Coomarasamy A, Bossuyt P . A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. J Clin Epidemiol. 2009; 62(8):797-806. DOI: 10.1016/j.jclinepi.2009.02.005. View

Zhou X . Correcting for verification bias in studies of a diagnostic test's accuracy. Stat Methods Med Res. 1999; 7(4):337-53. DOI: 10.1177/096228029800700403. View

Panzer R, Suchman A, Griner P . Workup bias in prediction research. Med Decis Making. 1987; 7(2):115-9. DOI: 10.1177/0272989X8700700209. View

Skaane P, Hofvind S, Skjennald A . Randomized trial of screen-film versus full-field digital mammography with soft-copy reading in population-based screening program: follow-up and final results of Oslo II study. Radiology. 2007; 244(3):708-17. DOI: 10.1148/radiol.2443061478. View

Oestreicher N, Lehman C, Seger D, Buist D, White E . The incremental contribution of clinical breast examination to invasive cancer detection in a mammography screening program. AJR Am J Roentgenol. 2005; 184(2):428-32. DOI: 10.2214/ajr.184.2.01840428. View

Baker S . Evaluating multiple diagnostic tests with partial verification. Biometrics. 1995; 51(1):330-7. View

Poplack S, Tosteson A, Grove M, Wells W, Carney P . Mammography in 53,803 women from the New Hampshire mammography network. Radiology. 2000; 217(3):832-40. DOI: 10.1148/radiology.217.3.r00dc33832. View

Lijmer J, Mol B, Heisterkamp S, Bonsel G, Prins M, van der Meulen J . Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999; 282(11):1061-6. DOI: 10.1001/jama.282.11.1061. View

10.

Berg W, Gutierrez L, NessAiver M, Carter W, Bhargavan M, Lewis R . Diagnostic accuracy of mammography, clinical examination, US, and MR imaging in preoperative assessment of breast cancer. Radiology. 2004; 233(3):830-49. DOI: 10.1148/radiol.2333031484. View

11.

Bobo J, Lee N, THAMES S . Findings from 752,081 clinical breast examinations reported to a national screening program from 1995 through 1998. J Natl Cancer Inst. 2000; 92(12):971-6. DOI: 10.1093/jnci/92.12.971. View

12.

Baker S . Improving the biomarker pipeline to develop and evaluate cancer screening tests. J Natl Cancer Inst. 2009; 101(16):1116-9. PMC: 2728744. DOI: 10.1093/jnci/djp186. View

13.

Rutjes A, Reitsma J, Di Nisio M, Smidt N, van Rijn J, Bossuyt P . Evidence of bias and variation in diagnostic accuracy studies. CMAJ. 2006; 174(4):469-76. PMC: 1373751. DOI: 10.1503/cmaj.050090. View

14.

Buzoianu M, Kadane J . Adjusting for verification bias in diagnostic test evaluation: a Bayesian approach. Stat Med. 2007; 27(13):2453-73. DOI: 10.1002/sim.3099. View

15.

Ikeda D, Andersson I, Wattsgard C, Janzon L, Linell F . Interval carcinomas in the Malmö Mammographic Screening Trial: radiographic appearance and prognostic considerations. AJR Am J Roentgenol. 1992; 159(2):287-94. DOI: 10.2214/ajr.159.2.1632342. View

16.

Glueck D, Lamb M, ODonnell C, Ringham B, Brinton J, Muller K . Bias in trials comparing paired continuous tests can cause researchers to choose the wrong screening modality. BMC Med Res Methodol. 2009; 9:4. PMC: 2657218. DOI: 10.1186/1471-2288-9-4. View

17.

Alonzo T, Braun T, Moskowitz C . Small sample estimation of relative accuracy for binary screening tests. Stat Med. 2003; 23(1):21-34. DOI: 10.1002/sim.1598. View

18.

Begg C, Greenes R . Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics. 1983; 39(1):207-15. View

19.

Walter S . Effects of dependent errors in the assessment of diagnostic test performance. Stat Med. 1997; 16(19):2157-75. DOI: 10.1002/(sici)1097-0258(19971015)16:19<2157::aid-sim653>3.0.co;2-x. View

20.

Lewin J, Hendrick R, DOrsi C, Isaacs P, Moss L, Karellas A . Comparison of full-field digital mammography with screen-film mammography for cancer detection: results of 4,945 paired examinations. Radiology. 2001; 218(3):873-80. DOI: 10.1148/radiology.218.3.r01mr29873. View