» Articles » PMID: 30811452

Producing Knowledge by Admitting Ignorance: Enhancing Data Quality Through an "I Don't Know" Option in Citizen Science

Overview
Journal PLoS One
Date 2019 Feb 28
PMID 30811452
Citations 5
Authors
Affiliations
Soon will be listed here.
Abstract

The "noisy labeler problem" in crowdsourced data has attracted great attention in recent years, with important ramifications in citizen science, where non-experts must produce high-quality data. Particularly relevant to citizen science is dynamic task allocation, in which the level of agreement among labelers can be progressively updated through the information-theoretic notion of entropy. Under dynamic task allocation, we hypothesized that providing volunteers with an "I don't know" option would contribute to enhancing data quality, by introducing further, useful information about the level of agreement among volunteers. We investigated the influence of an "I don't know" option on the data quality in a citizen science project that entailed classifying the image of a highly polluted canal into "threat" or "no threat" to the environment. Our results show that an "I don't know" option can enhance accuracy, compared to the case without the option; such an improvement mostly affects the true negative rather than the true positive rate. In an information-theoretic sense, these seemingly meaningless blank votes constitute a meaningful piece of information to help enhance accuracy of data in citizen science.

Citing Articles

Building International Capacity for Citizen Scientist Engagement in Mosquito Surveillance and Mitigation: The GLOBE Program's GLOBE Observer Mosquito Habitat Mapper.

Low R, Schwerin T, Boger R, Soeffing C, Nelson P, Bartlett D Insects. 2022; 13(7).

PMID: 35886800 PMC: 9316649. DOI: 10.3390/insects13070624.


GLOBE Mosquito Habitat Mapper Citizen Science Data 2017-2020.

Low R, Boger R, Nelson P, Kimura M Geohealth. 2021; 5(10):e2021GH000436.

PMID: 34712882 PMC: 8527845. DOI: 10.1029/2021GH000436.


Data Reliability in a Citizen Science Protocol for Monitoring Stingless Bees Flight Activity.

Leocadio J, Ghilardi-Lopes N, Koffler S, Barbieri C, Francoy T, Albertini B Insects. 2021; 12(9).

PMID: 34564206 PMC: 8467663. DOI: 10.3390/insects12090766.


Using demographics toward efficient data classification in citizen science: a Bayesian approach.

De Lellis P, Nakayama S, Porfiri M PeerJ Comput Sci. 2021; 5:e239.

PMID: 33816892 PMC: 7924415. DOI: 10.7717/peerj-cs.239.


Deep Lake Explorer: A web application for crowdsourcing the classification of benthic underwater video from the Laurentian Great Lakes.

Wick M, Angradi T, Pawlowski M, Bolgrien D, Debbout R, Launspach J J Great Lakes Res. 2021; 46(5):1469-1478.

PMID: 33424103 PMC: 7787985. DOI: 10.1016/j.jglr.2020.07.009.

References
1.
Palermo E, Laut J, Nov O, Cappa P, Porfiri M . A natural user interface to integrate citizen science and physical exercise. PLoS One. 2017; 12(2):e0172587. PMC: 5322974. DOI: 10.1371/journal.pone.0172587. View

2.
Candido Dos Reis F, Lynn S, Ali H, Eccles D, Hanby A, Provenzano E . Crowdsourcing the General Public for Large Scale Molecular Pathology Studies in Cancer. EBioMedicine. 2015; 2(7):681-9. PMC: 4534635. DOI: 10.1016/j.ebiom.2015.05.009. View

3.
See L, Comber A, Salk C, Fritz S, van der Velde M, Perger C . Comparing the quality of crowdsourced data contributed by expert and non-experts. PLoS One. 2013; 8(7):e69958. PMC: 3729953. DOI: 10.1371/journal.pone.0069958. View

4.
Khatib F, Cooper S, Tyka M, Xu K, Makedon I, Popovic Z . Algorithm discovery by protein folding game players. Proc Natl Acad Sci U S A. 2011; 108(47):18949-53. PMC: 3223433. DOI: 10.1073/pnas.1115898108. View

5.
Laut J, Cappa F, Nov O, Porfiri M . Increasing patient engagement in rehabilitation exercises using computer-based citizen science. PLoS One. 2015; 10(3):e0117013. PMC: 4368773. DOI: 10.1371/journal.pone.0117013. View