» Articles » PMID: 30213937

Finding Any Waldo with Zero-shot Invariant and Efficient Visual Search

Overview
Journal Nat Commun
Specialty Biology
Date 2018 Sep 15
PMID 30213937
Citations 17
Authors
Affiliations
Soon will be listed here.
Abstract

Searching for a target object in a cluttered scene constitutes a fundamental challenge in daily vision. Visual search must be selective enough to discriminate the target from distractors, invariant to changes in the appearance of the target, efficient to avoid exhaustive exploration of the image, and must generalize to locate novel target objects with zero-shot training. Previous work on visual search has focused on searching for perfect matches of a target after extensive category-specific training. Here, we show for the first time that humans can efficiently and invariantly search for natural objects in complex scenes. To gain insight into the mechanisms that guide visual search, we propose a biologically inspired computational model that can locate targets without exhaustive sampling and which can generalize to novel objects. The model provides an approximation to the mechanisms integrating bottom-up and top-down signals during search in natural scenes.

Citing Articles

A prospective multi-center study quantifying visual inattention in delirium using generative models of the visual processing stream.

Al-Hindawi A, Vizcaychipi M, Demiris Y Sci Rep. 2024; 14(1):15698.

PMID: 38977712 PMC: 11231180. DOI: 10.1038/s41598-024-66368-4.


Inactivation of face-selective neurons alters eye movements when free viewing faces.

Azadi R, Lopez E, Taubert J, Patterson A, Afraz A Proc Natl Acad Sci U S A. 2024; 121(3):e2309906121.

PMID: 38198528 PMC: 10801883. DOI: 10.1073/pnas.2309906121.


Gaze shifts during wayfinding decisions.

Geisen M, Bock O, Klatt S Atten Percept Psychophys. 2023; 86(3):808-814.

PMID: 37853168 PMC: 11062990. DOI: 10.3758/s13414-023-02797-z.


CNN-based search model fails to account for human attention guidance by simple visual features.

Poder E Atten Percept Psychophys. 2023; 86(1):9-15.

PMID: 36977907 DOI: 10.3758/s13414-023-02697-2.


Look twice: A generalist computational model predicts return fixations across tasks and species.

Zhang M, Armendariz M, Xiao W, Rose O, Bendtz K, Livingstone M PLoS Comput Biol. 2022; 18(11):e1010654.

PMID: 36413523 PMC: 9681066. DOI: 10.1371/journal.pcbi.1010654.


References
1.
Cristino F, Mathot S, Theeuwes J, Gilchrist I . ScanMatch: a novel method for comparing fixation sequences. Behav Res Methods. 2010; 42(3):692-700. DOI: 10.3758/BRM.42.3.692. View

2.
Ren S, He K, Girshick R, Sun J . Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Trans Pattern Anal Mach Intell. 2016; 39(6):1137-1149. DOI: 10.1109/TPAMI.2016.2577031. View

3.
Wu C, Wang H, Pomplun M . The roles of scene gist and spatial dependency among objects in the semantic guidance of attention in real-world scenes. Vision Res. 2014; 105:10-20. DOI: 10.1016/j.visres.2014.08.019. View

4.
Navalpakkam V, Itti L . Modeling the influence of task on attention. Vision Res. 2004; 45(2):205-31. DOI: 10.1016/j.visres.2004.07.042. View

5.
Bisley J . The neural basis of visual attention. J Physiol. 2010; 589(Pt 1):49-57. PMC: 3039259. DOI: 10.1113/jphysiol.2010.192666. View