Learning the 3-D Structure of Objects from 2-D Views Depends on Shape, Not Format
Overview
Affiliations
Humans can learn to recognize new objects just from observing example views. However, it is unknown what structural information enables this learning. To address this question, we manipulated the amount of structural information given to subjects during unsupervised learning by varying the format of the trained views. We then tested how format affected participants' ability to discriminate similar objects across views that were rotated 90° apart. We found that, after training, participants' performance increased and generalized to new views in the same format. Surprisingly, the improvement was similar across line drawings, shape from shading, and shape from shading + stereo even though the latter two formats provide richer depth information compared to line drawings. In contrast, participants' improvement was significantly lower when training used silhouettes, suggesting that silhouettes do not have enough information to generate a robust 3-D structure. To test whether the learned object representations were format-specific or format-invariant, we examined if learning novel objects from example views transfers across formats. We found that learning objects from example line drawings transferred to shape from shading and vice versa. These results have important implications for theories of object recognition because they suggest that (a) learning the 3-D structure of objects does not require rich structural cues during training as long as shape information of internal and external features is provided and (b) learning generates shape-based object representations independent of the training format.
Sha O, Zhang H, Bai J, Zhang Y, Yang J PeerJ Comput Sci. 2023; 9:e1610.
PMID: 37810332 PMC: 10557943. DOI: 10.7717/peerj-cs.1610.
Standardised images of novel objects created with generative adversarial networks.
Cooper P, Colton E, Bode S, Chong T Sci Data. 2023; 10(1):575.
PMID: 37660073 PMC: 10475029. DOI: 10.1038/s41597-023-02483-7.
Rosenke M, Davidenko N, Grill-Spector K, Weiner K Cereb Cortex. 2020; 30(9):4882-4898.
PMID: 32372098 PMC: 7391265. DOI: 10.1093/cercor/bhaa081.
The functional neuroanatomy of face perception: from brain measurements to deep neural networks.
Grill-Spector K, Weiner K, Gomez J, Stigliani A, Natu V Interface Focus. 2018; 8(4):20180013.
PMID: 29951193 PMC: 6015811. DOI: 10.1098/rsfs.2018.0013.