» Articles » PMID: 22144516

Discriminative Latent Models for Recognizing Contextual Group Activities

Overview
Authors
Affiliations
Soon will be listed here.
Abstract

In this paper, we go beyond recognizing the actions of individuals and focus on group activities. This is motivated from the observation that human actions are rarely performed in isolation; the contextual information of what other people in the scene are doing provides a useful cue for understanding high-level activities. We propose a novel framework for recognizing group activities which jointly captures the group activity, the individual person actions, and the interactions among them. Two types of contextual information, group-person interaction and person-person interaction, are explored in a latent variable framework. In particular, we propose three different approaches to model the person-person interaction. One approach is to explore the structures of person-person interaction. Differently from most of the previous latent structured models, which assume a predefined structure for the hidden layer, e.g., a tree structure, we treat the structure of the hidden layer as a latent variable and implicitly infer it during learning and inference. The second approach explores person-person interaction in the feature level. We introduce a new feature representation called the action context (AC) descriptor. The AC descriptor encodes information about not only the action of an individual person in the video, but also the behavior of other people nearby. The third approach combines the above two. Our experimental results demonstrate the benefit of using contextual information for disambiguating group activities.

Citing Articles

Vision Sensor for Automatic Recognition of Human Activities via Hybrid Features and Multi-Class Support Vector Machine.

Kamal S, F Alhasson H, Alnusayri M, Alatiyyah M, Aljuaid H, Jalal A Sensors (Basel). 2025; 25(1.

PMID: 39796988 PMC: 11723259. DOI: 10.3390/s25010200.


HAtt-Flow: Hierarchical Attention-Flow Mechanism for Group-Activity Scene Graph Generation in Videos.

Chappa N, Nguyen P, Le T, Dobbs P, Luu K Sensors (Basel). 2024; 24(11).

PMID: 38894164 PMC: 11174860. DOI: 10.3390/s24113372.


A Novel Deep Neural Network Method for HAR-Based Team Training Using Body-Worn Inertial Sensors.

Fan Y, Tseng Y, Wen C Sensors (Basel). 2022; 22(21).

PMID: 36366202 PMC: 9658685. DOI: 10.3390/s22218507.


Multi-Perspective Representation to Part-Based Graph for Group Activity Recognition.

Wu L, Lang X, Xiang Y, Wang Q, Tian M Sensors (Basel). 2022; 22(15).

PMID: 35898025 PMC: 9371107. DOI: 10.3390/s22155521.


3DMesh-GAR: 3D Human Body Mesh-Based Method for Group Activity Recognition.

Saqlain M, Kim D, Cha J, Lee C, Lee S, Baek S Sensors (Basel). 2022; 22(4).

PMID: 35214365 PMC: 8877503. DOI: 10.3390/s22041464.


References
1.
Quattoni A, Wang S, Morency L, Collins M, Darrell T . Hidden conditional random fields. IEEE Trans Pattern Anal Mach Intell. 2007; 29(10):1848-53. DOI: 10.1109/TPAMI.2007.1124. View

2.
Gorelick L, Blank M, Shechtman E, Irani M, Basri R . Actions as space-time shapes. IEEE Trans Pattern Anal Mach Intell. 2007; 29(12):2247-53. DOI: 10.1109/TPAMI.2007.70711. View

3.
Gupta A, Kembhavi A, Davis L . Observing human-object interactions: using spatial and functional compatibility for recognition. IEEE Trans Pattern Anal Mach Intell. 2009; 31(10):1775-89. DOI: 10.1109/TPAMI.2009.83. View

4.
Biederman I, Mezzanotte R, RABINOWITZ J . Scene perception: detecting and judging objects undergoing relational violations. Cogn Psychol. 1982; 14(2):143-77. DOI: 10.1016/0010-0285(82)90007-x. View