Integration of Industrially-Oriented Human-Robot Speech Communication and Vision-Based Object Recognition

Overview

Journal Sensors (Basel)

Publisher MDPI

Specialty Biotechnology

Date 2020 Dec 23

PMID 33353038

Citations 3

Authors

Adam Rogowski

Krzysztof Bieliszczuk

Jerzy Rapcewicz

Affiliations

Soon will be listed here.

Abstract

This paper presents a novel method for integration of industrially-oriented human-robot speech communication and vision-based object recognition. Such integration is necessary to provide context for task-oriented voice commands. Context-based speech communication is easier, the commands are shorter, hence their recognition rate is higher. In recent years, significant research was devoted to integration of speech and gesture recognition. However, little attention was paid to vision-based identification of objects in industrial environment (like workpieces or tools) represented by general terms used in voice commands. There are no reports on any methods facilitating the abovementioned integration. Image and speech recognition systems usually operate on different data structures, describing reality on different levels of abstraction, hence development of context-based voice control systems is a laborious and time-consuming task. The aim of our research was to solve this problem. The core of our method is extension of Voice Command Description (VCD) format describing syntax and semantics of task-oriented commands, as well as its integration with Flexible Editable Contour Templates (FECT) used for classification of contours derived from image recognition systems. To the best of our knowledge, it is the first solution that facilitates development of customized vision-based voice control applications for industrial robots.

Citing Articles

Scenario-Based Programming of Voice-Controlled Medical Robotic Systems.

Rogowski A Sensors (Basel). 2022; 22(23).

PMID: 36502220 PMC: 9738457. DOI: 10.3390/s22239520.

Multi-Objective Location and Mapping Based on Deep Learning and Visual Slam.

Sun Y, Hu J, Yun J, Liu Y, Bai D, Liu X Sensors (Basel). 2022; 22(19).

PMID: 36236676 PMC: 9571389. DOI: 10.3390/s22197576.

Improved Multi-Stream Convolutional Block Attention Module for sEMG-Based Gesture Recognition.

Wang S, Huang L, Jiang D, Sun Y, Jiang G, Li J Front Bioeng Biotechnol. 2022; 10:909023.

PMID: 35747495 PMC: 9209772. DOI: 10.3389/fbioe.2022.909023.

References

Zanchettin A, Bascetta L, Rocco P . Acceptability of robotic manipulators in shared working environments through human-like redundancy resolution. Appl Ergon. 2013; 44(6):982-9. DOI: 10.1016/j.apergo.2013.03.028. View

Sheridan T . Human-Robot Interaction: Status and Challenges. Hum Factors. 2016; 58(4):525-32. DOI: 10.1177/0018720816644364. View

Jiang P, Ishihara Y, Sugiyama N, Oaki J, Tokura S, Sugahara A . Depth Image-Based Deep Learning of Grasp Planning for Textureless Planar-Faced Objects in Vision-Guided Robotic Bin-Picking. Sensors (Basel). 2020; 20(3). PMC: 7038393. DOI: 10.3390/s20030706. View

Rogowski A, Skrobek P . Object Identification for Task-Oriented Communication with Industrial Robots. Sensors (Basel). 2020; 20(6). PMC: 7147712. DOI: 10.3390/s20061773. View