Paper
13 April 2001 Speech recognition technology combined with three-dimensional lip movement
Kazumi Komiya, Ryo Ishikawa, Keiko Momose
Author Affiliations +
Proceedings Volume 4298, Three-Dimensional Image Capture and Applications IV; (2001) https://doi.org/10.1117/12.424893
Event: Photonics West 2001 - Electronic Imaging, 2001, San Jose, CA, United States
Abstract
In order to improve speech recognition efficiency under background noise such as out of doors, we propose a new recognition technology combined with three dimensional lip movements. In this paper firstly, three-dimensional movements at four positions of the mouth were measured using principal component analysis to clarify which positions and which directions are main contributors to pronunciation. Secondly, recognition evaluation tests for 50 Japanese words were carried out under noise levels ranging from 40 to 80 dB. In the experiment, over 80% of recognition efficiency was measured at 70dB and improvement of 40% was obtained compared with ordinary speech recognition. From the experimental results, the proposed method can be modified to be used as practical speech recognition technology. Finally, research subjects were picked up such as an improvement in precision of measuring lip movement and experiments and data collection out of doors.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Kazumi Komiya, Ryo Ishikawa, and Keiko Momose "Speech recognition technology combined with three-dimensional lip movement", Proc. SPIE 4298, Three-Dimensional Image Capture and Applications IV, (13 April 2001); https://doi.org/10.1117/12.424893
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Laser induced plasma spectroscopy

Speech recognition

Mouth

Principal component analysis

CCD cameras

3D image processing

Cameras

Back to Top