In this paper, we introduce an interactive multimodal vision-based robot teaching method. Here, a multimodal 3D image (color (RGB), thermal (T) and point cloud (3D)) was used to capture the temperature, texture and geometry information required to analyze human action. By our method, we only need to move our finger on an object surface, and then the heat traces left by the finger on the object surface will be recorded by the multimodal 3D sensor. By analyzing the multimodal point cloud dynamically, the accurate finger trace on the object is recognized. A robot trajectory is computed using this finger trace.
In this work, we developed a multimodal imaging system for real-time applications by integrating 2D image sensors in different spectral ranges as well as a polarization camera into a high-speed optical 3D sensor. For the generation of the multimodal image data, a pixel-level alignment of 2D images in different modalities to 3D data is realized by applying projection matrices to each point in the 3D point cloud. For the calculation of projection matrices for each 2D image sensor, a calibration procedure is proposed for the extrinsic calibration of arbitrarily positioned image sensors. The final imaging system delivers multimodal video data with one mega-pixel resolution at a frame rate of 30 Hz. As application examples, we demonstrate the estimation of vital signs and the detection of human body parts with this imaging system.
Currently, Deep Learning (DL) shows us powerful capabilities for image processing. But it cannot output the exact photometric process parameters and shows non-interpretable results. Considering such limitations, this paper presents a robot vision system based on Convolutional Neural Networks (CNN) and Monte Carlo algorithms. As an example to discuss about how to apply DL in industry. In the approach, CNN is used for preprocessing and offline tasks. Then the 6- DoF object position are estimated using a particle filter approach. Experiments will show that our approach is efficient and accurate. In future it could show potential solutions for human-machine collaboration systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.