8 May 2024 Effective grasp detection method based on Swin transformer
Jing Zhang, Yulin Tang, Yusong Luo, Yukun Du, Mingju Chen
Author Affiliations +
Abstract

Grasp detection within unstructured environments encounters challenges that lead to a reduced success rate in grasping attempts, attributable to factors including object uncertainty, random positions, and differences in perspective. This work proposes a grasp detection algorithm framework, Swin-transNet, which adopts a hypothesis treating graspable objects as a generalized category and distinguishing between graspable and non-graspable objects. The utilization of the Swin transformer module in this framework augments the feature extraction process, enabling the capture of global relationships within images. Subsequently, the integration of a decoupled head with attention mechanisms further refines the channel and spatial representation of features. This strategic combination markedly improves the system’s adaptability to uncertain object categories and random positions, culminating in the precise output of grasping information. Moreover, we elucidate their roles in grasping tasks. We evaluate the grasp detection framework using the Cornell grasp dataset, which is divided into image and object levels. The experiment indicated a detection accuracy of 98.1% and a detection speed of 52 ms. Swin-transNet shows robust generalization on the Jacquard dataset, attaining a detection accuracy of 95.2%. It demonstrates an 87.8% success rate in real-world grasping testing on a visual grasping system, confirming its effectiveness for robotic grasping tasks.

© 2024 SPIE and IS&T
Jing Zhang, Yulin Tang, Yusong Luo, Yukun Du, and Mingju Chen "Effective grasp detection method based on Swin transformer," Journal of Electronic Imaging 33(3), 033008 (8 May 2024). https://doi.org/10.1117/1.JEI.33.3.033008
Received: 14 November 2023; Accepted: 18 April 2024; Published: 8 May 2024
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Transformers

Object detection

Education and training

Feature extraction

Detection and tracking algorithms

Data modeling

Windows

Back to Top