Paper
9 October 2024 Improved MaskFormer visual target segmentation method for noisy interference images
Yuanjin Sun, Gang Li, Pengbo Li, Ling Zhang, Shujing An, Jingkun Cao
Author Affiliations +
Proceedings Volume 13288, Fourth International Conference on Computer Graphics, Image, and Virtualization (ICCGIV 2024); 132880U (2024) https://doi.org/10.1117/12.3044886
Event: Fourth International Conference on Computer Graphics, Image, and Virtualization (ICCGIV 2024), 2024, Chengdu, China
Abstract
Aiming at the problem that the image is defaced with more noise and complex features that make the target difficult to be segmented accurately, the traditional CNN target segmentation method is difficult to fully extract the detail information. Based on this, a segmentation network based on improved MaskFormer is proposed in this paper. The lightweight mask attention mechanism is used instead of the Transformer decoding attention mechanism, and the foreground region of each prediction mask is constrained by cross-attention, which reduces the interference of the smeared region and enhances the extraction of local features; GAMS, a multi-scale feature fusion module based on the gating mechanism, is used to carve out the semantic information of the image at different scales, which improves the feature discriminative ability of the model; the exchange of the self-attention and the lightweight mask attention order to reduce the network computation and improve the model training efficiency. The optimal values of evaluation indexes such as MIoU, ssMIoU and msMIoU are obtained in the segmentation experiments on ADE20K and COCO-Stuff-10k datasets after noise defacement treatment, and the improved MaskFormer segmentation performance is better compared with other networks.
(2024) Published by SPIE. Downloading of the abstract is permitted for personal use only.
Yuanjin Sun, Gang Li, Pengbo Li, Ling Zhang, Shujing An, and Jingkun Cao "Improved MaskFormer visual target segmentation method for noisy interference images", Proc. SPIE 13288, Fourth International Conference on Computer Graphics, Image, and Virtualization (ICCGIV 2024), 132880U (9 October 2024); https://doi.org/10.1117/12.3044886
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Transformers

Data modeling

Feature fusion

Semantics

Education and training

Feature extraction

Back to Top