Semantic segmentation plays a crucial role in practical applications, such as autonomous driving and robot navigation. However, prevalent semantic segmentation networks suffer from two primary challenges: oversized networks with redundant parameters that hinder network inference speed and excessively lightweight network structures that sacrifice semantic segmentation accuracy. Therefore, it is essential to design a semantic segmentation network that strikes a balance between accuracy and inference speed. We propose the asymmetric residual bottleneck module, which incorporates dilated convolution, depth-wise separable asymmetric convolution, channel attention mechanism, and a channel shuffle unit. By utilizing these components, model parameters are effectively reduced, and inference speed is accelerated. Furthermore, a feature aggregation module is designed to integrate features from feature maps with various resolutions, thereby enhancing segmentation accuracy. Based on these advancements, an efficient and lightweight real-time semantic segmentation network called efficiently lightweight asymmetrical network (ELANet) is proposed. Experimental results of the Cityscapes and CamVid datasets demonstrate that ELANet strikes a favorable balance between speed and accuracy. Notably, without any pretrained model and postprocessing scheme, ELANet achieves an impressive mean intersection over union of 72.5% on the Cityscapes test dataset with only 0.82 million parameters, operating at an inference speed of 173.5 frames per second on a single NVIDIA GTX 3090 GPU, with a |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Convolution
Image segmentation
Semantics
Data modeling
Performance modeling
Ablation
Feature fusion