Convolutional neural networks (CNNs) are a popular choice for medical image segmentation. However, they may be challenged by the large inter-subject variation in organ shapes and sizes due to CNNs typically employing convolutions with fixed-sized local receptive fields. To address this limitation, we proposed multi-scale aggregated residual convolution (MARC) and iterative multi-scale aggregated residual convolution (iMARC) to capture finer and richer features at various scales. Our goal is to improve single convolutions’ representation capabilities. This is achieved by employing convolutions with varying-sized receptive fields, combining multiple convolutions into a deeper one, and dividing single convolutions into a set of channel-independent sub-convolutions. These implementations result in an increase in their depth, width, and cardinality. The proposed MARC and iMARC can be easily integrated into general CNN architectures and trained end-to-end. To evaluate the improvements of MARC and iMARC on CNNs’ segmentation capabilities, we integrated MARC and iMARC into a standard 2D U-Net architecture for pancreas segmentation on abdominal computed tomography (CT) images. The results showed that our proposed MARC and iMARC enhanced the representation capabilities of single convolutions, resulting in improved segmentation performance with lower computational complexity.
|