The effectiveness of CycleGAN is demonstrated to outperform recent approaches for semi-supervised semantic segmentation on public segmentation benchmarks for a small number of the labelled data. However CycleGAN tends to generate same semantic segmentation results for acoustic image datasets, and can’t retain target details. To solve this problem, an spectral normalized CycleGAN network (SNCycleGAN) is presented, which applies spectral normalization to both generators and discriminators to stabilize the training of GANs. The experimental results demonstrate that semi-supervised training of SNCycleGAN helps to achieve reasonably accurate sonar targets segmentation from limited labelled data without using transfer learning, and surpass supervised training in detail preservation.
Convolutional neural network (CNN) has achieved good performance in object classification due to its inherent translation equivariance, but its ability of scale equivariance is poor. A Scale-Aware Network (SA Net) with scale equivariance is proposed, which can estimate scale, that is, the size of image, while classifying. In the training stage, only one scale pattern is learned. In the testing stage, firstly, the testing sample with unseen scale is zoomed-in and zoomed-out into a set of images with different scales, which form an image pyramid. The image zooming-in channels are up-sampled by bilinear interpolation. The image zooming-out channels are down-sampled, and the combination of dyadic discrete wavelet transform (DWT) and bilinear interpolation are used to avoid spectrum aliasing. Then, the image pyramid with different scales is sent to siamese CNNs with weight-sharing for inferencing. A two-dimensional classification score matrix is obtained. Through the position of the maximum of the classification score matrix, the classification and scale estimation can be carried out at the same time. Experiments are carried out on MNIST Large Scale testing set. In scale estimation experiments, the relative value of root mean square error (RMSE) can be obtained by scaling the testing sample images in a geometric series with common ration of 4√2 in the range of [1/2,2]. The classification experiments show that when the scale is greater than 1.0, the classification accuracy can surpass 90%. SA Net can estimate the scale while improving the classification accuracy, and mis-estimated samples are always near the ground-truths (GTs), so the correct scale of the unseen scale can always be obtained roughly.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.