Image inpainting attempts to fill the missing areas of an image with plausible content that is visually coherent with the image context. Semantic image inpainting has remained a challenging task even with the emergence of deep learning-based approaches. We propose a deep semantic inpainting model built upon a generative adversarial network and a dense U-Net network. Such a design helps achieve feature reuse while avoiding feature explosion along the upsampling path of the U-Net. The model also uses a composite loss function for the generator network to enforce a joint global and local content consistency constraint. More specifically, our new loss function combines the global reconstruction loss characterizing the semantic similarity between the missing and known image regions with the local total variation loss characterizing the natural transitions among adjacent regions. Experimental results on CelebA-HQ and Paris StreetView datasets have demonstrated encouraging performance when compared with other state-of-the-art methods in terms of both quantitative and qualitative metrics. For the CelebA-HQ dataset, the proposed method can more faithfully infer the semantics of human faces; for the StreetView dataset, our method achieves improved inpainting results in terms of more natural texture transitions, better structural consistency, and enriched textural details. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
CITATIONS
Cited by 2 scholarly publications.
Visualization
Lithium
Convolution
Image quality
Network architectures
Binary data
Computer programming