Paper
31 December 2019 Frame-level speech enhancement based on Wasserstein GAN
Author Affiliations +
Proceedings Volume 11384, Eleventh International Conference on Signal Processing Systems; 113840G (2019) https://doi.org/10.1117/12.2559619
Event: Eleventh International Conference on Signal Processing Systems, 2019, Chengdu, China
Abstract
Speech enhancement is a challenging and critical task in the speech processing research area. In this paper, we propose a novel speech enhancement model based on Wasserstein generative adversarial networks, called WSEM. The proposed model operates on frame-level speech segments by using an adjacent frames extension mechanism, to enforce the mapping from noisy speech to the clean target, which makes it distinctly different from other related GAN-based models. We compare the performance of WSEM with related works on benchmark datasets under different signal-to-noise (SNR) conditions, experimental results show that WSEM performs comparable to the state-of-the-art approaches in all the tests, and it performs especially well in low SNR environments.
© (2019) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Peng Chuan, Tian Lan, Meng Li, Sen Li, and Qiao Liu "Frame-level speech enhancement based on Wasserstein GAN", Proc. SPIE 11384, Eleventh International Conference on Signal Processing Systems, 113840G (31 December 2019); https://doi.org/10.1117/12.2559619
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Signal to noise ratio

Gallium nitride

Performance modeling

Data modeling

Neural networks

Image filtering

Signal processing

Back to Top