Affordable Access

Access to the full text

Improvement in monaural speech separation using sparse non-negative tucker decomposition

Authors
  • Varshney, Yash Vardhan1
  • Upadhyaya, Prashant1
  • Abbasi, Zia Ahmad1
  • Abidi, Musiur Raza1
  • Farooq, Omar1
  • 1 Aligarh Muslim University, Department of Electronics Engineering, Aligarh, India , Aligarh (India)
Type
Published Article
Journal
International Journal of Speech Technology
Publisher
Springer US
Publication Date
Sep 05, 2018
Volume
21
Issue
4
Pages
837–849
Identifiers
DOI: 10.1007/s10772-018-9550-5
Source
Springer Nature
Keywords
License
Yellow

Abstract

A monaural speech separation/enhancement technique based on non-negative tucker decomposition (NTD) has been introduced in this paper. In the proposed work, the effect of sparsity regularization factor on the separation of mixed signal is included in the generalized cost function of NTD. By using the proposed algorithm, the vector components of both target and mixed signal can be exploited and used for the separation of any monaural mixture. Experiment was done on the monaural data generated by mixing the speech signals from two speakers and, by mixing noise and speech signals using TIMIT and noisex-92 dataset. The separation results are compared with the other existing algorithms in terms of correlation of separated signal with the original signal, signal to distortion ratio, perceptual evaluation of speech quality and short-time objective intelligibility. Further, to get more conclusive information about separation ability, speech recognition using Kaldi toolkit was also performed. The recognition results are compared in terms of word error rate (WER) using the MFCC based features. Results show the average improved WER using proposed algorithm over the nearest performing algorithm is up to 2.7% for mixed speech of two speakers and 1.52% for noisy speech input.

Report this publication

Statistics

Seen <100 times