Volume 4, Issue 1, February 2015, Page: 1-8
Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding
Md. Ekramul Hamid, Dept. of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh; Dept. of Electric and Electronic Engineering, Shizuoka University, Hamamatsu-shi, Japan
Md. Khademul Islam Molla, Dept. of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh
Md. Iqbal Aziz Khan, Dept. of Computer Science and Engineering, University of Rajshahi, Rajshahi, Bangladesh; Dept. of Electric and Electronic Engineering, Shizuoka University, Hamamatsu-shi, Japan
Takayoshi Nakai, Dept. of Electric and Electronic Engineering, Shizuoka University, Hamamatsu-shi, Japan
Received: Apr. 11, 2015;       Accepted: Apr. 18, 2015;       Published: Apr. 29, 2015
DOI: 10.11648/j.cssp.20150401.12      View  4808      Downloads  184
Abstract
A method of and a system for speech enhancement consists of Hilbert spectrum and wavelet packet analysis is studied. We implement ISA to separate speech and interfering signals from single mixture and wavelet packet based soft-thresholding algorithm to enhance the quality of target speech. The mixed signal is projected onto time-frequency (TF) space using empirical mode decomposition (EMD) based Hilbert spectrum (HS). Then a finite set of independent basis vectors are derived from the TF space by applying principal component analysis (PCA) and independent component analysis (ICA) sequentially. The vectors are clustered using hierarchical clustering to represent the independent subspaces corresponding to the component sources in the mixture. However, the speech quality of the separation algorithm is not enough and contains some residual noises. Therefore, in the next stage, the target speech is enhanced using wavelet packet decomposition (WPD) method where the speech activity is monitored by updating noise or unwanted signals statistics. The mode mixing issue of traditional EMD is addressed and resolved using ensemble EMD. The proposed algorithm is also tested using short-time Fourier transform (STFT) based spectrogram method. The simulation results show a noticeable performance in the field of audio source separation and speech enhancement.
Keywords
Speech Enhancement, Ensemble Empirical Mode Decomposition, Source Separation, Independent Subspace Analysis, Hilbert Spectrum, Wavelet Packet Decomposition
To cite this article
Md. Ekramul Hamid, Md. Khademul Islam Molla, Md. Iqbal Aziz Khan, Takayoshi Nakai, Speech Enhancement Using Hilbert Spectrum and Wavelet Packet Based Soft-Thresholding, Science Journal of Circuits, Systems and Signal Processing. Vol. 4, No. 1, 2015, pp. 1-8. doi: 10.11648/j.cssp.20150401.12
Reference
[1]
H. Saruwatari, S. Kurita, K. Takeda, F. Itakura, T. Nishikawa, and K. Shikano, “Blind Source Separation Combining Independent Component Analysis and Beamforming.” EURASIP Journal on Applied Signal Processing, vol. 11, pp. 1135-1146, 2003.
[2]
J. M. Valin, J. Rouat, and F. Michaud, “Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter,” Proc. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2004.
[3]
Y. Ephraim, and D. Malah, “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 32, pp. 1109-1121, 1984.
[4]
O. Cappe, “Estimation of the musical noise phenomenon with the Ephraim and Malah noise suppressor,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 2, pp. 345-349, 1994.
[5]
S. F. Boll, “Suppression of acoustic noise in speech using spectral subtraction,” IEEE Trans. on Acoustic, Speech and Signals Processing, vol. 27, pp. 113-120, 1979.
[6]
G. J. Brown, and M. Cooke,“Computational auditory scene analysis,” Computer Speech Language, vol. 8(4), pp. 297-336, 1994.
[7]
M. A.Casey, and A. Westner, “Separation of mixed audio sources by independent subspace analysis,” Proc. of International Computer Music Conference, pp. 154-161, 2000.
[8]
M. K. I. Molla, and K. Hirose, “Single mixture audio source separation by subspace decomposition of Hilbert spectrum,” IEEE transactions on audio, speech and language processing, vol. 15(3), pp. 893-900, 2007.
[9]
Y. Ghanbari, and M. R. K. Mollaei, “A new approach for speech enhancement based on the adaptive thresholding of the wavelet packets”, Speech Communications, Elsevier, vol. 48, pp. 927-940, 2006.
[10]
N. E. Huang, Z.Shen, S. R Long, et al. “The empirical mode decomposition and Hilbert spectrum for nonlinear and non-stationary time series analysis,” Proc. Roy. Soc. London A, vol. 454, pp. 903-995, 1998.
[11]
Z. Wu, and N. E. Huang, “Ensemble empirical mode decomposition: a noise-assisted data analysis method,” Advances in Adaptive Data Analysis, vol. 1(1), 2009.
[12]
A. Hyvärinen, and E. Oja, “Independent component analysis: algorithms and applications,”Neural Networks, vol.13(4-5), pp. 411-430, 2000.
[13]
J. F. Cardoso, and A. Souloumiac, “Blind beamforming for nongaussian signals,” IEE Proceedings-F,pp. 362-370, 1993.
[14]
J. Rosca, D.Erdogmus, J. Princip, and S. Haykin, Independent component analysis and blind signal separation, Springer, 2006.
[15]
R. A. Singer, R. G. Sea, “A new filter for optimal tracking in dense multi-target environment,” Proceedings of the ninth Allerton Conference Circuit and System Theory. Urbana-Champaign, USA: Univ. of Illinois, pp. 201-211,1971.
[16]
N. E. Huang, et al.,“Application of Hilbert-Huang transform to non-stationary financial time series analysis,” Applied Stochastic Model in Business and Industry, vol. 19, pp. 245-268, 2003.
Browse journals by subject