國立高雄大學圖資館 |

語系: 繁體中文

說明(常見問題)

圖資館首頁

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

New time-frequency domain pitch esti...

Concordia University (Canada).

New time-frequency domain pitch estimation methods for speech signals under low levels of SNR.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	New time-frequency domain pitch estimation methods for speech signals under low levels of SNR.
作者:	Shahnaz, Celia.
面頁冊數:	197 p.
附註:	Source: Dissertation Abstracts International, Volume: 71-07, Section: B, page: .
Contained By:	Dissertation Abstracts International71-07B.
標題:	Engineering, Computer.
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR63361
ISBN:	9780494633618

New time-frequency domain pitch estimation methods for speech signals under low levels of SNR.
Shahnaz, Celia.

New time-frequency domain pitch estimation methods for speech signals under low levels of SNR. - 197 p.

Source: Dissertation Abstracts International, Volume: 71-07, Section: B, page: .

Thesis (Ph.D.)--Concordia University (Canada), 2009.

The major objective of this research is to develop novel pitch estimation methods capable of handling speech signals in practical situations where only noise-corrupted speech observations are available. With this objective in mind, the estimation task is carried out in two different approaches. In the first approach, the noisy speech observations are directly employed to develop two new time-frequency domain pitch estimation methods. These methods are based on extracting a pitch-harmonic and finding the corresponding harmonic number required for pitch estimation. Considering that voiced speech is the output of a vocal tract system driven by a sequence of pulses separated by the pitch period, in the second approach, instead of using the noisy speech directly for pitch estimation, an excitation-like signal (ELS) is first generated from the noisy speech or its noise- reduced version.

ISBN: 9780494633618Subjects--Topical Terms:

384375
Engineering, Computer.

New time-frequency domain pitch estimation methods for speech signals under low levels of SNR.
LDR:05396nmm 2200289 4500 001 280830
005 20110119095004.5
008 110301s2009 ||||||||||||||||| ||eng d
020 $a 9780494633618
035 $a (UMI)AAINR63361
035 $a AAINR63361
040 $a UMI $c UMI
100 1 $a Shahnaz, Celia. $3 492964
245 1 0 $a New time-frequency domain pitch estimation methods for speech signals under low levels of SNR.
300 $a 197 p.
500 $a Source: Dissertation Abstracts International, Volume: 71-07, Section: B, page: .
502 $a Thesis (Ph.D.)--Concordia University (Canada), 2009.
520 $a The major objective of this research is to develop novel pitch estimation methods capable of handling speech signals in practical situations where only noise-corrupted speech observations are available. With this objective in mind, the estimation task is carried out in two different approaches. In the first approach, the noisy speech observations are directly employed to develop two new time-frequency domain pitch estimation methods. These methods are based on extracting a pitch-harmonic and finding the corresponding harmonic number required for pitch estimation. Considering that voiced speech is the output of a vocal tract system driven by a sequence of pulses separated by the pitch period, in the second approach, instead of using the noisy speech directly for pitch estimation, an excitation-like signal (ELS) is first generated from the noisy speech or its noise- reduced version.
520 $a In the first approach, at first, a harmonic cosine autocorrelation (HCAC) model of clean speech in terms of its pitch-harmonics is introduced. In order to extract a pitch-harmonic, we propose an optimization technique based on least-squares fitting of the autocorrelation function (ACF) of the noisy speech to the HCAC model. By exploiting the extracted pitch-harmonic along with the fast Fourier transform (FFT) based power spectrum of noisy speech, we then deduce a harmonic measure and a harmonic-to-noise-power ratio (HNPR) to determine the desired harmonic number of the extracted pitch-harmonic. In the proposed optimization, an initial estimate of the pitch-harmonic is obtained from the maximum peak of the smoothed FFT power spectrum.
520 $a In addition to the HCAC model, where the cross-product terms of different harmonics are neglected, we derive a compact yet accurate harmonic sinusoidal autocorrelation (HSAC) model for clean speech signal. The new HSAC model is then used in the least-squares model-fitting optimization technique to extract a pitch-harmonic.
520 $a In the second approach, first, we develop a pitch estimation method by using an excitation-like signal (ELS) generated from the noisy speech. To this end, a technique is based on the principle of homomorphic deconvolution is proposed for extracting the vocal-tract system (VTS) parameters from the noisy speech, which are utilized to perform an inverse-filtering of the noisy speech to produce a residual signal (RS). In order to reduce the effect of noise on the RS, a noise-compensation scheme is introduced in the autocorrelation domain. The noise-compensated ACF of the RS is then employed to generate a squared Hilbert envelope (SHE) as the ELS of the voiced speech. With a view to further overcome the adverse effect of noise on the ELS, a new symmetric normalized magnitude difference function of the ELS is proposed for eventual pitch estimation.
520 $a Cepstrum has been widely used in speech signal processing but has limited capability of handling noise. One potential solution could be the introduction of a noise reduction block prior to pitch estimation based on the conventional cepstrum, a framework already available in many practical applications, such as mobile communication and hearing aids. Motivated by the advantages of the existing framework and considering the superiority of our ELS to the speech itself in providing clues for pitch information, we develop a cepstrum-based pitch estimation method by using the ELS obtained from the noise-reduced speech. For this purpose, we propose a noise subtraction scheme in frequency domain, which takes into account the possible cross-correlation between speech and noise and has advantages of noise being updated with time and adjusted at each frame. The enhanced speech thus obtained is utilized to extract the vocal-tract system (VTS) parameters via the homomorphic deconvolution technique. A residual signal (RS) is then produced by inverse-filtering the enhanced speech with the extracted VTS parameters. It is found that, unlike the previous ELS-based method, the squared Hilbert envelope (SHE) computed from the RS of the enhanced speech without noise compensation, is sufficient to represent an ELS. Finally, in order to tackle the undesirable effect of noise of the ELS at a very low SNR and overcome the limitation of the conventional cepstrum in handling different types of noises, a time-frequency domain pseudo cepstrum of the ELS of the enhanced speech, incorporating information of both magnitude and phase spectra of the ELS, is proposed for pitch estimation. (Abstract shortened by UMI.)
590 $a School code: 0228.
650 4 $a Engineering, Computer. $3 384375
690 $a 0464
710 2 $a Concordia University (Canada). $3 492965
773 0 $t Dissertation Abstracts International $g 71-07B.
790 $a 0228
791 $a Ph.D.
792 $a 2009
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR63361