國立高雄大學圖資館 |

Language: English

Back

A soft computing based approach for ...

Ullah, Sameeh.

A soft computing based approach for multi-accent classification in IVR systems.

Record Type:	Electronic resources : Monograph/item
Title/Author:	A soft computing based approach for multi-accent classification in IVR systems.
Author:	Ullah, Sameeh.
Description:	178 p.
Notes:	Source: Dissertation Abstracts International, Volume: 71-01, Section: B, page: 0521.
Contained By:	Dissertation Abstracts International71-01B.
Subject:	Engineering, Computer.
Online resource:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR55577
ISBN:	9780494555774

A soft computing based approach for multi-accent classification in IVR systems.
Ullah, Sameeh.

A soft computing based approach for multi-accent classification in IVR systems. - 178 p.

Source: Dissertation Abstracts International, Volume: 71-01, Section: B, page: 0521.

Thesis (Ph.D.)--University of Waterloo (Canada), 2009.

A speaker's accent is the most important factor affecting the performance of Natural Language Call Routing (NLCR) systems because accents vary widely, even within the same country or community. This variation also occurs when nonnative speakers start to learn a second language, the substitution of native language phonology being a common process. Such substitution leads to fuzziness between the phoneme boundaries and phoneme classes, which reduces out-of-class variations, and increases the similarities between the different sets of phonemes. Thus, this fuzziness is the main cause of reduced NLCR system performance. The main requirement for commercial enterprises using an NLCR system is to have a robust NLCR system that provides call understanding and routing to appropriate destinations. The chief motivation for this present work is to develop an NLCR system that eliminates multilayered menus and employs a sophisticated speaker accent-based automated voice response system around the clock. Currently, NLCRs are not fully equipped with accent classification capability. Our main objective is to develop both speaker-independent and speaker-dependent accent classification systems that understand a caller's query, classify the caller's accent, and route the call to the acoustic model that has been thoroughly trained on a database of speech utterances recorded by such speakers. In the field of accent classification, the dominant approaches are the Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM). Of the two, GMM is the most widely implemented for accent classification. However, GMM performance depends on the initial partitions and number of Gaussian mixtures, both of which can reduce performance if poorly chosen. To overcome these shortcomings, we propose a speaker-independent accent classification system based on a distance metric learning approach and evolution strategy. This approach depends on side information from dissimilar pairs of accent groups to transfer data points to a new feature space where the Euclidean distances between similar and dissimilar points are at their minimum and maximum, respectively. Finally, a Non-dominated Sorting Evolution Strategy (NSES)-based k-means clustering algorithm is employed on the training data set processed by the distance metric learning approach. The main objectives of the NSES-based k-means approach are to find the cluster centroids as well as the optimal number of clusters for a GMM classifier. In the case of a speaker-dependent application, a new method is proposed based on the fuzzy canonical correlation analysis to find appropriate Gaussian mixtures for a GMM-based accent classification system. In our proposed method, we implement a fuzzy clustering approach to minimize the within-group sum-of-square-error and canonical correlation analysis to maximize the correlation between the speech feature vectors and cluster centroids. We conducted a number of experiments using the TIMIT database, the speech accent archive, and the foreign accent English databases for evaluating the performance of speaker-independent and speaker-dependent applications. Assessment of the applications and analysis shows that our proposed methodologies outperform the HMM, GMM, vector quantization GMM, and radial basis neural networks.

ISBN: 9780494555774Subjects--Topical Terms:

384375
Engineering, Computer.

A soft computing based approach for multi-accent classification in IVR systems.
LDR:04110nmm 2200241 4500 001 280757
005 20110119094946.5
008 110301s2009 ||||||||||||||||| ||eng d
020 $a 9780494555774
035 $a (UMI)AAINR55577
035 $a AAINR55577
040 $a UMI $c UMI
100 1 $a Ullah, Sameeh. $3 492850
245 1 2 $a A soft computing based approach for multi-accent classification in IVR systems.
300 $a 178 p.
500 $a Source: Dissertation Abstracts International, Volume: 71-01, Section: B, page: 0521.
502 $a Thesis (Ph.D.)--University of Waterloo (Canada), 2009.
520 $a A speaker's accent is the most important factor affecting the performance of Natural Language Call Routing (NLCR) systems because accents vary widely, even within the same country or community. This variation also occurs when nonnative speakers start to learn a second language, the substitution of native language phonology being a common process. Such substitution leads to fuzziness between the phoneme boundaries and phoneme classes, which reduces out-of-class variations, and increases the similarities between the different sets of phonemes. Thus, this fuzziness is the main cause of reduced NLCR system performance. The main requirement for commercial enterprises using an NLCR system is to have a robust NLCR system that provides call understanding and routing to appropriate destinations. The chief motivation for this present work is to develop an NLCR system that eliminates multilayered menus and employs a sophisticated speaker accent-based automated voice response system around the clock. Currently, NLCRs are not fully equipped with accent classification capability. Our main objective is to develop both speaker-independent and speaker-dependent accent classification systems that understand a caller's query, classify the caller's accent, and route the call to the acoustic model that has been thoroughly trained on a database of speech utterances recorded by such speakers. In the field of accent classification, the dominant approaches are the Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM). Of the two, GMM is the most widely implemented for accent classification. However, GMM performance depends on the initial partitions and number of Gaussian mixtures, both of which can reduce performance if poorly chosen. To overcome these shortcomings, we propose a speaker-independent accent classification system based on a distance metric learning approach and evolution strategy. This approach depends on side information from dissimilar pairs of accent groups to transfer data points to a new feature space where the Euclidean distances between similar and dissimilar points are at their minimum and maximum, respectively. Finally, a Non-dominated Sorting Evolution Strategy (NSES)-based k-means clustering algorithm is employed on the training data set processed by the distance metric learning approach. The main objectives of the NSES-based k-means approach are to find the cluster centroids as well as the optimal number of clusters for a GMM classifier. In the case of a speaker-dependent application, a new method is proposed based on the fuzzy canonical correlation analysis to find appropriate Gaussian mixtures for a GMM-based accent classification system. In our proposed method, we implement a fuzzy clustering approach to minimize the within-group sum-of-square-error and canonical correlation analysis to maximize the correlation between the speech feature vectors and cluster centroids. We conducted a number of experiments using the TIMIT database, the speech accent archive, and the foreign accent English databases for evaluating the performance of speaker-independent and speaker-dependent applications. Assessment of the applications and analysis shows that our proposed methodologies outperform the HMM, GMM, vector quantization GMM, and radial basis neural networks.
590 $a School code: 1141.
650 4 $a Engineering, Computer. $3 384375
690 $a 0464
710 2 $a University of Waterloo (Canada). $3 212557
773 0 $t Dissertation Abstracts International $g 71-01B.
790 $a 1141
791 $a Ph.D.
792 $a 2009
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=NR55577