國立高雄大學圖資館 |

Language: English

Back

Stereo music source separation via B...

Master, Aaron Steven.

Stereo music source separation via Bayesian modeling.

Record Type:	Electronic resources : Monograph/item
Title/Author:	Stereo music source separation via Bayesian modeling.
Author:	Master, Aaron Steven.
Description:	184 p.
Notes:	Adviser: Julius O. Smith, III.
Notes:	Source: Dissertation Abstracts International, Volume: 67-05, Section: B, page: 2751.
Contained By:	Dissertation Abstracts International67-05B.
Subject:	Music.
Online resource:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3219333
ISBN:	9780542707681

Stereo music source separation via Bayesian modeling.
Master, Aaron Steven.

Stereo music source separation via Bayesian modeling. - 184 p.

Adviser: Julius O. Smith, III.

Thesis (Ph.D.)--Stanford University, 2006.

It is often useful to be able to separate out the musical sources on a stereo recording. It allows the end user to easily remix and transcribe sources and to perform karaoke. It also allows single-source technical solutions to be applied to previously mixed music: speech or voice recognition, higher-accuracy pitch detection, and source-specific efficient audio coding. We presently consider separating sources mixed in the stereo (two channel) format, common in commercial recordings. This may be viewed as a special case of blind source separation where the mixtures are generally underdetermined (because there are in general more than two sources), yet information about the sources and their mixing is available. Specifically, we often know that sources---voices and instruments---are amplitude panned between the left and right channels, that they are only active at certain points in time, and that they have certain general loudness and spectral characteristics. In addressing such cases, we propose a short-time Fourier transform (STFT) domain Bayesian system that considers at each input point the frequency-dependent observed panning, phase offset between channels, and combined loudness of input. Based on these observations and training data, it computes "expected median value" estimates of the sources. The source combination modeling is significant because it considers frequency and loudness information, and because it allows the separation system to choose any number of active sources for a given set of input parameters, rather than just two. We position this system in a newly proposed framework that describes existing and proposed demixing as possibly nonlinear beamforming. This unifying framework is helpful because it allows us to visualize and understand how various stereo source separation systems relate to each other. It also allows us to break apart the separation system into components that attenuate magnitude, control panning, and select phase. We use this fact to build a karaoke system that preserves stereo imaging. We also demonstrate demixing superior to that of other systems on synthetic examples, using newly proposed psychoacoustic metrics.

ISBN: 9780542707681Subjects--Topical Terms:

227185
Music.

Stereo music source separation via Bayesian modeling.
LDR:03129nmm _2200265 _450 001 180576
005 20080111103743.5
008 090528s2006 eng d
020 $a 9780542707681
035 $a 00311601
040 $a UMI $c UMI
100 0 $a Master, Aaron Steven. $3 264152
245 1 0 $a Stereo music source separation via Bayesian modeling.
300 $a 184 p.
500 $a Adviser: Julius O. Smith, III.
500 $a Source: Dissertation Abstracts International, Volume: 67-05, Section: B, page: 2751.
502 $a Thesis (Ph.D.)--Stanford University, 2006.
520 # $a It is often useful to be able to separate out the musical sources on a stereo recording. It allows the end user to easily remix and transcribe sources and to perform karaoke. It also allows single-source technical solutions to be applied to previously mixed music: speech or voice recognition, higher-accuracy pitch detection, and source-specific efficient audio coding. We presently consider separating sources mixed in the stereo (two channel) format, common in commercial recordings. This may be viewed as a special case of blind source separation where the mixtures are generally underdetermined (because there are in general more than two sources), yet information about the sources and their mixing is available. Specifically, we often know that sources---voices and instruments---are amplitude panned between the left and right channels, that they are only active at certain points in time, and that they have certain general loudness and spectral characteristics. In addressing such cases, we propose a short-time Fourier transform (STFT) domain Bayesian system that considers at each input point the frequency-dependent observed panning, phase offset between channels, and combined loudness of input. Based on these observations and training data, it computes "expected median value" estimates of the sources. The source combination modeling is significant because it considers frequency and loudness information, and because it allows the separation system to choose any number of active sources for a given set of input parameters, rather than just two. We position this system in a newly proposed framework that describes existing and proposed demixing as possibly nonlinear beamforming. This unifying framework is helpful because it allows us to visualize and understand how various stereo source separation systems relate to each other. It also allows us to break apart the separation system into components that attenuate magnitude, control panning, and select phase. We use this fact to build a karaoke system that preserves stereo imaging. We also demonstrate demixing superior to that of other systems on synthetic examples, using newly proposed psychoacoustic metrics.
590 $a School code: 0212.
650 # 0 $a Music. $3 227185
650 # 0 $a Engineering, Electronics and Electrical. $3 226981
690 $a 0413
690 $a 0544
710 0 # $a Stanford University. $3 212607
773 0 # $g 67-05B. $t Dissertation Abstracts International
790 $a 0212
790 1 0 $a Smith, Julius O., III, $e advisor
791 $a Ph.D.
792 $a 2006
856 4 0 $u http://libsw.nuk.edu.tw:81/login?url=http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3219333 $z http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3219333