語系:
繁體中文
English
說明(常見問題)
圖資館首頁
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Statistical models for the analysis of heterogeneous biological data sets.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Statistical models for the analysis of heterogeneous biological data sets.
作者:
Buehler, Eugen Christian.
面頁冊數:
69 p.
附註:
Source: Dissertation Abstracts International, Volume: 64-10, Section: B, page: 5030.
附註:
Supervisor: Lyle Ungar.
Contained By:
Dissertation Abstracts International64-10B.
標題:
Biology, Biostatistics.
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3109158
ISBN:
0496567012
Statistical models for the analysis of heterogeneous biological data sets.
Buehler, Eugen Christian.
Statistical models for the analysis of heterogeneous biological data sets.
[electronic resource] - 69 p.
Source: Dissertation Abstracts International, Volume: 64-10, Section: B, page: 5030.
Thesis (Ph.D.)--University of Pennsylvania, 2003.
The focus of this thesis is on developing methods of integrating heterogeneous biological feature sets into structured statistical models, so as to improve model predictions and further understanding of the complex systems that they emulate. Combining data from different sources is an important task in genomics because of the increasing variety of large-scale data being generated, all of which reflect different components of the same complicated network of biological interactions that make up an organism. We contend that traditional machine learning techniques are too general to accurately model heterogeneous biological data and provide insufficient feedback to researchers concerning the systems being modeled. In contrast, we will show that interpretable statistical models specifically designed for and inspired by the underlying structure of biological problems yield more accurate predictions and provide valuable insight into biological systems.
ISBN: 0496567012Subjects--Topical Terms:
227395
Biology, Biostatistics.
Statistical models for the analysis of heterogeneous biological data sets.
LDR
:03275nmm _2200265 _450
001
162268
005
20051017073428.5
008
230606s2003 eng d
020
$a
0496567012
035
$a
00148769
035
$a
162268
040
$a
UnM
$c
UnM
100
0
$a
Buehler, Eugen Christian.
$3
227394
245
1 0
$a
Statistical models for the analysis of heterogeneous biological data sets.
$h
[electronic resource]
300
$a
69 p.
500
$a
Source: Dissertation Abstracts International, Volume: 64-10, Section: B, page: 5030.
500
$a
Supervisor: Lyle Ungar.
502
$a
Thesis (Ph.D.)--University of Pennsylvania, 2003.
520
#
$a
The focus of this thesis is on developing methods of integrating heterogeneous biological feature sets into structured statistical models, so as to improve model predictions and further understanding of the complex systems that they emulate. Combining data from different sources is an important task in genomics because of the increasing variety of large-scale data being generated, all of which reflect different components of the same complicated network of biological interactions that make up an organism. We contend that traditional machine learning techniques are too general to accurately model heterogeneous biological data and provide insufficient feedback to researchers concerning the systems being modeled. In contrast, we will show that interpretable statistical models specifically designed for and inspired by the underlying structure of biological problems yield more accurate predictions and provide valuable insight into biological systems.
520
#
$a
Toward proving this thesis, we introduce maximum entropy biological sequence models. Maximum entropy sequence models have been used previously to integrate arbitrary features in other (non-biological) domains, such as natural language modeling. Here, we apply the same model structure to amino acid and nucleotide sequences. We first propose a broad variety of biologically inspired features, define them mathematically, and test their ability to improve models of amino acid sequences. Of these features, particular attention is paid to long distance features such as triggers, which incorporate information unavailable to more conventional Markovian models and reflect the non-local nature of protein sequence constraints. The ability of these features to improve gene-finding models is demonstrated. We next extend maximum entropy models to nucleotide coding sequences and apply them to the detection of lateral gene transfer. This allows us to evaluate a diverse set of features in a statistically rigorous manner, improving understanding of the problem and eliminating the tendency to inaccurately label short genes. We also develop methods for integrating positional and gene expression data with our maximum entropy sequence model, allowing more accurate predictions of lateral gene transfer and resulting in significant biological insight.
590
$a
School code: 0175.
650
# 0
$a
Biology, Biostatistics.
$3
227395
650
# 0
$a
Computer Science.
$3
212513
650
# 0
$a
Biology, Molecular.
$3
226919
710
0 #
$a
University of Pennsylvania.
$3
212781
773
0 #
$g
64-10B.
$t
Dissertation Abstracts International
790
$a
0175
790
1 0
$a
Ungar, Lyle,
$e
advisor
791
$a
Ph.D.
792
$a
2003
856
4 0
$u
http://libsw.nuk.edu.tw/login?url=http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3109158
$z
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3109158
筆 0 讀者評論
全部
電子館藏
館藏
1 筆 • 頁數 1 •
1
條碼號
館藏地
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
000000000761
電子館藏
1圖書
學位論文
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
多媒體檔案
http://libsw.nuk.edu.tw/login?url=http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3109158
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼
登入