國立高雄大學圖資館 |

語系: 繁體中文

說明(常見問題)

圖資館首頁

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

不明確資料之資料挖掘 = Data Mining with Uncert...

吳志偉

不明確資料之資料挖掘 = Data Mining with Uncertain Data

紀錄類型:	書目-語言資料,印刷品 : 單行本
並列題名:	Data Mining with Uncertain Data
作者:	吳志偉,
其他團體作者:	國立高雄大學
出版地:	[高雄市]
出版者:	撰者;
出版年:	2009[民98]
面頁冊數:	68面圖、表 : 30公分;
標題:	不完全資料
標題:	association rule
電子資源:	http://handle.ncl.edu.tw/11296/ndltd/90386902002686296990
附註:	參考書目：面
附註:	指導教授：洪宗貝
摘要註:	機器學習與資料挖掘是兩項從資料集中擷取資料的重要技術。現今，雖然這兩項技術可以處理龐大的資料量，但是在資料收集的過程中有可能造成某些的資料遺失，在這樣的不完整資料上常會利用適當的處理方式來改善資料的品質。因此復原遺失資料的研究，已經在機器學習及資料挖掘此兩個領域上被視為一個重要的議題。本論文提出一種基於結合反覆式復原遺失資料的機制並結合不同的支持度算法來推導出可靠的關聯法則。第一種方法是利用健全式關聯法則的支持度算法來推導出可靠的關聯法則，再利用該關聯法則來反覆推導資料庫中遺失的資料。第二種方法則是利用部分支持度的算法來推導出可靠的關聯法則，再利用該關聯法則來反覆推導資料庫中遺失的資料。本論文中所提出的這兩個方法皆能完全的對遺失的資料賦予一個合理的值以提供品質較高的資料集。論文中所提到的反覆式復原遺失資料的機制由三個階段組成。第一階段利用關聯法則先粗略的復原某些遺失的資料；第二階段則反覆降低支持度門檻值以獲得更多關連法則，再藉由新的關聯法則復原剩下的所有遺失資料；而第三階段從已復原資料集中所挖掘出來的關連法則來修正復原過的遺失資料，以求得更準確之猜測值。最後，實驗結果顯示，即便在資料遺失度很高的情況下，此兩種方法仍然有很好的資料復原率及資料復原正確率。 Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute values to be missed in the data-gathering process. Incomplete data are usually appropriately handled to improve the quality of the discovered information. Therefore, the problem of recovering missing values from a data set has become an important research issue in the field of data mining and machine learning. In this thesis, we first introduce an iterative missing-value completion method based on the RAR support values to extract useful association rules for inferring missing values in an iterative way. The proposed method can fully infer the missing attribute values by combining an iterative mechanism and data mining techniques. It consists of three phases. The first phase uses the association rules to roughly complete the missing values. The second phase iteratively reduces the minimum support to gather more association rules to complete the rest of missing values. The third phase uses the association rules from the completed dataset to correct the missing values that have been filled in. The proposed approach is then a little modified to consider the partial support values in deriving missing values. The second approach is a little better than the first one because the former uses more information (incomplete tuples) in guessing. Experimental results show both the proposed approaches have good accuracy and data recovery even when the missing-value rate is high.

不明確資料之資料挖掘 = Data Mining with Uncertain Data
吳, 志偉

不明確資料之資料挖掘 = Data Mining with Uncertain Data / 吳志偉撰 - [高雄市] : 撰者, 2009[民98]. - 68面 ; 圖、表 ; 30公分.
參考書目：面指導教授：洪宗貝.
不完全資料association rule

不明確資料之資料挖掘 = Data Mining with Uncertain Data
LDR:04264nam0a2200277 450 001 220270
005 20170214090534.0
009 220270
010 0 $b 平裝
010 0 $b 精裝
100 $a 20170214y2009 k y0chiy09 ea
101 1 $a chi $d chi $d eng
102 $a tw
105 $a ak am 000yy
200 1 $a 不明確資料之資料挖掘 $d Data Mining with Uncertain Data $z eng $f 吳志偉撰
210 $a [高雄市] $c 撰者 $d 2009[民98]
215 0 $a 68面 $c 圖、表 $d 30公分
300 $a 參考書目：面
300 $a 指導教授：洪宗貝
328 $a 碩士論文--國立高雄大學電機工程學系碩士班
330 $a 機器學習與資料挖掘是兩項從資料集中擷取資料的重要技術。現今，雖然這兩項技術可以處理龐大的資料量，但是在資料收集的過程中有可能造成某些的資料遺失，在這樣的不完整資料上常會利用適當的處理方式來改善資料的品質。因此復原遺失資料的研究，已經在機器學習及資料挖掘此兩個領域上被視為一個重要的議題。本論文提出一種基於結合反覆式復原遺失資料的機制並結合不同的支持度算法來推導出可靠的關聯法則。第一種方法是利用健全式關聯法則的支持度算法來推導出可靠的關聯法則，再利用該關聯法則來反覆推導資料庫中遺失的資料。第二種方法則是利用部分支持度的算法來推導出可靠的關聯法則，再利用該關聯法則來反覆推導資料庫中遺失的資料。本論文中所提出的這兩個方法皆能完全的對遺失的資料賦予一個合理的值以提供品質較高的資料集。論文中所提到的反覆式復原遺失資料的機制由三個階段組成。第一階段利用關聯法則先粗略的復原某些遺失的資料；第二階段則反覆降低支持度門檻值以獲得更多關連法則，再藉由新的關聯法則復原剩下的所有遺失資料；而第三階段從已復原資料集中所挖掘出來的關連法則來修正復原過的遺失資料，以求得更準確之猜測值。最後，實驗結果顯示，即便在資料遺失度很高的情況下，此兩種方法仍然有很好的資料復原率及資料復原正確率。 Machine learning and data mining are two kinds of important techniques for extracting valuable information from datasets. Although current mining and learning technologies can handle large amounts of data, the rapid growth of datasets may cause some attribute values to be missed in the data-gathering process. Incomplete data are usually appropriately handled to improve the quality of the discovered information. Therefore, the problem of recovering missing values from a data set has become an important research issue in the field of data mining and machine learning. In this thesis, we first introduce an iterative missing-value completion method based on the RAR support values to extract useful association rules for inferring missing values in an iterative way. The proposed method can fully infer the missing attribute values by combining an iterative mechanism and data mining techniques. It consists of three phases. The first phase uses the association rules to roughly complete the missing values. The second phase iteratively reduces the minimum support to gather more association rules to complete the rest of missing values. The third phase uses the association rules from the completed dataset to correct the missing values that have been filled in. The proposed approach is then a little modified to consider the partial support values in deriving missing values. The second approach is a little better than the first one because the former uses more information (incomplete tuples) in guessing. Experimental results show both the proposed approaches have good accuracy and data recovery even when the missing-value rate is high.
510 1 $a Data Mining with Uncertain Data $z eng
610 0 $a 不完全資料 $a 支持度 $a 資料挖掘 $a 資料遺失 $a 關聯法則
610 1 $a association rule $a data mining $a incomplete data $a missing value $a support
681 $a 008M/0019 $b 542201 2642 $v 2007年版
700 1 $a 吳 $b 志偉 $4 撰 $3 57500
712 0 2 $a 國立高雄大學 $b 電機工程學系碩士班 $3 166118
801 0 $a tw $b 國立高雄大學 $c 20091020 $g CCR
856 7 $2 http $u http://handle.ncl.edu.tw/11296/ndltd/90386902002686296990