隱藏敏感項目集及減少其副作用之研究 = A Study on Hidin...
國立高雄大學資訊工程學系碩士班

 

  • 隱藏敏感項目集及減少其副作用之研究 = A Study on Hiding Sensitive Itemsets and Reducing Side Effects
  • 紀錄類型: 書目-語言資料,印刷品 : 單行本
    並列題名: A Study on Hiding Sensitive Itemsets and Reducing Side Effects
    作者: 許宏全,
    其他團體作者: 國立高雄大學
    出版地: [高雄市]
    出版者: 撰者;
    出版年: 2013[民102]
    面頁冊數: 108面圖,表格 : 30公分;
    標題: 隱私防護之資料探勘
    標題: privacy-preserving data mining
    電子資源: http://handle.ncl.edu.tw/11296/ndltd/92167419173185386934
    附註: 參考書目:面91-95
    附註: 102年10月31日公開
    摘要註: 資料探勘在傳統上主要是從大量的資料中探勘出有用的知識並加以分析,而私人或機密的資料則必須在公佈之前經過淨除或隱藏以保留它的隱私性,因此,隱私防護資料探勘在近幾年裡儼然形成了一個重要的研究議題。在隱私防護資料探勘中,較為普遍的做法是進行資料庫的淨除,來使得敏感資訊在探勘過程裡不會被找出。因此,我們在此論文中提出了兩個新的演算法,分別透過刪除交易與刪除項目的方式以達到隱藏敏感項目集的目的。在刪除交易方面,主要是選擇最大比率的敏感與非敏感項目集來將整筆交易資料進行刪除;而在刪除項目方面,主要是延續上述方法先選出一筆交易,接著選擇所有敏感項目集裡出現最多次數的項目,並從所選出的交易資料來進行該項目的刪除。我們也納入三種在隱私防護資料探勘的副作用,如隱藏失敗、遺失項目集和多餘(人工)項目集進行演算法的評估考量以決定哪些交易或項目集需要被刪除,以達到隱藏敏感項目集的效果,而關聯於上述三項副作用的權重可依據使用者的喜好進行調整。此外,我們也提出了風險邊界的概念僅用來評估在邊界值裡的頻繁項目集、敏感項目集與非頻繁項目集,進而加速刪除程序。最後我們藉由實驗來評估所提演算法在執行時間、刪除交易數與副作用個數中的效能。 Data mining is traditionally adopted to retrieve and analyze knowledge from large amounts of data. Private or confidential data may be sanitized or suppressed before it is shared or published in public. Privacy-preserving data mining (PPDM) has thus become an important issue in recent years. The most general way of PPDM is to sanitize the database such that sensitive information will not be known. In this thesis, two novel algorithms are proposed to hide sensitive itemsets through transaction deletion and item deletion, respectively. For transaction deletion, the transaction with the maximal ratio of sensitive to non-sensitive one is thus selected to be entirely deleted. For item deletion, it uses the same way from the transaction deletion to firstly select a transaction for sanitization. The item with the maximal occurrence frequency of the sensitive itemsets is then selected to be deleted from the selected transaction. The three side effects of hiding failures, missing itemsets, and artificial itemsets are considered to evaluate whether the transactions or the itemsets are required to be deleted for hiding sensitive itemsets. Three weights are also assigned to the three factors above, which can be set according to the requirement of users. A risky bound is also designed to speed up the deletion process by only evaluating the itemsets within the boundaries whether for the frequent itemsets, the sensitive itemsets, and the non-frequent (small) itemsets. Experiments are then conducted to show the performance of the proposed algorithms in execution time, number of deleted transactions, and number of side effects.
館藏
  • 2 筆 • 頁數 1 •
 
310002393976 博碩士論文區(二樓) 不外借資料 學位論文 TH 008M/0019 464103 0838 2013 一般使用(Normal) 在架 0
310002393984 博碩士論文區(二樓) 不外借資料 學位論文 TH 008M/0019 464103 0838 2013 c.2 一般使用(Normal) 在架 0
  • 2 筆 • 頁數 1 •
評論
Export
取書館別
 
 
變更密碼
登入