數個應用於隱私保護資料探勘之啟發性方法 = Several Heuris...
國立高雄大學資訊工程學系碩士班

 

  • 數個應用於隱私保護資料探勘之啟發性方法 = Several Heuristic Approaches to Privacy-Preserving Data Mining
  • 紀錄類型: 書目-語言資料,印刷品 : 單行本
    並列題名: Several Heuristic Approaches to Privacy-Preserving Data Mining
    作者: 楊國棟,
    其他團體作者: 國立高雄大學
    出版地: [高雄市]
    出版者: 撰者;
    出版年: 民99[2010]
    面頁冊數: 80面圖,表 : 30公分;
    標題: 準大項目集
    標題: Pre-large itemsets
    電子資源: http://handle.ncl.edu.tw/11296/ndltd/68298347718388735104
    摘要註: 資料探勘的技術可以協助人們從大量的數據中獲取有用的知識,但是在資料收集與數據傳播的過程當中,可能因某些因素導致敏感及隱私的資料受到外洩威脅的風險。所以,有關個人、企業和組織敏感的資訊,應在公布之前即被受限、制止或保護,也因此近幾年,處理隱私保護之資料探勘成為一個重要的研究議題。在本篇論文中,我們提出三種方法藉由修改原始的資料庫,達到隱藏敏感項目集的目的。第一個方法稱為SIF-IDF,它是一個以貪婪演算法為基礎的方法,其主要的想法是借用在文件探勘上利用詞頻與逆向文件頻率(TF-IDF)作關鍵字分析的技巧來評估交易項目集與敏感項目集間的相似程度,然後在一些交易中選擇適當的項目來隱藏。第二個方法是以晶格為基礎的方法,晶格的構建是基於以敏感資料集間關係的分類結果。此外,我們也使用由下而上的刪除策略來逐步減少敏感資料集頻率,以快速達成敏感資料隱藏的目的。最後第三個方法是一個演化式的隱私保護資料探勘的方法,用以從資料庫中尋找最適合的交易進行隱藏。此方法使用了三個變數來設計一個靈活的評估函數,並且可根據使用者的愛好彈性地分配此三變數的權重。除此之外,準大項目集的概念也被應用來減少重新掃描資料庫的成本,加快評估染色體的過程。此三個方法可在隱私保護與執行時間中取得一好的折衷平衡,而實驗結果也顯示所提方法在效能上有優越的表現。 Data mining technology can help extract useful knowledge from large data sets. The process of data collection and data dissemination may, however, result in an inherent risk of privacy threats. Some sensitive or private information about individuals, businesses and organizations needs to be suppressed before it is shared or published. The privacy-preserving data mining (PPDM) has thus become an important issue in recent years. In this thesis, we propose three approaches for modifying original databases in order to hide sensitive itemsets. The first one is called SIF-IDF, which is a greedy approach based on the concept borrowed from the Term Frequency and Inverse Document Frequency (TF-IDF) in text mining. It uses the above concept to evaluate the similarity degrees between the items in transactions and the desired sensitive itemsets and then selects appropriate items in some transactions to hide. The second one is a lattice-based approach, in which a lattice is built based on the relation of sensitive itemsets. The bottom-up deletion strategies is also used to gradually reduce the frequency of sensitive itemsets in the hiding process. The third one is an evolutionary privacy-preserving data mining method to find appropriate transactions to be hidden from a database. The proposed approach designs a flexible evaluation function with three factors, and different weights may be assigned to them depending on users’ preference. Besides, the concept of pre-large itemsets is used to reduce the cost of rescanning databases, thus speeding up the evaluation process of chromosomes. The three proposed approaches can easily make good trade-offs between privacy preserving and execution time. Experimental results also show the performance of the proposed approaches.
館藏
  • 2 筆 • 頁數 1 •
 
310002031972 博碩士論文區(二樓) 不外借資料 學位論文 TH 008M/0019 464103 4664 2010 一般使用(Normal) 在架 0
310002031980 博碩士論文區(二樓) 不外借資料 學位論文 TH 008M/0019 464103 4664 2010 c.2 一般使用(Normal) 在架 0
  • 2 筆 • 頁數 1 •
評論
Export
取書館別
 
 
變更密碼
登入