國立高雄大學圖資館 |

語系: 繁體中文

說明(常見問題)

圖資館首頁

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

多重巨量資料處理平台之整合與最佳化技術 = Integration an...

國立高雄大學資訊工程學系碩士班

多重巨量資料處理平台之整合與最佳化技術 = Integration and Optimization Technologies for Multiple Big Data Processing Platforms

紀錄類型:	書目-語言資料,印刷品 : 單行本
並列題名:	Integration and Optimization Technologies for Multiple Big Data Processing Platforms
作者:	蔡允哲,
其他團體作者:	國立高雄大學
出版地:	[高雄市]
出版者:	撰者;
出版年:	2014[民103]
面頁冊數:	63葉部分彩圖,表 : 30公分;
標題:	分散式記憶體儲存
標題:	distributed memory storage
電子資源:	https://hdl.handle.net/11296/5ydc5c
附註:	108年10月31日公開
附註:	參考書目：葉53-54
摘要註:	本研究的目的在基於雲端計算架構上的建置一套具有高效能、高可用性、高擴展性的多重巨量資料處理平台，透過整合Apache Hive、Cloudera Impala及BDAS Shark使平台在巨量資料的環境下支援SQL命令快速檢索能力。首先，透過本研究所設計的最佳化程式，可以讓使用者透過單一的存取介面後，由程式自動選擇執行效能最佳的巨量資料倉儲平台進行運算。再者，利用Memcached分散式記憶體儲存系統和Apache Hadoop中的HDFS分散式檔案系統對已查詢結果進行快取，此後若是輸入相同的SQL查詢命令則會透過此高效能的快取系統直接取得檢索結果，避免巨量資料倉儲平台重複執行相同命令所導致的冗長檢索時間。透過上述兩項機制可使整體效能有顯著性的提升，尤其對於多人使用環境下執行重複性高的SQL命令，更能大幅縮短檢索所需的時間。 The objective of this study is to realize a multiple big data processing platform with high performance and high availability. The integration of Apache Hive, Cloudera Impala, and BDAS Shark make the platform support SQL query in big data environment. In addition, users can access a single interface and select the best performance of big data warehouse platform automatically by the optimizer proposed in this research. Distributed memory storage system Memcached along with distributed file system Apache Hadoop HDFS is employed for caching query results. Thereafter, if user gives the same SQL query command, user is able to get the same result rapidly from the high-performance cache system so as to avoid a longer retrieval time when suffering the repeated searches in big data warehouse platform. The proposed approach definitely improves the overall performance significantly, and especially the application of the high repeatable SQL commands with multi-user mode makes it possible to reduce the time for query/response dramatically.

多重巨量資料處理平台之整合與最佳化技術 = Integration and Optimization Technologies for Multiple Big Data Processing Platforms
蔡, 允哲

多重巨量資料處理平台之整合與最佳化技術 = Integration and Optimization Technologies for Multiple Big Data Processing Platforms / 蔡允哲撰 - [高雄市] : 撰者, 2014[民103]. - 63葉 ; 部分彩圖,表 ; 30公分.
108年10月31日公開參考書目：葉53-54.
分散式記憶體儲存distributed memory storage

多重巨量資料處理平台之整合與最佳化技術 = Integration and Optimization Technologies for Multiple Big Data Processing Platforms
LDR:03271nam0a2200277 450 001 430145
005 20191119095134.0
010 0 $b 精裝
010 0 $b 平裝
100 $a 20141027y2014 k y0chiy50 e
101 0 $a eng $d chi $d eng
102 $a tw
105 $a ak am 000yy
200 1 $a 多重巨量資料處理平台之整合與最佳化技術 $d Integration and Optimization Technologies for Multiple Big Data Processing Platforms $z eng $f 蔡允哲撰
210 $a [高雄市] $c 撰者 $d 2014[民103]
215 0 $a 63葉 $c 部分彩圖,表 $d 30公分
300 $a 108年10月31日公開
300 $a 參考書目：葉53-54
314 $a 指導教授：張保榮教授
328 $a 碩士論文--國立高雄大學資訊工程學系碩士班
330 $a 本研究的目的在基於雲端計算架構上的建置一套具有高效能、高可用性、高擴展性的多重巨量資料處理平台，透過整合Apache Hive、Cloudera Impala及BDAS Shark使平台在巨量資料的環境下支援SQL命令快速檢索能力。首先，透過本研究所設計的最佳化程式，可以讓使用者透過單一的存取介面後，由程式自動選擇執行效能最佳的巨量資料倉儲平台進行運算。再者，利用Memcached分散式記憶體儲存系統和Apache Hadoop中的HDFS分散式檔案系統對已查詢結果進行快取，此後若是輸入相同的SQL查詢命令則會透過此高效能的快取系統直接取得檢索結果，避免巨量資料倉儲平台重複執行相同命令所導致的冗長檢索時間。透過上述兩項機制可使整體效能有顯著性的提升，尤其對於多人使用環境下執行重複性高的SQL命令，更能大幅縮短檢索所需的時間。 The objective of this study is to realize a multiple big data processing platform with high performance and high availability. The integration of Apache Hive, Cloudera Impala, and BDAS Shark make the platform support SQL query in big data environment. In addition, users can access a single interface and select the best performance of big data warehouse platform automatically by the optimizer proposed in this research. Distributed memory storage system Memcached along with distributed file system Apache Hadoop HDFS is employed for caching query results. Thereafter, if user gives the same SQL query command, user is able to get the same result rapidly from the high-performance cache system so as to avoid a longer retrieval time when suffering the repeated searches in big data warehouse platform. The proposed approach definitely improves the overall performance significantly, and especially the application of the high repeatable SQL commands with multi-user mode makes it possible to reduce the time for query/response dramatically.
510 1 $a Integration and Optimization Technologies for Multiple Big Data Processing Platforms $z eng
610 # 0 $a 分散式記憶體儲存 $a 雲端計算 $a 多重巨量資料處理平台 $a 分散式檔案系統 $a 資料倉儲
610 # 1 $a distributed memory storage $a multiple big data processing platform $a data warehouse $a cloud computing $a distributed file system
681 $a 008M/0019 $b 464103 4425 $v 2007年版
700 1 $a 蔡 $b 允哲 $4 撰 $3 673368
712 0 2 $a 國立高雄大學 $b 資訊工程學系碩士班 $3 353878
801 0 $a tw $b 國立高雄大學 $c 20141021 $g CCR
856 7 # $u https://hdl.handle.net/11296/5ydc5c $2 http $z 電子資源