國立高雄大學圖資館 |

登入

回首頁

Distributed Partitioning and Processing of Large Spatial Datasets.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Distributed Partitioning and Processing of Large Spatial Datasets.
作者:	Zeidan, Ayman.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, 2022
面頁冊數:	201 p.
附註:	Source: Dissertations Abstracts International, Volume: 83-07, Section: B.
附註:	Advisor: Vo, Huy T.
Contained By:	Dissertations Abstracts International83-07B.
標題:	Computer science.
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28870119
ISBN:	9798762110730

Distributed Partitioning and Processing of Large Spatial Datasets.
Zeidan, Ayman.

Distributed Partitioning and Processing of Large Spatial Datasets. - Ann Arbor : ProQuest Dissertations & Theses, 2022 - 201 p.

Source: Dissertations Abstracts International, Volume: 83-07, Section: B.

Thesis (Ph.D.)--City University of New York, 2022.

This item must not be sold to any third party vendors.

Data collection is one of the most common practices in today’s world. The data collection rate has rapidly increased over the past decade and is not showing any signs of decline. Data sources are many; the Internet of Things devices, mobile gadgets, social media posts, connected cars, and web servers constantly report on their users’ interactions and habits. Much of the collected data is spatial data which contains attributes that denote the physical origin of the data. As a result of the tremendous growth in data collection, higher demand for new techniques emerged to efficiently process and extract valuable insights in a relatively acceptable time frame. The current standard approach to large-scale data analysis uses distributed parallel processing systems like Apache Hadoop and Apache Spark. However, these systems are designed for general-purpose parallel processing and require an additional layer to recognize and efficiently process spatial datasets. Motivated by its many applications, we examine the several challenges facing spatial data partitioning and processing and propose solutions customized for each task. We detail our techniques for building spatial partitioners over large datasets for use with spatial queries like map-matching and kNN spatial join. Additionally, we present an accuracy benchmarking framework for comparing and classifying the results of two input files based on specific criteria. Our proposed work targets batch processing of large spatial datasets, including structured, unstructured, and semi-structured datasets.

ISBN: 9798762110730Subjects--Topical Terms:

199325
Computer science.
Subjects--Index Terms:

Apache Spark

Distributed Partitioning and Processing of Large Spatial Datasets.
LDR:02826nmm a2200397 4500 001 636070
005 20230501063851.5
006 m o d
007 cr#unu||||||||
008 230724s2022 ||||||||||||||||| ||eng d
020 $a 9798762110730
035 $a (MiAaPQ)AAI28870119
035 $a AAI28870119
040 $a MiAaPQ $c MiAaPQ
100 1 $a Zeidan, Ayman. $0 (orcid)0000-0002-2881-5047 $3 942348
245 1 0 $a Distributed Partitioning and Processing of Large Spatial Datasets.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2022
300 $a 201 p.
500 $a Source: Dissertations Abstracts International, Volume: 83-07, Section: B.
500 $a Advisor: Vo, Huy T.
502 $a Thesis (Ph.D.)--City University of New York, 2022.
506 $a This item must not be sold to any third party vendors.
520 $a Data collection is one of the most common practices in today’s world. The data collection rate has rapidly increased over the past decade and is not showing any signs of decline. Data sources are many; the Internet of Things devices, mobile gadgets, social media posts, connected cars, and web servers constantly report on their users’ interactions and habits. Much of the collected data is spatial data which contains attributes that denote the physical origin of the data. As a result of the tremendous growth in data collection, higher demand for new techniques emerged to efficiently process and extract valuable insights in a relatively acceptable time frame. The current standard approach to large-scale data analysis uses distributed parallel processing systems like Apache Hadoop and Apache Spark. However, these systems are designed for general-purpose parallel processing and require an additional layer to recognize and efficiently process spatial datasets. Motivated by its many applications, we examine the several challenges facing spatial data partitioning and processing and propose solutions customized for each task. We detail our techniques for building spatial partitioners over large datasets for use with spatial queries like map-matching and kNN spatial join. Additionally, we present an accuracy benchmarking framework for comparing and classifying the results of two input files based on specific criteria. Our proposed work targets batch processing of large spatial datasets, including structured, unstructured, and semi-structured datasets.
590 $a School code: 0046.
650 4 $a Computer science. $3 199325
650 4 $a Artificial intelligence. $3 194058
653 $a Apache Spark
653 $a Big spatial data
653 $a Distributed computing
653 $a kNN distributed spatial query
653 $a Parallel processing and Load balancing
653 $a Spatial query
690 $a 0984
690 $a 0800
710 2 $a City University of New York. $b Computer Science. $3 492891
773 0 $t Dissertations Abstracts International $g 83-07B.
790 $a 0046
791 $a Ph.D.
792 $a 2022
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=28870119