國立高雄大學圖資館 |

語系: 繁體中文

說明(常見問題)

圖資館首頁

登入

回首頁

切換: 標籤 | MARC模式 | ISBD

Architecture, Models, and Algorithms...

He, Hua.

Architecture, Models, and Algorithms for Textual Similarity.

紀錄類型:	書目-電子資源 : Monograph/item
正題名/作者:	Architecture, Models, and Algorithms for Textual Similarity.
作者:	He, Hua.
出版者:	Ann Arbor : ProQuest Dissertations & Theses, 2018
面頁冊數:	212 p.
附註:	Source: Dissertation Abstracts International, Volume: 79-11(E), Section: B.
附註:	Adviser: Jimmy Lin.
Contained By:	Dissertation Abstracts International79-11B(E).
標題:	Computer science.
電子資源:	http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10751591
ISBN:	9780438149212

Architecture, Models, and Algorithms for Textual Similarity.
He, Hua.

Architecture, Models, and Algorithms for Textual Similarity. - Ann Arbor : ProQuest Dissertations & Theses, 2018 - 212 p.

Source: Dissertation Abstracts International, Volume: 79-11(E), Section: B.

Thesis (Ph.D.)--University of Maryland, College Park, 2018.

Identifying similar pieces of texts remains one of the fundamental problems in computational linguistics. This dissertation focuses on the textual similarity measurement and identification problem by studying a variety of major tasks that share common properties, and presents our efforts to address 7 closely-related similarity tasks given over 20 public benchmarks, including paraphrase identification, answer selection for question answering, pairwise learning to rank, monolingual/cross-lingual semantic textual similarity measurement, insight extraction on biomedical literature, and high performance cross-lingual pattern matching for machine translation on GPUs.

ISBN: 9780438149212Subjects--Topical Terms:

199325
Computer science.

Architecture, Models, and Algorithms for Textual Similarity.
LDR:05952nmm a2200373 4500 001 547551
005 20190513114556.5
008 190715s2018 ||||||||||||||||| ||eng d
020 $a 9780438149212
035 $a (MiAaPQ)AAI10751591
035 $a (MiAaPQ)umd:18819
035 $a AAI10751591
040 $a MiAaPQ $c MiAaPQ
100 1 $a He, Hua. $3 761588
245 1 0 $a Architecture, Models, and Algorithms for Textual Similarity.
260 1 $a Ann Arbor : $b ProQuest Dissertations & Theses, $c 2018
300 $a 212 p.
500 $a Source: Dissertation Abstracts International, Volume: 79-11(E), Section: B.
500 $a Adviser: Jimmy Lin.
502 $a Thesis (Ph.D.)--University of Maryland, College Park, 2018.
520 $a Identifying similar pieces of texts remains one of the fundamental problems in computational linguistics. This dissertation focuses on the textual similarity measurement and identification problem by studying a variety of major tasks that share common properties, and presents our efforts to address 7 closely-related similarity tasks given over 20 public benchmarks, including paraphrase identification, answer selection for question answering, pairwise learning to rank, monolingual/cross-lingual semantic textual similarity measurement, insight extraction on biomedical literature, and high performance cross-lingual pattern matching for machine translation on GPUs.
520 $a We investigate how to make textual similarity measurement more accurate with deep neural networks. Traditional approaches are either based on feature engineering which leads to disconnected solutions, or the Siamese architecture which treats inputs independently, utilizes single representation view and straightforward similarity comparison. In contrast, we focus on modeling stronger interactions between inputs and develop interaction-based neural modeling that explicitly encodes the alignments of input words or aggregated sentence representations into our models. As a result, our multiple deep neural networks show highly competitive performance on many textual similarity measurement public benchmarks we evaluated.
520 $a Our multi-perspective convolutional neural networks (MPCNN) uses a multiplicity of perspectives to process input sentences with multiple parallel convolutional neural networks, is able to extract salient sentence-level features automatically at multiple granularities with different types of pooling. Our novel structured similarity layer encourages stronger input interactions by comparing local regions of both sentence representations. This model is the first example of our interaction-based neural modeling.
520 $a We also provide an attention-based input interaction layer on top of the MPCNN model. The input interaction layer models a closer relationship of input words by converting two separate sentences into an inter-related sentence pair. This layer utilizes the attention mechanism in a straightforward way, and is another example of our interaction-based neural modeling.
520 $a We then provide our pairwise word interaction model with very deep neural networks (PWI). This model directly encodes input word interactions with novel pairwise word interaction modeling and a novel similarity focus layer. The use of very deep architecture in this model is the first example in NLP domain for better textual similarity modeling. Our PWI model outperforms the Siamese architecture and feature engineering approach on multiple tasks, and is another example of our interaction-based neural modeling.
520 $a We also focus on the question answering task with a pairwise ranking approach. Unlike traditional pointwise approach of the task, our pairwise ranking approach with the use of negative sampling focuses on modeling interactions between two pairs of question and answer inputs, then learns a relative order of the pairs to predict which answer is more relevant to the question. We demonstrate its high effectiveness against competitive previous pointwise baselines.
520 $a For the insight extraction on biomedical literature task, we develop neural networks with similarity modeling for better causality/correlation relation extraction, as we convert the extraction task into a similarity measurement task. Our approach innovates in that it explicitly models the interactions among the trio: named entities, entity relations and contexts, and then measures both relational and contextual similarity among them, finally integrate both similarity evaluations into considerations for insight extraction. We also build an end-to-end system to extract insights, with human evaluations we show our system is able to extract insights with high human acceptance accuracy.
520 $a Lastly, we explore how to exploit massive parallelism offered by modern GPUs for high-efficiency pattern matching. We take advantage of GPU hardware advances and develop a massive parallelism approach. We firstly work on phrase-based SMT, where we enable phrase lookup and extraction on suffix arrays to be massively parallelized and vastly many queries to be carried out in parallel. We then work on computationally expensive hierarchical SMT model, which requires matching grammar patterns that contain ''gaps''. In order to get high efficiency for the similarity identification task on GPUs, we show developing massively parallel algorithms on GPUs is the most important approach to fully utilize GPU's raw processing power, and developing compact data structures on GPUs is helpful to lower GPU's memory latency. Compared to a highly-optimized, state-of-the-art multi-threaded CPU implementation, our techniques achieve orders of magnitude improvement in terms of throughput.
590 $a School code: 0117.
650 4 $a Computer science. $3 199325
690 $a 0984
710 2 $a University of Maryland, College Park. $b Computer Science. $3 660362
773 0 $t Dissertation Abstracts International $g 79-11B(E).
790 $a 0117
791 $a Ph.D.
792 $a 2018
793 $a English
856 4 0 $u http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=10751591