Language:
English
繁體中文
Help
圖資館首頁
Login
Back
Switch To:
Labeled
|
MARC Mode
|
ISBD
Probabilistic hashing techniques for...
~
Cornell University.
Probabilistic hashing techniques for big data.
Record Type:
Electronic resources : Monograph/item
Title/Author:
Probabilistic hashing techniques for big data.
Author:
Shrivastava, Anshumali.
Description:
193 p.
Notes:
Source: Dissertation Abstracts International, Volume: 77-03(E), Section: B.
Notes:
Adviser: Ping Li.
Contained By:
Dissertation Abstracts International77-03B(E).
Subject:
Computer science.
Online resource:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3730467
ISBN:
9781339167350
Probabilistic hashing techniques for big data.
Shrivastava, Anshumali.
Probabilistic hashing techniques for big data.
- 193 p.
Source: Dissertation Abstracts International, Volume: 77-03(E), Section: B.
Thesis (Ph.D.)--Cornell University, 2015.
We investigate probabilistic hashing techniques for addressing computational and memory challenges in large scale machine learning and data mining systems. In this thesis, we show that the traditional idea of hashing goes far beyond near-neighbor search and there are some striking new possibilities. We show that hashing can improve state of the art large scale learning algorithms, and it goes beyond the conventional notions of pairwise similarities. Despite being a very well studied topic in literature, we found several opportunities for fundamentally improving some of the well know textbook hashing algorithms. In particular, we show that the traditional way of computing minwise hashes is unnecessarily expensive and without loosing anything we can achieve an order of magnitude speedup. We also found that for cosine similarity search there is a better scheme than SimHash.
ISBN: 9781339167350Subjects--Topical Terms:
199325
Computer science.
Probabilistic hashing techniques for big data.
LDR
:02436nmm a2200313 4500
001
476136
005
20160418090200.5
008
160517s2015 ||||||||||||||||| ||eng d
020
$a
9781339167350
035
$a
(MiAaPQ)AAI3730467
035
$a
AAI3730467
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Shrivastava, Anshumali.
$3
730437
245
1 0
$a
Probabilistic hashing techniques for big data.
300
$a
193 p.
500
$a
Source: Dissertation Abstracts International, Volume: 77-03(E), Section: B.
500
$a
Adviser: Ping Li.
502
$a
Thesis (Ph.D.)--Cornell University, 2015.
520
$a
We investigate probabilistic hashing techniques for addressing computational and memory challenges in large scale machine learning and data mining systems. In this thesis, we show that the traditional idea of hashing goes far beyond near-neighbor search and there are some striking new possibilities. We show that hashing can improve state of the art large scale learning algorithms, and it goes beyond the conventional notions of pairwise similarities. Despite being a very well studied topic in literature, we found several opportunities for fundamentally improving some of the well know textbook hashing algorithms. In particular, we show that the traditional way of computing minwise hashes is unnecessarily expensive and without loosing anything we can achieve an order of magnitude speedup. We also found that for cosine similarity search there is a better scheme than SimHash.
520
$a
In the end, we show that the existing locality sensitive hashing framework itself is very restrictive, and we cannot have efficient algorithms for some important measures like inner products which are ubiquitous in machine learning. We propose asymmetric locality sensitive hashing (ALSH), an extended framework, where we show provable and practical efficient algorithms for Maximum Inner Product Search (MIPS). Having such an efficient solutions to MIPS directly scales up many popular machine learning algorithms.
520
$a
We believe that this thesis provides significant improvements to some of the heavily used subroutines in big-data systems, which we hope will be adopted.
590
$a
School code: 0058.
650
4
$a
Computer science.
$3
199325
650
4
$a
Statistics.
$3
182057
650
4
$a
Information science.
$3
190425
690
$a
0984
690
$a
0463
690
$a
0723
710
2
$a
Cornell University.
$b
Computer Science.
$3
730438
773
0
$t
Dissertation Abstracts International
$g
77-03B(E).
790
$a
0058
791
$a
Ph.D.
792
$a
2015
793
$a
English
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3730467
based on 0 review(s)
ALL
電子館藏
Items
1 records • Pages 1 •
1
Inventory Number
Location Name
Item Class
Material type
Call number
Usage Class
Loan Status
No. of reservations
Opac note
Attachments
000000119486
電子館藏
1圖書
學位論文
TH 2015
一般使用(Normal)
On shelf
0
1 records • Pages 1 •
1
Multimedia
Multimedia file
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3730467
Reviews
Add a review
and share your thoughts with other readers
Export
pickup library
Processing
...
Change password
Login