語系:
繁體中文
English
說明(常見問題)
圖資館首頁
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Performance bottlenecks on large-sca...
~
Kunz, Robert C.
Performance bottlenecks on large-scale shared-memory multiprocessors.
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Performance bottlenecks on large-scale shared-memory multiprocessors.
作者:
Kunz, Robert C.
面頁冊數:
137 p.
附註:
Adviser: John Hennessy.
附註:
Source: Dissertation Abstracts International, Volume: 65-11, Section: B, page: 5934.
Contained By:
Dissertation Abstracts International65-11B.
標題:
Engineering, Electronics and Electrical.
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3153509
ISBN:
0496138243
Performance bottlenecks on large-scale shared-memory multiprocessors.
Kunz, Robert C.
Performance bottlenecks on large-scale shared-memory multiprocessors.
- 137 p.
Adviser: John Hennessy.
Thesis (Ph.D.)--Stanford University, 2005.
Even setting aside contention, the coherence protocol is a smaller bottleneck than other system aspects including the operating system's scheduling policies and the applications effective or ineffective use of the cache coherent memory system. Programmers still need to tune their programs to a specific architecture; such tuning limits portability. While coherence protocols might be able to provide a reduction in remote communication, the mismatch between an application and the architecture are often more significant and prevent major performance improvements.
ISBN: 0496138243Subjects--Topical Terms:
226981
Engineering, Electronics and Electrical.
Performance bottlenecks on large-scale shared-memory multiprocessors.
LDR
:03548nmm _2200289 _450
001
167294
005
20061005085843.5
008
090528s2005 eng d
020
$a
0496138243
035
$a
00197910
040
$a
UnM
$c
UnM
100
0
$a
Kunz, Robert C.
$3
237440
245
1 0
$a
Performance bottlenecks on large-scale shared-memory multiprocessors.
300
$a
137 p.
500
$a
Adviser: John Hennessy.
500
$a
Source: Dissertation Abstracts International, Volume: 65-11, Section: B, page: 5934.
502
$a
Thesis (Ph.D.)--Stanford University, 2005.
520
#
$a
Even setting aside contention, the coherence protocol is a smaller bottleneck than other system aspects including the operating system's scheduling policies and the applications effective or ineffective use of the cache coherent memory system. Programmers still need to tune their programs to a specific architecture; such tuning limits portability. While coherence protocols might be able to provide a reduction in remote communication, the mismatch between an application and the architecture are often more significant and prevent major performance improvements.
520
#
$a
Large-scale multiprocessors continue to remain difficult to program because the memory system alone cannot eliminate the need for programmers to remain aware of implicit communication. The software libraries, compiler, and operating system must apply complex machine-specific optimizations to reduce second- and third-order performance bottlenecks. Therefore, the memory system should provide meaningful visibility and feedback to programming monitoring tools and compilers. Without such tools to assist programmers, the programming advantages of a coherent shared memory multiprocessor versus a message passing multiprocessor are likely to be small for larger processor counts.
520
#
$a
Researchers working on multiprocessor memory systems have advocated easing the programming burden by adding enhancements to the memory system designed to reduce memory latency and coherence overhead. Analogous to the lessons learned during the RISC movement over 20 years ago, simpler memory system designs are faster than more complicated ones, primarily because the additional contention present in the memory system overwhelms minor reductions in latency that more complicated protocols provide. Thus, architects should focus on minimizing memory controller occupancy on large-scale multiprocessors rather than just latency.
520
#
$a
While multiprocessors have existed for many years, most parallel architectures are difficult to program efficiently. The key challenge is how to simplify the programming model so that programmers can write portable highly efficient parallel programs with minimal effort. For example, cache-coherent shared-memory architectures trade the memory system complexity of the coherence protocol for a simpler programming model that does not require communication to be programmed explicitly. Using the FLASH machine, a large-scale cc-NUMA multiprocessor, this dissertation explores the interaction between hardware and software design trade-offs and quantifies the performance gains of memory system enhancements.
590
$a
School code: 0212.
650
# 0
$a
Engineering, Electronics and Electrical.
$3
226981
690
$a
0544
710
0 #
$a
Stanford University.
$3
212607
773
0 #
$g
65-11B.
$t
Dissertation Abstracts International
790
$a
0212
790
1 0
$a
Hennessy, John,
$e
advisor
791
$a
Ph.D.
792
$a
2005
856
4 0
$u
http://libsw.nuk.edu.tw:81/login?url=http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3153509
$z
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3153509
筆 0 讀者評論
全部
電子館藏
館藏
1 筆 • 頁數 1 •
1
條碼號
館藏地
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
000000002232
電子館藏
1圖書
學位論文
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
多媒體檔案
http://libsw.nuk.edu.tw:81/login?url=http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3153509
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼
登入