語系:
繁體中文
English
說明(常見問題)
圖資館首頁
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Fault-tolerance techniques for high-...
~
Herault, Thomas.
Fault-tolerance techniques for high-performance computing
紀錄類型:
書目-電子資源 : Monograph/item
正題名/作者:
Fault-tolerance techniques for high-performance computingedited by Thomas Herault, Yves Robert.
其他作者:
Herault, Thomas.
出版者:
Cham :Springer International Publishing :2015.
面頁冊數:
ix, 320 p. :ill., digital ;24 cm.
Contained By:
Springer eBooks
標題:
Fault-tolerant computing.
電子資源:
http://dx.doi.org/10.1007/978-3-319-20943-2
ISBN:
9783319209432 (electronic bk.)
Fault-tolerance techniques for high-performance computing
Fault-tolerance techniques for high-performance computing
[electronic resource] /edited by Thomas Herault, Yves Robert. - Cham :Springer International Publishing :2015. - ix, 320 p. :ill., digital ;24 cm. - Computer communications and networks,1617-7975. - Computer communications and networks..
Part I: General Overview -- Fault-Tolerance Techniques for High-Performance Computing -- Part II: Technical Contributions -- Errors and Faults -- Fault-Tolerant MPI -- Using Replication for Resilience on Exascale Systems -- Energy-Aware Check pointing Strategies.
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC) The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Topics and features: Includes self-contained contributions from an international selection of preeminent experts Provides a survey of resilience methods and performance models Examines the various sources for errors and faults in large-scale systems, detailing their characteristics, with a focus on modeling, detection and prediction Reviews the spectrum of techniques that can be applied to design a fault-tolerant message passing interface Investigates different approaches to replication, comparing these to the traditional checkpoint-recovery approach Discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems, proposing a methodology to estimate such energy consumption This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing. Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Superieure de Lyon, France, and a Visiting Research Scholar in the ICL.
ISBN: 9783319209432 (electronic bk.)
Standard No.: 10.1007/978-3-319-20943-2doiSubjects--Topical Terms:
324957
Fault-tolerant computing.
LC Class. No.: QA76.9.F38
Dewey Class. No.: 004.2
Fault-tolerance techniques for high-performance computing
LDR
:03089nmm a2200325 a 4500
001
472557
003
DE-He213
005
20160223100031.0
006
m d
007
cr nn 008maaau
008
160316s2015 gw s 0 eng d
020
$a
9783319209432 (electronic bk.)
020
$a
9783319209425 (paper)
024
7
$a
10.1007/978-3-319-20943-2
$2
doi
035
$a
978-3-319-20943-2
040
$a
GP
$c
GP
041
0
$a
eng
050
4
$a
QA76.9.F38
072
7
$a
UYD
$2
bicssc
072
7
$a
COM074000
$2
bisacsh
082
0 4
$a
004.2
$2
23
090
$a
QA76.9.F38
$b
F263 2015
245
0 0
$a
Fault-tolerance techniques for high-performance computing
$h
[electronic resource] /
$c
edited by Thomas Herault, Yves Robert.
260
$a
Cham :
$b
Springer International Publishing :
$b
Imprint: Springer,
$c
2015.
300
$a
ix, 320 p. :
$b
ill., digital ;
$c
24 cm.
490
1
$a
Computer communications and networks,
$x
1617-7975
505
0
$a
Part I: General Overview -- Fault-Tolerance Techniques for High-Performance Computing -- Part II: Technical Contributions -- Errors and Faults -- Fault-Tolerant MPI -- Using Replication for Resilience on Exascale Systems -- Energy-Aware Check pointing Strategies.
520
$a
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC) The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Topics and features: Includes self-contained contributions from an international selection of preeminent experts Provides a survey of resilience methods and performance models Examines the various sources for errors and faults in large-scale systems, detailing their characteristics, with a focus on modeling, detection and prediction Reviews the spectrum of techniques that can be applied to design a fault-tolerant message passing interface Investigates different approaches to replication, comparing these to the traditional checkpoint-recovery approach Discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems, proposing a methodology to estimate such energy consumption This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing. Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Superieure de Lyon, France, and a Visiting Research Scholar in the ICL.
650
0
$a
Fault-tolerant computing.
$3
324957
650
0
$a
High performance computing.
$3
211079
650
1 4
$a
Computer Science.
$3
212513
650
2 4
$a
System Performance and Evaluation.
$3
273898
650
2 4
$a
Performance and Reliability.
$3
277564
650
2 4
$a
Numeric Computing.
$3
275524
700
1
$a
Herault, Thomas.
$3
727766
700
1
$a
Robert, Yves.
$3
727767
710
2
$a
SpringerLink (Online service)
$3
273601
773
0
$t
Springer eBooks
830
0
$a
Computer communications and networks.
$3
560387
856
4 0
$u
http://dx.doi.org/10.1007/978-3-319-20943-2
950
$a
Computer Science (Springer-11645)
筆 0 讀者評論
全部
電子館藏
館藏
1 筆 • 頁數 1 •
1
條碼號
館藏地
館藏流通類別
資料類型
索書號
使用類型
借閱狀態
預約狀態
備註欄
附件
000000118662
電子館藏
1圖書
電子書
EB QA76.9.F38 F263 2015
一般使用(Normal)
在架
0
1 筆 • 頁數 1 •
1
多媒體
多媒體檔案
http://dx.doi.org/10.1007/978-3-319-20943-2
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼
登入