0% found this document useful (0 votes)

12 views78 pages

10.1201 9781351036863 Previewpdf

The document is a comprehensive overview of contemporary high-performance computing, focusing on the transition from petascale to exascale systems. It includes contributions from various experts on topics such as resilient HPC, grid computing, and programming for hybrid systems. The book serves as a resource for understanding advancements in computational science and the applications of high-performance computing technologies.

Uploaded by

hongphucvas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

12 views78 pages

10.1201 9781351036863 Previewpdf

Uploaded by

hongphucvas

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 78

Contemporary High

Performance Computing
From Petascale toward Exascale
Volume 3
Chapman & Hall/CRC
Computational Science Series
Series Editor: Sartaj Sahni
Data-Intensive Science
Terence Critchlow, Kerstin Kleese van Dam

Grid Computing
Techniques and Applications
Barry Wilkinson
Scientific Computing with Multicore and Accelerators
Jakub Kurzak, David A. Bader, Jack Dongarra

Introduction to the Simulation of Dynamics Using Simulink

Michael A. Gray

Introduction to Scheduling
Yves Robert, Frederic Vivien

Introduction to Modeling and Simulation with MATLAB® and Python

Steven I. Gordon, Brian Guilfoos

Fundamentals of Multicore Software Development

Victor Pankratius, Ali-Reza Adl-Tabatabai, Walter Tichy

Programming for Hybrid Multi/Manycore MPP Systems

John Levesque, Aaron Vose

Exascale Scientific Applications

Scalability and Performance Portability
Tjerk P. Straatsma, Katerina B. Antypas, Timothy J. Williams

GPU Parallel Program Development Using CUDA

Tolga Soyata

Parallel Programming with Co-Arrays

Robert W. Numrich

Contemporary High Performance Computing

From Petascale toward Exascale, Volume 3
Jeffrey S. Vetter

For more information about this series please visit:

https://www.crcpress.com/Chapman--HallCRC-Computational-Science/book-series/
CHCOMPUTSCI
Contemporary High
Performance Computing
From Petascale toward Exascale
Volume 3

Edited by
Jeffrey S. Vetter
CRC Press
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742

© 2019 by Taylor & Francis Group, LLC

CRC Press is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed on acid-free paper

Version Date: 20190124

International Standard Book Number-13: 978-1-1384-8707-9 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity
of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized
in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying,
microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com
(http://www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers,
MA 01923, 978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of
users. For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been
arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.
Visit the Taylor & Francis Web site at
http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
Dedication

To my family, Jana and Alex.

Contents

Preface xix

Editor xxiii

1 Resilient HPC for 24x7x365 Weather Forecast Operations at the

Australian Government Bureau of Meteorology 1
Dr Lesley Seebeck, Tim F Pugh, Damian Aigus, Dr Joerg Henrichs,
Andrew Khaw, Tennessee Leeuwenburg, James Mandilas, Richard
Oxbrow, Naren Rajasingam, Wojtek Uliasz, John Vincent, Craig West,
and Dr Rob Bell
1.1 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Program Background . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Sponsor Background . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Highlights of Main Applications . . . . . . . . . . . . . . . . . . . 8
1.3.2 2017 Case Study: From Nodes to News, TC Debbie . . . . . . . . . 10
1.3.3 Benchmark Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.4 SSP - Monitoring System Performance . . . . . . . . . . . . . . . . 11
1.4 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.1 System Design Decisions . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.1 Australis Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.2 Australis Node Design . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.2.1 Australis Service Node . . . . . . . . . . . . . . . . . . . . 15
1.5.2.2 Australis Compute Node . . . . . . . . . . . . . . . . . . . 16
1.5.3 External Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.4 Australis Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.5 Australis Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.6 Australis Storage and Filesystem . . . . . . . . . . . . . . . . . . . 17
1.6 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6.1 Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6.2 Operating System Upgrade Procedure . . . . . . . . . . . . . . . . 18
1.6.3 Schedulers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6.3.1 SMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.3.2 Cylc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.3.3 PBS Professional . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Programming System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.1 Programming Models . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.2 Compiler Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

vii
viii Contents

1.7.3 Optimisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8 Archiving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.8.1 Oracle Hierarchical Storage Manager (SAM-QFS) . . . . . . . . . . 23
1.8.2 MARS/TSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.9 Data Center/Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.10 System Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.10.1 Systems Usage Patterns . . . . . . . . . . . . . . . . . . . . . . . . 25
1.11 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.11.1 Failover Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.2 Compute Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.3 Data Mover Failover . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.4 Storage Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.4.1 Normal Mode . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.4.2 Failover Mode . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.4.3 Recovery Mode . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.4.4 Isolated Mode . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.5 SSH File Transfer Failover . . . . . . . . . . . . . . . . . . . . . . . 27
1.12 Implementing a Product Generation Platform . . . . . . . . . . . . . . . . 28

2 Theta and Mira at Argonne National Laboratory 31

Mark R. Fahey, Yuri Alexeev, Bill Allcock, Benjamin S. Allen, Ramesh
Balakrishnan, Anouar Benali, Liza Booker, Ashley Boyle, Laural
Briggs, Edouard Brooks, Phil Carns, Beth Cerny, Andrew Cherry, Lisa
Childers, Sudheer Chunduri, Richard Coffey, James Collins, Paul
Coffman, Susan Coghlan, Kathy DiBennardi, Ginny Doyle, Hal Finkel,
Graham Fletcher, Marta Garcia, Ira Goldberg, Cheetah Goletz, Susan
Gregurich, Kevin Harms, Carissa Holohan, Joseph A. Insley, Tommie
Jackson, Janet Jaseckas, Elise Jennings, Derek Jensen, Wei Jiang,
Margaret Kaczmarski, Chris Knight, Janet Knowles, Kalyan Kumaran,
Ti Leggett, Ben Lenard, Anping Liu, Ray Loy, Preeti Malakar, Avanthi
Mantrala, David E. Martin, Guillermo Mayorga, Gordon McPheeters,
Paul Messina, Ryan Milner, Vitali Morozov, Zachary Nault, Denise
Nelson, Jack O’Connell, James Osborn, Michael E. Papka, Scott
Parker, Pragnesh Patel, Saumil Patel, Eric Pershey, Renée Plzak,
Adrian Pope, Jared Punzel, Sreeranjani Ramprakash, John ‘Skip’
Reddy, Paul Rich, Katherine Riley, Silvio Rizzi, George Rojas, Nichols
A. Romero, Robert Scott, Adam Scovel, William Scullin, Emily
Shemon, Haritha Siddabathuni Som, Joan Stover, Mirek Suliba, Brian
Toonen, Tom Uram, Alvaro Vazquez-Mayagoitia, Venkatram
Vishwanath, R. Douglas Waldron, Gabe West, Timothy J. Williams,
Darin Wills, Laura Wolf, Wanda Woods, and Michael Zhang
2.1 ALCF Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.1.1 Argonne Leadership Computing Facility . . . . . . . . . . . . . . . 32
2.1.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1.3 Organization of This Chapter . . . . . . . . . . . . . . . . . . . . . 34
2.2 Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.2.1 Mira Facility Improvements . . . . . . . . . . . . . . . . . . . . . . 34
2.2.2 Theta Facility Improvements . . . . . . . . . . . . . . . . . . . . . . 35
2.3 Theta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.3.1.1 Processor . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Contents ix

2.3.1.2 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.1.3 Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.3.1.4 Storage System . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3.2 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.3.2.1 Systems Administration of the Cray Linux Environment . 42
2.3.2.2 Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.3 Programming System . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3.3.1 Programming Models . . . . . . . . . . . . . . . . . . . . 42
2.3.3.2 Languages and Compilers . . . . . . . . . . . . . . . . . . 43
2.3.4 Deployment and Acceptance . . . . . . . . . . . . . . . . . . . . . . 44
2.3.4.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.3.4.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.3.5 Early Science and Transition to Operations . . . . . . . . . . . . . 46
2.4 Mira . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.4.1 Architecture and Software Summary . . . . . . . . . . . . . . . . . 49
2.4.2 Evolution of Ecosystem . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.4.3 Notable Science Accomplishments . . . . . . . . . . . . . . . . . . . 52
2.4.4 System Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.5 Cobalt Job Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.6 Job Failure Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.7 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3 Enabling HPC Applications on a Cray XC40 with Manycore CPUs at

ZIB 63
Alexander Reinefeld, Thomas Steinke, Matthias Noack, and Florian Wende
3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.1.1 Research Center for Many-Core HPC . . . . . . . . . . . . . . . . . 64
3.1.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.2 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.2.1 VASP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.2.2 GLAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.2.3 HEOM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3 System Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3.1 Cray TDS at ZIB with Intel Xeon Phi Processors . . . . . . . . . . 68
3.3.2 Intel Xeon Phi 71xx . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.3.3 Intel Xeon Phi 72xx . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4 Many-Core in HPC: The Need for Code Modernization . . . . . . . . . . 70
3.4.1 High-level SIMD Vectorization . . . . . . . . . . . . . . . . . . . . . 71
3.4.2 Offloading over Fabric . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.4.3 Runtime Kernel Compilation with KART . . . . . . . . . . . . . . 84
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4 The Mont-Blanc Prototype 93

Filippo Mantovani, Daniel Ruiz, Leonardo Bautista, Vishal Metha,
Fabio Banchelli, Nikola Rajovic, Eduard Ayguade, Jesus Labarta,
Mateo Valero, Alejandro Rico Carro, Alex Ramirez Bellido, Markus
Geimer, and Daniele Tafani
4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
4.1.1 Project Context and Challenges . . . . . . . . . . . . . . . . . . . . 94
4.1.2 Objectives and Timeline . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.2.1 Compute Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
x Contents

4.2.2 Blade . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.2.3 The Overall System . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.2.4 Performance Summary . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.3 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.3.1 Development Tools Ecosystem . . . . . . . . . . . . . . . . . . . . . 101
4.3.2 OpenStack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.4 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.4.1 Core Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.4.2 Node Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.4.3 System Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.4.4 Node Power Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.5 Deployment and Operational Information . . . . . . . . . . . . . . . . . . 108
4.5.1 Thermal Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.6 Highlights of Mont-Blanc . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.6.1 Reliability Study of an Unprotected RAM System . . . . . . . . . . 111
4.6.2 Network Retransmission and OS Noise Study . . . . . . . . . . . . 114
4.6.3 The Power Monitoring Tool of the Mont-Blanc System . . . . . . . 117
4.7 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5 Chameleon 123
Kate Keahey, Pierre Riteau, Dan Stanzione, Tim Cockerill, Joe
Mambretti, Paul Rad, and Paul Ruth
5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
5.1.1 A Case for a Production Testbed . . . . . . . . . . . . . . . . . . . 124
5.1.2 Program Background . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5.1.3 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.2 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
5.2.1 Projected Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.2.2 Phase 1 Chameleon Deployment . . . . . . . . . . . . . . . . . . . . 127
5.2.3 Experience with Phase 1 Hardware and Future Plans . . . . . . . . 129
5.3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.4 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
5.4.1 Core Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.4.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.5 Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.5.1 System Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.5.2 Complex Appliances . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6 Data Center/Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6.1 University of Chicago Facility . . . . . . . . . . . . . . . . . . . . . 137
5.6.2 TACC Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.6.3 Wide-Area Connectivity . . . . . . . . . . . . . . . . . . . . . . . . 137
5.7 System Management and Policies . . . . . . . . . . . . . . . . . . . . . . . 138
5.8 Statistics and Lessons Learned . . . . . . . . . . . . . . . . . . . . . . . . 138
5.9 Research Projects Highlights . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.9.1 Chameleon Slices for Wide-Area Networking Research . . . . . . . 141
5.9.2 Machine Learning Experiments on Chameleon . . . . . . . . . . . . 142
Contents xi

6 CSCS and the Piz Daint System 149

Sadaf R. Alam, Ladina Gilly, Colin J. McMurtrie, and Thomas C.
Schulthess
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.1.1 Program and Sponsor . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.1.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.2 Co-designing Piz Daint . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.3 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.3.1 Overview of the Cray XC50 Architecture . . . . . . . . . . . . . . . 155
6.3.2 Cray XC50 Hybrid Compute Node and Blade . . . . . . . . . . . . 155
6.3.3 Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.3.4 Scratch File System Configuration . . . . . . . . . . . . . . . . . . . 157
6.4 Innovative Features of Piz Daint . . . . . . . . . . . . . . . . . . . . . . . 159
6.4.1 New Cray Linux Environment (CLE 6.0) . . . . . . . . . . . . . . . 160
6.4.2 Public IP Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.4.3 GPU Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.4.4 System Management and Monitoring . . . . . . . . . . . . . . . . . 162
6.5 Data Center/Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.5.1 Design Criteria for the Facility . . . . . . . . . . . . . . . . . . . . . 163
6.5.2 Lake Water Cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.5.3 Cooling Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 165
6.5.4 Electrical Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.5.5 Siting the Current Piz Daint System . . . . . . . . . . . . . . . . . 166
6.5.5.1 Cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.5.5.2 Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.5.5.3 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.6 Consolidation of Services . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.6.1 High Performance Computing Service . . . . . . . . . . . . . . . . . 167
6.6.2 Visualization and Data Analysis Service . . . . . . . . . . . . . . . 168
6.6.3 Data Mover Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
6.6.4 Container Service . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.6.5 Cray Urika-XC Analytics Software Suite Services . . . . . . . . . . 170
6.6.6 Worldwide Large Hadron Collider (LHC) Computing Grid (WLCG)
Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
6.7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171

7 Facility Best Practices 175

Ladina Gilly
7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
7.2 Forums That Discuss Best Practices in HPC . . . . . . . . . . . . . . . . 176
7.3 Relevant Standards for Data Centres . . . . . . . . . . . . . . . . . . . . . 176
7.4 Most Frequently Encountered Infrastructure Challenges . . . . . . . . . . 177
7.5 Compilation of Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.5.1 Management Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.5.2 Tendering Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.5.3 Building Envelope . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.5.4 Power Density and Capacity . . . . . . . . . . . . . . . . . . . . . . 180
7.5.5 Raised Floor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.5.6 Electrical Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . 182
7.5.7 Cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.5.8 Fire Protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
xii Contents

7.5.9 Measuring and Monitoring . . . . . . . . . . . . . . . . . . . . . . . 184

7.5.10 Once in Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.6 Limitations and Implications . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

8 Jetstream 189
Craig A. Stewart, David Y. Hancock, Therese Miller, Jeremy Fischer,
R. Lee Liming, George Turner, John Michael Lowe, Steven Gregory,
Edwin Skidmore, Matthew Vaughn, Dan Stanzione, Nirav Merchant,
Ian Foster, James Taylor, Paul Rad, Volker Brendel, Enis Afgan,
Michael Packard, Therese Miller, and Winona Snapp-Childs
8.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
8.1.1 Jetstream Motivation and Sponsor Background . . . . . . . . . . . 192
8.1.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
8.1.3 Hardware Acceptance . . . . . . . . . . . . . . . . . . . . . . . . . . 196
8.1.4 Benchmark Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.1.5 Cloud Functionality Tests . . . . . . . . . . . . . . . . . . . . . . . 198
8.1.6 Gateway Functionality Tests . . . . . . . . . . . . . . . . . . . . . . 199
8.1.7 Data Movement, Storage, and Dissemination . . . . . . . . . . . . . 199
8.1.8 Acceptance by NSF . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.2 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . 200
8.2.1 Highlights of Main Applications . . . . . . . . . . . . . . . . . . . 201
8.3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.4 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
8.4.1 Node Design and Processor Elements . . . . . . . . . . . . . . . . . 203
8.4.2 Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8.4.3 Storage Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8.5 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8.5.1 Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
8.5.2 System Administration . . . . . . . . . . . . . . . . . . . . . . . . . 206
8.5.3 Schedulers and Virtualization . . . . . . . . . . . . . . . . . . . . . 206
8.5.4 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
8.5.5 Storage Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
8.5.6 User Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . 208
8.5.7 Allocation Software and Processes . . . . . . . . . . . . . . . . . . . 209
8.6 Programming System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.6.1 Atmosphere . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
8.6.2 Jetstream Plugins for the Atmosphere Platform . . . . . . . . . . . 211
8.6.2.1 Authorization . . . . . . . . . . . . . . . . . . . . . . . . . 211
8.6.2.2 Allocation Sources and Special Allocations . . . . . . . . . 211
8.6.3 Globus Authentication and Data Access . . . . . . . . . . . . . . . 212
8.6.4 The Jetstream OpenStack API . . . . . . . . . . . . . . . . . . . . 212
8.6.5 VM libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
8.7 Data Center Facilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
8.8 System Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
8.9 Interesting Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
8.9.1 Jupyter and Kubernetes . . . . . . . . . . . . . . . . . . . . . . . . 216
8.10 Artificial Intelligence Technology Education . . . . . . . . . . . . . . . . . 217
8.11 Jetstream VM Image Use for Scientific Reproducibility - Bioinformatics as
an Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.12 Running a Virtual Cluster on Jetstream . . . . . . . . . . . . . . . . . . . 218
Contents xiii

9 Modular Supercomputing Architecture: From Idea to Production 223

Estela Suarez, Norbert Eicker, and Thomas Lippert
9.1 The Jülich Supercomputing Centre (JSC) . . . . . . . . . . . . . . . . . . 224
9.2 Supercomputing Architectures at JSC . . . . . . . . . . . . . . . . . . . . 224
9.2.1 The Dual Supercomputer Strategy . . . . . . . . . . . . . . . . . . 225
9.2.2 The Cluster-Booster Concept . . . . . . . . . . . . . . . . . . . . . 227
9.2.3 The Modular Supercomputing Architecture . . . . . . . . . . . . . 228
9.3 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 229
9.3.1 Co-design Applications in the DEEP Projects . . . . . . . . . . . . 231
9.4 Systems Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
9.4.1 Sponsors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.4.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
9.5 Hardware Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
9.5.1 First Generation (DEEP) Prototype . . . . . . . . . . . . . . . . . 235
9.5.2 Second Generation (DEEP-ER) Prototype . . . . . . . . . . . . . . 238
9.5.3 JURECA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
9.6 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.6.1 System Administration . . . . . . . . . . . . . . . . . . . . . . . . . 241
9.6.2 Schedulers and Resource Management . . . . . . . . . . . . . . . . 242
9.6.3 Network-bridging Protocol . . . . . . . . . . . . . . . . . . . . . . . 244
9.6.4 I/O Software and File System . . . . . . . . . . . . . . . . . . . . . 244
9.7 Programming Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
9.7.1 Inter-module MPI Offloading . . . . . . . . . . . . . . . . . . . . . 245
9.7.2 OmpSs Abstraction Layer . . . . . . . . . . . . . . . . . . . . . . . 246
9.7.3 Resiliency Software . . . . . . . . . . . . . . . . . . . . . . . . . . . 247
9.8 Cooling and Facility Infrastructure . . . . . . . . . . . . . . . . . . . . . . 249
9.9 Conclusions and Next steps . . . . . . . . . . . . . . . . . . . . . . . . . . 250
9.10 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

10 SuperMUC at LRZ 257

Hayk Shoukourian, Arndt Bode, Herbert Huber, Michael Ott, and
Dieter Kranzlmüller
10.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
10.1.1 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258
10.2 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
10.3 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 261
10.4 System Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265
10.5 Data Center/Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
10.6 R&D on Energy-Efficiency at LRZ . . . . . . . . . . . . . . . . . . . . . . 268

11 The NERSC Cori HPC System 275

Katie Antypas Brian Austin, Deborah Bard, Wahid Bhimji, Brandon
Cook, Tina Declerck, Jack Deslippe, Richard Gerber, Rebecca
Hartman–Baker, Yun (Helen) He, Douglas Jacobsen, Thorsten Kurth,
Jay Srinivasan, and Nicholas J. Wright
11.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
11.1.1 Sponsor and Program Background . . . . . . . . . . . . . . . . . . 276
11.1.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.2 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.2.1 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
11.3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279
xiv Contents

11.4 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

11.4.1 Node Types and Design . . . . . . . . . . . . . . . . . . . . . . . . 280
11.4.1.1 Xeon Phi ”Knights Landing” Compute Nodes . . . . . . . 280
11.4.1.2 Xeon ”Haswell” Compute Nodes . . . . . . . . . . . . . . 280
11.4.1.3 Service Nodes . . . . . . . . . . . . . . . . . . . . . . . . . 280
11.4.2 Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
11.4.3 Storage - Burst Buffer and Lustre Filesystem . . . . . . . . . . . . 281
11.5 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
11.5.1 System Software Overview . . . . . . . . . . . . . . . . . . . . . . . 281
11.5.2 System Management Stack . . . . . . . . . . . . . . . . . . . . . . . 282
11.5.3 Resource Management . . . . . . . . . . . . . . . . . . . . . . . . . 282
11.5.4 Storage Resources and Software . . . . . . . . . . . . . . . . . . . . 283
11.5.5 Networking Resources and Software . . . . . . . . . . . . . . . . . . 284
11.5.6 Containers and User-Defined Images . . . . . . . . . . . . . . . . . 284
11.6 Programming Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 285
11.6.1 Programming Models . . . . . . . . . . . . . . . . . . . . . . . . . . 285
11.6.2 Languages and Compilers . . . . . . . . . . . . . . . . . . . . . . . 285
11.6.3 Libraries and Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
11.6.4 Building Software for a Heterogeneous System . . . . . . . . . . . . 286
11.6.5 Default Mode Selection Considerations . . . . . . . . . . . . . . . . 287
11.6.6 Running Jobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
11.7 NESAP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11.7.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
11.7.2 Optimization Strategy and Tools . . . . . . . . . . . . . . . . . . . 288
11.7.3 Most Effective Optimizations . . . . . . . . . . . . . . . . . . . . . 290
11.7.4 NESAP Result Overview . . . . . . . . . . . . . . . . . . . . . . . . 291
11.7.5 Application Highlights . . . . . . . . . . . . . . . . . . . . . . . . . 291
11.7.5.1 Quantum ESPRESSO . . . . . . . . . . . . . . . . . . . . 291
11.7.5.2 MFDn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
11.8 Data Science . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
11.8.1 IO Improvement: Burst Buffer . . . . . . . . . . . . . . . . . . . . . 295
11.8.2 Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
11.8.2.1 Network Connectivity to External Nodes . . . . . . . . . . 298
11.8.2.2 Burst Buffer Filesystem for In-situ Workflows . . . . . . . 298
11.8.2.3 Real-time and Interactive Queues for Time Sensitive
Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
11.8.2.4 Scheduler and Queue Improvements to Support
Data-intensive Computing . . . . . . . . . . . . . . . . . . 299
11.9 System Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
11.9.1 System Utilizations . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
11.9.2 Job Completion Statistics . . . . . . . . . . . . . . . . . . . . . . . 299
11.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
11.11 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302

12 Lomonosov-2 305
Vladimir Voevodin, Alexander Antonov, Dmitry Nikitenko, Pavel
Shvets, Sergey Sobolev, Konstantin Stefanov, Vadim Voevodin, and
Sergey Zhumatiy and Andrey Brechalov, and Alexander Naumov
12.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305
12.1.1 HPC History of MSU . . . . . . . . . . . . . . . . . . . . . . . . . . 305
12.1.2 Lomonosov-2 Supercomputer: Timeline . . . . . . . . . . . . . . . . 308
Contents xv

12.2 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 309

12.2.1 Main Applications Highlights . . . . . . . . . . . . . . . . . . . . . 309
12.2.2 Benchmark Results and Rating Positions . . . . . . . . . . . . . . . 309
12.2.3 Users and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 310
12.3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
12.4 System Software and Programming Systems . . . . . . . . . . . . . . . . . 313
12.5 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
12.5.1 Communication Network . . . . . . . . . . . . . . . . . . . . . . . . 315
12.5.2 Auxiliary InfiniBand Network . . . . . . . . . . . . . . . . . . . . . 315
12.5.3 Management and Service Network . . . . . . . . . . . . . . . . . . . 316
12.6 Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
12.7 Engineering Infrastructure . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
12.7.1 Infrastructure Support . . . . . . . . . . . . . . . . . . . . . . . . . 318
12.7.2 Power Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
12.7.3 Engineering Equipment . . . . . . . . . . . . . . . . . . . . . . . . . 320
12.7.4 Overall Cooling System . . . . . . . . . . . . . . . . . . . . . . . . . 320
12.7.5 Cooling Auxiliary IT Equipment . . . . . . . . . . . . . . . . . . . 322
12.7.6 Emergency Cooling . . . . . . . . . . . . . . . . . . . . . . . . . . . 322
12.7.7 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
12.8 Efficiency of the Supercomputer Center . . . . . . . . . . . . . . . . . . . 323

13 Electra 331
Rupak Biswas, Jeff Becker, Davin Chan, David Ellsworth, and Robert
Hood, Piyush Mehrotra, Michelle Moyer, Chris Tanner, and William
Thigpen
13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332
13.2 NASA Requirements for Supercomputing . . . . . . . . . . . . . . . . . . 333
13.3 Supercomputing Capabilities: Conventional Facilities . . . . . . . . . . . . 333
13.3.1 Computer Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 333
13.3.2 Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
13.3.3 Network Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . 334
13.3.4 Storage Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
13.3.5 Visualization and Hyperwall . . . . . . . . . . . . . . . . . . . . . . 336
13.3.6 Primary NAS Facility . . . . . . . . . . . . . . . . . . . . . . . . . . 336
13.4 Modular Supercomputing Facility . . . . . . . . . . . . . . . . . . . . . . . 337
13.4.1 Limitations of the Primary NAS Facility . . . . . . . . . . . . . . . 337
13.4.2 Expansion and Integration Strategy . . . . . . . . . . . . . . . . . 337
13.4.3 Site Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
13.4.4 Module Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338
13.4.5 Power, Cooling, Network . . . . . . . . . . . . . . . . . . . . . . . . 339
13.4.6 Facility Operations and Maintenance . . . . . . . . . . . . . . . . . 340
13.4.7 Environmental Impact . . . . . . . . . . . . . . . . . . . . . . . . . 341
13.5 Electra Supercomputer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
13.5.1 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342
13.5.2 I/O Subsystem Architecture . . . . . . . . . . . . . . . . . . . . . . 343
13.6 User Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
13.6.1 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
13.6.2 Resource Allocation and Scheduling . . . . . . . . . . . . . . . . . . 344
13.6.3 User Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
13.7 Application Benchmarking and Performance . . . . . . . . . . . . . . . . 345
13.8 Utilization Statistics of HECC Resources . . . . . . . . . . . . . . . . . . 347
xvi Contents

13.9 System Operations and Maintenance . . . . . . . . . . . . . . . . . . . . . 348

13.9.1 Administration Tools . . . . . . . . . . . . . . . . . . . . . . . . . . 348
13.9.2 Monitoring, Diagnosis, and Repair Tools . . . . . . . . . . . . . . . 349
13.9.3 System Enhancements and Maintenance . . . . . . . . . . . . . . . 350
13.10 Featured Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
13.11 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352

14 Bridges: Converging HPC, AI, and Big Data for Enabling Discovery 355
Nicholas A. Nystrom, Paola A. Buitrago, and Philip D. Blood
14.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356
14.1.1 Sponsor/Program Background . . . . . . . . . . . . . . . . . . . . . 357
14.1.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
14.2 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 359
14.2.1 Highlights of Main Applications and Data . . . . . . . . . . . . . . 360
14.2.2 Artificial Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . 361
14.2.3 Genomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
14.2.4 Gateways . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
14.2.5 Allocations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
14.3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
14.4 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
14.4.1 Processors and Accelerators . . . . . . . . . . . . . . . . . . . . . . 366
14.4.2 Node Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
14.4.3 Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
14.4.4 Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
14.4.5 Storage System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
14.5 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
14.5.1 Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
14.5.2 File Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
14.5.3 System Administration . . . . . . . . . . . . . . . . . . . . . . . . . 371
14.5.4 Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
14.6 Interactivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
14.6.1 Virtualization and Containers . . . . . . . . . . . . . . . . . . . . . 372
14.7 User Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
14.7.1 User Environment Customization . . . . . . . . . . . . . . . . . . . 373
14.7.2 Programming Models . . . . . . . . . . . . . . . . . . . . . . . . . . 374
14.7.3 Languages and Compilers . . . . . . . . . . . . . . . . . . . . . . . 374
14.7.4 Programming Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
14.7.5 Spark and Hadoop . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
14.7.6 Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
14.7.7 Domain-Specific Frameworks and Libraries . . . . . . . . . . . . . . 375
14.7.8 Gateways, Workflows, and Distributed Applications . . . . . . . . . 375
14.8 Storage, Visualization, and Analytics . . . . . . . . . . . . . . . . . . . . . 376
14.8.1 Community Datasets and Big Data as a Service . . . . . . . . . . . 376
14.9 Datacenter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
14.10 System Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
14.10.1 Reliability and Uptime . . . . . . . . . . . . . . . . . . . . . . . . . 377
14.11 Science Highlights: Bridges-Enabled Breakthroughs . . . . . . . . . . . . . 377
14.11.1 Artificial Intelligence and Big Data . . . . . . . . . . . . . . . . . . 377
14.11.2 Genomics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
14.12 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379
Contents xvii

15 Stampede at TACC 385

Dan Stanzione and John West
15.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
15.1.1 Program Background . . . . . . . . . . . . . . . . . . . . . . . . . 386
15.1.2 Lessons Learned on the Path to Stampede 2 . . . . . . . . . . . . . 386
15.2 Workload and the Design of Stampede 2 . . . . . . . . . . . . . . . . . . . 388
15.2.1 Science Highlights . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
15.3 System Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
15.3.1 Processors and Memory . . . . . . . . . . . . . . . . . . . . . . . . 390
15.3.2 Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
15.3.3 Disk I/O Subsystem . . . . . . . . . . . . . . . . . . . . . . . . . . 391
15.3.4 Non-volatile Memory . . . . . . . . . . . . . . . . . . . . . . . . . . 392
15.4 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 392
15.4.1 System Performance Monitoring and Administration . . . . . . . . 392
15.4.2 Job Submission and System Health . . . . . . . . . . . . . . . . . . 393
15.4.3 Application Development Tools . . . . . . . . . . . . . . . . . . . . 393
15.5 Visualization and Analytics . . . . . . . . . . . . . . . . . . . . . . . . . . 394
15.5.1 Visualization on Stampede 2 . . . . . . . . . . . . . . . . . . . . . . 394
15.5.2 Data Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 395
15.6 Datacenter, Layout, and Cybersecurity . . . . . . . . . . . . . . . . . . . . 395
15.6.1 System Layout and Phased Deployment . . . . . . . . . . . . . . . 396
15.6.2 Cybersecurity and Identity Management . . . . . . . . . . . . . . . 396
15.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396

16 Oakforest-PACS 401
Taisuke Boku, Osamu Tatebe, Daisuke Takahashi, Kazuhiro Yabana,
Yuta Hirokawa, and Masayuki Umemura, Toshihiro Hanawa, Kengo
Nakajima, and Hiroshi Nakamura, Tsuyoshi Ichimura and Kohei
Fujita, and Yutaka Ishikawa, Mitsuhisa Sato, Balazs Gerofi, and
Masamichi Takagi
16.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
16.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
16.3 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . 403
16.3.1 GAMERA/GHYDRA . . . . . . . . . . . . . . . . . . . . . . . . . 403
16.3.2 ARTED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404
16.3.3 Benchmark Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
16.3.3.1 HPL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
16.3.3.2 HPCG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
16.4 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
16.5 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 408
16.6 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
16.6.1 Basic System Software . . . . . . . . . . . . . . . . . . . . . . . . . 409
16.6.2 IHK/McKernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409
16.7 Programming System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 412
16.7.1 Basic Programming Environment . . . . . . . . . . . . . . . . . . . 412
16.7.2 XcalableMP: A PGAS Parallel Programming Language for Parallel
Many-core Processor System . . . . . . . . . . . . . . . . . . . . . . 413
16.7.2.1 Overview of XcalableMP . . . . . . . . . . . . . . . . . . . 413
16.7.2.2 OpenMP and XMP Tasklet Directive . . . . . . . . . . . . 414
16.7.2.3 Multi-tasking Execution Model in XcalableMP between
Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 415
xviii Contents

16.7.2.4 Preliminary Performance Evaluation on Oakforest-PACS . 416

16.7.2.5 Communication Optimization for Many-Core Clusters . . 417
16.8 Storage System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 418
16.9 Data Center/Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419

17 CHPC in South Africa 423

Happy M Sithole, Werner Janse Van Rensburg, Dorah Thobye,
Krishna Govender, Charles Crosby, Kevin Colville, and Anita Loots
17.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
17.1.1 Sponsor/Program Background . . . . . . . . . . . . . . . . . . . . . 423
17.1.2 Business Case of the Installation of Lengau . . . . . . . . . . . . . 424
17.1.3 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 425
17.2 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 426
17.2.1 Highlights of Main Applications . . . . . . . . . . . . . . . . . . . . 426
17.2.2 Benchmark Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 427
17.2.2.1 Computational Mechanics . . . . . . . . . . . . . . . . . . 428
17.2.2.2 Earth Sciences . . . . . . . . . . . . . . . . . . . . . . . . 430
17.2.2.3 Computational Chemistry . . . . . . . . . . . . . . . . . . 430
17.2.2.4 Astronomy . . . . . . . . . . . . . . . . . . . . . . . . . . 433
17.3 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
17.4 Storage, Visualisation and Analytics . . . . . . . . . . . . . . . . . . . . . 438
17.5 Data Center/Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
17.6 System Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 439
17.7 Square Kilometer Array . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447

Index 451
Preface

We are pleased to present you with this third volume of material that captures a snapshot
of the rich history of practice in Contemporary High Performance Computing. As evidenced
in the chapters of this book, High Performance Computing continues to flourish, both in
industry and research, both domestically and internationally. While much of the focus of
HPC is on the hardware architectures, a significant ecosystem is responsible for this success.
This book helps capture this broad ecosystem.
High Performance Computing (HPC) is used to solve a number of complex questions
in computational and data-intensive sciences. These questions include the simulation and
modeling of physical phenomena, such as climate change, energy production, drug design,
global security, and materials design; the analysis of large data sets, such as those in genome
sequencing, astronomical observation, and cybersecurity; and, the intricate design of engi-
neered products, such as airplanes and automobiles.
It is clear and well-documented that HPC is used to generate insight that would not oth-
erwise be possible. Simulations can augment or replace expensive, hazardous, or impossible
experiments. Furthermore, in the realm of simulation, HPC has the potential to suggest
new experiments that escape the parameters of observability.
Although much of the excitement about HPC focuses on the largest architectures and
on specific benchmarks, such as TOP500, there is a much deeper and broader commitment
from the international scientific and engineering community than is first apparent. In fact,
it is easy to lose track of history in terms the broad uses of HPC and the communities
that design, deploy, and operate HPC systems and facilities. Many of these sponsors and
organizations have spent decades developing scientific simulation methods and software,
which serves as the foundation of HPC today. This community has worked closely with
countless vendors to foster the sustained development and deployment of HPC systems
internationally.
In this third volume of Contemporary High Performance Computing [1, 2], we continue
to document international HPC ecosystems, which includes the sponsors and sites that
host them. We have selected contributions from international HPC sites, which represent
a combination of sites, systems, vendors, applications, and sponsors. Rather than focus
on simply the architectures or applications, we focus on HPC ecosystems that have made
this dramatic progress possible. Though the very word ecosystem can be a broad, all-
encompassing term, it aptly describes high performance computing. That is, HPC is far more
than one sponsor, one site, one application, one software system, or one architecture. Indeed,
it is a community of interacting entities in this environment that sustains the community
over time. In this regard, we asked contributors to include the following topics in their
chapters:

1. Sponsor and site history

2. Highlights of applications, workloads, and benchmarks
3. Systems overview
4. Hardware architecture

xix
xx Preface

5. System software

6. Programming systems
7. Storage, visualization, and analytics
8. Data center/facility

9. Site HPC statistics

Some of the authors followed this outline precisely while others found creative ways
to include this content in a different structure. Once you read the book, I think that you
will agree with me that most of the chapters have exceeded these expectations and have
provided a detailed snapshot of their HPC ecosystem, science, and organization.

Why I Edited This Book

My goal with this series of books has been to highlight and document significant systems
and facilities in high performance computing. With Volume 1, my main focus was proposed
to be on the architectural design of important and successful HPC systems. However, as I
started to interact with authors, I realized that HPC is about more than just hardware: it is
an ecosystem that includes software, applications, facilities, educators, software developers,
scientists, administrators, sponsors, and many others. Broadly speaking, HPC is growing
internationally, so I invited contributions from a broad base of organizations including the
USA, Japan, Germany, Australia, Spain, and others. The second volume is a snapshot of
these contemporary HPC ecosystems. Each chapter is typically punctuated with a site’s
flagship system.
My excitement about volumes one and two of this book grew as I started inviting authors
to contribute: everyone said ’yes!’ In fact, due to the limitations on hardback publishing,
we continued the series with this volume.

Helping Improve This Book

HPC and computing, in general, is a rapidly changing, large, diverse field. If you have
comments, corrections, or questions, please send a note to me at [email protected].

Bibliography
[1] J. S. Vetter. Contemporary high performance computing: an introduction. In Jeffrey S.
Vetter, editor, Contemporary High Performance Computing: From Petascale Toward
Exascale, volume 1 of CRC Computational Science Series, page 730. Taylor and Francis,
Boca Raton, 1 edition, 2013.
Preface xxi

[2] J. S. Vetter, editor. Contemporary High Performance Computing: From Petascale To-
ward Exascale, volume 2 of CRC Computational Science Series. Taylor and Francis,
Boca Raton, 1 edition, 2015.
Editor

Jeffrey S. Vetter, Ph.D., is a Distinguished R&D Staff Member, and the founding group
leader of the Future Technologies Group in the Computer Science and Mathematics Division
of Oak Ridge National Laboratory. Vetter also holds a joint appointment at the Electrical
Engineering and Computer Science Department of the University of Tennessee-Knoxville.
From 2005 through 2015, Vetter held a joint position at Georgia Institute of Technology,
where, from 2009 to 2015, he was the Principal Investigator of the NSF Track 2D Ex-
perimental Computing XSEDE Facility, named Keeneland, for large scale heterogeneous
computing using graphics processors, and the Director of the NVIDIA CUDA Center of
Excellence.
Vetter earned his Ph.D. in Computer Science from the Georgia Institute of Technology.
He joined ORNL in 2003, after stints as a computer scientist and project leader at Lawrence
Livermore National Laboratory, and postdoctoral researcher at the University of Illinois at
Urbana-Champaign. The coherent thread through his research is developing rich architec-
tures and software systems that solve important, real-world high performance computing
problems. He has been investigating the effectiveness of next-generation architectures, such
as non-volatile memory systems, massively multithreaded processors, and heterogeneous
processors such as graphics processors and field-programmable gate arrays (FPGAs), for
key applications. His recent books, entitled ”Contemporary High Performance Computing:
From Petascale toward Exascale (Vols. 1 and 2),” survey the international landscape of
HPC.
Vetter is a Fellow of the IEEE, and a Distinguished Scientist Member of the ACM.
Vetter, as part of an interdisciplinary team from Georgia Tech, NYU, and ORNL, was
awarded the Gordon Bell Prize in 2010. Also, his work has won awards at major venues:
Best Paper Awards at the International Parallel and Distributed Processing Symposium
(IPDPS), EuroPar and the 2018 AsHES Workshop, Best Student Paper Finalist at SC14,
Best Presentation at EASC 2015, and Best Paper Finalist at the IEEE HPEC Conference.
In 2015, Vetter served as the Technical Program Chair of SC15 (SC15 Breaks Exhibits and
Attendance Records While in Austin). You can see more at https://ft.ornl.gov/∼vetter.

xxiii
Chapter 1
Resilient HPC for 24x7x365 Weather
Forecast Operations at the Australian
Government Bureau of Meteorology

Dr Lesley Seebeck
Former Group Executive of Data & Digital, CITO, Australian Bureau of Meteorology

Tim F Pugh
Director, Supercomputer Programme, Australian Bureau of Meteorology

Damian Aigus
Support Services, Data & Digital, Australian Bureau of Meteorology

Dr Joerg Henrichs
Computational Science Manager, Data & Digital, Australian Bureau of Meteorology

Andrew Khaw
Scientific Computing Service Manager, Data & Digital, Australian Bureau of Meteorology

Tennessee Leeuwenburg
Model Build Team Manager, Data & Digital, Australian Bureau of Meteorology

James Mandilas
Operations and Change Manager, Data & Digital, Australian Bureau of Meteorology

Richard Oxbrow
HPD Systems Manager, Data & Digital, Australian Bureau of Meteorology

Naren Rajasingam
HPD Analyst, Data & Digital, Australian Bureau of Meteorology

Wojtek Uliasz
Enterprise Architect, Data & Digital, Australian Bureau of Meteorology

John Vincent
Delivery Manager, Data & Digital, Australian Bureau of Meteorology

Craig West
HPC Systems Manager, Data & Digital, Australian Bureau of Meteorology

Dr Rob Bell
IMT Scientific Computing Services, National Partnerships, CSIRO

1
2 Contemporary High Performance Computing, Vol. 3

1.1 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.1 Program Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.2 Sponsor Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.3 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3.1 Highlights of Main Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.2 2017 Case Study: From Nodes to News, TC Debbie . . . . . . . . . . . . . . . . . . 10
1.3.3 Benchmark Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.3.4 SSP - Monitoring System Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 System Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.1 System Design Decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.1 Australis Processors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.2 Australis Node Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.2.1 Australis Service Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.2.2 Australis Compute Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.3 External Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.4 Australis Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.5 Australis Interconnect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.5.6 Australis Storage and Filesystem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 System Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6.1 Operating System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6.2 Operating System Upgrade Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.6.3 Schedulers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.6.3.1 SMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.3.2 Cylc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.6.3.3 PBS Professional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7 Programming System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.1 Programming Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.2 Compiler Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.7.3 Optimisations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8 Archiving . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.8.1 Oracle Hierarchical Storage Manager (SAM-QFS) . . . . . . . . . . . . . . . . . . . . 23
1.8.2 MARS/TSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.9 Data Center/Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.10 System Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.10.1 Systems Usage Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.11 Reliability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.11.1 Failover Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.2 Compute Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.3 Data Mover Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.4 Storage Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.11.4.1 Normal Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.4.2 Failover Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.4.3 Recovery Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.4.4 Isolated Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.5 SSH File Transfer Failover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.12 Implementing a Product Generation Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Australian Government Bureau of Meteorology 3

1.1 Foreword
Supercomputing lies at the heart of modern weather forecasting. It coevolves with the
science, technology, means of the collection of observations, the needs of meteorologists,
and the expectations of the users of our forecasts and warnings. It nestles in a web of
other platforms and networks, applications and capabilities. It is driven by, consumes, and
generates vast and increasing amounts of data. And it is part of the global effort by the
world’s meteorological agencies to collect data, understand the weather, and look ahead to
generate forecasts and warnings on which human activity is based. Given the complexity
of the overall task and the web of supporting capability, to talk about the supercomputing
component alone seems reductionist. And yet it is a feat of human engineering and effort
that we do well to recognise. These are capabilities that drive the data and information
business that is the Bureau – the growing benefits available through more data, increasing
granularity and frequency of forecasts, and better information to the Bureau’s customers –
no more and no less than to the scientists or the meteorologists.

FIGURE 1.1: Australis compute racks as seen in the Data Centre

The Bureau’s current supercomputer, Australis, was delivered on time and within bud-
get, with the supercomputer itself, a Cray XC40, bought at a capital cost of $A80 million[8].
The programme extends from 2014-15 through 2020-21. Within that period, the Bureau con-
tinues to keep pace with the relentless demands of the data, the models and user needs, and
explore new, improving ways to extract value from both data and capability. It also has to
contend with an increasingly challenging operating environment; their effective use placing
growing demands on organisations in terms of skills, operating costs, and security.
On a personal note, arriving at the start of the programme to replace the existing
supercomputer, I was fortunate to have a highly capable team led by Tim Pugh. To continue
to be an effective contributor to the field, both the Bureau – and Australia – need to nurture
and grow the technical skills, deep computational understanding, insights that build and
shape the field of high performance computing and to exploit that capability. This chapter
sets out the Australian Bureau of Meteorology’s supercomputing capability, and in doing
so helps contribute to that effort.

Dr Lesley Seebeck
Former Group Executive of Data & Digital, CITO,
Australian Bureau of Meteorology
4 Contemporary High Performance Computing, Vol. 3

1.2 Overview
The Australian Government’s Bureau of Meteorology has had the responsibility of pro-
viding trusted, reliable, and responsive meteorological services for Australia - all day, every
day – since 1908. Bringing together the ever-expanding world-wide observation networks,
and improving computational analysis and numerical modelling to deliver the Bureau’s ex-
ceptional predictive and analytical capability, we are able to undertake the grand challenge
of weather and climate prediction.
Australia is a country with a landmass marginally less than the continental United
States, but with a population 13 times smaller. Australia is not only vast, it is also harsh.
With just 9% of the landmass suitable for farming, and the main population living along
the cooler coastal regions, the climate of the continent plays a significant role in defining
the life of the country.
Around the country there are climate pockets similar to those found on every other
continent; Sydney shares a climate similar to South Africa, Canberra is most like Rome,
Melbourne like the San Franciso Bay area, Perth like Los Angeles, Darwin like Mumbai, Ho-
bart like Southern Chile and the UK. Across the centre are deserts, which, though sparsely
populated, still contain major population centres like Alice Springs and the mining town of
Kalgoorlie.
Against this backdrop the Bureau and its forecasting team strive to provide timely
weather products to cover the entire continent and its climate variations, as well as managing
its weather responsibilities for Australia’s Antarctic Territory (a 5.9 million square kilometre
area, 42% of the Antarctic continent), on a 24x7x365 basis. As if this wasn’t a significant
enough daily endeavour, the Bureau also manages a suite of on-demand emergency forecasts
to cover the extreme weather events of the region; tsunami, cyclone, and bushfire (wildfire).
They regularly run in the extreme weather season (December - April) and are also ready
to go as and when they are required. Australia as an island continent also provides a full
oceanographic suite of forecasting.
The Bureau of Meteorology has the unique numerical prediction capabilities required
to routinely forecast the weather and climate conditions across the Australian continent,
its territories, and the surrounding marine environment. When this capability is utilised
with modern data and digital information services, we are able to issue timely forecasts
and warnings to the Australian public, media, industries, and Government services well in
advance of an event. These services are essential to ensure the nation is prepared to act when
faced with an event, and to mitigate the loss of property and lives. As a prepared nation,
we have been very successful in reducing the loss of lives and improving the warnings over
the years, and will continue to improve with each advance in key areas of science, numerical
prediction, observing networks, and computational systems.
Modern weather and climate prediction requires a significant science investment to
achieve modelling advances that lead to enhanced forecast services. The science invest-
ment comes from the Bureau and its local partners in the Commonwealth Scientific and
Industrial Research Organization (CSIRO) and Australian universities and international
partners. The Bureau is a member of the Unified Model (UM) partnership [9], which is lead
by the UK Met Office.
These partnerships bring together the required breadth of science, observations, and
modelling expertise to develop the global data assimilation and forecast models to high
resolution, convection permitting models, and the forthcoming multi-week and seasonal
coupled climate models. Today the Bureau assimilates data from more than 30 different
satellite data streams, surface observations, aircraft and balloon observations, and radar
Australian Government Bureau of Meteorology 5

observations coming next. All these capabilities talk to the sophistication of numerical
prediction modelling and why the Bureau needs such scientific partnerships, observation
networks, and computing capability to continuously deliver better products.
The Bureau strategy is to focus on customer needs to deliver more accurate and trusted
forecasts through its High Performance Computing (HPC) and numerical prediction capa-
bility. Australian businesses, agriculture, mining, aviation, shipping, defence, government
agencies, and citizens are all beneficiaries of more timely and accurate weather forecasts,
the multi-week climate outlooks, seasonal climate and water forecasting, and climate change
projections. The Bureau’s customers have an interest in decision-making across many time-
scales, and value an ability to change decisions (and derive value) well beyond the one-week
lead time typically associated with weather forecasts.[12]
The size of the HPC system is dependent on these factors; the number of modelling
suites; the numerical weather prediction models cost and complexity, arising from finer grid
resolutions; the need to consider a range of probable future atmospheric states (ensemble
modelling); and the need to couple physical modelling domains (i.e. atmosphere, ocean, sea
ice, land surfaces) to better capture physical interactions leading to improve simulations
and forecast skill. Typically, the numerical prediction models are sized to the available
computing capacity, thus constraining the modelling grid resolution.

1.2.1 Program Background

In-house computing came to the Bureau of Meteorology in 1967 with the commissioning
of an IBM360/65. Previous to this, weather computations were compiled from 1956 using
the Barotropic model running on CSIRAC, Australia’s first digital computer. From 1964
the Bureau used the CSIRO CDC 3600 based in Canberra. A year later in 1968 a second
IBM 360/65 was installed and run in parallel. Such was the public interest in the IBM
systems that they were displayed on the ground floor of the World Met Centre building in
central Melbourne with floor to ceiling windows for public viewing. The machines remained
on display until the Bureau moved offices in 1974. The systems were replaced in 1982 by a
Fujitsu M200 mainframe.
The Bureau’s “supercomputer” era started in 1988 with the arrival of an ETA 10-P
“Piper”. A second ETA 10-P enroute to the Bureau was damaged by a forklift on the
loading dock; it was unfortunately irreplaceable. Following ETA’s reincorporation into the
Control Data Corporation in 1990, CDC replaced the system with a Cray R X-MP and
then in 1992, a Cray Y-MP was acquired. The Cray platform delivered the Bureau’s Global
Atmospheric Spectral Prediction (GASP) model at 250km and a 75km regional atmospheric
model nested in the global model.
In 1997, the CSIRO and the Bureau collaborated again, forming the High Performance
Computing and Communications Centre (HPCCC), which jointly operated a succession of
NEC SX systems until 2010. The SX-5 further developing the global and regional models,
and the SX6 finally delivered global and regional resolutions of 80km and 37km respectively.
The Bureau then replaced the NEC system with a Sun Constellation in 2009. This
was acquired through a joint procurement with the National Computational Infrastructure
(NCI) at the Australian National University – NCI is a collaborative research computing
facility supported by the Australian Government. An Oracle HPC 6000 “Ngamai” then
replaced the Bureau’s Sun HPC system in 2013. The computational power of the Sun and
Oracle machines facilitated continuous improvement of the real-time forecast/assimilation
models to deliver a seasonal climate model at 250km resolution, global model at 25km,
regional model at 12km, and city models at 4km.
In 2013, the Bureau entered the National Computational Infrastructure (NCI) part-
nership that forms part of the Bureau’s ongoing commitment to the scientific community,
6 Contemporary High Performance Computing, Vol. 3

working to advance weather and climate research and development within the region. It has
facilitated the adoption of a software life cycle process for numerical prediction products;
the Australian Community Climate Earth System Simulator (ACCESS) [4] and the Unified
Model/Variational Assimilation (UM/VAR) weather modelling suites as well as ocean and
marine modelling suites.
This continuing development of the Bureau’s numerical modelling and prediction prod-
ucts have delivered an operational service for the routine and real-time demands of 24x7x365
weather, climate, ocean, and hydrological forecast services.
In July 2015 to meet the growing demand, the Bureau entered into a contract with Cray
to acquire a Cray XC40 system called “Australis” to support its operational HPC require-
ments for improved numerical prediction and forecast services. The Cray computational
systems married Intel processors, Lustre, the Network File System (NFS) filesystems, and
PBS Pro job scheduler system to provide the backbone of the Bureau’s HPC capability.
The computational power of Australis facilitates improvements of the forecast/assimilation
models to deliver a seasonal climate model at 60km resolution, global model at 12km, re-
gional model at 4km, and city models at 1.5km. Australis also provides ensemble modelling
capability to enable probabilistic forecasts to improve decision support systems.
The Bureau’s current HPC platforms consist of several systems, an Exemplar used by
system administrators to test system upgrades and patches; a small Development system
for scientists and software developers called “Terra”; and the Australis operational sys-
tem, a mission critical system for severe weather forecasting and environmental emergency
response.

1.2.2 Sponsor Background

The Bureau is an Australian Government funded organisation. For the duration of its
109 years it has been mandated to provide the expertise and services needed to assist
Australians in preparing for and responding to the harsh realities of the country’s natural
environment.
Through regular forecasts, warnings, monitoring, and advice spanning the Australian
region and the Antarctic territory, the Bureau now provides one of the most widely used and
fundamental services of government. In recent years the Bureau’s position, both regionally
and globally, as a provider of weather products in the Asia Pacific region has seen its
profile rise in the governmental sphere. The importance of the HPC platform delivering
time-sensitive weather predictions is widely recognised.
The HPC system needs to support the Bureau’s mission with many aspects of the system
being highly available and resilient to meet the critical forecast service requirements. It has
resulted in a unique Supercomputer configuration, dedicated to delivering high quality,
timely products on a 24x7x365 schedule.
The amount of observation data available to the Bureau has increased dramatically over
recent years, predominately satellite imaging and its increasing resolution of 1km or less.
Local and international observations are also universally collected using automatic weather
stations and shared through global networks with the Bureau for our weather forecast
models.
A consequence of this is the ability for the models to run at a higher grid resolution and
produce greater accuracy. The previous Bureau HPC platforms were found to be consistently
over-committed with operational and research computing half way through their life. In
common with other Met centres, we have observed over many cycles where the system starts
its life with 25% capacity dedicated to operations and 75% to research and development,
ending with 80% for operations and 20% for research. This results in a constrained capacity
Australian Government Bureau of Meteorology 7

for research and development projects that delay improvements until the next investment
cycle.
In response to this, the Bureau changed its strategy to separate research and operational
computing investments. Research computing moved to a collaborative national peak facility
at NCI in 2013. New Government funding was obtained in 2014 for the replacement of
the Bureau’s existing Oracle HPC system with one delivering the computing capability
to improve its numerical weather prediction applications and forecast services for severe
weather events through improved accuracy, more up to date forecasts, increased ability
to quantify the probability of forecast outcomes, and responding on-demand to extreme
weather and hazard events as they develop.
Within the Bureau, the HPC platform sits in the Data & Digital Group of the organ-
isation. The weather products are developed by the Research and Development branch in
a collaborative relationship with the HPC technical team and National Forecast Services.
This relationship of a scientific need meeting a technical service has been the internal driver
for the system’s upgrades.

1.2.3 Timeline
The timeline for the latest system design, development, procurement, installation, and
use is shown in Table 1.1 below.

TABLE 1.1: Australis Implementation Timeline.

Date Milestone Description Reference Name

15 Jul 2015 Completion of Supercomputer and Procurement Contract Signing with Cray
Issue of Official Order for Australis System Data Inc.
Centre preparations for water-cooled Cray
25 Nov 2015 Cray XC40 site preparations completed, system Installation commencement.
delivered, and installation commenced
15 Mar 2016 Supercomputer acceptance testing completed in- Australis System Readiness
cluding 30 day availability test meeting 30 day Completed.
service levels.
Cray hand over of system to the Bureau. Cray to
meet service levels and provide 24x7 support.
Software porting from existing HPC system to
new Cray XC40 began following acceptance.
30 Jun 2016 Commissioning of replacement Supercomputer Australis Commissioning.
for routine 24x7 operational forecast services.
Operational readiness acceptance testing and
trials by the Bureau completed.

1.3 Applications and Workloads

The dominant applications on the system are focused on National Weather Prediction.
Weather forecasts are produced using Numerical Weather Prediction model (NWP) outputs.
These models deliver critical guidance within the core forecast process and in the direct
generation of some products.
The Bureau runs an extensive suite of models, across multiple domains (atmosphere,
ocean, land, water), and across multiple spatial scales and forecast lead times (from hours
to days to weeks to months). Bureau HPC capability and infrastructure is built to ensure
8 Contemporary High Performance Computing, Vol. 3

both a very high level of reliability of forecast generation and its timeliness of delivery -
HPC attributes that distinguish it from workloads in other fields, such as research.
Improvements in the forecast quality of NWP over the decades have been driven by
three key factors:
1. Improved understanding of atmospheric physics, and how that understanding can be
encapsulated in a numerical model;
2. Use of more observation data and observation types, together with increasingly so-
phisticated mathematical methods in the “Data Assimilation” process that generates
the initial atmospheric state from which a forecast simulation is produced;
3. Increasing HPC capacity, which has enabled models to run at higher-resolution to
better resolve physical features and processes within a given production time-window.
Our daily runs include; Global NWP, Global Wave, Global Ocean, Australian Regional
NWP, Regional Wave, and six regions of high-resolution NWP models for Victoria/Tasma-
nia, New South Wales, Queensland, South Australia, Western Australia, and Northern Ter-
ritory. A single high-resolution convection resolving NWP model of the Australian continent
is desirable but computationally unattainable due to resource costs. Antarctic forecasting
currently uses the Global NWP model for guidance. Our severe weather modelling consists
of tropical cyclone prediction, fire weather prediction, flood forecasting, and environmen-
tal emergency response to chemical, volcanic, nuclear, and bio-hazard events. Additional
modelling runs for global climate forecasting use the Predictive Ocean Atmosphere Model
for Australia (POAMA) ensemble with a 250km grid resolution, and a new ACCESS cou-
pled climate model with a 60km grid is being readied for multi-week and seasonal climate
forecast services. Further predictive modelling includes ocean tides, storm surge, tsunami,
coastal ocean circulation, space weather, hydrology, and stream flow.

1.3.1 Highlights of Main Applications

The Numerical Weather Prediction (NWP) models focus on predicting the future state
of the atmosphere on timescales from hours to days, and are one of the key domains sup-
ported by Bureau HPC capability. Many of the models are refinements of the Australian
Community Climate and Earth-System Simulator (ACCESS)[4].
In common with many other applications of Computational Fluid Dynamics, NWP mod-
els work by approximating the known (but largely unsolvable) partial-differential physics
equations that govern fluid flow and energy, by a very large set of (solvable) algebraic and
numeric equations, through a process of “discretisation” - breaking the atmosphere up into

Atmosphere Science Development

Monitoring Product and

Weather and Service
and Numerical Information
Climate Delivery
Observation Prediction and Customers
------------ and Advisory ------------ and
Oceans Third-party Analysis Generation Customer Partners
Prediction Engagement
and Analysis and
Liason
Land

High Performance Computing

FIGURE 1.2: The Bureau’s Numerical Prediction Value Chain

Australian Government Bureau of Meteorology 9

small volumes (“grid-cells”) in a similar way a digital camera approximates an image in an

array of pixels. The smaller the grid-cells, the higher the fidelity, the more accurate detail
the model can produce, but higher resolution comes at a greater computational cost. A rule
of thumb is a doubling of horizontal grid resolution results in 10 fold increase in compute
costs. To best manage this detail/cost trade-off, the Bureau runs a number of NWP systems
for spatial resolution and forecast lead time including:
“ACCESS-G” – the Bureau’s global NWP system. This model covers the entire globe
at a resolution of approximately 25km today, 12km in development, and produces forecasts
out to ten days. It answers questions around large-scale meteorology, such as “where is that
cold-front likely to be five days from now?” A new 18-member global ensemble system is in
development at a resolution of 33km to derive probabilistic forecast information for feeding
decision support systems.
“ACCESS-R” – the Bureau’s regional NWP system covers mainland Australia and a
significant expanse of surrounding ocean at a resolution of approximately 12km today,
4km in development. ACCESS-R produces forecasts out to three days. Whilst ACCESS-R
does not simulate the atmosphere outside its region, it is influenced by it and references
the “boundary data” provided by ACCESS-G. Thus, ACCESS-R is a nested, downscaled
system of ACCESS-G, and its forecast quality is dependent on the quality of the ACCESS-
G forecast. ACCESS-R is also focused on large-scale meteorology, but its higher resolution
enables it to better address questions such as “Are the winds associated with the cold front
likely to intensify over the next 24 hours?” or “How is the motion/structure of that cold
front likely to change as it moves over a mountain range?”
“ACCESS-C” – the Bureau city-scale NWP system covers the country’s major popula-
tion centres at a resolution of approximately 1.5km. It produces forecasts out to 36 hours.
It depends on ACCESS-R for its boundary data, as well as specification of the initial state
of the atmosphere. At this high-resolution, the model begins to have an ability to mimic,
though not perfectly simulate, aspects of weather at local scales. Whereas ACCESS-G and
ACCESS-R provide a longer-range view as to the large-scale conditions that may favour
thunderstorm formation in a particular region, ACCESS-C better simulates the thunder-
storm itself, the weather near coastal regions and mountains and projected rainfall for
streamflow and flood forecasts. A new 18-member city ensemble system is in development
at a resolution of 2.2km to derive probabilistic forecast information for feeding decision
support systems.
“ACCESS-TC” – the model optimised for forecasting the behaviour of tropical cyclones.
A new deterministic model at 4km resolution will produce more accurate forecasts for
tropical cyclone path, intensity, and structure including when and where it will cross the
coast.

Sydney, NSW
(1.5km togography)

25km Global Model 12km Regional Model 1.5km grid City/State Model
2x daily, 10-day forecast 4x daily, 3-day forecast 4x daily, 36 hour forecast

FIGURE 1.3: Numerical Weather Prediction Cascading Domains

10 Contemporary High Performance Computing, Vol. 3

Cascading or “coupling” of individual NWP models, and now ensemble modelling, has
placed additional stress on HPC capacity in terms of forecast timeliness and peak demand
for the compute and data storage resources. This characteristic typically sets the compute
resource capacity limits or size of the system.

1.3.2 2017 Case Study: From Nodes to News, TC Debbie

On 22nd March 2017, a weak area of low pressure developed over the Coral Sea, in
the Milne Bay Province of Papua New Guinea. Using a consensus of computer models
from around the globe, including the ACCESS-G and ACCESS-R running on Australis,
meteorologists in the Bureau of Meteorology’s Tropical Cyclone Warning Centre in Brisbane
were able to assess the potential for the tropical low to develop into a tropical cyclone and
the possible directions that the system might move over the following days. The Bureau’s
embedded meteorologists at Queensland Fire and Emergency Services were also able to use
the computer model outputs to provide briefings on the range of scenarios that could occur
to assist state and local government planning.
Following an assessment of the computer model guidance available at the time, mete-
orologists in the Brisbane Tropical Cyclone Warning Centre decided to issue to the first
Tropical Cyclone Watch for the developing tropical low on 24th March. Moving into the
25th March, the Bureau of Meteorology declared that the system had formed into a tropical
cyclone and as a result gave it the designation of Debbie[2]. The declaration of the sys-
tem as a Tropical Cyclone led to the initiation of the ACCESS-TC model, also running on
Australis. The system was heading south.

45 150 155
Cairns
10 pm Mar 24 L
10 am Mar 25 1
2 10 pm Mar 25
2 10
10

21
pm

am
10

0p Ma
am

m r2
Ma
12

6
31
r2

Townsville r2
Ma
pm

0a 6
7

m
r2

4
M

Ma
ar
8

Bowen 4 r2 -20
28

7
Proserpine3
4 Arlie Beach
Hamilton Island
Collinsvill 2
4p

1
10

Mackay
Severe tropical cyclone
m

Mount Coolon
2a

Debbie
m

r2
Ma
Ma

L
8

March 25 - 29, 2017

St Lawrence
r2

4 pm Mar 29
8
9

L
9 Rockhampton
M ar 2 0 200 400
m
10p Kilometres
© Copyright Commonwealth of Australia 2017, Bureau of Meterology

FIGURE 1.4: Path of Tropical Cyclone Debbie

Working with a combination of automatic weather station observations, satellite in-

formation, and data from the various weather models, meteorologists continued to make
projections of the likely development and trajectory of the system. Further warnings were
issued and possible evacuations were considered by emergency authorities. Passing over the
Australian Government Bureau of Meteorology 11

Coral Sea the system had developed into a Category 2 cyclone by 26th March. On 27th
March Debbie strengthened quickly from a Category 2 to a Category 4 severe tropical
cyclone as it continued heading toward the Queensland coastline.
The storm then continued developing until it crossed the coastline at Airlie Beach at
midday on 28th March with sustained winds of 195 km/h. Bureau observing equipment on
Hamilton Island Airport was damaged by the storm at around 11am. Prior to this, a peak
wind gust of 263 km/h was recorded; this being the highest ever wind gust recorded in
Queensland.
All the while Bureau staff were using the predictions from the ACCESS-G, ACCESS-R,
ACCESS-C, and ACCESS-TC systems to update forecasts and warnings for communities
in the expected path of the storm, and for those likely to experience damage to property
and danger to life. Using the new Australis system, forecast models were produced every
6 hours, using guidance from the newest higher resolution ACCESS-C2 model covering
the highly populated area of southeast Queensland. The model provided output to provide
guidance on potential rainfall totals across southeast Queensland; this gave important input
into decisions surrounding Severe Weather Warnings and Flood Warnings.
Ex-tropical cyclone Debbie tracked south then southeast over the Sunshine Coast and
Brisbane during the afternoon and evening of Thursday, 30th March. The storm continued
to move south across Queensland and into New South Wales; the forecasting responsibilities
moved to the regional office in Sydney. Debbie finally left the Australian mainland on 31st
March. As a severe weather system it continued across the Tasman Sea, where it caused
further significant flooding in the Bay of Plenty region of New Zealand on 6th April[6].

1.3.3 Benchmark Usage

Benchmarks for the HPC platform are something that the Bureau uses for procurement,
in system balance and design, application performance, system acceptance and diagnostics,
routine sustained system performance reporting, contract measure of system performance,
and assessment of new technology for future investments. From an operational viewpoint,
with a system that has a focus on delivering a high quality product 24x7x365, routine
benchmarks on sustained system performance are used to assess the overall performance
of scheduler loads and system resourcing, computational performance, network and storage
performance, power consumption, and operating system and application runtime system
performance of our operational environment.
With such a continuous and complex workflow, we found placing the focus on measuring
the ongoing performance of the system using SSP (Sustained System Performance) to be
most operationally beneficial.

1.3.4 SSP - Monitoring System Performance

An ongoing concern for running an operational supercomputer is to detect any issues
in this environment that might affect the performance of the system. While the Bureau
runs a fairly predictable job mix, those jobs are not suited to detect issues in the system.
Most jobs utilise iterative solvers or handle a varying amount of data, all of which makes
the runtimes of most jobs variable. A hidden issue in the system causing a slowdown might
not be picked up while monitoring the runtime of operational work load on Australis. For
instance, a sensor and power supply issue is known to force the downclocking of a processor
- and hence affecting the application performance.
A common approach to monitoring the system performance is to use standard bench-
marks metrics like floating point performance, memory bandwidth, network performance
12 Contemporary High Performance Computing, Vol. 3

and IO bandwidth. However none of these benchmarks give a holistic view of the overall
HPC system, each one only picks certain aspects of the overall system.
The Bureau uses a different approach for monitoring the performance of the system by
defining a set of five typical applications run as a standard set of benchmarks. This set
represents the mix of applications running on the system routinely, but uses a fixed set of
input data. The benchmarks in this set are three different UM (Unified Model) simulations
at different resolutions (from a low-resolution climate model to a global model), a data
assimilation, and an ocean simulation, as illustrated in Table 1.2.

TABLE 1.2: SSP Benchmarking.

Benchmark Runtime Number Performance

(seconds) of cores per core

1. UM N1024L85 828.754 2088 0.00208

(12 hour simulation)
2. UM N512L85 578.409 1656 0.003758
(3-day simulation)
3. UM N216L85 296.314 1320 0.009204
(5-day simulation)
4. 4DVAR N320L70 1253.057 1536 0.00187
5. OFAM3 409.443 480 0.018318
(1-day simulation)

Overall SSP value 25,614 123.45

The runtime of each of those five benchmarks is used to compute a performance per
core value (column 4). These five performance values are then averaged using a geometric
mean and multiplied with the number of cores in the system, resulting in one overall SSP
(Sustained System Performance) number.
Because each simulation uses the same input data, the runtimes of each benchmark
should report very little variation. Consequently the overall SSP figure should be within
variance. Any significant change in the runtime behaviour, in any of those applications,
would result in a significant change of the reported SSP figure, indicating that the system
has an issue that would cause runtime degradation.
The full SSP suite is run once a week at a quiet time on the system. The individual
runtimes, as well as the overall SSP figure, are reported monthly by Cray; these values are
monitored by the Bureau to look for any degradation of the overall system performance.
A similar SSP setup was used on the Bureau’s previous supercomputers. Some interesting
issues discovered included:
1. A 7% performance loss was detected over a six-month period. While an OS update
would have likely solved this issue, the associated risk of an OS update (loss of official
support since newer kernels might not yet be certified to work with other components
of the system, and the risk of introducing new problems) prevented an OS update
from happening. Instead it was decided to reboot all nodes regularly, which solved the
performance slowdown observed by the SSP.
2. A BIOS update contained an incorrect setting (hardware prefetch was disabled). The
BIOS update was rolled out as nodes were rebooted. The SSP value early on indicated
a system issue. Correlating the used nodes with lower SSP results soon indicated that
recently rebooted nodes showed the slowdown, and closer analysis resulted in detecting
the changed BIOS setting.
Australian Government Bureau of Meteorology 13

SSP tests are also used to evaluate new system software. The approach is somewhat
different from the weekly SSP system tests; the tests will only be run on demand, i.e. when
a new version of software is installed and needs to be evaluated (typically before it is made
available on the system). In this scenario it is rare for the same binary to be used more
than once (except to make sure we are getting statistically significant results). In contrast,
the system-testing SSP suite will keep the same binaries for repeated cycles.
Due to the difficulties involved in verifying an application, running suites tend not to
update to a newer compiler or system library until absolutely required. The SSP suite
mirrors this and keeps on running with the previously compiled binary, in order to accurately
reflect the mix running on the system. Once enough newly compiled software is running on
the system, the system SSP suite is recompiled, and a new baseline is established.

1.4 System Overview

The Bureau of Meteorology relies on multiple supercomputing environments supporting
its complete Software Development Lifecycle (SDLC); Research & Development, Perfor-
mance Testing, Trial and Verification, Pre-Production, and Production stages. The hard-
ware supporting these environments is located in either the Bureau’s Data Centres or the
National Computation Infrastructure (NCI). Its parts are defined as either physical or log-
ical partitions of the system depending on the function.
The Research and Development teams in collaboration with our partners at the NCI
facility conduct scientific studies to understand, develop, and validate weather and climate
science using the national peak computing facility and their 1.7 Petaflop machine called
Raijin.
The Pre-Production and Production computing is performed on our Cray XC40 super-
computer called Australis. Performance model testing and validation, and trial and verifi-
cation during Pre-Production is performed on the Australis Staging partition. Production
modelling is performed on the Australis Production partition with resilience provided by
Australis Staging partition.
In the near future the Bureau will promote another machine into Production to de-
liver post-processing facilities for Australis called Aurora. The work currently underway is
discussed in section 1.12.

1.4.1 System Design Decisions

The Cray XC40 system has been sized to support a maximum forecast load that includes
enhanced 3-hour model runs, concurrent with model processing for exceptional weather
events, comprising 3 tropical cyclones and 4 severe weather events.
The HPC design assumed a single Supercomputer system segmented into two parts.
Each part is a Cray XC40 and is housed in its own room in the Bureau’s data centre. These
parts are designated one for operational use and the other for Pre-Production or Staging.
Each part is designed to operate (power up or power down) independently of the other to
allow maintenance to be performed without interrupting operational jobs running on the
system.
The Bureau designates its HPC cluster members as Australis East/West with the system
running an option for a Staging and Production function. The batch and suite schedulers
run on servers external to the HPC enclosures. Using high availability configurations, the
14 Contemporary High Performance Computing, Vol. 3

Observation
and model
data

Ocean /
Climate models

Downstream
Users:
Workflow Weather
Bureau,
Schedulers models
Public,
Industry

Software / Supercomputer and

Application Stacks Execution hosts Archive
Systems
MARS
SAM

FIGURE 1.5: HPC Environment at the Bureau

schedulers are able to survive single points of failure independent from Australis and can
maintain operational workflow into the HPC systems. This element of the configuration is
seen as a key element in the HPC system meeting its uptime objective.
The operational benefit of this arrangement is that if an unplanned fault occurs on the
part running the Bureau’s operational services, the system administrators are able to move
operations to the other part and restart the last computational jobs and thus minimise the
effect of unplanned outages on the Bureau’s operations. This design allows the Bureau’s
numerical forecast services to achieve a 99.86% uptime service level, a figure that equates
to less than 1 hour of downtime per month.

1.5 Hardware Architecture

The Australis platform was envisaged as a two phase deployment. The first phase deliv-
ered in Q4 2015 was operational at the end of Q2 2016 and boosted the Bureau’s compute
capacity over 16 fold on the system it replaced. The second phase, scheduled for deployment
in 2018, will double the capacity again. The final system, which follows a heterogeneous de-
sign, will deliver a fit for purpose platform for the Bureau’s operations.
In addition to Australis, there are two further XC40 systems, a development platform
currently with 7.5% the capacity of Australis, and an exemplar system designed specifically
for testing settings and operating system updates before they are applied to Australis.

1.5.1 Australis Processors

All the nodes that run NWP jobs are using Intel R Xeon R Haswell 12-core E5-2960V3
2.66GHz processors. Applications are compiled with the Haswell specific feature set selected.
Australian Government Bureau of Meteorology 15
SDB
PBS NetApp FAS

4x 10GbE
10GbE

40GbE 40GbE

10GbE 10GbE
BoM Network
40GbE 40GbE

10GbE 10GbE
40GbE 40GbE

MOM MAMU SDB

DVS SDB
RSIP SDB
Net SDB
Net SDB
RSIP SDB
DVS MAMU MOM
Compute Nodes Compute Nodes
Node Node Node Node

FDR IB

FDR IB FDR IB
FDR IB
SMW
SMW FDR IB FDR IB SMW
SMW

FDR IB
10GbE 10GbE

Data
SDB Login
SDB Login
SDB Data
SDB
Movers Nodes Nodes Movers
Sonexion Sonexion Sonexion Sonexion
Lustre Lustre Lustre Lustre
3SSU 6SSU 6SSU 3SSU

FDR IB FDR IB

FDR IB INFINIBAND FDR IB

Fabric
FDR IB
FDR IB

FIGURE 1.6: Australis Platform Logical Architecture

This processor was selected as the best processor performance when running the SSP bench-
mark. The use of hyperthreading is enabled on a per job basis; thus an application can utilise
either 24 physical cores or 48 virtual cores per node.
The service blades for the XC40 system use Intel Xeon Sandy Bridge processors; therefore
these are not used to run any NWP jobs.

1.5.2 Australis Node Design

There are three classes of nodes in Australis.

1.5.2.1 Australis Service Node

The first is the service blade, which is used to allow the machine to operate. Each service
blade can hold two servers with IO slots. Service blades fulfill a number of different roles:

1. Boot and SDB nodes - these nodes provide a boot function and a System Database
2. Network Router nodes (Net) - these nodes provide the capability to route packets to
TM
and from a Cray Aries network to the corporate network.
3. DVS – Data Virtualisation Service nodes provide a method to mount external NFS
file systems (our Netapp FAS) into the Aries network.

4. RSIP – Realm Specific IP-Address nodes provide a service similar to Network Address
Translation. They allows nodes on the Aries network to send packets to and get return
packets from our HPC services. RSIP is typically used for lower level communications
like DNS, LDAP, software license management, workflow status, etc.
5. LNET – these nodes route Lustre files system traffic from the Aries network through
TM
to the Sonexion R Lustre R appliances via InfiniBand .
16 Contemporary High Performance Computing, Vol. 3

1.5.2.2 Australis Compute Node

The other type of blade is a compute blade. Each compute blade has 4 compute nodes.
These Extreme Scalability nodes are used for the following:
1. Parallel compute jobs – the OS runs a minimal operating stack image.
2. MOM nodes - Machine Oriented Miniserver nodes. These are the nodes that launch
the parallel compute jobs. They run PBS daemons to facilitate this and use ALPS to
launch the jobs across the parallel nodes. They utilise an API to communicate with
ALPS.
3. MAMU nodes - Multiple Application, Multiple User. These nodes provide the capabil-
ity for small jobs from 1 to 24 cores. Jobs cannot span across multiple MAMU nodes.
More than one job and more than one user can be utilising a MAMU node at a time.

TABLE 1.3: Australis Server Configuration.

Feature Australis (2015)

Node Architecture Cray XC40

CPU Intel Xeon E5-2690 v3
CPU microarchitecture Haswell
CPU Frequency (GHz) 2.60
CPU Count per Node 2
CPU cores per node 24
Node Memory Capacity (GB) 128
Node PCIe R Gen 3
Interconnection Network Aries
Compute Racks 12 (liquid cooled)
Number of compute nodes 2176
Number of MAMU nodes 136
Peak FLOP Rate (TF) 1660
Number of service nodes 64
Number of external nodes 16

1.5.3 External Nodes

A number of nodes are implemented outside the Cray XC40 chassis, with each performing
a specific function.
1. Login nodes – these are the external facing servers for users to login to the system
(via jump boxes). They also provide access to PBS queues for automatic builds of
software in all but the Production environment.
2. DataMover nodes – these external facing servers have 40GbE connections to our cor-
porate data Staging network. This is where large file transfers into and out of the
supercomputer environments take place. It is also where the backup copy of data
from the Production disks to the Staging disks is performed. This leaves the com-
pute nodes free to do computation and for the DataMovers to do the storage based
operations.
3. PBS nodes – These nodes are located externally to the Australis cabinets and are
used to schedule the jobs in both the Production and Staging XC40 machines. More
information is given in section 1.6.3 ‘Schedulers’. The design factors leading to the
decision to locate the PBS nodes externally were discussed in Section 1.4.1
Australian Government Bureau of Meteorology 17

4. Suite Scheduler nodes – SMS and Cylc services provide routine operations workflow
management, event triggers, and scheduling for Australis.
5. All the nodes located externally from the XC40 cabinets are managed by Bright
Cluster Manager.

1.5.4 Australis Memory

The system memory architecture uses commodity 32GB DDR3, 2133MHz DIMM mod-
ules with chipkill advanced ECC[3]. Each processor uses 4 memory channels with one DIMM
per channel with a measured 59GBps or 118GBps per node. The RAM is directly connected
to the Intel Xeon processors. The jobs that use multiple nodes do so via Message Passing
Interface (MPI) and there is no global RAM space defined.

1.5.5 Australis Interconnect

Australis uses a Cray custom interconnect, called Aries, inside the XC40 cabinets. It
utilises a Dragonfly topology that incorporates three different layers of connectivity[5]. The
first and lowest latency is the chassis interconnect and connects only within a chassis (each
chassis houses 16 blades). The next level is an electrical group that uses copper interconnects,
and for the XC40 liquid cooled system it connects 6 chassis together. The final level is based
on optical fibres and connects each electrical group to all the other electrical groups. All
ethernet traffic to and from each system is routed via physical firewall appliances. The Test
and Development systems do not use Aries to communicate directly with the Staging and
Production facilities.

1.5.6 Australis Storage and Filesystem

Australis utilises four Lustre file systems. Each Lustre file system is contained within
a Cray Sonexion 2000 appliance, delivering Australis a total of 4PB. The Lustre storage
network is currently implemented using FDR (Fourteen Data Rate) InfiniBand, while future
upgrades will deploy EDR (Enhanced Data Rate) InfiniBand. This will double the Lustre
storage capacity and bandwidth/throughput.
In the current Australis system, two of the Sonexion storage systems are large and
twice the size of the two smaller systems. One large and one small Sonexion storage system
are configured as a pair; one pair is used for the Production system and the other for
the Staging system. The Staging storage pair also provides space for a limited copy of the
Production data. The storage systems are accessed via symbolic links so that the Production
data location can be redirected by simply reallocating the links. This design also allows for
maintenance of the storage systems as well as recovery from unplanned outages.
Applications replicate data from the primary to the secondary Lustre file system at
an appropriate point in each run cycle. Typically this occurs at application restart, next
start, or when results are produced. The majority of the data on the Lustre file systems is
considered to be transient and as such is not backed up, apart from the previously mentioned
replicated data. The NWP Model Suite also archives data to one or more systems external
to the Supercomputer. The Models are also aware of the status of each Sonexion file system
and will either perform their data copy or abort that step when only one of the Production
or Staging storage services is available.
The Cray Sonexion product is basically an appliance, and updates to it are performed
mostly while the file system is offline. This means that storage failover methods are utilised
during the updates.
18 Contemporary High Performance Computing, Vol. 3

Two NFS servers support the Cray XC40’s by providing a small amount of persistent
storage for critical software and data. The home directories on the XC40 computers are lo-
cated on the NFS file systems; data protection is implemented using file based snapshots, file
replication, and traditional backups. The home directories’ file systems hold the persistent
data required by the supercomputer to run the Production workload.

TABLE 1.4: Australis Storage Configuration.

Feature Australis (2015)

Global Parallel Storage Architecture Cray Sonexion 2000

Storage Filesystem Lustre 2.5
Interconnection FDR InfiniBand
Storage Racks 4
Total number of OSS 36
Storage Capacity 4320TB
Storage Bandwidth 135 GB/s
Storage Gateway to Aries method LustreNet Routing
Total number of LNET routers 40
Shared NFS Storage Architecture NetApp FAS 8040 Clustered
Pair
Storage Filesystem (other) NFS 4.0
Interconnection (NFS) Ethernet (10/40Gbit)
Storage Gateway to Aries method Cray DVS
Gateway nodes 4

1.6 System Software

1.6.1 Operating System
Australis currently uses a Cray customised version of SuSE Linux Enterprise Server
11 SP3 for extreme scalability applications and a full SuSE Linux for MOM and MAMU
workflows. This environment, called ‘CLE’, is currently running Version 5.2 UP04. This is
used on all the nodes including the management nodes. Future upgrades will include an
update to SLES 12 and CLE 6.0.

1.6.2 Operating System Upgrade Procedure

As previously noted the Bureau HPC system must be highly available for its 24x7x365
mission. When needed, operating system updates are applied from Test, to Staging, and
then promoted onto Australis. The Australis upgrade process is depicted in Figure 1.7.
Staging is updated first; when that partition achieves stability, Staging is suspended and
the Production environment is made active on the just updated partition. Provided this
partition remains stable the remainder of the system is updated and the Staging workload
resumes on the last upgraded partition. This method provides a fail-back option in the case
of issues with Production. It is quicker and easier to do this than to attempt to roll-back
all the updates that were applied, but sometimes that course of action may be the option
chosen.
Australian Government Bureau of Meteorology 19

TABLE 1.5: Software Configuration.

Feature Australis (2015)

Login Node OS SLES 11 SP3 / CLE 5.2

Compute Node OS SLES 11 SP3 / CLE 5.2
Parallel Filesystem Lustre 2.5.2
Compilers Intel 2016 / 2017
GNU 5.2
Cray Development Toolkit
Message Passing Interface (MPI) Cray MPI
Notable Libraries HDF5
netcdf
Intel Math Kernel Library
Job Scheduler PBS Pro 12.2
Resource Manager ALPS and PBS Pro
Debugging Tools Allinea DDT / Forge
Performance Tools TAU
HPCToolkit
NVIDIA R Visual Profiler

West
Suspend Upgrade Verify
Original Staging OS Stability New New New
OS OS OS Production OS
(Staging) (Staging) (Prod’n) (Prod’n)

Upgrade Verify Restart

Original Original Original Wait New
OS Stability Staging
OS OS OS OS
(Prod’n) (Prod’n) (Staging) (Staging)

Fail-back
East

FIGURE 1.7: Australis Operating System Upgrade Process

1.6.3 Schedulers
A key tool in the ongoing delivery of the operational HPC’s is our use of schedulers.
Managing the complex daily schedule requires management of both jobs and the system
resources on the HPC platform.
In operation, the job scheduler has three goals: first the priority scheduling goal, then
the backfill scheduling goal, and finally the job pre-emption goal.
1. The priority scheduling goal is to run the most important time-critical jobs first; the
environmental emergency response, the on-demand severe weather prediction, and
finally the routine weather prediction jobs within the daily production time-window.
2. The backfill scheduling goal is to run the greatest aggregate sum of non-time critical
jobs when computing resources are available, such as climate, ocean prediction, and
reanalysis jobs. This results in the highest utilisation of the system.
3. The job pre-emption goal is to stop the minimum set of jobs required to allow priority
jobs to run immediately. The suspend-resume pre-emption scheduling is a key feature
of our system, which is used to effectively achieve both priority scheduling and backfill
scheduling goals.
The pre-emption scheduling will target backfill jobs that can be suspended in memory when
time-critical jobs are ready to run. When the priority job has completed, the backfill job is
20 Contemporary High Performance Computing, Vol. 3

resumed. This means that the elapsed time of a backfill job does not need to fit within an
available time slot in the operational schedule. The resource requirements for the backfill
job do need to be met. The large memory compute nodes make the pre-emption scheduling
achievable. Overall this allows us to achieve the highest utilisation of the system and achieve
the production schedule of our business.
The HPC platform currently uses two schedulers (SMS and Cylc) and a workload man-
ager (PBS Professional) to manage the daily work flow. The schedulers are used to feed the
workload manager. Of the 20 plus weather modelling suites the Bureau runs regularly, the
ACCESS suites are the most resource hungry. Running up to 8 times in any 24 hour period,
they need to be managed alongside the Seasonal, Wave, Ocean, and Ocean forecast models,
as well as a fleet of smaller NWP suites. Currently, the Bureau PBS scheduler runs up to
60,000 jobs per day across Australis production and staging.

1.6.3.1 SMS
Developed by European Centre for Medium-Range Weather Forecasts (ECMWF) the
Supervisor Monitor Scheduler (SMS) has been the backbone of the HPC’s delivery platforms
for two decades. Written in C, it allows extensive customisations of task environments. SMS
allows submission to multiple execution hosts using one or more batch schedulers, with suites
scheduled according to time, cycle, suite, task family, and individual task triggering.
The key to the longevity of SMS is that it was always a product developed specifically
to cycle numerical prediction workflows. The 20 years of continual development of SMS
has contributed to HPC’s high rates of uptime and timeliness of delivery. Such a long
evolution and refinement has made it a very stable and reliable product. SMS, however, is
now a decade outside of its operational lifetime and support from ECMWF has ended; the
responsibility for support and development now falls in-house at the Bureau. The limitations
of its interface and its alert and monitoring connectivity eventually drove a decision to seek
a new workflow scheduler.

1.6.3.2 Cylc
The ACCESS suite of weather prediction jobs that deliver the main product from the
Bureau are based on the Unified Model (UM) modelling software from the UK Met Of-
fice. Cylc as a workflow scheduler [10] is integrated in these models along with the suite
configuration package Rose. Both these products use Python as the programming language.
Adopting Cylc as the new workflow scheduler will deliver significant time saving benefits
with a simplified localisation process for every ACCESS model update. The Cylc service,
like SMS, provides our IT Operators with status and alerts relating to running modelling
suites and product generation. The deployment of Cylc continues and is expected to run
into 2018 with an expected retirement of SMS towards 2020.

1.6.3.3 PBS Professional

TM
PBS Pro 12.2 is a Cray specific release of the PBS Pro software that facilitates Cray
Application Level Placement Scheduling (ALPS) to manage the system’s compute, memory,
and storage resources.
Workflow jobs are passed from SMS or Cylc to PBS Pro for scheduling and execution
when system resources are available. PBS Pro runs virtual Staging and Production queues
to manage the upstream workloads distributing the jobs across the Staging and Production
platform. PBS Pro also generates the data used to analyse the system utilisation, leading
to improved job management, baseline capacity planning, and future upgrade planning.
Australian Government Bureau of Meteorology 21

1.7 Programming System

Software development for cloud or traditional server deployment has utilized a “de-
v/stage/prod” or “dev/test/prod” lifecycle for some time. It has delivered significant ben-
efits to application resilience, at relatively low cost in those environments. HPC environ-
ments, however, are more complex and costly, and maintaining independent infrastructure
is a challenge for many organisations.
The Australis environment has provided some separation between each stage in the
development lifecycle, allowing the Bureau HPC engineering to benefit from many of the
efficiencies that are inherent to cloud-based engineering models. This has enabled the team
to pursue enhancements to automation and automated testing without risking operational
stability, in particular using a partitioned area of the environment for testing and evaluation
of systems.
Integrating with automated software validation tools such as Jenkins, Artifactory,
Ansible R , and PyTest (among many others) has enabled a systematic record of provenance
(knowing exactly what code produced the running binaries and scripts), easier rollback (easy
availability of prior versions), and strong guarantees around the validation of the systems
in operation.
Git was also adopted for version control for software development and configuration
release. There has been clear value in moving to Git for version control, due to its improved
ability for managing code and merging changes. Supporting multiple team structures is
more easily done with a more powerful version control system.
Adopting these approaches has not been straightforward. There have been challenges
in introducing new ways of working to long-standing teams, and architectural challenges
integrating the automation tools.
Software engineering processes in other industries have achieved a far greater agility
than in the world of NWP systems development. In an agile environment, the time from
the development of a change to its operational implementation can be as little as a day, or
even minutes in some cases. The challenges of managing complex numerical behavior mean
this agility has not yet been achieved in the Bureau’s HPC NWP domain. More work is
required to provide the system modularity required for these approaches to apply fully.

1.7.1 Programming Models

A weather forecast suite consists of more than just the big models; typically there are
several dozen scripts and programs related to pre- and post-processing. Porting a forecast
suite involves two complementary tasks, porting the scripts and recompiling and tuning the
actual executables. When the new HPC first became available, the Model Build Team had
experience and expertise in current software engineering practice, but they lacked specialist
skills and experience in the porting and tuning of operational software for Cray XC40
computers.
Thus the task of the initial porting of the numerical modelling software and workflows
to Cray XC, and subsequent tuning, fell to the Scientific Computing Services and their
Computational Science team to configure the user and application environment. Its staff
then trained the Model Build team, Research & Development, and National Operations
Centre staff on the specifics of the Cray XC40 environment and compilers so they could
complete the application porting to Australis.
22 Contemporary High Performance Computing, Vol. 3

1.7.2 Compiler Selection

There are three different compiler environments available on Australis, The Intel Com-
piler Suite, the Cray Compiling Environment, and the GNU Compiler Collection. Based
on past experiences, the Intel compiler had been found to deliver better performance than
the GNU compilers. As all the Bureau applications on the previous supercomputer had
been compiled with the Intel compilers, a comparison project between the Intel and Cray
compilers was started. Early results indicated that the porting effort when using the Cray
compiler was significantly higher. Several applications had problems when compiled with
the Cray compiler. This was not typically caused by bugs in the Cray compiler, but had
various other reasons:
1. Some algorithms are numerically sensitive to floating point optimisations, which are
different between different compilers.
2. Several applications assumed that variables were initialised with zero. This had been
working fine with the Intel compiler, but with the Cray compiler those applications
aborted or caused incorrect results.
3. It can also be assumed that since all applications had been compiled for years with
the Intel compiler, any compiler bugs that existed with the Intel compiler had been
worked around (e.g., using special compiler flags to remove certain code transforms,
or code restructuring).
Early measurements showed no clear performance advantage when using the Cray com-
piler so a decision was made to use the Intel compiler for all porting to the new Cray
systems, which benefits from the implicit knowledge contained in the mature build scripts.
Porting using the newer Haswell optimised Intel compilers revealed some issues with
older code. Whilst most instances were easy to remedy, some more complex examples re-
quired compiling with the Sandy Bridge options to avoid incorrect results.
In February 2016 the HPC platform began using the Intel V16 compilers, switching to
V17 at the end of 2016. These compilers have so far proven themselves to be stable and
suitable for compiling newer NWP models for Australis.

1.7.3 Optimisations
Since all the codes had previously been optimised with the Intel compilers, only a little
effort was required to tune the existing applications. Working with the Cray hardware
platform (processors, operating system, network & storage) the runtime environment of
those jobs had to be adjusted. The Model Build team frequently used Cray’s grid reorder
tool to change the ‘processes to nodes’ mapping to minimise internode communication.
Results from these changes delivered up to a 5% performance increase.
Further testing investigated the number of nodes required to allow each job to finish
within the required time. In each case various domain decompositions were tested to find
the one that gave the best performance. For the NWP Unified Model, used for the global,
regional, and city forecasts, an additional IO server needed to be configured. The IO servers
have their own topology and configuration, and free-up capacity in computation nodes. In
some cases the Bureau utilised a simple in-house written “experiment manager” to search
the multi-dimension search space for the best combination of parameters.
Running applications on partial nodes was also tested. On the previous HPC platform,
it was found that some applications executed more quickly when using only part of the
cores on a node – in some cases on a 16 core node, only 12 cores would be used. Partial
node usage increases the memory bandwidth per processes and can give more third level
Australian Government Bureau of Meteorology 23

cache per process, resulting in a shorter execution time[1]. Our measurements so far have
not indicated any need to do this on the Cray XC40 platform.
Additional work was necessary for hybrid codes that utilised MPI and OpenMP R for
parallel processing. Using the Intel compiler with its own thread-binding mechanism and
environment variables was found in some cases to interfere with the thread-binding setup
by the Cray application scheduler.
One unexpected problem encountered was a huge runtime variation in jobs that heavily
utilise scripts (jobs starting more than 50,000 processes). The issue was traced to caching
issues with the NFS file system. This drove the decision to move those centrally provided
scripts from NFS and onto Lustre, and resulted in much improved application performance.

1.8 Archiving
Two archival systems are in place to handle data related to the input and output data
from Australis, Oracle’s Hierarchical Storage Manager (OHSM, formally known as SAM-
QFS), and Meteorological Archival and Retrieval System (MARS) from the European Cen-
tre for Medium-Range Weather Forecasts (ECMWF).

1.8.1 Oracle Hierarchical Storage Manager (SAM-QFS)

Oracle Hierarchical Storage Manager (OHSM), previously known as SAM-QFS, archives
unstructured data and is housed on an Oracle T5-2 clustered system. Details of the system
are shown in Table 1.6
TABLE 1.6: OHSM (SAM-QFS) System Configuration.

Feature Australis (2015)

Host Oracle T5-2

Host Operating System Solaris 11
CPU Sparc
RAM 256GB
Archival Software OHSM v6.1
Disk DELL/EMC Unity 500
100TB
Tape StorageTek T10000C
10 Drives
Network 10Gb Ethernet
4 Connections

Data is moved onto and off SAM-QFS using network copy programs (such as scp) from
Data Mover nodes on the edge of the Crays.

1.8.2 MARS/TSM
MARS archives and retrieves structured data to and from tape, and is optimised for
expected retrieval patterns. MARS is a bespoke archive solution designed and developed by
the ECMWF.[7]

Meteorological Archival and Retrieval System (MARS) features:

24 Contemporary High Performance Computing, Vol. 3

1. Numerical Prediction (NP) specific archival system for gridded and observation
datasets developed and maintained by the ECMWF
2. Metadata database and disk front-end for tape storage
3. 2.5PB of operational archive, growing 1+TB per day
4. Up to 60,000+ transactions per day
5. File formats: GRIBL, GRIBZ, & BUFR [11].

TABLE 1.7: MARS System Configuration.

Feature Australis (2015)

Host DELL R730

Host Operating System Linux 6
CPU E5-2643
RAM 128
Network 10Gb Ethernet
4 Connections
Tape StorageTek T10000D
10 Drives
TSM version 7.1

MARS provides a layer to archive/move the data to tape; in our case IBMs Spectrum
Protection (previously Tivoli Storage Manager, TSM) is used to archive the data to tape.
The system supporting the MARS service and TSM is described in Table 1.7.

1.9 Data Center/Facility

The Bureau’s HPC platform is housed in a commercial data centre with a tier 3+ up-
time rating, and is located in the state of Victoria, Australia. The facility has a PUE rating
of 1.3. The space the Bureau utilises for its data centre has been divided into 3 separate
rooms. The first two each house 50% of the Cray XC40 liquid cooled system. The rooms
have sufficient cooling capacity, up to 980kW per room, for the anticipated mid-life upgrade.
Each of those rooms has a separate power feed; a loss of a power feed will only affect one
of the two rooms. The data centre provides redundant and separate liquid cooling loops to
each Cray XC40.
The third room has redundant power feeds that incorporate dual feeds to all the racks;
this enables all the Cray infrastructure services devices to have dual power supplies; these
are fed from independent power sources. This room houses the Management, PBS Pro,
Login, and Data Mover nodes for Australis; the Cray Sonexion and DDN GPFS (IBM
TM
Spectrum Scale General Parallel File System) storage systems. It also houses other in-
frastructure, filesystems, and data storage. This room is air-cooled only and built to use
cold aisle containment.
The extra walls were required to isolate the rooms and include fire protection. This
means the layout of our data centre is not the same that other customers have within
the same facility. The Bureau needed to work with the data centre to alter their standard
layouts and customise the airflows and other aspects of the facility to its needs.
Australian Government Bureau of Meteorology 25

1.10 System Statistics

1.10.1 Systems Usage Patterns
Australis has four principal Production run times each day (00:00 UTC, 06:00 UTC,
12:00 UTC & 18:00 UTC). The model runs include Global, Global Wave, Global Ocean,
Australian Regional, Regional Wave, and six high-resolution state/city runs for Victoria/-
Tasmania, New South Wales, Queensland, South Australia, Western Australia, and North-
ern Territory. Antarctic forecasting uses data from the Global model. Additional Model runs
for climate forecasting use the Predictive Ocean Atmosphere Model for Australia (POAMA)
ensemble with a 250km grid resolution.
Compute
MAMU
DataMover

00.00 03.00 06.00 09.00 12.00 15.00 18.00 21.00 00.00

FIGURE 1.8: Australis NWP Time-critical Processing Load

A sample of system usage patterns on the Production platform can be seen in Figure
1.8. The Time-critical Processing Load graph shows the regularity of the load created by
the 6-hourly Production jobs on the HPC system.
Notes: This load graph omits the impact of the many non time-critical jobs that run on
Australis. These jobs include ocean and climate models, and they tend to use a lot of the
HPC system capacity that otherwise might seem to be unused. In addition, noticeably more
HPC system capacity will be consumed when the higher-resolution models currently being
planned are moved to production.
Figure 1.9 shows the rise in the number of production jobs since the Australis HPC
system inception. Staging jobs are not shown.

1.11 Reliability
In striving to achieve a robust 24x7x365 platform we identified four key areas to optimise
the workflow and balance the competing loads.
1. Compute capacity to run jobs from PBS Pro
2. Datamover capacity to stage files
3. Storage disks for short term files on Lustre
4. SSH access to pull files
26 Contemporary High Performance Computing, Vol. 3

Identifying and focusing on these areas allowed us to establish processes to monitor and
support each key area. In normal operation each node is allocated directly to particular
PBS Pro queues; this is where a process allows specific groups of nodes to be allocated to
either Production or Staging queues.

Jobs per Month

400k
350k
300k
250k
200k
150k
100k
50k
k
Jun Jul Aug Sep Oct Nov Dec Jan Feb Mar Apr May Jun Jul Aug
Production 2016 -2017

FIGURE 1.9: Australis Production Jobs per Month

1.11.1 Failover Scenarios

Maintaining operations 24x7x365 are important to the Bureau’s forecasting services. The
system has been designed to survive many different types of compute and storage failures.

1.11.2 Compute Failover

The changeover operation of compute nodes from Staging to Production can be per-
formed without interruption to Production queues. However our preferred process is to first
stop and drain all running jobs on the Staging systems. In the case of a system failure where
the status of the jobs that were running on the (failing) Production nodes is unknown, the
process is to ensure that all the Staging nodes have been drained (often forcibly) and then
power off all the nodes previously allocated to Production. Production is then started on
what was the Staging server. The failed nodes are rebooted; any issues are then fixed and
the system is brought back on-line, returning to service as the Staging server. In this way
the Bureau strives to minimise outages and ensure that the system is returned to full service
as quickly and efficiently as possible.
A fault occurring on the Staging server can be resolved without disrupting Production.

1.11.3 Data Mover Failover

Data movers are allocated to Production or Staging in groups. At present there is a
direct relationship of data movers to compute nodes, so they are usually swapped from
Staging to Production at the same time.

1.11.4 Storage Failover

As previously mentioned there are two distinct groups of Lustre filesystems. The first is
for Production data (Group P) and the second for Staging data (Group S). By design the
Group S also holds a copy of important Production data. When needed for recovery, NWP
applications can be reconfigured to collect their Production data from the Group S area
Australian Government Bureau of Meteorology 27

relatively easily. Before doing so, this data may need to be rolled back to the last known
good state.
The available modes of operation of the storage systems is shown in Figure 1.10. These
range from Normal mode through Failover, Recovery, and Isolated modes, and are discussed
further below.

Normal Production Failover Production

Mode Compute Mode Compute

Group P Group S Group P Group S

storage storage storage storage

Recovery Production Isolated Production

Mode Compute Mode Compute

Group P Group S Group P Group S

storage storage storage storage

FIGURE 1.10: File Storage Operations Modes

1.11.4.1 Normal Mode

All filesystems are available. Production writes to Group P storage and stores a copy of
the important data on Group S storage.

1.11.4.2 Failover Mode

The Group P storage is unavailable. All running jobs need to be restarted from last
known state and now use Group S for input/output. In this state there is no copy of
important data taken. The next mode will be Recovery, which can be switched to without
requiring any jobs to be stopped.

1.11.4.3 Recovery Mode

All filesystems available, but Production still uses Group S. This mode is used so that
the Group P file system can get a copy of the recovery data before we return to normal
mode - which is what happens after being in recovery mode. Normally 24 hours are allotted
for this to take place, however the process can be sped up or run manually for certain
applications. It depends on how long the Group P file system was unavailable. Changing
from Recovery to Normal mode requires stopping all running jobs.

1.11.4.4 Isolated Mode

The Group S storage is unavailable. Production continues to run without an interruption;
however creating the copy of important backup data won’t take place.Transitioning to this
mode usually only takes place from the normal mode and can be achieved while jobs are
still running in Production.

1.11.5 SSH File Transfer Failover

Some applications, such as the suite scheduler, require SSH access to retrieve files. To
facilitate this, a F5 load balancer is used to poll the data movers to check their availability
28 Contemporary High Performance Computing, Vol. 3

and to determine which PBS Pro queues they are allocated to. From this, a dynamic DNS
entry is generated for those clients that need to run SCP (secure copy). If a target node
becomes unresponsive, it will be removed from the list maintained by the F5, and when a
queue is switched, the list updates within a few minutes.

1.12 Implementing a Product Generation Platform

During the time in which we have been preparing this chapter, the HPC team at the
Bureau have been readying the environment to transition to a data-intensive product gener-
ation platform called “Aurora”. The Bureau’s aim is to decouple, or reduce, the modelling
dependencies, improve forecast product longevity, reduce disruptions to business, and to
improve:
1. Data processing performance; to provide a greater breadth of applications
2. Standard data processing levels; to improve data asset management
3. Standard methods; to improve software reuse and scientific integrity
4. Common data services; provides ease and consistency of product data access
5. API management; providing security and management of data requests
There are two major thrusts in the product generation platform: the splitting of pre/-
post processing from model simulation and the decoupling of model data outputs from the
forecast products and services that the Bureau’s customers are interested in.
This split of compute and post-processing will allow the Bureau’s HPC platform to
adopt a more agile approach to managing its numerical modelling and weather products,
so that modelling changes have fewer dependencies and thus lower development costs with
shorter delivery times.
Separating the numerically intensive compute from the I/O intensive requirement offers
many benefits to our operations. Currently the platform’s resources are constantly managed
to favour the intensive compute for the NWP modelling rather than the intensive data
processing that delivers the forecasting products.
Separating these two functions will free the XC40 to focus on improving compute inten-
sive functions. The CS400 post-processing platform will be allowed to specialise in delivering
better I/O performance, optimised applications and products, with the ability to support
innovations in areas like machine learning and visualisation.
Post-processing tasks are much more granular than NWP model suites. The existing
scheduling tools – Cylc and PBS – are not fully fit to address such fine-grained requirements.
Post-processing tasks change frequently in response to evolving customer requirements. To
support such flexibility and agility of the development and operations, the future post-
processing framework will be highly dynamic and configurable, with event-based declarative
flow like the Dask.distributed package. This package is a distributed dispatcher, with a pool
of workers that scales on demand through PBS.
There are other benefits that are considered important; in the present arrangement, sys-
tem upgrades affect the whole environment in our 24x7x365 operational model, and is an
ongoing risk. In the split platforms, the scope of adverse impact is halved. Resource man-
agement on the platform, currently a significant overhead for live operations, will also ease.
This decoupling concept and platform approach is not unique to the Bureau. It has been
Australian Government Bureau of Meteorology 29

Observation Software /
and model Application Stacks
data

Workflow
Schedulers

Ocean / Ocean /
Climate Climate
models products
Downstream
Users:
Weather Weather
Bureau,
models products
Public,
Industry
Production Post Production

Archive
Systems
MARS
SAM

FIGURE 1.11: HPC Product Generation

used successfully at other National Meteorology centres; however our choice of hardware
solution is different and worth a discussion.
The post processing platform is being built around two Cray CS400 systems. Each scal-
able CS400 platform comprises 16 compute nodes and 4 GPU nodes, each with 1.6TB NVMe
flash storage, 3 service nodes, and 2 management nodes running Bright Cluster Manager.
The platform will also mount the Cray XC40 Lustre filesystem, and its own GPFS filesys-
tem, which is better equipped to manage the intensive I/O workloads. The configuration
is based on a simplified design philosophy where each CS400 cluster is interconnected over
InfiniBand and have their own dedicated storage based on DDN GRIDScaler R GS14KX
hardware.
Currently the CS400 cluster represents a compute count of 40 nodes, 1440 Intel Broad-
well cores, 8 NVIDIA R Tesla R K80 GPU’s, 10.24TB of RAM, and 4PB of GPFS data
storage with 300TB of SSD flash storage.
The post processing will use IBM’s Spectrum Scale (GPFS) as its parallel file system.
All CS400 clients run GPFS client software while the DDN storage run embedded GPFS
Network Shared Disk(NSD) servers. In addition to this, each CS400 cluster has a dedicated
LNET Node that provides access to the Australis global file systems. Each Compute, GPU
and Service node also runs a version of lustre client to enable it to access the Australis
global file systems.
We believe we are one of only a few HPC organisations utilising both Lustre and GPFS.
Our early testing using IOR (IO performance benchmark tool) measures 35GB/second
across 16 nodes. Our calculations show that there is performance growth headroom within
the DDN storage stack; however the more limiting performance bottleneck is the choice
of FDR in our InfiniBand network (this is consistent with Lustre networking today). To
improve storage performance, the Bureau will be moving to InfniBand EDR in the mid-life
upgrade.
Once the Aurora platform is in place, the Bureau will continue to work towards estab-
lishing a content delivery network with data supplied by the post processing platform.
Bibliography
J. S. Vetter. Contemporary high performance computing: an introduction. In Jeffrey S. Vetter, editor,
Contemporary High Performance Computing: From Petascale Toward Exascale, volume 1 of CRC
Computational Science Series, page 730. Taylor and Francis, Boca Raton, 1 edition, 2013.
J. S. Vetter, editor. Contemporary High Performance Computing: From Petascale Toward Exascale,
volume 2 of CRC Computational Science Series. Taylor and Francis, Boca Raton, 1 edition, 2015.
Ilia Bermous, Joerg Henrichs, and Michael Naughton. Application performance improvement by use
of partial nodes to reduce memory contention. CAWCR Research Letters, pages 19–22, 2013.
http://www.cawcr.gov.au/researchletters/CAWCR_Research_Letters_9.pdf#page=19, [accessed
31-August 2017].
Bureau of Meteorology, Queensland Regional Office. Severe Tropical Cyclone Debbie. Press Release,
29 March 2017. http://www.bom.gov.au/announcements/sevwx/qld/qldtc20170325.shtml,
[accessed 31-August-2017].
TJ Dell. A white paper on the benefits of chipkill – correct ECC for PC server main memory.
Technical report, IBM Microelectronics Division, November 1997.
http://www.ece.umd.edu/courses/enee759h.S2003/references/ibm_chipkill.pdf, [accessed 31-
August-2017].
Tom Keenan, Kamal Puri, Tony Hirst, Tim Pugh, Ben Evans, Martin Dix, Andy Pitman, Peter Craig,
Rachel Law, Oscar Alves, Gary Dietachmayer, Peter Steinle, and Helen Cleugh. Next Generation
Australian Community Climate and Earth-System Simulator (NG-ACCESS) - A Roadmap 2014-
2019. The Centre for Australian Weather and Climate Research, June 2014.
http://www.cawcr.gov.au/technical-reports/CTR_075.pdf, [accessed 31-August-2017].
J Kim, WJ Dally, and D Abts. Technology-Driven, Highly-Scalable Dragonfly Topology. In ACM
SIGARCH Computer Architecture News, volume 36, pages 77–88. IEEE Computer Society, 2008.
Anna Leask, Kurt Bayer, and Lynley Bilby. Tropical storm Debbie – a day of destruction, despair and
drama. New Zealand Herald, 7 April 2017.
http://www.nzherald.co.nz/nz/news/article.cfm?c_id=1&objectid=11833401, [accessed 10-
October-2017].
Carsten Maass. MARS User Documentation. October 2017.
https://software.ecmwf.int/wiki/display/UDOC/MARS+user+documentation, [accessed 10-
October-2017].
Bureau of Meteorology. New supercomputer to supercharge weather warnings and forecasts. Press
Release, July 2015. http://media.bom.gov.au/releases/188/new-supercomputer-to-supercharge-
weather-warnings-and-forecasts/, [accessed 31-August-2017].
UK Met Office. Unified Model Partnership, October 2016.
https://www.metoffice.gov.uk/research/collaboration/um-partnership, [accessed 31-August-2017].
Hilary J Oliver. Cylc (The Cylc Suite Engine), Version 7.5.0. Technical report, NIWA, 2016.
http://cylc.github.io/cylc/html/single/cug-html.html, [accessed 31-August-2017].
World Meteorological Organization. WMO International Codes, December 2012.
http://www.wmo.int/pages/prog/www/WMOCodes.html, [accessed 31-August-2017].
QJ Wang. Seasonal Water Forecasting and Prediction. Technical report, CSIRO, 2013.
http://www.bom.gov.au/water/about/waterResearch/document/wirada/wirada-long-term-
factsheet.pdf, [accessed 10-October-2017].
COBALT: Component-based lightweight toolkit. http://trac.mcs.anl.gov/projects/cobalt.
Parallel filesystem I/O benchmark. https://github.com/LLNL/ior.
Sandia MPI Micro-Benchmark Suite (SMB). http://www.cs.sandia.gov/smb/.
Sustained System Performance (SSP). http://www.nersc.gov/users/computational-systems/cori/nersc-
8-procurement/trinity-nersc-8-rfp/nersc-8-trinity-benchmarks/ssp/.
The Graph 500 – June 2011. http://www.graph500.org.
The Green 500 – June 2010. http://www.top500.org/green500.
The Top 500 – June 2008. http://www.top500.org.
mdtest, 2017. https://github.com/MDTEST-LANL/mdtest.
Bob Alverson, Edwin Froese, Larry Kaplan, and Duncan Roweth. Cray XC Series Network.
http://www.cray.com/sites/default/files/resources/CrayXCNetwork.pdf.
Anna Maria Bailey, Adam Bertsch, Barna Bihari, Brian Carnes, Kimberly Cupps, Erik W. Draeger,
Larry Fried, Mark Gary, James N. Glosli, John C. Gyllenhaal, Steven Langer, Rose McCallen,
Arthur A. Mirin, Fady Najjar, Albert Nichols, Terri Quinn, David Richards, Tome Spelce, Becky
Springmeyer, Fred Streitz, Bronis de Supinski, Pavlos Vranas, Dong Chen, George L.T. Chiu, Paul
W. Coteus, Thomas W. Fox, Thomas Gooding, John A. Gunnels, Ruud A. Haring, Philip
Heidelberger, Todd Inglett, Kyu Hyoun Kim, Amith R. Mamidala, Sam Miller, Mike Nelson,
Martin Ohmacht, Fabrizio Petrini, Kyung Dong Ryu, Andrew A. Schram, Robert Shearer, Robert
E. Walkup, Amy Wang, Robert W. Wisniewski, William E. Allcock, Charles Bacon, Raymond
Bair, Ramesh Balakrishnan, Richard Coffey, Susan Coghlan, Jeff Hammond, Mark Hereld, Kalyan
Kumaran, Paul Messina, Vitali Morozov, Michael E. Papka, Katherine M. Riley, Nichols A.
Romero, and Timothy J. Williams. Blue Gene/Q: Sequoia and Mira. In Jeffrey S. Vetter, editor,
Contemporary High Performance Computing: From Petascale toward Exascale, chapter 10, pages
225–281. Chapman & Hall/CRC, 2013.
Cray. CLE User Application Placement Guide, S-2496-5204 edition.
Cray. Cray C and C++ Reference Manual, (8.5) S-2179 edition.
Cray. Cray Fortran Reference Manual, (8.5) S-3901 edition.
Cray. XC Series GNI and DMAPP API User Guide, (CLE6.0.UP03) S-2446 edition.
Cray. XC Series Programming Environment User Guide, (17.05) S-2529 edition.
Argonne Leadership Computing Facility. Early science program, 2010. http://esp.alcf.anl.gov.
Sunny Gogar. Intel Xeon Phi x200 processor - memory modes and cluster modes: Configuration and
use cases. Intel Software Developer Zone, 2015. http://software.intel.com/en-us/articles/intel-xeon-
phi-x200-processor-memory-modes-and-cluster-modes-configuration-and-use-cases.
Kevin Harms, Ti Leggett, Ben Allen, Susan Coghlan, Mark Fahey, Ed Holohan, Gordon McPheeters,
and Paul Rich. Theta: Rapid installation and acceptance of an XC40 KNL system. In Proceedings
of the 2017 Cray User Group, Redmond, WA, May 2017.
Mark Holland and Garth A. Gibson. Parity declustering for continuous operation in redundant disk
arrays. In Richard L. Wexelblat, editor, Proceedings of the 5th International Conference on
Architectural Support for Programming Languages and Operating Systems, volume 27. ACM, New
York, NY, 1992.
Paul Peltz Jr., Adam DeConinck, and Daryl Grunau. How to automate and not manage under
Rhine/Redwood. In Proceedings of the 2016 Cray User Group, London, UK, May 2016.
Steven Martin, David Rush, and Matthew Kappel. Cray advanced platform monitoring and control
(CAPMC). In Proceedings of the 2015 Cray User Group, Chicago, IL, April 2015.
John D. McCalpin. Memory bandwidth and machine balance in current high performance computers.
IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, pages
19–25, December 1995. http://tab.computer.org/tcca/NEWS/DEC95/dec95_mccalpin.ps.
James Milano and Pamela Lembke. IBM System Blue Gene Solution: Blue Gene/Q Hardware
Overview and Installation Planning. Number SG24-7872-01 in An IBM Red-books publication.
May 2013. ibm.com/redbooks.
Department of Energy (DOE) Office of Science. Facilities for the Future of Science: A Twenty-Year
Outlook, 2003. https://science.energy.gov/~/media/bes/pdf/archives/plans/ffs_10nov03.pdf.
Scott Parker, Vitali Morozov, Sudheer Chunduri, Kevin Harms, Chris Knight, and Kalyan Kumaran.
Early evaluation of the Cray XC40 Xeon Phi System Theta at argonne. In Proceedings of the 2017
Cray User Group, Redmond, WA, May 2017.
Avinash Sodani. Knights Landing (KNL): 2nd Generation Intel Xeon Phi. In Hot Chips 27
Symposium (HCS), 2015 IEEE, Cupertino, CA, August 2015.
http://ieeexplore.ieee.org/document/7477467/.
Wolfgang Baumann, Guido Laubender, Matthias Luter, Alexander Reinefeld, Christian Schimmel,
Thomas Steinke, Christian Tuma, and Stefan Wollny. Contemporary High Performance
Computing: From Petascale toward Exascale, volume 2, chapter HLRN-III at Zuse Institute
Berlin, pages 81–114. Chapman & Hall/CRC, 2015.
Heinecke, A. and Klemm, M. and Bungartz, H. J. From GPGPU to Many-Core: Nvidia Fermi and
Intel Many Integrated Core Architecture. Computing in Science and Engineering, 14:78–83, 2012.
Tom Henderson, John Michalakes, Indraneil Gokhale, and Ashish Jha. Chapter 2 - Numerical Weather
Prediction Optimization. In James Reinders and Jim Jeffers, editors, High Performance Parallelism
Pearls, pages 7 – 23. Morgan Kaufmann, Boston, 2015.
Intel Corporation. Itanium ABI, v1.86.
Jeffers, James and Reinders, James. Intel Xeon Phi Coprocessor High Performance Programming.
Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1st edition, 2013.
Khronos OpenCL Working Group. The OpenCL Specification, Version 2.2, March 2016.
https://www.khronos.org/registry/cl/specs/opencl-2.2.pdf.
Michael Klemm, Alejandro Duran, Xinmin Tian, Hideki Saito, Diego Caballero, and Xavier Martorell.
Extending OpenMP* with Vector Constructs for Modern Multicore SIMD Architectures. In
Proceedings of the 8th International Conference on OpenMP in a Heterogeneous World,
IWOMP’12, pages 59–72, Berlin, Heidelberg, 2012. Springer-Verlag.
G. Kresse and J. Furthmüller. Efficiency of ab-initio total energy calculations for metals and
semiconductors using a plane-wave basis set. Comput. Mater. Sci., 6(1):15 – 50, 1996.
G. Kresse and J. Hafner. Phys. Rev. B, 47:558, 1993.
G. Kresse and D. Joubert. Phys. Rev., 59:1758, 1999.
G. Kresse, M. Marsman, and J. Furthmüller. VASP the Guide.
http://cms.mpi.univie.ac.at/vasp/vasp/vasp.html, April 2016.
B. Maronga, M. Gryschka, R. Heinze, F. Hoffmann, F. Kanani-Shring, M. Keck, K. Ketelsen, M. O.
Letzel, M. Shring, and S. Raasch. The Parallelized Large-Eddy Simulation Model (PALM) version
4.0 for atmospheric and oceanic flows: model formulation, recent developments, and future
perspectives. Geosci. Model Dev., 8:1539–1637, 2015.
Y. Nakamura and H. Stüben. BQCD - Berlin quantum chromodynamics program. In PoS (Lattice
2010), page 40, 2010.
Chris J. Newburn, Rajiv Deodhar, Serguei Dmitriev, Ravi Murty, Ravi Narayanaswamy, John
Wiegert, Francisco Chinchilla, and Russell McGuire. Offload Compiler Runtime for the Intel Xeon
Phi Coprocessor. In Supercomputing, pages 239–254. Springer Berlin Heidelberg, 2013.
John Nickolls, Ian Buck, Michael Garland, and Kevin Skadron. Scalable Parallel Programming with
CUDA. Queue, 6(2):40–53, March 2008.
Matthias Noack. HAM - Heterogenous Active Messages for Efficient Offloading on the Intel Xeon Phi.
Technical report, ZIB, Takustr.7, 14195 Berlin, 2014.
Matthias Noack, Florian Wende, and Klaus-Dieter Oertel. Chapter 19 - OpenCL: There and Back
Again. In James Reinders and Jim Jeffers, editors, High Performance Parallelism Pearls, pages
355 – 378. Morgan Kaufmann, Boston, 2015.
Matthias Noack, Florian Wende, Thomas Steinke, and Frank Cordes. A Unified Programming Model
for Intra- and Inter-Node Offloading on Xeon Phi Clusters. In International Conference for High
Performance Computing, Networking, Storage and Analysis, SC 2014, New Orleans, LA, USA,
November 16-21, 2014, pages 203–214, 2014.
Matthias Noack, Florian Wende, Georg Zitzlsberger, Michael Klemm, and Thomas Steinke. KART –
A Runtime Compilation Library for Improving HPC Application Performance. In IXPUG
Workshop ―Experiences on Intel Knights Landing at the One Year Mark‖ at ISC High Performance
2017, Frankfurt, Germany, June 2017.
OpenMP Architecture Review Board. OpenMP Application Program Interface, Version 4.0, 2013.
http://www.openmp.org.
OpenMP Architecture Review Board. OpenMP Application Program Interface, Version 4.5, 2015.
http://www.openmp.org.
OpenMP Architecture Review Board. OpenMP Application Program Interface, Version 4.5, 2015.
http://www.openmp.org/.
Scott Pakin, M. Lang, and D.K. Kerbyson. The Reverse-Acceleration Model for Programming
Petascale Hybrid Systems. IBM Journal of Research and Development, 53(5), 2009.
Boris Schling. The Boost C++ Libraries. XML Press, 2011.
Sergi Siso. DL_MESO Code Modernization. Intel Xeon Phi Users Group (IXPUG), March 2016.
IXPUG Workshop, Ostrava.
Avinash Sodani, Roger Gramunt, Jesús Corbal, Ho-Seop Kim, Krishna Vinod, Sundaram
Chinthamani, Steven Hutsell, Rajat Agarwal, and Yen-Chen Liu. Knights Landing: Second-
Generation Intel Xeon Phi Product. IEEE Micro, 36(2):34–46, 2016.
Florian Wende, Martijn Marsman, Zhengji Zhao, and Jeongnim Kim. Porting VASP from MPI to MPI
+ OpenMP [SIMD]. In Proceedings of the 13th International Workshop on OpenMP, Scaling
OpenMP for Exascale Performance and Portability, IWOMP’17, 2017. Accepted for publication.
Florian Wende, Matthias Noack, Thomas Steinke, Michael Klemm, Chris J. Newburn, and Georg
Zitzlsberger. Portable SIMD Performance with OpenMP* 4.X Compiler Directives. In Proceedings
of the 22nd International Conference on Euro-Par 2016: Parallel Processing - Volume 9833, pages
264–277, New York, NY, USA, 2016. Springer-Verlag New York, Inc.
Rosa M. Badia, Jesus Labarta, Judit Gimenez, and Francesc Escale. DIMEMAS: Predicting MPI
applications behavior in Grid environments. In Workshop on Grid Applications and Programming
Tools (GGF8), volume 86, pages 52–62, 2003.
Barcelona Supercomputing Center. BSC Performance Analysis Tools. https://tools.bsc.es/.
Barcelona Supercomputing Center. MareNostrum III (2013) System Architecture.
https://www.bsc.es/marenostrum/marenostrum/mn3.
Leonardo Bautista-Gomez, Ferad Zyulkyarov, Osman Unsal, and Simon McIntosh-Smith. Unprotected
Computing: A Large-scale Study of DRAM Raw Error Rate on a Supercomputer. In Proceedings of
the International Conference for High Performance Computing, Networking, Storage and Analysis,
SC ’16, pages 55:1–55:11, Piscataway, NJ, USA, 2016. IEEE Press.
Kallia Chronaki, Alejandro Rico, Rosa M Badia, Eduard Ayguadé, Jesús Labarta, and Mateo Valero.
Criticality-aware dynamic task scheduling for heterogeneous architectures. In Proceedings of the
29th ACM on International Conference on Supercomputing, pages 329–338. ACM, 2015.
Lamia Djoudi, Denis Barthou, Patrick Carribault, Christophe Lemuet, Jean-Thomas Acquaviva,
William Jalby, et al. Maqao: Modular assembler quality analyzer and optimizer for Itanium 2. In
The 4th Workshop on EPIC architectures and compiler technology, San Jose, volume 200, 2005.
Alejandro Duran, Eduard Ayguadé, Rosa M Badia, Jesús Labarta, Luis Martinell, Xavier Martorell,
and Judit Planas. OmpSs: a proposal for programming heterogeneous multi-core architectures.
Parallel Processing Letters, 21(02):173–193, 2011.
Markus Geimer, Felix Wolf, Brian JN Wylie, Erika Ábrahám, Daniel Becker, and Bernd Mohr. The
Scalasca performance toolset architecture. Concurrency and Computation: Practice and
Experience, 22(6):702–719, 2010.
Jülich Supercomputing Centre. Jülich Supercomputing Centre – HPC technology. http://www.fz-
juelich.de/ias/jsc/EN/Research/HPCTechnology/PerformanceAnalyse/performanceAnalysis_node.h
tml.
Andreas Knüpfer, Christian Rössel, Dieter an Mey, Scott Biersdorff, Kai Diethelm, Dominic
Eschweiler, Markus Geimer, Michael Gerndt, Daniel Lorenz, Allen Malony, et al. Score-P: A joint
performance measurement run-time infrastructure for Periscope, Scalasca, TAU, and Vampir. In
Tools for High Performance Computing 2011, pages 79–91. Springer, 2012.
Krishna T. Malladi, Benjamin C. Lee, Frank A. Nothaft, Christos Kozyrakis, Karthika Periyathambi,
and Mark Horowitz. Towards Energy-proportional Datacenter Memory with Mobile DRAM. In
Proceedings of the 39th Annual International Symposium on Computer Architecture, ISCA ’12,
pages 37–48, 2012.
Mathias Nachtmann and José Gracia. Enabling model-centric debugging for task-based programming
models–a tasking control interface. In Tools for High Performance Computing 2015, pages 147–
160. Springer, 2016.
Vincent Pillet, Jesús Labarta, Toni Cortes, and Sergi Girona. Paraver: A tool to visualize and analyze
parallel code. In Proceedings of WoTUG-18: transputer and occam developments, volume 44,
pages 17–31. IOS Press, 1995.
Nikola Rajovic, Paul M. Carpenter, Isaac Gelado, Nikola Puzovic, Alex Ramirez, and Mateo Valero.
Supercomputing with commodity CPUs: Are mobile SoCs ready for HPC? In Proceedings of the
International Conference on High Performance Computing, Networking, Storage and Analysis, SC
’13, pages 40:1–40:12, New York, NY, USA, 2013. ACM.
Nikola Rajovic, Alejandro Rico, Filippo Mantovani, Daniel Ruiz, Josep Oriol Vilarrubi, Constantino
Gomez, Luna Backes, Diego Nieto, Harald Servat, Xavier Martorell, Jesus Labarta, Eduard
Ayguade, Chris Adeniyi-Jones, Said Derradji, Hervé Gloaguen, Piero Lanucara, Nico Sanna, Jean-
Francois Mehaut, Kevin Pouget, Brice Videau, Eric Boyer, Momme Allalen, Axel Auweter, David
Brayford, Daniele Tafani, Volker Weinberg, Dirk Brömmel, René Halver, Jan H. Meinke, Ramon
Beivide, Mariano Benito, Enrique Vallejo, Mateo Valero, and Alex Ramirez. The Mont-blanc
Prototype: An Alternative Approach for HPC Systems. In Proceedings of the International
Conference for High Performance Computing, Networking, Storage and Analysis, SC ’16, pages
38:1–38:12, Piscataway, NJ, USA, 2016. IEEE Press.
Nikola Rajovic, Alejandro Rico, Nikola Puzovic, Chris Adeniyi-Jones, and Alex Ramirez. Tibidabo1:
Making the case for an ARM-based HPC system. Future Generation Computer Systems, 36:322–
334, July 2014.
Nikola Rajovic, Alejandro Rico, James Vipond, Isaac Gelado, Nikola Puzovic, and Alex Ramirez.
Experiences with Mobile Processors for Energy Efficient HPC. In Proceedings of the Conference
on Design, Automation and Test in Europe, DATE ’13, pages 464–468, San Jose, CA, USA, 2013.
EDA Consortium.
Nikola Rajovic, Lluis Vilanova, Carlos Villavieja, Nikola Puzovic, and Alex Ramirez. The low power
architecture approach towards exascale computing. Journal of Computational Science, 4(6):439–
443, 2013.
Pavel Saviankou, Michael Knobloch, Anke Visser, and Bernd Mohr. Cube v4: From performance
report explorer to performance analysis tool. Procedia Computer Science, 51:1343–1352, 2015.
Brice Videau, Kevin Pouget, Luigi Genovese, Thierry Deutsch, Dimitri Komatitsch, Frédéric Desprez,
and Jean-François Méhaut. BOAST: A metaprogramming framework to produce portable and
efficient computing kernels for hpc applications. The International Journal of High Performance
Computing Applications, page 1094342017718068, 2017.
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S.
Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, et al. Tensorflow: Large-scale machine
learning on heterogeneous distributed systems. arXiv preprint arXiv:1603.04467, 2016.
Amazon Web Services. Elastic Compute Cloud (EC2). https://aws.amazon.com/ec2/, 2017. [Online;
accessed 28-July-2017].
Ansible HQ. Ansible. https://www.ansible.com, 2017. [Online; accessed 28-July-2017].
Arutyun I. Avetisyan, Roy Campbell, Indranil Gupta, Michael T. Heath, Steven Y. Ko, Gregory R.
Ganger, Michael A. Kozuch, David O’Hallaron, Marcel Kunze, Thomas T. Kwan, et al. Open
Cirrus: A Global Cloud Computing Testbed. Computer, 43(4):35–43, 2010.
Ilia Baldine, Yufeng Xin, Anirban Mandal, Paul Ruth, Chris Heerman, and Jeff Chase. ExoGENI: A
Multi-Domain Infrastructure-as-a-Service Testbed. Testbeds and Research Infrastructure.
Development of Networks and Communities, pages 97–113, 2012.
Daniel Balouek, Alexandra Carpen-Amarie, Ghislain Charrier, Frédéric Desprez, Emmanuel Jeannot,
Emmanuel Jeanvoine, Adrien Lébre, David Margery, Nicolas Niclausse, Lucas Nussbaum, Olivier
Richard, Christian Pérez, Flavien Quesnel, Cyril Rohr, and Luc Sarzyniec. Adding Virtualization
Capabilities to the Grid’5000 Testbed. In Ivan I. Ivanov, Marten van Sinderen, Frank Leymann,
and Tony Shan, editors, Cloud Computing and Services Science, volume 367 of Communications in
Computer and Information Science, pages 3–20. Springer International Publishing, 2013.
Mark Berman, Jeffrey S. Chase, Lawrence Landweber, Akihiro Nakao, Max Ott, Dipankar
Raychaudhuri, Robert Ricci, and Ivan Seskar. GENI: A federated testbed for innovative network
experiments. Computer Networks, 61:5–23, 2014. Special issue on Future Internet Testbeds – Part
I.
Blazar contributors. Welcome to Blazar! — Blazar. http://blazar.readthedocs.io/en/latest/, 2017.
[Online; accessed 28-July-2017].
A. Boles and P. Rad. Voice biometrics: Deep learning-based voiceprint authentication system. In 2017
12th System of Systems Engineering Conference (SoSE), pages 1–6, June 2017.
Bolze, Raphaël and Cappello, Franck and Caron, Eddy and Dayde, Michel and Desprez, Frédéric and
Jeannot, Emmanuel and Jégou, Yvon and Lanteri, Stephane and Leduc, Julien and Melab,
Nouredine and Mornet, Guillaume and Namyst, Raymond and Primet, Pascale and Quétier,
Benjamin and Richard, Olivier and El-Ghazali, Talbi and Touche, Iréa. Grid’5000: A Large Scale
And Highly Reconfigurable Experimental Grid Testbed. International Journal of High
Performance Computing Applications, 20(4):481–494, 2006.
T. Bray. The JavaScript Object Notation (JSON) Data Interchange Format. RFC 7159, RFC Editor,
March 2014.
Tomasz Buchert, Cristian Ruiz, Lucas Nussbaum, and Olivier Richard. A survey of general-purpose
experiment management tools for distributed systems. Future Generation Computer Systems, 45:1–
12, 2015.
Ceilometer contributors. Welcome to the Ceilometer developer documentation! — ceilometer
documentation. https://docs.openstack.org/developer/ceilometer/, 2017. [Online; accessed 28-July-
2017].
Chef contributors. About Ohai — Chef Docs. https://docs.chef.io/ohai.html, 2017. [Online; accessed
28-July-2017].
Brent Chun, David Culler, Timothy Roscoe, Andy Bavier, Larry Peterson, Mike Wawrzoniak, and
Mic Bowman. PlanetLab: An Overlay Testbed for Broad-Coverage Services. ACM SIGCOMM
Computer Communication Review, 33(3):3–12, 2003.
Cinder contributors. Attach a Single Volume to Multiple Hosts — Cinder Specs.
https://specs.openstack.org/openstack/cinder-specs/specs/kilo/multi-attach-volume.html, 2015.
[Online; accessed 28-July-2017].
Cinder contributors. Add Volume Connection Information for Ironic Nodes — Ironic Specs.
https://specs.openstack.org/openstack/ironic-specs/specs/approved/volume-connection-
information.html, 2016. [Online; accessed 28-July-2017].
Marcos Dias de Assuncão, Laurent Lefèvre, and Francois Rossigneux. On the impact of advance
reservations for energy-aware provisioning of bare-metal cloud resources. In 2016 12th
International Conference on Network and Service Management (CNSM), pages 238–242, Oct
2016.
Ewa Deelman, James Blythe, Yolanda Gil, Carl Kesselman, Gaurang Mehta, Sonal Patil, Mei-Hui Su,
Karan Vahi, and Miron Livny. Pegasus: Mapping Scientific Workflows onto the Grid. In Grid
Computing, pages 131–140. Springer, 2004.
Diskimage-builder contributors. Diskimage-builder Documentation.
https://docs.openstack.org/developer/diskimage-builder/, 2017. [Online; accessed 28-July-2017].
D. Duplyakin and R. Ricci. Introducing configuration management capabilities into CloudLab
experiments. In 2016 IEEE Conference on Computer Communications Workshops (INFOCOM
WKSHPS), pages 39–44, April 2016.
EM Fajardo, JM Dost, B Holzman, T Tannenbaum, J Letts, A Tiradani, B Bockelman, J Frey, and D
Mason. How much higher can HTCondor fly? In J. Phys. Conf. Ser., volume 664. Fermi National
Accelerator Laboratory (FNAL), Batavia, IL (United States), 2015.
Geoffrey C. Fox, Gregor von Laszewski, Javier Diaz, Kate Keahey, José Fortes, Renato Figueiredo,
Shava Smallen, Warren Smith, and Andrew Grimshaw. FutureGrid: A Reconfigurable Testbed for
Cloud, HPC, and Grid Computing. In Contemporary High Performance Computing: From
Petascale toward Exascale, Chapman & Hall/CRC Computational Science, pages 603–636.
Chapman & Hall/CRC, April 2013.
Garth Gibson, Gary Grider, Andree Jacobson, and Wyatt Lloyd. PRObE: A Thousand-Node
Experimental Cluster for Computer Systems Research. USENIX; login, 38(3), 2013.
Glance contributors. Welcome to Glance’s documentation! — glance documentation.
https://docs.openstack.org/developer/glance/, 2017. [Online; accessed 28-July-2017].
Gnocchi contributors. Gnocchi – Metric as a Service. http://gnocchi.xyz, 2017. [Online; accessed 28-
July-2017].
Brice Goglin. Exposing the Locality of Heterogeneous Memory Architectures to HPC Applications. In
Proceedings of the Second International Symposium on Memory Systems, MEMSYS ’16, pages
30–39. ACM, 2016.
Google Compute Platform. Compute Engine - IaaS. https://cloud.google.com/compute/, 2017. [Online;
accessed 28-July-2017].
Heat contributors. Welcome to the Heat documentation! — heat documentation.
https://docs.openstack.org/developer/heat/, 2017. [Online; accessed 28-July-2017].
Internet2. Advanced Layer 2 Service. https://www.internet2.edu/products-services/advanced-
networking/layer-2-services/, 2017. [Online; accessed 28-July-2017].
Ironic contributors. Ironic Release Notes: Newton Series (6.0.0 - 6.2.x).
https://docs.openstack.org/releasenotes/ironic/newton.html, 2016. [Online; accessed 28-July-2017].
Ironic contributors. Configuring Web or Serial Console — ironic documentation.
https://docs.openstack.org/developer/ironic/deploy/console.html, 2017. [Online; accessed 28-July-
2017].
Ironic contributors. OpenStack Docs: Multi-tenancy in the Bare Metal service.
https://docs.openstack.org/ironic/latest/admin/multitenancy.html, 2017. [Online; accessed 28-July-
2017].
Ironic contributors. Physical Network Awareness — Ironic Specs.
https://specs.openstack.org/openstack/ironic-specs/specs/not-implemented/physical-network-
awareness.html, 2017. [Online; accessed 28-July-2017].
Ironic contributors. Welcome to Ironic’s developer documentation! — ironic documentation.
https://docs.openstack.org/developer/ironic/, 2017. [Online; accessed 28-July-2017].
K. Keahey and T. Freeman. Contextualization: Providing One-Click Virtual Clusters. In 2008 IEEE
Fourth International Conference on eScience, pages 301–308, Dec 2008.
Yann LeCun, Yoshua Bengio, et al. Convolutional networks for images, speech, and time series. The
handbook of brain theory and neural networks, 3361(10):1995, 1995.
Xiaoyi Lu, Md. Wasi-ur Rahman, Nusrat Islam, Dipti Shankar, and Dhabaleswar K. (DK) Panda.
Accelerating Big Data Processing on Modern HPC Clusters, pages 81–107. Springer International
Publishing, 2016.
Xiaoyi Lu, Jie Zhang, and Dhabaleswar K. (DK) Panda. Building Efficient HPC Cloud with SR-IOV
Enabled InfiniBand: The MVAPICH2 Approach. Springer International Publishing, 2017.
J. Lwowski, P. Kolar, P. Benavidez, P. Rad, J. J. Prevost, and M. Jamshidi. Pedestrian detection
system for smart communities using deep convolutional neural networks. In 2017 12th System of
Systems Engineering Conference (SoSE), pages 1–6, June 2017.
Lyonel Vincent. Hardware Lister (lshw). http://www.ezix.org/project/wiki/HardwareLiSter, 2017.
[Online; accessed 28-July-2017].
David Margery, Emile Morel, Lucas Nussbaum, Olivier Richard, and Cyril Rohr. Resources
Description, Selection, Reservation and Verification on a Large-Scale Testbed. In Victor C.M.
Leung, Min Chen, Jiafu Wan, and Yin Zhang, editors, Testbeds and Research Infrastructure:
Development of Networks and Communities: 9th International ICST Conference, TridentCom
2014, Guangzhou, China, May 5-7, 2014, Revised Selected Papers, pages 239–247. Springer
International Publishing, 2014. DOI: 10.1007/978-3-319-13326-3_23.
Nick McKeown, Tom Anderson, Hari Balakrishnan, Guru Parulkar, Larry Peterson, Jennifer Rexford,
Scott Shenker, and Jonathan Turner. OpenFlow: Enabling Innovation in Campus Networks. ACM
SIGCOMM Computer Communication Review, 38(2):69–74, 2008.
Dirk Merkel. Docker: Lightweight Linux Containers for Consistent Development and Deployment.
Linux Journal, 2014(239):2, 2014.
Microsoft Azure. Virtual machines – Linux and Azure virtual machines.
https://azure.microsoft.com/services/virtual-machines/, 2017. [Online; accessed 28-July-2017].
National Science Foundation. CISE Research Infrastructure: Mid-Scale Infrastructure - NSFCloud
(CRI: NSFCloud). https://www.nsf.gov/pubs/2013/nsf13602/nsf13602.htm, 2013. [Online;
accessed 28-July-2017].
Neutron contributors. Welcome to Neutron’s developer documentation! — neutron documentation.
https://docs.openstack.org/developer/neutron/, 2017. [Online; accessed 28-July-2017].
Nova contributors. Vendordata — nova documentation.
https://docs.openstack.org/developer/nova/vendordata.html, 2017. [Online; accessed 28-July-2017].
Nova contributors. Welcome to Nova’s developer documentation! — nova documentation.
https://docs.openstack.org/developer/nova/, 2017. [Online; accessed 28-July-2017].
OpenStack contributors. Openstack Juno — OpenStack Open Source Cloud Computing Software.
https://www.openstack.org/software/juno/, 2014. [Online; accessed 28-July-2017].
OpenStack contributors. OpenStack Open Source Cloud Computing Software.
https://www.openstack.org, 2017. [Online; accessed 28-July-2017].
S. Panwar, A. Das, M. Roopaei, and P. Rad. A deep learning approach for mapping music genres. In
2017 12th System of Systems Engineering Conference (SoSE), pages 1–5, June 2017.
Pittsburgh Supercomputing Center. Bridges. https://www.psc.edu/bridges, 2017. [Online; accessed 28-
July-2017].
Ruth Pordes, Don Petravick, Bill Kramer, Doug Olson, Miron Livny, Alain Roy, Paul Avery, Kent
Blackburn, Torre Wenaus, Frank Würthwein, et al. The Open Science Grid. In Journal of Physics:
Conference Series, volume 78. IOP Publishing, 2007.
Puppet. Puppet. https://puppet.com, 2017. [Online; accessed 28-July-2017].
QEMU contributors. QCOW2. http://bit.ly/qcow2, 2017. [Online; accessed 28-July-2017].
Renaissance Computing Institute. Scientific Data Analysis at Scale (SciDAS).
http://renci.org/research/scientific-data-analysis-at-scale-scidas/, 2017. [Online; accessed 28-July-
2017].
Robert Ricci, Eric Eide, and The CloudLab Team. Introducing CloudLab: Scientific Infrastructure for
Advancing Cloud Architectures and Applications. USENIX ;login:, 39(6), December 2014.
Constantine Sapuntzakis, David Brumley, Ramesh Chandra, Nickolai Zeldovich, Jim Chow, Monica
S. Lam, and Mendel Rosenblum. Virtual Appliances for Deploying and Maintaining Software. In
Proceedings of the 17th USENIX Conference on System Administration, LISA ’03, pages 181–
194, 2003.
Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117,
2015.
Craig A. Stewart, Timothy M. Cockerill, Ian Foster, David Hancock, Nirav Merchant, Edwin
Skidmore, Daniel Stanzione, James Taylor, Steven Tuecke, George Turner, et al. Jetstream: A self-
provisioned, scalable science and engineering cloud environment. In Proceedings of the 2015
XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, page 29.
ACM, 2015.
Shawn M. Strande, Haisong Cai, Trevor Cooper, Karen Flammer, Christopher Irving, Gregor von
Laszewski, Amit Majumdar, Dmistry Mishin, Philip Papadopoulos, Wayne Pfeiffer, Robert S.
Sinkovits, Mahidhar Tatineni, Rick Wagner, Fugang Wang, Nancy Wilkins-Diehr, Nicole Wolter,
and Michael L. Norman. Comet: Tales from the long tail: Two years in and 10,000 users later. In
Proceedings of the Practice and Experience in Advanced Research Computing 2017 on
Sustainability, Success and Impact, PEARC17, pages 38:1–38:7. ACM, 2017.
The Chameleon project. Chameleon Cloud Homepage. https://www.chameleoncloud.org, 2017.
[Online; accessed 28-July-2017].
The FutureGrid project. FutureGrid. http://www.futuregrid.org, 2015. [Online; accessed 28-July-
2017].
Leendert van Doorn. Hardware Virtualization Trends. In Proceedings of the 2nd International
Conference on Virtual Execution Environments, pages 45–45, 2006.
Brian White, Jay Lepreau, Leigh Stoller, Robert Ricci, Shashi Guruprasad, Mac Newbold, Mike
Hibler, Chad Barb, and Abhijeet Joglekar. An integrated experimental environment for distributed
systems and networks. In Proceedings of the Fifth Symposium on Operating Systems Design and
Implementation, pages 255–270. USENIX Association, December 2002.
S. Alam, J. Poznanovic, U. Varetto, N Bianchi, A. Pena, and N. Suvanphim. Early experiences with
the Cray XK6 hybrid CPU and GPU MPP platform. In Proceedings of the Cray User Group
Conference, 2012.
N. Chaimov, A. Malony, C. Iancu, and K. Ibrahim. Scaling Spark on Lustre. ISC High Performance
2016. Lecture Notes in Computer Science, 9945, 2016.
G. Chrysos and S. P. Engineer. Intel xeon phi coprocessor (codename knights corner). In Proceedings
of the 24th Hot Chips Symposium, 2012.
G. Doms and U Schattler. The non-hydrostatic limited-area model LM (lokalmodell) of DWD Part I:
scientific documentation. German Weather Service, Offenbach/M., 1999.
G. Faanes, A. Bataineh, D. Roweth, T. Court, E. Froese, B. Alverson, T. Johnson, J. Kopnick, M.
Higgins, and J. Reinhard. Cray cascade: a scalable hpc system based on a dragonfly network.
Proceedings of the International Conference on High Performance Computing, Networking,
Storage and Analysis (SC 12), 2012.
O. Fuhrer, T. Chadha, T. Hoefler, G. Kwasniewski, X. Lapillonne, D. Leutwyler, D. Lüthi, C. Osuna,
C. Schär, T. C. Schulthess, and H. Vogt. Near-global climate simulation at 1 km resolution:
establishing a performance baseline on 4888 gpus with cosmo 5.0. Geoscientific Model
Development Discussions (under review), 2017.
O. Fuhrer, C. Osuna, X. Lapillonne, T. Gysi, B. Cumming, M. Biaco, A. Arteaga, and T. C.
Schulthess. Towards a performance portable, architecture agnostic implementation strategy for
weather and climate models. Supercomputing frontiers and innovations, 2014.
G. Johansen. Configuring and customizing the cray programming environment on cle 6.0 systems. In
Proceedings of the Cray User Group Conference, 2016.
D. J. Kerbyson, K. J. Barker, A. Vishnu, and A. Hoisie. A performance comparison of current HPC
systems: Blue Gene/Q, Cray XE6 and InfiniBand systems. Future Gener. Comput. Syst., 30:291–
304, January 2014.
S. Matsuoka. Power and energy aware computing with tsubame 2.0 and beyond. In Proceedings of the
2011 Workshop on Energy Efficiency: HPC System and Datacenters, EE-HPC-WG ’11, pages 1–
76, New York, NY, USA, 2011. ACM.
Y. Oyanagi. Lessons learned from the K computer project - from the K computer to Exascale. Journal
of Physics: Conference Series, 523:012001, 06 2014.
S. Ramos and T. Hoefler. Modeling communication in cache-coherent SMP systems: A case-study
with Xeon Phi. In Proceedings of the 22nd International Symposium on High-performance Parallel
and Distributed Computing, HPDC ’13, pages 97–108, New York, NY, USA, 2013. ACM.
O. Schütt, P. Messmer, J. Hutter, and J. VandeVondele. Gpu-accelerated sparse matrix-matrix
multiplication for linear scaling density functional theory. Electronic Structure Calculations on
Graphics Processing Units: From Quantum Chemistry to Condensed Matter Physics, 2016.
G. Sciacca. ATLAS and LHC computing on Cray. 22nd International Conference on Computing in
High Energy and Nuclear Physics, 2016.
A. Sodani, R. Gramunt, J. Corbal, H. Kim, K. Vinod, S. Chinthamani, S. Hutsell, R. Agarwal, and Y
Liu. Knights landing: Second-generation intel xeon phi product. IEEE Micro, 36(2):34–46, 2016.
M. Staveley. Adapting Microsoft’s CNTK and ResNet-18 to Enable Strong-Scaling on Cray Systems.
In Neural Information Processing Systems (NIPS), 2016.
Chao Yang, Wei Xue, Haohuan Fu, Hongtao You, Xinliang Wang, Yulong Ao, Fangfang Liu, Lin
Gan, Ping Xu, Lanning Wang, Guangwen Yang, and Weimin Zheng. 10m-core scalable fully-
implicit solver for nonhydrostatic atmospheric dynamics. In Proceedings of the International
Conference for High Performance Computing, Networking, Storage and Analysis, SC ’16, pages
6:1–6:12, Piscataway, NJ, USA, 2016. IEEE Press.
British Standards Institution (2012). EN 50600-1 Information Technology - Data centre facilities and
infrastructures - Part 1: General concepts. London, 2012.
Telecommunications Infrastructure Association (2014). Telecommunications Infrastructure Standard
for Data Centers. 2014. Available from: http://www.tia.org.
British Standards Institution (2014a). Information Technology. Data centre facilities and
infrastructures. Building construction. London, 2014.
British Standards Institution (2014b). Information Technology. Data centre facilities and
infrastructures. Power distribution. London, 2014.
British Standards Institution (2014c). Information Technology. Data centre facilities and
infrastructures. Environmental control. London, 2014.
British Standards Institution (2014d). Information Technology. Data centre facilities and
infrastructures. Management and operational information. London, 2014.
British Standards Institution (2015). Information Technology. Data centre facilities and
infrastructures. Telecommunications cabling infrastructure. London, 2015.
British Standards Institution (2016). Information Technology. Data centre facilities and
infrastructures. Security Systems. London, 2016.
BISCI. ANSI/BICS 002-2014, Data Center Design and Implementation Best Practices. Tampa, FL, 3rd
edition, 2014.
Ladina Gilly. Data centre design standards and best practices for public research high performance
computing centres. Master’s thesis, CUREM - Center for Urban & Real Estate Management,
University of Zurich, CH - 8002 Zurich, 8 2016. Available at:
http://www.cscs.ch/fileadmin/publications/Tech_Reports/Data_centre_design_Thesis_e.pdf.
The Uptime Institute. Data Centre Site Infrastructure Tier Standard: Topology. 2012. Available from:
http://www.uptimeinstitute.com.
The Uptime Institute. Data Center Site Infrastructure Tier Standard: Operational Sustainability. 2013.
Available from: http://www.uptimeinstitute.com.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2007). Structural and
vibration guidelines for datacom equipment centers. Atlanta, GA, 2007.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2008a). Best practices
for datacom facility energy efficiency. Atlanta, GA, 2008.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2008a). TC 9.9 Mission
Critical Facilities Technology Spaces and Electronic Equipment (2008b): High Density Data
Centers. Atlanta, GA, 2008.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2009a). Design
considerations for datacom equipment centers. Atlanta, GA, 2009.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2009b). Real-time energy
consumption measurements in data centers. Atlanta, GA, 2009.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2011). Green tips for
data centers. Atlanta, GA, 2011.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2012). Datacom
equipment power trends and cooling applications. Atlanta, GA, 2012.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2013). PUE: a
comprehensive examination of the metric. Atlanta, GA, 2013.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2014a). Liquid cooling
guidelines for datacom equipment centers. Atlanta, GA, 2014.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2014b). Particulate and
gaseous contamination in datacom environments. Atlanta, GA, 2014.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2015a). Thermal
guidelines for data processing environments. Atlanta, GA, 2015.
American Society of Heating Refrigerating and Air-Conditioning Engineers (2015b). Server Efficiency
- Metrics for Computer Servers and Storage. Atlanta, GA, 2015.
W. P. Turner, J. H. Seader, V. Renaud, and K. G. Brill. Tier Classifications Define Site Infrastructure
Performance. 2008. White-paper available from: http://www.uptimeinstitute.org.
The OAuth 2.0 Authorization Framework. Technical report, 10 2012.
JupyterHub, 2017.
Enis Afgan, Dannon Baker, Marius vandenBeek, Daniel Blankenberg, Dave Bouvier, Martin Cech,
John Chilton, Dave Clements, Nate Coraor, Carl Eberhard, Bjrn Grüning, Aysam Guerler, Jennifer
Hillman-Jackson, Greg VonKuster, Eric Rasche, Nicola Soranzo, Nitesh Turaga, James Taylor,
Anton Nekrutenko, and Jeremy Goecks. The Galaxy platform for accessible, reproducible and
collaborative biomedical analyses: 2016 update. Nucleic Acids Research, 44(W1):W3–W10, 7
2016.
Jim Basney, Terry Fleury, and Jeff Gaynor. CILogon: A federated X.509 certification authority for
cyberinfrastructure logon. Concurrency and Computation: Practice and Experience, 26(13):2225–
2239, 9 2014.
Volker Brendel. Brendel Group Handbook, 2015.
Volker P Brendel. BWASP, 2017.
C. Titus Brown. Next-Gen Sequence Analysis Workshop (2017) angus 6.0 documentation, 2017.
M. S. Campbell, M. Law, C. Holt, J. C. Stein, G. D. Moghe, D. E. Hufnagel, J. Lei, R.
Achawanantakun, D. Jiao, C. J. Lawrence, D. Ware, S.-H. Shiu, K. L. Childs, Y. Sun, N. Jiang, and
M. Yandell. MAKER-P: A Tool Kit for the Rapid Creation, Management, and Quality Control of
Plant Genome Annotations. PLANT PHYSIOLOGY, 164(2):513–524, 2 2014.
Kyle Chard, Ian Foster, and Steven Tuecke. Globus: Research Data Management as Service and
Platform, 2017.
CyVerse. Django.
CyVerse. Atmosphere, 2017.
CyVerse. Atmosphere-Ansible, 2017.
CyVerse. Troposphere, 2017.
Jack Dongarra and Piotr Luszczek. HPC Challenge: Design, History, and Implementation Highlights.
In Jeffrey Vetter, editor, Contemporary High Performance Computing: From Petascale toward
Exascale, chapter 2, pages 13–30. Taylor and Francis, CRC Computational Science Series, Boca
Raton, FL, 2013.
Jeremy Fischer, Enis Afgan, Thomas Doak, Carrie Ganote, David Y. Hancock, and Matthew Vaughn.
Using Galaxy with Jetstream. In Galaxy Community Conference, Bloomington, IN, 2016.
Jeremy Fischer, David Y Hancock, John Michael Lowe, George Turner, Winona Snapp-Childs, and
Craig A Stewart. Jetstream: A Cloud System Enabling Learning in Higher Education Communities.
In Proceedings of the 2017 ACM Annual Conference on SIGUCCS, SIGUCCS ’17, pages 67–72,
New York, NY, USA, 2017. ACM.
Ian Foster and Dennis B. Gannon. Cloud computing for science and engineering. Massachusetts
Institute of Technology Press, 2017.
National Science Foundation. High Performance Computing System Acquisition: Continuing the
Building of a More Inclusive Computing Environment for Science and Engineering, 2014.
Genomics and Bioinformatics Service at Texas A&M. PoreCamp USA.
Globus. Globus.
Stephen A. Goff, Matthew Vaughn, Sheldon McKay, Eric Lyons, Ann E. Stapleton, Damian Gessler,
Naim Matasci, Liya Wang, Matthew Hanlon, Andrew Lenards, Andy Muir, Nirav Merchant, Sonya
Lowry, Stephen Mock, Matthew Helmke, Adam Kubach, Martha Narro, Nicole Hopkins, David
Micklos, Uwe Hilgert, Michael Gonzales, Chris Jordan, Edwin Skidmore, Rion Dooley, John
Cazes, Robert McLay, Zhenyuan Lu, Shiran Pasternak, Lars Koesterke, William H. Piel, Ruth
Grene, Christos Noutsos, Karla Gendler, Xin Feng, Chunlao Tang, Monica Lent, Seung-Jin Kim,
Kristian Kvilekval, B. S. Manjunath, Val Tannen, Alexandros Stamatakis, Michael Sanderson,
Stephen M. Welch, Karen A. Cranston, Pamela Soltis, Doug Soltis, Brian O’Meara, Cecile Ane,
Tom Brutnell, Daniel J. Kleibenstein, Jeffery W. White, James Leebens-Mack, Michael J.
Donoghue, Edgar P. Spalding, Todd J. Vision, Christopher R. Myers, David Lowenthal, Brian J.
Enquist, Brad Boyle, Ali Akoglu, Greg Andrews, Sudha Ram, Doreen Ware, Lincoln Stein, and
Dan Stanzione. The iPlant Collaborative: Cyberinfrastructure for Plant Biology. Frontiers in Plant
Science, 2:34, 7 2011.
Chris Holdgraf, Aaron Culich, Ariel Rokem, Fatma Deniz, Maryana Alegro, and Dani Ushizima.
Portable Learning Environments for Hands-On Computational Instruction. Proceedings of the
Practice and Experience in Advanced Research Computing 2017 on Sustainability, Success and
Impact – PEARC17, pages 1–9, 2017.
Jetstream. Trial Access Allocation, 2017.
John Michael Lowe, Michael Packard, and C. Bret Hammond. Jetstream Salt States. Jetstream: A
Novel Cloud System for Science
Ruth Malan and Dana Bredemeyer. Functional Requirements and Use Cases. Technical report, 2001.
Joe Mambretti, Jim Chen, and Fei Yeh. Next Generation Clouds, the Chameleon Cloud Testbed, and
Software Defined Networking (SDN). In Proceedings of the 2015 International Conference on
Cloud Computing Research and Innovation (ICCCRI), ICCCRI ’15, pages 73–79, Washington,
DC, USA, 2015. IEEE Computer Society.
Nirav Merchant, Eric Lyons, Stephen Goff, Matthew Vaughn, Doreen Ware, David Micklos, and
Parker Antin. The iPlant Collaborative: Cyberinfrastructure for Enabling Data to Discovery for the
Life Sciences. PLOS Biology, 14(1):e1002342, 1 2016.
National Science Foundation. Cyberinfrastructure: From Supercomputing to the TeraGrid, 2006.
National Science Foundation. CISE Research Infrastructure: Mid-Scale Infrastructure – NSFCloud
(CRI: NSFCloud), 2013.
National Science Foundation. TeraGrid Phase III: eXtreme Digital Resources for Science and
Engineering (XD), 2008.
Jp Navarro, Craig A Stewart, Richard Knepper, Lee Liming, David Lifka, and Maytal Dahan. The
Community Software Repository from XSEDE: A Resource for the National Research Community.
OpenStack Foundation. Getting started with the OpenStack SDK, 2017.
OpenStack Foundation. Heat, 2017.
OpenStack Foundation. Horizon Dashboard, 2017.
OpenStack Foundation. OpenStack Clients, 2017.
OpenStack Foundation. OpenStack Roadmap, 2017.
ORCID Inc. ORCID — Connecting Research and Researchers.
Yuvi Panda and Andrea Zonca. kubeadm-bootstrap, 2017.
Project Jupyter team. Zero to JupyterHub with Kubernetes, 2017.
Paul Rad, Mehdi Roopaei, Nicole Beebe, Mehdi Shadaram, and Yoris A. Au. AI Thinking for Cloud
Education Platform with Personalized Learning. In 51st Hawaii International Conference on
System Sciences, Waikoloa Village, HI, 2018.
Inc. Red Hat. Ceph Homepage – Ceph, 2017.
FN Sakimura, J Bradley, M Jones, B de Medeiros, and C Mortimore. OpenID Connect Core 1.0
incorporating errata set 1, 2014.
C.A. Stewart. Preserving Scientific Software … in a Usable Form? EDUCAUSE Review, 2016.
C.A. Stewart, V. Welch, B. Plale, G. Fox, M. Pierce, and T. Sterling. Indiana University Pervasive
Technology Institute, 2017.
Craig A Stewart, David Y. Hancock, Matthew Vaughn, Jeremy Fischer, Lee Liming, Nirav Merchant,
Therese Miller, John Michael Lowe, Daniel Stanzione, Jaymes Taylor, and Edwin Skidmore.
Jetstream – Performance, Early Experiences, and Early Results. In Proceedings of the XSEDE16
Conference, St. Louis, MO, 2016.
Craig A. Stewart, David Y. Hancock, Matthew Vaughn, Nirav C. Merchant, John Michael Lowe,
Jeremy Fischer, Lee Liming, James Taylor, Enis Afgan, George Turner, C. Bret Hammond, Edwin
Skidmore, Michael Packard, and Ian Foster. System Acceptance Report for NSF award 1445604
High Performance Computing System Acquisition: Jetstream – A Self-Provisioned, Scalable
Science and Engineering Cloud Environment. Technical report, Indiana University, Bloomington,
IN, 2016.
Craig A Stewart, R Knepper, Andrew Grimshaw, Ian Foster, Felix Bachmann, D Lifka, Morris Riedel,
and Steven Tuecke. Campus Bridging Use Case Quality Attribute Scenarios. Technical report,
2012.
Craig A. Stewart, Richard Knepper, Andrew Grimshaw, Ian Foster, Felix Bachmann, David Lifka,
Morris Riedel, and Steven Tueke. XSEDE Campus Bridging Use Cases. Technical report, 2012.
Craig A. Stewart, Richard Knepper, Matthew R Link, Marlon Pierce, Eric Wernert, and Nancy
Wilkins-Diehr. Cyberinfrastructure, Cloud Computing, Science Gateways, Visualization, and
Cyberinfrastructure Ease of Use. In Mehdi Khosrow-Pour, editor, Encyclopedia of Information
Science and Technology. IGI Global, Hershey, PA, fourth edition, 2018.
University of Texas at Austin, Texas Advanced Computing Center, 2017.
The OpenStack Foundation. OpenStack, 2017.
John Towns, Timothy Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, Andrew Grimshaw, Victor
Hazlewood, Scott Lathrop, Dave Lifka, Gregory D. Peterson, Ralph Roskies, J. Ray Scott, and
Nancy Wilkens-Diehr. XSEDE: Accelerating Scientific Discovery. Computing in Science &
Engineering, 16(5):62–74, 9 2014.
Steven Tuecke, Rachana Ananthakrishnan, Kyle Chard, Mattias Lidman, Brendan Mccollam, Stephen
Rosen, and Ian Foster. Globus Auth: A Research Identity and Access Management Platform. In
IEEE 12th International Conference on eScience, Baltimore, Maryland, 2016.
Ubuntu. Cloud-Init, 2017.
Venkatesh Viswanath, Michael G Morris, Gordon B Davis, and Fred D Davis. User Acceptance of
Information Technology: Toward a Unified View. MIS Quarterly, 27(3):425–478, 2003.
Gregor von Laszewski, Geoffrey C. Fox, Fugang Wang, Andrew J. Younge, Archit Kulshrestha,
Gregory G. Pike, Warren Smithy, Jens Vöcklerz, Renato J. Figueiredox, Jose Fortesx, and Kate
Keahey. Design of the Futuregrid experiment management framework. In 2010 Gateway
Computing Environments Workshop, GCE 2010, 2010.
XSEDE. XSEDE Education Allocations, 2017.
XSEDE. XSEDE Research Allocations, 2017.
XSEDE. XSEDE Startup Allocations, 2017.
AVBP website at CERFACS. http://www.cerfacs.fr/avbp7x/. Accessed: 2017-07-28.
Chroma github repository. https://jeffersonlab.github.io/chroma/. Accessed: 2017-07-28.
CMS software github repository. https://github.com/cms-sw/cmssw. Accessed: 2017-07-28.
DEEP-ER project website. http://www.deep-er.eu. Accessed: 2017-07-16.
DEEP project website. http://www.deep-project.eu. Accessed: 2017-07-16.
DEEP prototype website. http://www.fz-
juelich.de/ias/jsc/EN/Expertise/Supercomputers/DEEP/DEEP_node.html. Accessed: 2017-07-16.
EXTOLL GmbH website. http://www.extoll.de. Accessed: 2017-07-23.
EXTOLL Tourmalet. http://www.http://extoll.de/products/tourmalet. Accessed: 2017-07-23.
Forschungszentrum Jülich website. http://www.fz-juelich.de/en. Accessed: 2017-07-09.
Full Wave Inversion (FWI) code in DEEP-ER. http://www.deep-projects.eu/applications/project-
applications/enhancing-oil-exploration.html. Accessed: 2017-07-28.
GROMACS application website. http://www.gromacs.org/. Accessed: 2017-07-28.
Helmoltz association website. https://www.helmholtz.de/en/. Accessed: 2017-07-09.
High-Q club website. http://www.fz-juelich.de/ias/jsc/EN/Expertise/High-Q-Club/_node.html.
Accessed: 2017-08-10.
Human Brain Project pilot systems website. http://www.fz-
juelich.de/ias/jsc/EN/Expertise/Supercomputers/HBPPilots/_node.html. Accessed: 2017-07-16.
JSC Simlabs website. http://www.fz-juelich.de/ias/jsc/EN/Expertise/SimLab/simlab_node.html.
Accessed: 2017-08-07.
Jülich Supercomputing Centre website. http://www.fz-juelich.de/ias/jsc/EN. Accessed: 2017-07-09.
JUST: Jülich Storage Cluster website. http://www.fz-
juelich.de/ias/jsc/EN/Expertise/Datamanagement/OnlineStorage/JUST/JUST_node.html. Accessed:
2017-07-16.
NEST code website. www.nest-simulator.org. Accessed: 2017-07-28.
OpenHMC. http://www.uni-heidelberg.de/openhmc. Accessed: 2017-07-26.
ParaStation V5 website. http://www.par-tec.com/products/parastationv5.html. Accessed: 2017-07-26.
Prace website. http://www.prace-ri.eu. Accessed: 2017-08-07.
QPACE3 website. http://www.fz-
juelich.de/ias/jsc/EN/Expertise/Supercomputers/QPACE3/_node.html. Accessed: 2017-07-16.
Reverse time migration (rtm) code website. http://www.cgg.com/en/What-We-Do/Subsurface-
Imaging/Migration/Reverse-Time-Migration. Accessed: 2017-07-28.
SeisSol application website. http://www.seissol.org/. Accessed: 2017-07-28.
SKA data analysis pipeline in DEEP-ER. http://www.deep-projects.eu/applications/project-
applications/radio-astronomy.html. Accessed: 2017-07-28.
Top 500 list. https://www.top500.org/lists/. Accessed: 2017-06-26.
3M. Novec. http://multimedia.3m.com/mws/media/569865O/3mtm-novectm-649-engineered-
fluid.pdf?&fn=Novec649_6003926.pdf.
Adaptive Computing. Maui. http://www.adaptivecomputing.com/products/open-source/maui/.
Adaptive Computing. TORQUE Resource Manager.
http://www.adaptivecomputing.com/products/open-source/torque/.
Krste Asanovic, Ras Bodik, Bryan Christopher Catanzaro, Joseph James Gebis, Parry Husbands, Kurt
Keutzer, David A. Patterson, William Lester Plishker, John Shalf, Samuel Webb Williams, and
Katherine A. Yelick. The landscape of parallel computing research: A view from Berkeley.
Technical Report UCB/EECS-2006-183, EECS Department, University of California, Berkeley,
Dec 2006.
Norbert Attig, Florian Berberich, Ulrich Detert, Norbert Eicker, Thomas Eickermann, Paul Gibbon,
Wolfgang Gürich, Wilhelm Homberg, Antonia Illich, Sebastian Rinke, Michael Stephan, Klaus
Wolkersdorfer, and Thomas Lippert. Entering the Petaflop-Era – New Developments in
Supercomputing. In NIC Symposium 2010 / ed.: G. Münster, D. Wolf, M. Kremer, Jülich,
Forschungszentrum Jülich, IAS Series Vol. 3. – 978-3-89336-606-4. – S. 1 – 12, 2010. Record
converted from VDB: 12.11.2012.
Dirk Brömmel, Ulrich Detert, Stephan Graf, Thomas Lippert, Boris Orth, Dirk Pleiter, Michael
Stephan, and Estela Suarez. Paving the Road towards Pre-Exascale Supercomputing. In NIC
Symposium 2014 – Proceedings, volume 47 of NIC Series, pages 1–14, Jülich, Feb 2014. NIC
Symposium 2014, Jülich (Germany), 12 Feb 2014 – 13 Feb 2014, John von Neumann Institute for
Computing.
Ulrich Bruening, Mondrian Nuessle, Dirk Frey, and Hans-Christian Hoppe. An Immersive Cooled
Implementation of a DEEP Booster, pages 30–36. Intel Corporation, Munich, 2015.
Michalis Christou, Theodoros Christoudias, Julian Morillo, Damian Alvarez, and Hendrik Merx. Earth
system modelling on system-level heterogeneous architectures: EMAC (version 2.42) on the
Dynamical Exascale Entry Platform (DEEP). Geoscientific model development, 9(9):3483 – 3491,
2016.
Alejandro Duran, Eduard Ayguadé, Rosa M. Badia, Jesús Labarta, Luis Martinell, Xavier Martorell,
and Judit Planas. OmpSs: A proposal for programming heterogeneous multi-core architectures.
Parallel Processing Letters, 21(02):173–193, 2011.
Norbert Eicker, Andreas Galonska, Jens Hauke, and Mondrian Nüssle. Bridging the DEEP Gap –
Implementation of an Efficient Forwarding Protocol, pages 34–41. Intel Corporation, Munich,
2014.
Norbert Eicker and Thomas Lippert. An accelerated Cluster-Architecture for the Exascale. In PARS
’11, PARS-Mitteilungen, Mitteilungen – Gesellschaft für Informatik e.V., Parallel-Algorithmen
und Rechnerstrukturen, ISSN 0177-0454, Nr. 28, Oktober 2011 (Workshop 2011), 110 – 119,
2011. Record converted from VDB: 12.11.2012.
Norbert Eicker, Thomas Lippert, Thomas Moschny, and Estela Suarez. The DEEP Project – An
alternative approach to heterogeneous cluster-computing in the manycore era. Concurrency and
computation, 28(8):23942411, 2016.
Andrew Emerson and Fabio Affinito. Enabling a Quantum Monte Carlo application for the DEEP
architecture. In 2015 International Conference on High Performance Computing Simulation
(HPCS), pages 453–457, July 2015.
Fraunhofer Gesselschaft. BeeGFS website.
Fraunhofer Gesselschaft. BeeOND: BeeGFS On Demand website.
Jens Freche, Wolfgang Frings, and Godehard Sutmann. High Throughput Parallel-I/O using SIONlib
for Mesoscopic Particle Dynamics Simulations on Massively Parallel Computers. In Parallel
Computing: From Multicores and GPU’s to Petascale, / ed.: B. Chapman, F. Desprez, G.R.
Joubert, A. Lichnewsky, F. Peters and T. Priol, Amsterdam, IOS Press, 2010. Advances in Parallel
Computing Volume 19. – 978-1-60750-529-7. – S. 371 – 378, 2010. Record converted from VDB:
12.11.2012.
Wolfgang Frings, Felix Wolf, and Ventsislav Petkov. Scalable Massively Parallel I/O to Task-Local
Files. In Proceedings of the Conference on High Performance Computing Networking, Storage and
Analysis, Portland, Oregon, November 14 – 20, 2009, SC’09, SESSION: Technical papers, Article
No. 17, New York, ACM, 2009. ISBN 978-1-60558-744-8. – S. 1 – 11, 2009. Record converted
from VDB: 12.11.2012.
Markus Götz, Christian Bodenstein, and Morris Riedel. HPDBSCAN – Highly parallel DBSCAN. In
Proceedings of the Workshop on Machine Learning in High-Performance Computing
Environments – MLHPC ’15, page 2. Workshop Workshop on Machine Learning in High-
Performance Computing Environments, subworkshop to Supercomputing 2015, Austin (Texas), 15
Nov 2015 – 15 Nov 2015, ACM Press New York, New York, USA, Nov 2015.
Dorian Krause and Philipp Thörnig. JURECA: General-purpose supercomputer at Jülich
Supercomputing Centre. Journal of large-scale research facilities, 2:A62, 2016.
Anke Kreuzer, Jorge Amaya, Norbert Eicker, Raphaël Léger, and Estela Suarez. The DEEP-ER
project: I/O and resiliency extensions for the Cluster-Booster architecture. In Proceedings of the
20th International Conference on High Performance Computing and Communications (HPCC),
Exeter, UK, 2018. IEEE Computer Society Press. (accepted for publication).
Anke Kreuzer, Jorge Amaya, Norbert Eicker, and Estela Suarez. Application performance on a
Cluster-Booster system. In Proceedings of the 2018 IEEE International Parallel and Distributed
Processing Symposium (IPDPS) Workshops Proceedings (HCW), IPDPS Conference, Vancouver,
Canada, 2018. (accepted for publication).
Pramod Kumbhar, Michael Hines, Aleksandr Ovcharenko, Damian Alvarez, James King, Florentino
Sainz, Felix Schürmann, and Fabien Delalondre. Leveraging a Cluster-Booster Architecture for
Brain-Scale Simulations. In Proceedings of the 31st International Conference High Performance
Computing, volume 9697 of Lecture Notes in Computer Science, pages 363 – 380, Cham, Jun
2016. 31st International Conference High Performance Computing, Frankfurt (Germany), 19 Jun
2016 – 23 Jun 2016, Springer International Publishing.
Raphäel Léger, Damian Alvarez Mallon, Alejandro Duran, and Stephane Lanteri. Adapting a Finite-
Element Type Solver for Bioelectromagnetics to the DEEP-ER Platform. In Parallel Computing:
On the Road to Exascale, volume 27 of Advances in Parallel Computing, pages 349 – 359.
International Conference on Parallel Computing 2015, Edinburgh (UK), 1 Sep 2015 – 4 Sep 2015,
IOS Press Ebooks, Sep 2016.
Thomas Lippert. Recent Developments in Supercomputing, volume 39 of NIC series, pages 1–8. John
von Neumann Institute for Computing, Jülich, 2008. Record converted from VDB: 12.11.2012.
Programming Models @ BSC. The OmpSs Programming Model, 2013.
SchedMD. SLURM website. https://slurm.schedmd.com/.
Michael Stephan and Jutta Docter. JUQUEEN: IBM Blue Gene/Q Supercomputer System at the Jülich
Supercomputing Centre. Journal of large-scale research facilities, 1:A1, 2015.
Estela Suarez, Norbert Eicker, and Thomas Lippert. Supercomputing Evolution at JSC. volume 49 of
Publication Series of the John von Neumann Institute for Computing (NIC) NIC Series, pages 1 –
12, Jlich, Feb 2018. NIC Symposium 2018, Jülich (Germany), 22 Feb 2018 – 23 Feb 2018, John
von Neumann Institute for Computing.
Anna Wolf, Anke Zitz, Norbert Eicker, and Giovanni Lapenta. Particle-in-Cell algorithms on
DEEP:The iPiC3D case study. volume 32, pages 38–48, Erlangen, May 2015. PARS 15, Potsdam
(Germany), 7 May 2015 – 8 May 2015, PARS.
IBM: Tivoli workload scheduler LoadLeveler, 2015.
American Society of Heating, Refrigerating and Air-Conditioning Engineers, 2016.
Applications Software at LRZ, 2017.
Axel Auweter, Arndt Bode, Matthias Brehm, Luigi Brochard, Nicolay Hammer, Herbert Huber, Raj
Panda, Francois Thomas, and Torsten Wilde. A Case Study of Energy Aware Scheduling on
SuperMUC. In Proceedings of the 29th International Conference on Supercomputing – Volume
8488, ISC 2014, pages 394–409, New York, NY, USA, 2014. Springer-Verlag New York, Inc.
Natalie Bates, Girish Ghatikar, Ghaleb Abdulla, Gregory A Koenig, Sridutt Bhalachandra, Mehdi
Sheikhalishahi, Tapasya Patki, Barry Rountree, and Stephen Poole. Electrical grid and
supercomputing centers: An investigative analysis of emerging opportunities and challenges.
Informatik-Spektrum, 38(2):111–127, 2015.
Bavarian Academy of Sciences and Humanities, 2017.
Alexander Breuer, Alexander Heinecke, Sebastian Rettenberger, Michael Bader, Alice-Agnes Gabriel,
and Christian Pelties. Sustained petascale performance of seismic simulations with seissol on
supermuc. In International Supercomputing Conference, pages 1–18. Springer, 2014.
GCS: Delivering 10 Years of Integrated HPC Excellence for Germany, Spring 2017.
Gauss Centre for Supercomputing, 2016.
Green500, 2016.
Carla Guillen, Wolfram Hesse, and Matthias Brehm. The persyst monitoring tool. In European
Conference on Parallel Processing, pages 363–374. Springer, 2014.
Carla Guillen, Carmen Navarrete, David Brayford, Wolfram Hesse, and Matthias Brehm. Dvfs
automatic tuning plugin for energy related tuning objectives. In Green High Performance
Computing (ICGHPC), 2016 2nd International Conference, pages 1–8. IEEE, 2016.
Nicolay Hammer, Ferdinand Jamitzky, Helmut Satzger, Momme Allalen, Alexander Block, Anupam
Karmakar, Matthias Brehm, Reinhold Bader, Luigi Iapichino, Antonio Ragagnin, et al. Extreme
scale-out supermuc phase 2-lessons learned. arXiv preprint arXiv:1609.01507, 2016.
John L Hennessy and David A Patterson. Computer architecture: a quantitative approach. Elsevier,
2012.
HLRS High Performance Computing Center Stuttgart, 2017.
IBM, 2016.
Intel. Intel Xeon Processor E5 v3 Product Family. Processor Specification Update, August 2015.
Intel, 2016.
Jülich Supercomputing Centre (JSC), 2017.
Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities, 2017.
Lenovo, 2016.
Magneticum: Simulating Large Scale Structure Formation in the Universe, 2014.
Wagner S., Bode A., Brüchle H., and Brehm M. Extreme Scale-out on SuperMUC Phase 2. 2016.
ISBN: 978-3-9816675-1-6.
Magnus Schwörer, Konstantin Lorenzen, Gerald Mathias, and Paul Tavan. Utilizing fast multipole
expansions for efficient and accurate quantum-classical molecular dynamics simulations. The
Journal of chemical physics, 142(10):03B608_1, 2015.
Hayk Shoukourian. Adviser for Energy Consumption Management: Green Energy Conservation. PhD
thesis, München, Technische Universität München (TUM), 2015.
Hayk Shoukourian, Torsten Wilde, Axel Auweter, and Arndt Bode. Monitoring Power Data: A first
step towards a unified energy efficiency evaluation toolset for HPC data centers. Elsevier, 2013.
Hayk Shoukourian, Torsten Wilde, Axel Auweter, and Arndt Bode. Predicting the Energy and Power
Consumption of Strong and Weak Scaling HPC Applications. Supercomputing Frontiers and
Innovations, 1(2):20–41, 2014.
Hayk Shoukourian, Torsten Wilde, Axel Auweter, Arndt Bode, and Daniele Tafani. Predicting Energy
Consumption Relevant Indicators of Strong Scaling HPC Applications for Different Compute
Resource Configurations. To appear in the proceedings of the 23rd High Performance Computing
Symposium, Society for Modeling and Simulation International (SCS), 2015.
Hayk Shoukourian, Torsten Wilde, Herbert Huber, and Arndt Bode. Analysis of the efficiency
characteristics of the first high-temperature direct liquid cooled petascale supercomputer and its
cooling infrastructure. Journal of Parallel and Distributed Computing, 107:87 – 100, 2017.
Hayk Shoukourian, Torsten Wilde, Detlef Labrenz, and Arndt Bode. Using machine learning for data
center cooling infrastructure efficiency prediction. In Parallel and Distributed Processing
Symposium Workshops (IPDPSW), 2017 IEEE International, pages 954–963. IEEE, 2017.
The SIMOPEK Project. http://simopek.de/, 2016.
Top500, 2017.
T. Wilde, M. Ott, A. Auweter, I. Meijer, P. Ruch, M. Hilger, S. Khnert, and H. Huber. Coolmuc-2: A
supercomputing cluster with heat recovery for adsorption cooling. In 2017 33rd Thermal
Measurement, Modeling Management Symposium (SEMITHERM), pages 115–121, March 2017.
Torsten Wilde, Axel Auweter, and Hayk Shoukourian. The 4 pillar framework for energy efficient hpc
data centers. Computer Science – Research and Development, pages 1–11, 2013.
Compressible Turbulence World’s Largest Simulation of Supersonic, 2013.
APEX Benchmarks. https://www.nersc.gov/research-and-development/apex/apex-benchmarks/.
Intel VTune Amplifier. https://software.intel.com/en-us/intel-vtune-amplifier-xe.
Intel®Advisor. https://software.intel.com/en-us/intel-advisor-xe.
Libfabric OpenFabrics. https://ofiwg.github.io/libfabric.
MPICH. http://www.mpich.org.
NERSC-8 Benchmarks. https://www.nersc.gov/users/computational-systems/cori/nersc-8-
procurement/trinity-nersc-8-rfp/nersc-8-trinity-benchmarks/.
NERSC Cori System. https://www.nersc.gov/users/computational-systems/cori.
NERSC Edison System. https://www.nersc.gov/users/computational-systems/edison.
NESAP. http://www.nersc.gov/users/computational-systems/cori/nesap/nesap-projects.
NESAP Application Case Studies. http://www.nersc.gov/users/computational-
systems/cori/application-porting-and-performance/application-case-studies/.
NESAP Projects. http://www.nersc.gov/users/computational-systems/cori/nesap.
NESAP Xeon Phi Application Performance. http://www.nersc.gov/users/application-
performance/preparing-for-cori/.
Quantum ESPRESSO Case Study. http://www.nersc.gov/users/computational-
systems/cori/application-porting-and-performance/application-case-studies/quantum-espresso-
exact-exchange-case-study/.
Roofline Performance Model. http://crd.lbl.gov/departments/computerscience/PAR/research/roofline.
SDE: Intel Software Development Emulator. https://software.intel.com/en-us/articles/intel-software-
development-emulator.
Tips for Using CMake and GNU Autotools on Cray Heterogeneous Systems.
http://docs.cray.com/books/S-2801-1608//S-2801-1608.pdf.
Taylor Barnes, Brandon Cook, Douglas Doerfler, Brian Friesen, Yun He, Thorsten Kurth, Tuomas
Koskela, Mathieu Lobet, Tareq Malas, Leonid Oliker, and et al. Evaluating and Optimizing the
NERSC Workload on Knights Landing. Jan 2016.
Taylor A. Barnes, Thorsten Kurth, Pierre Carrier, Nathan Wichmann, David Prendergast, Paul R.C.
Kent, and Jack Deslippe. Improved treatment of exact exchange in quantum {ESPRESSO}.
Computer Physics Communications, 214:52 – 58, 2017.
W. Bhimji, D. Bard, K. Burleigh, C. Daley, S. Farrell, M. Fasel, B. Friesen, L. Gerhardt, J. Liu, P.
Nugent, D. Paul, J. Porter, and V. Tsulaia. Extreme i/o on hpc for hep using the burst buffer at
nersc. Computing in High-Energy Physics, 2016.
W. Bhimji, D. Bard, M. Romanus, D. Paul, A. Ovsyannikov, B. Friesen, M. Bryson, J. Correa, G.K.
Lockwood, V. Tsulaia, Byna S., S Farrell, D. Gursoy, C. Daley, V Beckner, B. Van Straalen, D.
Trebotich, Tull C., G.H. Weber, N.J. Wright, K. Antypas, and Prabhat. Accelerating science with
the nersc burst buffer. Cray User Group, 2016.
S. Binder, A. Calci, E. Epelbaum, R. J. Furnstahl, J. Golak, K. Hebeler, H. Kamada, H. Krebs, J.
Langhammer, S. Liebig, P. Maris, U.-G. Meißner, D. Minossi, A. Nogga, H. Potter, R. Roth, R.
Skinińki, K. Topolnicki, J. P. Vary, and H. Witała. Few-nucleon systems with state-of-the-art chiral
nucleon-nucleon forces. Phys. Rev. C, 93(4):044002, 2016.
R.S. Canon, T. Declerck, B. Draney, J. Lee, D. Paul, and D. Skinner. Enabling a superfacility with
software defined networking. Cray User Group, 2017.
Brandon Cook, Pieter Maris, Meiyue Shao, Nathan Wichmann, Marcus Wagner, John ONeill, Thanh
Phung, and Gaurav Bansal. High performance optimizations for nuclear physics code mfdn on knl.
In International Conference on High Performance Computing, pages 366–377. Springer, 2016.
Jack Deslippe, Georgy Samsonidze, David A. Strubbe, Manish Jain, Marvin L. Cohen, and Steven G.
Louie. Berkeleygw: A massively parallel computer package for the calculation of the quasiparticle
and optical properties of materials and nanostructures. Computer Physics Communications,
183(6):1269 – 1289, 2012.
Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth,
Mathieu Lobet, Tareq Malas, Jean-Luc Vay, and Henri Vincenti. Applying the Roofline
Performance Model to the Intel Xeon Phi Knights Landing Processor, pages 339–353. Springer
International Publishing, Cham, 2016.
P. Hill, C. Synder, and J. Sygulla. Knl system software. Cray User Group, 2017.
D.M. Jacobsen. Extending cle6 to a multicomputer os. Cray User Group, 2017.
M. Jette, D.M. Jacobsen, and D. Paul. Scheduler optimization for current generation cray systems.
Cray User Group, 2017.
William TC Kramer, John M Shalf, and Erich Strohmaier. The sustained system performance (ssp)
benchmark.
Thorsten Kurth, William Arndt, Taylor Barnes, Brandon Cook, Jack Deslippe, Doug Doerfler, Brian
Friesen, Yu He, Tuomas Koskela, Mathieu Lobet, Tareq Malas, Leonid Oliker, Andrey
Ovsyannikov, Samuel Williams, Woo-Sun Yang, and Zhengji Zhao. Analyzing Performance of
Selected Applications on the Cori HPC System. Jun 2017. Accepted for IXPUG Workshop
Experiences on Intel Knights Landing at the One Year Mark, ISC 2017, Frankfurt, Germany.
Melara M, Gamblin T, Becker G, French R, Belhorn M, Thompson K, Scheibel P, and HartmanBaker
R. Using spack to manage software on cray supercomputers. In Proceedings of Cray User Group,
2017.
P. Maris, M. A. Caprio, and J. P. Vary. Emergence of rotational bands in ab initio no-core
configuration interaction calculations of the Be isotopes. Phys. Rev. C, 91(1):014310, 2015.
P. Maris, J. P. Vary, P. Navratil, W. E. Ormand, H. Nam, and D. J. Dean. Origin of the anomalous
long lifetime of 14C. Phys. Rev. Lett., 106(20):202502, 2011.
Pieter Maris, James P. Vary, S. Gandolfi, J. Carlson, and Steven C. Pieper. Properties of trapped
neutrons interacting with realistic nuclear Hamiltonians. Phys. Rev. C, 87(5):054318, 2013.
A. Ovsyannikov, M. Romanus, B. Van Straalwn, G. Weber, and D. Trebotich. Scientific workflows at
datawarp-speed: Accelerated data-intensive science using nersc’s burst buffer. IEEE, 2016.
Meiyue Shao, Hasan Metin Aktulga, Chao Yang, Esmond G Ng, Pieter Maris, and James P Vary.
Accelerating nuclear configuration interaction calculations through a preconditioned block iterative
eigensolver. arXiv preprint arXiv:1609.01689, 2016.
Samuel Williams, Andrew Waterman, and David Patterson. Roofline: An insightful visual
performance model for multicore architectures. Commun. ACM, 52(4):65–76, April 2009.
Samuel Webb Williams. Auto-tuning Performance on Multicore Computers. PhD thesis, Berkeley,
CA, USA, 2008. AAI3353349.
A list of Top50 most powerful supercomputers in Russia and CIS. http://top50.supercomputers.ru.
Moscow University Supercomputing Center. http://hpc.msu.ru.
Octoshell source code. https://github.com/octoshell/octoshell-v2.
Octotron framework source code. https://github.com/srcc-msu/octotron.
Open Encyclopedia of Parallel Algorithmic Features. http://algowiki-project.org.
Slurm — cluster management and job scheduling system. https://slurm.schedmd.com.
xCAT. http://xcat.org/.
A. Antonov, D. Nikitenko, P. Shvets, S. Sobolev, K. Stefanov, Vad. Voevodin, Vl. Voevodin, and S.
Zhumatiy. An approach for ensuring reliable functioning of a supercomputer based on a formal
model. In Parallel Processing and Applied Mathematics. 11th International Conference, PPAM
2015, Krakow, Poland, September 6–9, 2015. Revised Selected Papers, Part I, volume 9573 of
Lecture Notes in Computer Science, pages 12–22. Springer International Publishing, 2016.
A. Antonov, V. Voevodin, and J. Dongarra. Algowiki: an Open encyclopedia of parallel algorithmic
features. Supercomputing Frontiers and Innovations, 2(1):4–18, 2015.
A. Brechalov. Moscow State University Meets Provides a Facility That Meets HPC Demands. Uptime
Institute Journal, 6:50, 2016.
B. Mohr, E. Hagersten, J. Gimenez, A. Knupfer, D. Nikitenko, M. Nilsson, H. Servat, A. Shah, Vl.
Voevodin, F. Winkler, F. Wolf, and I. Zhukov. The HOPSA Workflow and Tools. In Proceedings
of the 6th International Parallel Tools Workshop, Stuttgart, 2012, volume 11, pages 127–146.
Springer, 2012.
D.A. Nikitenko, Vad.V. Voevodin, and S.A. Zhumatiy. Octoshell: Large supercomputer complex
administration system. In Proceedings of the 1st Russian Conference on Supercomputing —
Supercomputing Days 2015, volume 1482 of CEUR Workshop Proceedings, pages 69–83, 2015.
D.A. Nikitenko, S.A. Zhumatiy, and P.A. Shvets. Making Large-Scale Systems Observable —
Another Inescapable Step Towards Exascale. Supercomputing Frontiers and Innovations, 3(2):72–
79, 2016.
V. Sadovnichy, A. Tikhonravov, Vl Voevodin, and V. Opanasenko. Lomonosov: Supercomputing at
Moscow State University. In Contemporary High Performance Computing: From Petascale
toward Exascale, Chapman & Hall/CRC Computational Science, pages 283–307, Boca Raton,
United States, 2013.
K.S. Stefanov, Vl.V. Voevodin, S.A. Zhumatiy, and Vad.V. Voevodin. Dynamically Re-configurable
Distributed Modular Monitoring System for Supercomputers (DiMMon). volume 66 of Procedia
Computer Science, pages 625–634. Elsevier B.V., 2015.
Vl.V. Voevodin, Vad.V. Voevodin, D.I. Shaikhislamov, and D.A. Nikitenko. Data mining method for
anomaly detection in the supercomputer task flow. In Numerical Computations: Theory and
Algorithms, The 2nd International Conference and Summer School, Pizzo calabro, Italy, June 20–
24, 2016, volume 1776 of AIP Conference Proceedings, 2016.
Estimating the Circulation and Climate of the Ocean Consortium, Phase II (ECCO2). Website.
http://ecco.jpl.nasa.gov/.
D. Ellsworth, C. Henze, and B. Nelson. Interactive Visualization of High-Dimensional Petascale
Ocean Data. 2017 IEEE 7th Symposium on Large Data Analysis and Visualization (LDAV).
Phoenix, AZ, 2017.
The Enzo Project. Website. http://enzo-project.org/.
FUN3D: Fully Unstructured Navier-Stokes. Website. http://fun3d.larc.nasa.gov/.
The GEOS-5 System. Website. http://gmao.gsfc.nasa.gov/systems/geos5/.
HECC Storage Resources. Website. https://www.nas.nasa.gov/hecc/resources/storage_systems.html.
High Performance Conjugate Gradients, November 2016. Website. http://www.hpcg-
benchmark.org/custom/index.html?lid=155&slid=289.
hyperwall Visualization System. Website. https://www.nas.nasa.gov/hecc/resources/viz_systems.html.
Massachusetts Institute of Technology General Circulation Model (mitgcm). Website.
http://mitgcm.org/.
Merope Supercomputer. Website. https://www.nas.nasa.gov/hecc/resources/merope.html.
nu-WRF: NASA-Unified Weather Research and Forecasting (nu-WRF). Website.
https://modelingguru.nasa.gov/community/atmospheric/nuwrf.
OVERFLOW Computational Fluid Dynamics (CFD) flow solver. Website.
https://overflow.larc.nasa.gov/.
Pleiades Supercomputer. Website. https://www.nas.nasa.gov/hecc/resources/pleiades.html.
TOP500 – November 2016. Website. https://www.top500.org/lists/2016/11/.
USM3D NASA Common Research Model (USM3D). Website.
https://commonresearchmodel.larc.nasa.gov/computational-approach/flow-solvers-used/usm3d/.
Causal web. https://ccd2.vm.bridges.psc.edu/ccd/login.
Causal web application quick start and user guide.
http://www.ccd.pitt.edu/wiki/index.php?title=Causal_Web_Application_Quick_Start_and_User_G
uide.
Frederick Jelinek Memorial Summer Workshop. https://www.lti.cs.cmu.edu/frederick-jelinek-
memorial-summer-workshop-closing-day-schedule.
Galaxy Main. https://usegalaxy.org.
Galaxy Project Stats. https://galaxyproject.org/galaxy-project/statistics/#usegalaxyorg-usage.
MIDAS MISSION Public Health Hackathon – Visualizing the future of public health. https://midas-
publichealth-hack-3336.devpost.com.
Openstack bare metal provisioning program. https://wiki.openstack.org/wiki/Ironic.
Science gateways listing. https://www.xsede.org/gateways-listing.
The GDELT Project. https://www.gdeltproject.org.
Serafim Batzoglou. Algorithmic challenges in mammalian whole-genome assembly. In Encyclopedia
of Genetics, Genomics, Proteomics and Bioinformatics. American Cancer Society, 2005.
Noam Brown and Tuomas Sandholm. Safe and Nested Subgame Solving for Imperfect-Information
Games. In I Guyon, U V Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R
Garnett, editors, ArXiv e-prints, volume Advances i, pages 689–699, Long Beach, California, 2017.
Curran Associates, Inc.
Noam Brown and Tuomas Sandholm. Superhuman AI for heads-up no-limit poker: Libratus beats top
professionals. Science, 2017.
Gregory A. Cary, R. Andrew Cameron, and Veronica F. Hinman. EchinoBase: Tools for Echinoderm
Genome Analyses. In Eukaryotic Genomic Databases, Methods in Molecular Biology, pages 349–
369. Humana Press, New York, NY, 2018.
Uma R. Chandran, Olga P. Medvedeva, M. Michael Barmada, Philip D. Blood, Anish Chakka,
Soumya Luthra, Antonio Ferreira, Kim F. Wong, Adrian V. Lee, Zhihui Zhang, Robert Budden, J.
Ray Scott, Annerose Berndt, Jeremy M. Berg, and Rebecca S. Jacobson. TCGA Expedition: A Data
Acquisition and Management System for TCGA Data. PLOS ONE, 11(10):e0165395, October
2016.
Chris Dyer and Phil Blunsom. On the State of the Art of Evaluation in Neural Language Models.
pages 1–10, 2018.
Andre Esteva, Brett Kuprel, Roberto A Novoa, Justin Ko, Susan M Swetter, Helen M Blau, and
Sebastian Thrun. Dermatologist-level classification of skin cancer with deep neural networks.
Nature Publishing Group, 2017.
O.S. Foundation. The Crossroads of Cloud and HPC: OpenStack for Scientific Research: Exploring
OpenStack Cloud Computing for Scientific Workloads. CreateSpace Independent Publishing
Platform, 2016.
Timothy Gushanas. NASA Twins Study Investigators to Release Integrated Paper in 2018. 2018.
Jo Handelsman. Metagenomics: Application of Genomics to Uncultured Microorganisms.
Microbiology and Molecular Biology Reviews, 68(4):669–685, December 2004.
David E Hudak, Douglas Johnson, Jeremy Nicklas, Eric Franz, Brian McMichael, and Basil Gohar.
Open OnDemand: Transforming Computational Science Through Omni-disciplinary Software
Cyberinfrastructure. In Proceedings of the XSEDE16 Conference on Diversity, Big Data, and
Science at Scale, pages 1–7, Miami, USA, 2016. ACM.
Morris A. Jette, Andy B. Yoo, and Mark Grondona. Slurm: Simple linux utility for resource
management. In In Lecture Notes in Computer Science: Proceedings of Job Scheduling Strategies
for Parallel Processing (JSSPP) 2003, pages 44–60. Springer-Verlag, 2002.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. ImageNet Classification with Deep
Convolutional Neural Networks, pages 1097–1105. Curran Associates, Inc., 2012.
Gregory M. Kurtzer, Vanessa Sochat, and Michael W. Bauer. Singularity: Scientific containers for
mobility of compute. PLOS ONE, 12(5):1–20, 05 2017.
Katherine A Lawrence, Michael Zentner, Nancy Wilkins-Diehr, Julie A Wernert, Marlon Pierce,
Suresh Marru, and Scott Michael. Science gateways today and tomorrow: positive perspectives of
nearly 5000 members of the research community. Concurrency and Computation: Practice and
Experience, 27(16):4252–4268, 2015.
Charng-Da Lu, James Browne, Robert L. DeLeon, John Hammond, William Barth, Thomas R.
Furlani, Steven M. Gallo, Matthew D. Jones, and Abani K. Patra. Comprehensive job level
resource usage measurement and analysis for XSEDE HPC systems. Proceedings of the
Conference on Extreme Science and Engineering Discovery Environment Gateway to Discovery –
XSEDE ’13, page 1, 2013.
Mario Lucic, Karol Kurach, Marcin Michalski, Sylvain Gelly, and Olivier Bousquet. Are GANs
Created Equal? A Large-Scale Study. arXiv:1711.10337 [cs, stat], November 2017. arXiv:
1711.10337.
Herman L. Mays, Chih-Ming Hung, Pei-Jen Shaner, James Denvir, Megan Justice, Shang-Fang Yang,
Terri L. Roth, David A. Oehler, Jun Fan, Swanthana Rekulapally, and Donald A. Primerano.
Genomic Analysis of Demographic History and Ecological Niche Modeling in the Endangered
Sumatran Rhinoceros Dicerorhinus sumatrensis. Current Biology, 28(1):70–76.e4, January 2018.
Dirk Merkel. Docker: lightweight Linux containers for consistent development and deployment. Linux
Journal, 2014(239), 2014.
Paul Nowoczynski, Jason Sommerfield, Jared Yanovich, J. Ray Scott, Zhihui Zhang, and Michael
Levine. The data supercell. In Proceedings of the 1st Conference of the Extreme Science and
Engineering Discovery Environment: Bridging from the eXtreme to the Campus and Beyond,
XSEDE ’12, pages 13:1–13:11, New York, NY, USA, 2012. ACM.
Nicholas A. Nystrom. Bridges virtual tour. https://psc.edu/bvt.
Nicholas A. Nystrom, Michael J. Levine, Ralph Z. Roskies, and J. Ray Scott. Bridges: A uniquely
flexible hpc resource for new communities and data analytics. In Proceedings of the 2015 XSEDE
Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, XSEDE ’15,
pages 30:1–30:8, New York, NY, USA, 2015. ACM.
Nick Nystrom, Joel Welling, Phil Blood, and Eng Lim Goh. Blacklight: Coherent Shared Memory for
Enabling Science. In Contemporary High Performance Computing, Chapman & Hall/CRC
Computational Science, pages 421–440. Chapman and Hall/CRC, July 2013.
David Palesch, Steven E. Bosinger, Gregory K. Tharp, Thomas H. Vanderford, Mirko Paiardini, Ann
Chahroudi, Zachary P. Johnson, Frank Kirchhoff, Beatrice H. Hahn, Robert B. Norgren, Nirav B.
Patel, Donald L. Sodora, Reem A. Dawoud, Caro-Beth Stewart, Sara M. Seepo, R. Alan Harris,
Yue Liu, Muthuswamy Raveendran, Yi Han, Adam English, Gregg W. C. Thomas, Matthew W.
Hahn, Lenore Pipes, Christopher E. Mason, Donna M. Muzny, Richard A. Gibbs, Daniel Sauter,
Kim Worley, Jeffrey Rogers, and Guido Silvestri. Sooty mangabey genome sequence provides
insight into AIDS resistance in a natural SIV host. Nature, 553(7686):77–81, January 2018.
Pavel A. Pevzner, Haixu Tang, and Michael S. Waterman. An Eulerian path approach to DNA
fragment assembly. Proceedings of the National Academy of Sciences, 98(17):9748–9753, August
2001.
Lenore Pipes, Sheng Li, Marjan Bozinoski, Robert Palermo, Xinxia Peng, Phillip Blood, Sara Kelly,
Jeffrey M. Weiss, Jean Thierry-Mieg, Danielle Thierry-Mieg, Paul Zumbo, Ronghua Chen, Gary P.
Schroth, Christopher E. Mason, and Michael G. Katze. The non-human primate reference
transcriptome resource (NHPRTR) for comparative functional genomics. Nucleic Acids Research,
41(D1):D906–D914, January 2013.
Pranav Rajpurkar, Awni Y. Hannun, Masoumeh Haghpanahi, Codie Bourn, and Andrew Y. Ng.
Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks. arXiv:1707.01836
[cs], July 2017. arXiv: 1707.01836.
Jason A. Reuter, Damek V. Spacek, and Michael P. Snyder. High-throughput sequencing technologies.
Molecular Cell, 58(4):586–597, May 2015.
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang,
Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C Berg, and Li Fei-Fei. ImageNet
Large Scale Visual Recognition Challenge. International Journal of Computer Vision, 115(3):211–
252, 2015.
Alexander Sczyrba, Peter Hofmann, Peter Belmann, David Koslicki, Stefan Janssen, Johannes Dröge,
Ivan Gregor, Stephan Majda, Jessika Fiedler, Eik Dahms, Andreas Bremges, Adrian Fritz, Ruben
Garrido-Oter, Tue Sparholt Jørgensen, Nicole Shapiro, Philip D Blood, Alexey Gurevich, Yang
Bai, Dmitrij Turaev, Matthew Z DeMaere, Rayan Chikhi, Niranjan Nagarajan, Christopher Quince,
Fernando Meyer, Monika Balvočit, Lars Hestbjerg Hansen, Søren J Sørensen, Burton K H Chia,
Bertrand Denis, Jeff L Froula, Zhong Wang, Robert Egan, Dongwan Don Kang, Jeffrey J Cook,
Charles Deltel, Michael Beckstette, Claire Lemaitre, Pierre Peterlongo, Guillaume Rizk,
Dominique Lavenier, Yu-Wei Wu, Steven W Singer, Chirag Jain, Marc Strous, Heiner
Klingenberg, Peter Meinicke, Michael D Barton, Thomas Lingner, Hsin-Hung Lin, Yu-Chieh Liao,
Genivaldo Gueiros Z Silva, Daniel A Cuevas, Robert A Edwards, Surya Saha, Vitor C Piro,
Bernhard Y Renard, Mihai Pop, Hans-Peter Klenk, Markus Göker, Nikos C Kyrpides, Tanja
Woyke, Julia A Vorholt, Paul Schulze-Lefert, Edward M Rubin, Aaron E Darling, Thomas Rattei,
and Alice C McHardy. Critical Assessment of Metagenome Interpretation—a benchmark of
metagenomics software. Nature Methods, 14:1063, Oct 2017.
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George van den Driessche,
Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, Sander Dieleman,
Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy Lillicrap, Madeleine
Leach, Koray Kavukcuoglu, Thore Graepel, and Demis Hassabis. Mastering the game of Go with
deep neural networks and tree search. Nature, 529(7587):484–489, 2016.
Nikolay A. Simakov, Joseph P. White, Robert L. DeLeon, Steven M. Gallo, Matthew D. Jones, Jeffrey
T. Palmer, Benjamin Plessinger, and Thomas R. Furlani. A Workload Analysis of NSF’s
Innovative HPC Resources Using XDMoD. arXiv:1801.04306 [cs], January 2018. arXiv:
1801.04306.
John Towns, Timothy Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, Andrew Grimshaw, Victor
Hazlewood, Scott Lathrop, Dave Lifka, Gregory D Peterson, Ralph Roskies, J Ray Scott, and
Nancy Wilkens-Diehr. XSEDE: Accelerating Scientific Discovery. Computing in Science &
Engineering, 16(5):62–74, 9 2014.
B Yang, L Ying, and J Tang. Artificial Neural Network Enhanced Bayesian PET Image
Reconstruction. IEEE Transactions on Medical Imaging, PP(99):1, 2018.
Jared Yanovich. Slash2 file system. https://github.com/pscedu/slash2.
J Ye, P Wu, J Z Wang, and J Li. Fast Discrete Distribution Clustering Using Wasserstein Barycenter
With Sparse Support. IEEE Transactions on Signal Processing, 65(9):2317–2332, 2017.
Jonathan D. Young, Chunhui Cai, and Xinghua Lu. Unsupervised deep learning reveals prognostically
relevant subtypes of glioblastoma. BMC Bioinformatics, 18(Suppl 11):381, October 2017.
Daniel R. Zerbino and Ewan Birney. Velvet: Algorithms for de novo short read assembly using de
Bruijn graphs. Genome Research, 18(5):821–829, May 2008.
Xinye Zheng, Jianbo Ye, Yukun Chen, Stephen Wistar, Jia Li, Jose A. Piedra-Fernndez, Michael A.
Steinberg, and James Z. Wang. Detecting Comma-shaped Clouds for Severe Weather Forecasting
using Shape and Motion. arXiv:1802.08937 [cs], February 2018. arXiv: 1802.08937.
The gravIT github repository.
OpenSWR.
Presidential Executive Order No. 13702. 2015.
B. P. Abbott et al. Observation of gravitational waves from a binary black hole merger. Phys. Rev.
Lett., 116:061102, February 2016.
Kapil Agrawal, Mark R. Fahey, Robert McLay, and Doug James. User environment tracking and
problem detection with XALT. In Proceedings of the First International Workshop on HPC User
Support Tools, HUST ’14, pages 32–40, Piscataway, NJ, USA, 2014. IEEE Press.
Ronald Babich, Michael A. Clark, and Bálint Joó. Parallelizing the QUDA library for multi-GPU
calculations in lattice quantum chromodynamics. In Proceedings of the 2010 ACM/IEEE
International Conference for High Performance Computing, Networking, Storage and Analysis, SC
’10, pages 1–11, Washington, DC, USA, 2010. IEEE Computer Society.
Carson Brownlee, Thiago Ize, and Charles D. Hansen. Image-parallel ray tracing using openGL
interception. In Proceedings of the 13th Eurographics Symposium on Parallel Graphics and
Visualization, EGPGV ’13, pages 65–72, Aire-la-Ville, Switzerland, 2013. Eurographics
Association.
Martin Burtscher, Byoung-Do Kim, Jeff Diamond, John McCalpin, Lars Koesterke, and James
Browne. Perfexpert: An easy-to-use performance diagnosis tool for HPC applications. In
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing,
Networking, Storage and Analysis, SC ’10, pages 1–11, Washington, DC, USA, 2010. IEEE
Computer Society.
Todd Evans, William L. Barth, James C. Browne, Robert L. DeLeon, Thomas R. Furlani, Steven M.
Gallo, Matthew D. Jones, and Abani K. Patra. Comprehensive resource use monitoring for HPC
systems with TACC stats. In Proceedings of the First International Workshop on HPC User
Support Tools, HUST ’14, pages 13–21, Piscataway, NJ, USA, 2014. IEEE Press.
National Science Foundation. Advanced computing infrastructure strategic plan. Technical Report
NSF-12-051, 2012.
Niall Gaffney, Christopher Jordan, Tommy Minyard, and Dan Stanzione. Building wrangler: A
transformational data intensive resource for the open science community. 2014 IEEE International
Conference on Big Data (Big Data), pages 20–22, 2014.
J. Hammond. The lltop github repository.
J. Hammond. The xltop github repository.
Alexander Heinecke, Alexander Breuer, Sebastian Rettenberger, Michael Bader, Alice-Agnes Gabriel,
Christian Pelties, Arndt Bode, William Barth, Xiang-Ke Liao, Karthikeyan Vaidyanathan, Mikhail
Smelyanskiy, and Pradeep Dubey. Petascale high order dynamic rupture earthquake simulations on
heterogeneous supercomputers. In Proceedings of the International Conference for High
Performance Computing, Networking, Storage and Analysis, SC ’14, pages 3–14, Piscataway, NJ,
USA, 2014. IEEE Press.
Jacob A. Hummel, Athena Stacy, and Volker Bromm. The First Stars: formation under cosmic ray
feedback. Mon. Not. Roy. Astron. Soc., 460(3):2432–2444, 2016.
Morris A. Jette, Andy B. Yoo, and Mark Grondona. SLURM: Simple linux utility for resource
management. In In Lecture Notes in Computer Science: Proceedings of Job Scheduling Strategies
for Parallel Processing (JSSPP) 2003, pages 44–60. Springer-Verlag, 2002.
Jiuxing Liu, Jiesheng Wu, and Dhabaleswar K. Panda. High performance RDMA-based MPI
implementation over infiniband. Int. J. Parallel Program., 32(3):167–198, June 2004.
Christopher Maffeo, Binquan Luan, and Aleksei Aksimentiev. End-to-end attraction of duplex DNA.
Nucleic Acids Research, 40(9):3812–3821, 2012.
Robert McLay, Karl W. Schulz, William L. Barth, and Tommy Minyard. Best practices for the
deployment and management of production HPC clusters. In State of the Practice Reports, SC ’11,
pages 1–11, New York, NY, USA, 2011. ACM.
Nirav Merchant, Eric Lyons, Stephen Goff, Matthew Vaughn, Doreen Ware, David Micklos, and
Parker Antin. The iplant collaborative: Cyberinfrastructure for enabling data to discovery for the
life sciences. PLOS Biology, 14(1):1–9, January 2016.
James C. Phillips, Rosemary Braun, Wei Wang, James Gumbart, Emad Tajkhorshid, Elizabeth Villa,
Christophe Chipot, Robert D. Skeel, Laxmikant Kal, and Klaus Schulten. Scalable molecular
dynamics with NAMD. Journal of Computational Chemistry, 26(16):1781–1802, 2005.
Abtin Rahimian, Ilya Lashuk, Shravan Veerapaneni, Aparna Chandramowlishwaran, Dhairya
Malhotra, Logan Moon, Rahul Sampath, Aashay Shringarpure, Jeffrey Vetter, Richard Vuduc,
Denis Zorin, and George Biros. Petascale direct numerical simulation of blood flow on 200K cores
and heterogeneous architectures. In Proceedings of the 2010 ACM/IEEE International Conference
for High Performance Computing, Networking, Storage and Analysis, SC ’10, pages 1–11,
Washington, DC, USA, 2010. IEEE Computer Society.
Ellen M. Rathje, Clint Dawson, Jamie E. Padgett, Jean-Paul Pinelli, Dan Stanzione, Ashley Adair,
Pedro Arduino, Scott J. Brandenberg, Tim Cockerill, Charlie Dey, Maria Esteva, Fred L. Haan,
Matthew Hanlon, Ahsan Kareem, Laura Lowes, Stephen Mock, and Gilberto Mosqueda.
Designsafe: New cyberinfrastructure for natural hazards engineering. Natural Hazards Review,
18(3):06017001, 2017.
Johann Rudi, A. Cristiano, I. Malossi, Tobin Isaac, Georg Stadler, Michael Gurnis, Peter W. J. Staar,
Yves Ineichen, Costas Bekas, Alessandro Curioni, and Omar Ghattas. An extreme-scale implicit
solver for complex PDEs: Highly heterogeneous flow in earth’s mantle. In Proceedings of the
International Conference for High Performance Computing, Networking, Storage and Analysis, SC
’15, pages 1–12, New York, NY, USA, 2015. ACM.
Dan Stanzione, Bill Barth, Niall Gaffney, Kelly Gaither, Chris Hempel, Tommy Minyard, S.
Mehringer, Eric Wernert, H. Tufo, D. Panda, and P. Teller. Stampede 2: The evolution of an
XSEDE supercomputer. In Proceedings of the Practice and Experience in Advanced Research
Computing 2017 on Sustainability, Success and Impact, PEARC17, pages 1–8, New York, NY,
USA, 2017. ACM.
Craig A. Stewart, Timothy M. Cockerill, Ian Foster, David Hancock, Nirav Merchant, Edwin
Skidmore, Daniel Stanzione, James Taylor, Steven Tuecke, George Turner, Matthew Vaughn, and
Niall I. Gaffney. Jetstream: A self-provisioned, scalable science and engineering cloud
environment. In Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by
Enhanced Cyberinfrastructure, XSEDE ’15, pages 1–8, New York, NY, USA, 2015. ACM.
John Towns, Timothy Cockerill, Maytal Dahan, Ian Foster, Kelly Gaither, Andrew Grimshaw, Victor
Hazlewood, Scott Lathrop, Dave Lifka, Gregory D. Peterson, Ralph Roskies, J. Ray Scott, and
Nancy Wilkins-Diehr. XSEDE: Accelerating scientific discovery. Computing in Science &
Engineering, 16(5):62–74, 2014.
I. Wald, G. P. Johnson, J. Amstutz, et al. OSPRay – A CPU ray tracing framework for scientific
visualization. IEEE Transactions on Visualization & Computer Graphics, 23(1):931–940, 2017.
Fuqing Zhang and Yonghui Weng. Predicting hurricane intensity and associated hazards: A five-year
real-time forecast experiment with assimilation of airborne Doppler radar observations. Bulletin of
the American Meteorological Society, 96(1):25–33, 2015.
Github: ARTED. https://github.com/ARTED/ARTED.
Green500 | TOP500 Supercomputer Sites.
HPCG.
KNC cluster COMA. https://www.ccs.tsukuba.ac.jp/eng/supercomputers/.
TOP500 Supercomputer Sites.
Jack Dongarra, Michael A. Heroux, and Piotr Luszczek. High-performance conjugate-gradient
benchmark: A new metric for ranking high-performance computing systems. The International
Journal of High Performance Computing Applications, 30(1):3–10, 2016.
Jack J. Dongarra, Piotr Luszczek, and Antoine Petitet. The LINPACK benchmark: past, present and
future. Concurrency and Computation: Practice and Experience, 15(9):803–820, 2003.
K. Fujita, T. Ichimura, K. Koyama, M. Horikoshi, H. Inoue, L. Meadows, S. Tanaka, M. Hori, M.
Lalith, and T. Hori. A fast implicit solver with low memory footprint and high scalability for
comprehensive earthquake simulation system. In Research Poster for SC16, International
Conference for High Performance Computing, Networking, Storage and Analysis, November 2016.
Balazs Gerofi, Akio Shimada, Atsushi Hori, and Yutaka Ishikawa. Partially Separated Page Tables for
Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous
Architectures. In 2013 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid
Computing (CCGrid), May 2013.
Balazs Gerofi, Akio Shimada, Atsushi Hori, Takagi Masamichi, and Yutaka Ishikawa. CMCP: A
Novel Page Replacement Policy for System Level Hierarchical Memory Management on Many-
cores. In Proceedings of the 23rd International Symposium on High-performance Parallel and
Distributed Computing, HPDC ’14, pages 73–84, New York, NY, USA, 2014. ACM.
Balazs Gerofi, Masamichi Takagi, Yutaka Ishikawa, Rolf Riesen, Evan Powers, and Robert W.
Wisniewski. Exploring the Design Space of Combining Linux with Lightweight Kernels for
Extreme Scale Computing. In Proceedings of ROSS’15, pages 1–8. ACM, 2015.
Y. Hirokawa, T. Boku, S. A. Sato, and K. Yabana. Performance evaluation of large scale electron
dynamics simulation under many-core cluster based on Knights Landing. In HPC Asia 2018,
January 2018.
T. Ichimura, K. Fujita, P. E. B. Quinay, L. Maddegedara, M. Hori, S. Tanaka, Y. Shizawa, H.
Kobayashi, and K. Minami. Implicit nonlinear wave simulation with 1.08T DOF and 0.270T
unstructured finite elements to enhance comprehensive earthquake simulation. In ACM
Proceedings of the International Conference on High Performance Computing, Networking,
Storage and Analysis (SC’15), November 2015.
T. Ichimura, K. Fujita, S. Tanaka, M. Hori, M. Lalith, Y. Shizawa, and H. Kobayashi. Physics-based
urban earthquake simulation enhanced by 10.7 BlnDOF × 30K time-step unstructured FE non-
linear seismic wave simulation. In IEEE Proceedings of the International Conference on High
Performance Computing, Networking, Storage and Analysis (SC’14), November 2014.
K. Nakajima, M. Satoh, T. Furumura, H. Okuda, T. Iwashita, H. Sakaguchi, T. Katagiri, M.
Matsumoto, S. Ohshima, H. Jitsumoto, T. Arakawa, F. Mori, T. Kitayama, A. Ida, and M. Y.
Matsuo. ppOpen-HPC: Open source infrastructure for development and execution of large-scale
scientific applications on post-peta-scale supercomputers with automatic tuning (AT). In
Optimization in the Real World — Towards Solving Real-Worlds Optimization Problems, volume
13 of Mathematics for Industry, pages 15–35, 2015.
A. Petitet, R. C. Whaley, J. Dongarra, and A. Cleary. HPL – A Portable Implementation of the High-
Performance Linpack Benchmark for Distributed-Memory Computers.
S. A. Sato and K. Yabana. Maxwell + TDDFT multi-scale simulation for laser-matter interactions. J.
Adv. Simulat. Sci. Eng., 1(1), 2014.
Taku Shimosawa. Operating System Organization for Manycore Systems. dissertation, The University
of Tokyo, 2012.
Taku Shimosawa, Balazs Gerofi, Masamichi Takagi, Gou Nakamura, Tomoki Shirasawa, Yuji Saeki,
Masaaki Shimizu, Atsushi Hori, and Yutaka Ishikawa. Interface for Heterogeneous Kernels: A
Framework to Enable Hybrid OS Designs targeting High Performance Computing on Manycore
Architectures. In 2014 21th International Conference on High Performance Computing (HiPC),
December 2014.
Github: ARTED. https://github.com/ARTED/ARTED.
Green500 | TOP500 Supercomputer Sites.
HPCG.
KNC cluster COMA. https://www.ccs.tsukuba.ac.jp/eng/supercomputers/.
TOP500 Supercomputer Sites.
Jack Dongarra, Michael A. Heroux, and Piotr Luszczek. High-performance conjugate-gradient
benchmark: A new metric for ranking high-performance computing systems. The International
Journal of High Performance Computing Applications, 30(1):3–10, 2016.
Jack J. Dongarra, Piotr Luszczek, and Antoine Petitet. The LINPACK benchmark: past, present and
future. Concurrency and Computation: Practice and Experience,15(9):803–820, 2003.
K. Fujita, T. Ichimura, K. Koyama, M. Horikoshi, H. Inoue, L. Meadows, S. Tanaka, M. Hori, M.
Lalith, and T. Hori. A fast implicit solver with low memory footprint and high scalability for
comprehensive earthquake simulation system. In Research Poster for SC16, International
Conference for High Performance Computing, Networking, Storage and Analysis, November 2016.
Balazs Gerofi, Akio Shimada, Atsushi Hori, and Yutaka Ishikawa. Partially Separated Page Tables for
Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous
Architectures. In Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM
International Symposium on, May 2013.
Balazs Gerofi, Akio Shimada, Atsushi Hori, Takagi Masamichi, and Yutaka Ishikawa. CMCP: A
Novel Page Replacement Policy for System Level Hierarchical Memory Management on Many-
cores. In Proceedings of the 23rd International Symposium on High-performance Parallel and
Distributed Computing, HPDC ’14, pages 73–84, New York, NY, USA, 2014. ACM.
Balazs Gerofi, Masamichi Takagi, Yutaka Ishikawa, Rolf Riesen, Evan Powers, and Robert W.
Wisniewski. Exploring the Design Space of Combining Linux with Lightweight Kernels for
Extreme Scale Computing. In Proceedings of ROSS’15, pages 1–8. ACM, 2015.
Y. Hirokawa, T. Boku, S. A. Sato, and K. Yabana. Performance evaluation of large scale electron
dynamics simulation under many-core cluster based on Knights Landing. In HPC Asia 2018,
January 2018.
T. Ichimura, K. Fujita, P. E. B. Quinay, L. Maddegedara, M. Hori, S. Tanaka, Y. Shizawa, H.
Kobayashi, and K. Minami. Implicit nonlinear wave simulation with 1.08T DOF and 0.270T
unstructured finite elements to enhance comprehensive earthquake simulation. In ACM
Proceedings of the International Conference on High Performance Computing, Networking,
Storage and Analysis (SC’15), November 2015.
T. Ichimura, K. Fujita, S. Tanaka, M. Hori, M. Lalith, Y. Shizawa, and H. Kobayashi. Physics-based
urban earthquake simulation enhanced by 10.7 BlnDOF × 30K time-step unstructured FE non-
linear seismic wave simulation. In IEEE Proceedings of the International Conference on High
Performance Computing, Networking, Storage and Analysis (SC’14), November 2014.
K. Nakajima, M. Satoh, T. Furumura, H. Okuda, T. Iwashita, H. Sakaguchi, T. Katagiri, M.
Matsumoto, S. Ohshima, H. Jitsumoto, T. Arakawa, F. Mori, T. Kitayama, A. Ida, and M. Y.
Matsuo. ppOpen-HPC: Open source infrastructure for development and execution of large-scale
scientific applications on post-peta-scale supercomputers with automatic tuning (AT). In
Optimization in the Real World — Towards Solving Real-Worlds Optimization Problems, volume
13 of Mathematics for Industry, pages 15–35, 2015.
A. Petitet, R. C. Whaley, J. Dongarra, and A. Cleary. HPL – A Portable Implementation of the High-
Performance Linpack Benchmark for Distributed-Memory Computers.
S. A. Sato and K. Yabana. Maxwell + TDDFT multi-scale simulation for laser-matter interactions. J.
Adv. Simulat. Sci. Eng., 1(1), 2014.
Taku Shimosawa. Operating System Organization for Manycore Systems. dissertation, The University
of Tokyo, 2012.
Taku Shimosawa, Balazs Gerofi, Masamichi Takagi, Gou Nakamura, Tomoki Shirasawa, Yuji Saeki,
Masaaki Shimizu, Atsushi Hori, and Yutaka Ishikawa. Interface for Heterogeneous Kernels: A
Framework to Enable Hybrid OS Designs targeting High Performance Computing on Manycore
Architectures. In High Performance Computing (HiPC), 2014 21th International Conference on,
HiPC ’14, December 2014.

Mainframe From Scratch Hardware Configuration and ZOS Build
100% (1)
Mainframe From Scratch Hardware Configuration and ZOS Build
282 pages
Linux HPC Cluster Installation
No ratings yet
Linux HPC Cluster Installation
252 pages
PowerHA SystemMirror For IBM I Cookbook
No ratings yet
PowerHA SystemMirror For IBM I Cookbook
488 pages
IBM Blade Center, Linux, and Open Source Blueprint For E-Business On Demand
No ratings yet
IBM Blade Center, Linux, and Open Source Blueprint For E-Business On Demand
258 pages
Distributed and Cloud Computing
100% (1)
Distributed and Cloud Computing
671 pages
SG 248485
No ratings yet
SG 248485
146 pages
IBM POWER8 High-Performance Computing Guide
No ratings yet
IBM POWER8 High-Performance Computing Guide
400 pages
Cause
No ratings yet
Cause
341 pages
CNFS
No ratings yet
CNFS
82 pages
Parallel Programming and Optimization With Intel Xeon Phi Coprocessors
No ratings yet
Parallel Programming and Optimization With Intel Xeon Phi Coprocessors
520 pages
Cloud Computing (Draft For Review)
No ratings yet
Cloud Computing (Draft For Review)
103 pages
Principles of System Administration
No ratings yet
Principles of System Administration
316 pages
SG 248439
No ratings yet
SG 248439
214 pages
Implementing A VersaStack Solution by Cisco and IBM With IBM Storwize V5030, Cisco UCS Mini, Hyper-V, and SQL Server
No ratings yet
Implementing A VersaStack Solution by Cisco and IBM With IBM Storwize V5030, Cisco UCS Mini, Hyper-V, and SQL Server
272 pages
sg247844 GPFS
No ratings yet
sg247844 GPFS
426 pages
SG 246374
No ratings yet
SG 246374
350 pages
SG 248960
No ratings yet
SG 248960
424 pages
Versastack Redbooks
No ratings yet
Versastack Redbooks
476 pages
IBM High Availability Solution For IBM FileNet P8 Systems PDF
No ratings yet
IBM High Availability Solution For IBM FileNet P8 Systems PDF
488 pages
Dissertation Pu
No ratings yet
Dissertation Pu
255 pages
SG 248569
No ratings yet
SG 248569
178 pages
Informix Dynamic Server 11
No ratings yet
Informix Dynamic Server 11
394 pages
Aix Hacmp
100% (1)
Aix Hacmp
240 pages
Nextgen Comp Arch
No ratings yet
Nextgen Comp Arch
794 pages
SG 248074
No ratings yet
SG 248074
638 pages
Anatomy of A Resource Management System For HPC Clusters
No ratings yet
Anatomy of A Resource Management System For HPC Clusters
23 pages
Powerha Systemmirror For Aix Cookbook: Books
No ratings yet
Powerha Systemmirror For Aix Cookbook: Books
640 pages
ExaScale Computing
No ratings yet
ExaScale Computing
291 pages
HPC - IBM Architecture
No ratings yet
HPC - IBM Architecture
62 pages
DIH 10.5 Installation Guide
No ratings yet
DIH 10.5 Installation Guide
133 pages
Infra Support Part 1 - Piyushwairale
No ratings yet
Infra Support Part 1 - Piyushwairale
28 pages
Sg247841 Power Ha Systemmirror 61
No ratings yet
Sg247841 Power Ha Systemmirror 61
598 pages
Software Requirements Specification
No ratings yet
Software Requirements Specification
36 pages
IBM Intelligent Operations
No ratings yet
IBM Intelligent Operations
250 pages
SG 247739
No ratings yet
SG 247739
570 pages
NeIBM Rebook AIX Oracle
No ratings yet
NeIBM Rebook AIX Oracle
144 pages
SG 246650
No ratings yet
SG 246650
254 pages
IBM Xseries Architecture
No ratings yet
IBM Xseries Architecture
34 pages
SG 247833
No ratings yet
SG 247833
440 pages
Is Parallel Programming Hard, And, If So, What Can You Do
No ratings yet
Is Parallel Programming Hard, And, If So, What Can You Do
475 pages
Introduction To Storage Area Networks
No ratings yet
Introduction To Storage Area Networks
302 pages
Parallel Sysplex Overview
No ratings yet
Parallel Sysplex Overview
88 pages
IBM PowerSC in The Cloud
No ratings yet
IBM PowerSC in The Cloud
346 pages
Operating Systems Moayad B.
No ratings yet
Operating Systems Moayad B.
43 pages
Installation Guide
0% (1)
Installation Guide
438 pages
Ibm Powerha Systemmirror For Aix Cookbook: Front Cover
No ratings yet
Ibm Powerha Systemmirror For Aix Cookbook: Front Cover
570 pages
IBM AIX Continuous Availability Features: Paper
No ratings yet
IBM AIX Continuous Availability Features: Paper
166 pages
Ibm p5 - 595 - 590 Redbook
No ratings yet
Ibm p5 - 595 - 590 Redbook
316 pages
DAQ Training Course PDF
No ratings yet
DAQ Training Course PDF
210 pages
Mvsteen Distributed Systems 3rd Preliminary Version 3 01pre 2017 170215 7 11
No ratings yet
Mvsteen Distributed Systems 3rd Preliminary Version 3 01pre 2017 170215 7 11
5 pages
Hus VM Product Overview Guide
No ratings yet
Hus VM Product Overview Guide
70 pages
LAN Security Manager PDF
No ratings yet
LAN Security Manager PDF
47 pages
C C C C C C C
No ratings yet
C C C C C C C
4 pages
WP Vision HA For IBMi E
No ratings yet
WP Vision HA For IBMi E
27 pages
High Availability Implementation of FileNet P8
No ratings yet
High Availability Implementation of FileNet P8
488 pages
APS DPF Elmer Future of Computing
No ratings yet
APS DPF Elmer Future of Computing
38 pages
The Link Between Large Scientific Collaboration and Productivity. Rethinking How To Estimate The Monetary Value of Publications
No ratings yet
The Link Between Large Scientific Collaboration and Productivity. Rethinking How To Estimate The Monetary Value of Publications
57 pages
PoS (ISGC2019) 010
No ratings yet
PoS (ISGC2019) 010
12 pages
LHC Assignment
No ratings yet
LHC Assignment
8 pages
Mastering Python Advanced Concepts and Practical Applications
From Everand
Mastering Python Advanced Concepts and Practical Applications
Aissa Younes
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
From Everand
The Linux Terminal for Advanced Users - The Command Line Made Easy: First Edition
Michael Basler
No ratings yet
Software Patterns Made Easy
From Everand
Software Patterns Made Easy
Justice Nanhou
No ratings yet

10.1201 9781351036863 Previewpdf

Uploaded by

10.1201 9781351036863 Previewpdf

Uploaded by

Contemporary High

Introduction to the Simulation of Dynamics Using Simulink

Introduction to Modeling and Simulation with MATLAB® and Python

Fundamentals of Multicore Software Development

Programming for Hybrid Multi/Manycore MPP Systems

Exascale Scientific Applications

GPU Parallel Program Development Using CUDA

Parallel Programming with Co-Arrays

Contemporary High Performance Computing

For more information about this series please visit:

© 2019 by Taylor & Francis Group, LLC

No claim to original U.S. Government works

Printed on acid-free paper

International Standard Book Number-13: 978-1-1384-8707-9 (Hardback)

To my family, Jana and Alex.

1 Resilient HPC for 24x7x365 Weather Forecast Operations at the

2 Theta and Mira at Argonne National Laboratory 31

3 Enabling HPC Applications on a Cray XC40 with Manycore CPUs at

4 The Mont-Blanc Prototype 93

6 CSCS and the Piz Daint System 149

7 Facility Best Practices 175

7.5.9 Measuring and Monitoring . . . . . . . . . . . . . . . . . . . . . . . 184

9 Modular Supercomputing Architecture: From Idea to Production 223

10 SuperMUC at LRZ 257

11 The NERSC Cori HPC System 275

11.4 Hardware Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 280

12.2 Applications and Workloads . . . . . . . . . . . . . . . . . . . . . . . . . . 309

13.9 System Operations and Maintenance . . . . . . . . . . . . . . . . . . . . . 348

15 Stampede at TACC 385

16.7.2.4 Preliminary Performance Evaluation on Oakforest-PACS . 416

17 CHPC in South Africa 423

1. Sponsor and site history

9. Site HPC statistics

Why I Edited This Book

Helping Improve This Book

FIGURE 1.1: Australis compute racks as seen in the Data Centre

1.2.1 Program Background

1.2.2 Sponsor Background

TABLE 1.1: Australis Implementation Timeline.

Date Milestone Description Reference Name

1.3 Applications and Workloads

1.3.1 Highlights of Main Applications

Atmosphere Science Development

Monitoring Product and

High Performance Computing

FIGURE 1.2: The Bureau’s Numerical Prediction Value Chain

small volumes (“grid-cells”) in a similar way a digital camera approximates an image in an

FIGURE 1.3: Numerical Weather Prediction Cascading Domains

1.3.2 2017 Case Study: From Nodes to News, TC Debbie

March 25 - 29, 2017

FIGURE 1.4: Path of Tropical Cyclone Debbie

Working with a combination of automatic weather station observations, satellite in-

1.3.3 Benchmark Usage

1.3.4 SSP - Monitoring System Performance

TABLE 1.2: SSP Benchmarking.

Benchmark Runtime Number Performance

1. UM N1024L85 828.754 2088 0.00208

Overall SSP value 25,614 123.45

1.4 System Overview

1.4.1 System Design Decisions

Software / Supercomputer and

FIGURE 1.5: HPC Environment at the Bureau

1.5 Hardware Architecture

1.5.1 Australis Processors

MOM MAMU SDB

ARIES Network ARIES Network

FDR IB INFINIBAND FDR IB

FIGURE 1.6: Australis Platform Logical Architecture

1.5.2 Australis Node Design

1.5.2.1 Australis Service Node

1.5.2.2 Australis Compute Node

TABLE 1.3: Australis Server Configuration.

Feature Australis (2015)

Node Architecture Cray XC40

1.5.3 External Nodes

1.5.4 Australis Memory