Chap2 ComputingTrends
Chap2 ComputingTrends
Yijie Zhang
1
Outline
• Introduction
• Challenges & Objectives
• Computing Technologies
− High-performance computing
• Supercomputing and Cluster Computing
− Grid computing
− Computing continuum: from edge to fog to cloud
− Mobile computing
• Ongoing Research in Our Group
− Data-intensive computing
− High-performance networking
2
3
Office
Supercomputer
4
Big-data Applications
• BIG DATA: rapidly increase from T to P, to E, to Z, to Y, and
beyond…
− Science
• Simulation
− Astrophysics, climate modeling, combustion research, etc.
• Experimental
− Spallation Neutron Source, Large Hadron Collider, microarray, genome sequencing,
protein folding, etc.
• Observational
− Large-scale sensor networks, astronomical imaging/radio devices (Dark Energy
Camera, Five-hundred-meter Aperture Spherical Telescope – FAST), etc.
− Business
• Financial transactions
− Wal-Mart, NY stock trading center, Amazon, Alibaba
− Social media
• YouTube, Facebook, Twitter, Weblogs, TikTok, WeChat
No matter which type of data is considered, we need
a high-performance end-to-end computing solution
to support data generation, storage, transfer, processing, and analysis!
5
Big-data Workflows
• Require massively distributed resources
− Hardware
• Computing facilities, storage systems, special rendering engines, display devices
(tiled display, powerwall, etc.), network infrastructures, etc.
− Software
• Domain-specific data analytics/processing tools, programs, etc.
− Data
• Real-time, archival
6
Computing Paradigms: an Overview
• Client–Server Model
− Client–server computing refers broadly to any distributed application that distinguishes between service providers (servers)
and service requesters (clients)
• Grid Computing
− A form of distributed computing and parallel computing, whereby a “super and virtual computer” is composed of a cluster of
networked, loosely coupled computers acting in concert to perform very large tasks
• Service-Oriented Computing
− Cloud computing provides services related to computing while, in a reciprocal manner, service-oriented computing consists of
the computing techniques that operate on software-as-a-service
• Edge Computing
− A distributed computing paradigm that brings computation and data storage closer to the sources of data to improve response
times and save bandwidth
− Internet of things (IoT) is an example of edge computing
• Mobile Computing
− Computing on the go!
7
High-performance Computing
(Supercomputing, Cluster Computing)
8
Supercomputing for Scientific Applications
Astrophysics
Computational biology
Nanoscience
Climate research
Neutron sciences
Flow dynamics
Fusion simulation
Computational materials
9
Why do we care about computing power?
Computer Security: Exhaustive Key Search
• Two types of security
− Computational security
− Unconditional security
• Two types of encryption methods
− Conventional (a.k.a. single-key, secret-key, symmetric): DES/DEA
− Public key-based (a.k.a. asymmetric): RSA, D-H
• Either key could be used for encryption, but only the other key can be used for decryption
• Attack on computational security
− Always possible to simply try every key
− Most basic attack, proportional to key size
− Assume to either know / recognize plaintext
Key Size (bits) Number of Alternative Time required at Time required at
Keys 1 decryption/µs 106 decryptions/µs
32 232 = 4.3 × 109 231 µs = 35.8 minutes 2.15 milliseconds
56 256 = 7.2 × 1016 255 µs = 1142 years 10.01 hours
128 2128 = 3.4 × 1038 2127 µs = 5.4 1024 years 5.4 1018 years
168 2168 = 3.7 × 1050 2167 µs = 5.9 1036 years 5.9 1030 years
26 letters 26! = 4 × 1026 2 1026 µs = 6.4 1012 years 6.4 106 years
(permutation)
10
11
Exascale Race/Technologies
Projected Exascale Dates and Suppliers
U.S. EU
§ Sustained ES*: 2022-2023 § PEAK ES: 2023-2024
§ Peak ES: 2021 § Pre-ES: 2021-2022
§ Vendors: U.S. § Vendors: U.S., Europe
§ Processors: U.S. (some ARM?) § Processors: Likely ARM
§ Initiatives: NSCI/ECP § Initiatives: EuroHPC
§ Cost: $600M per system, plus § Cost: $300-$350M per system,
heavy R&D investments plus heavy R&D investments
China Japan
§ Sustained ES*: 2021-2022 § Sustained ES*: ~2022
§ Peak ES: 2020 § Peak ES: Likely as a AI/ML/DL system
§ Vendors: Chinese (multiple sites) § Vendors: Japanese
§ Processors: Chinese (plus U.S.?) § Processors: Japanese
§ 13th 5-Year Plan § Cost: $800M-$1B, this includes both 1
§ Cost: $350-$500M per system, system and the R&D costs, will also do
plus heavy R&D many smaller size systems
* 1 exaflops on a 64-bit real application
12
13
14
Top 500 June 2024 Release
15
16
17
Top 10 Challenges to Exascale
3 Hardware, 4 Software, 3 Algorithms/Math Related
• Energy efficiency: • Data management:
• Creating more energy efficient circuit, • Creating data management software that
power, and cooling technologies. can handle the volume, velocity and
diversity of data that is anticipated.
• Interconnect technology:
• Increasing the performance and energy • Scientific productivity:
efficiency of data movement. • Increasing the productivity of
computational scientists with new software
• Memory Technology: engineering tools and environments.
Integrating advanced memory
•
technologies to improve both capacity • Exascale Algorithms:
and bandwidth. • Reformulating science problems and
refactoring their solution algorithms for
• Scalable System exascale systems.
Software: • Algorithms for discovery,
Developing scalable system software
•
that is power and resilience aware. design, and decision:
• Facilitating mathematical optimization and
• Programming systems: uncertainty quantification for exascale
• Inventing new programming discovery, design, and decision making.
environments that express massive
parallelism, data locality, and resilience • Resilience and correctness:
• Ensuring correct scientific computation in
face of faults, reproducibility, and
algorithm verification challenges. 13
18
19
Computing power continues to increase over time
20
Terascale Supernova Initiative (TSI)
• Collaborative project
− Supernova explosion
• TSI simulation
− 1 terabyte a day with a
small portion of
parameters
− From TSI to PSI to ESI
• Transfer to remote sites
− Interactive distributed
visualization
− Collaborative data
analysis
− Computation monitoring
− Computation steering
Visualization channel
21
High Performance Computing
Supercomputing Cluster Computing
(MPI, OpenMP) (MapReduce, Spark)
HPC HPDA
Data Data Data Data
Numerically Data
In Out In Out
Intensive Intensive
22
Comparison of Data Analytics and Computing Ecosystems
Spark
23
HPC Architectures
25
Type of Parallelization
Trivial:
• Each CPU does a part of
the work independently.
Nontrivial:
• Each CPU relies on its
neighbor CPUs to complete
the assigned work.
26
Performance Metrics
• Execution time
• Time when the last
processor finishes its work
• Speedup
• (time on 1CPU) / (time on
P CPUs)
• Efficiency
• Speedup / P
27
28
Grid Computing
29
Grid Computing
• Who needs grid computing?
− Particular software capabilities
• Modelling, simulation, etc.
− High hardware/computing demands
• Processing, storage, etc.
− Large network bandwidth
• Circuit provisioning to support large data transfer
• Problems, which are hard (or impossible) to solve at a
single site, can be solved with the right kind of
parallelization and distribution of the tasks involved.
• There are two primary types of grids
− Computational grids
• Open Science Grid (OSG)
• Worldwide LHC Computing Grid (WLCG)
− Data grids
• Earth System Grid (ESG)
30
Requirements
• Computational grids
− Manage a variety of computing resources
− Select computing resources capable of running a user’s job
− Predict loads on grid resources
− Decide about resource availability
− Dynamic resource configuration and provisioning
• Data grids
− Provide data virtualization service
− Support flexible data access, filtering, and transfer mechanisms
− Provide security and privacy mechanisms
• Grid computing environments are constructed upon
three foundations
− Coordinated resources
− Open standard protocols and frameworks
− Non-trivial QoS
31
Computing Continuum:
from Edge to Fog to Cloud
32
Cloud Computing
• What is cloud computing?
− The phrase originated from the cloud symbol used to symbolize the
Internet
− A model for enabling convenient, on-demand network access to a
shared pool of configurable computing resources (e.g., networks,
servers, storage, applications, and services) that can be rapidly
provisioned and released with minimal management effort or service
provider interaction
− Provide computation, software, data access, and storage services that
do not require end-user knowledge of the physical location and
configuration of the system that delivers the services
• Cloud architecture
− Involve multiple cloud components communicating with each other
over application programming interfaces, usually web services and 3-
tier architecture
• Front end: seen by the user, such as a web browser
• Back end: the cloud itself comprising computers, servers, data storage devices, etc.
33
Five Layers in Cloud Computing
Front and back ends:
• Client
− A cloud client consists of computer hardware and/or computer software that relies on
cloud computing for application delivery, or that is specifically designed for delivery of
cloud services
• Server
− The servers layer consists of computer hardware and/or computer software products that
are specifically designed for the delivery of cloud services, including multi-core
processors, cloud-specific operating systems and combined offerings
34
Real-life Cloud Computing Environments
• Microsoft Windows Azure
• Google Gov Cloud
• Amazon EC2
• Alibaba Cloud
• Baidu Cloud
• Eucalyptus (first open-source platform for private clouds, 2008)
• A software platform for the implementation of private cloud computing on
computer clusters
• Many others
35
Managing Amazon EC2 instances
36
Login AWS management console
37
Select AMI to create instance(s)
40
Launch instance
41
Instance IP addresses
43
Regions and
availability zones
45
Connect to Linux instance
46
Connect to Windows instance: get
administrator password
47
Connect to Windows instance
48
IAM – Identity Access Management
49
Mobile and Ubiquitous Computing
50
Mobile and Ubiquitous Computing
51
Mobile and Ubiquitous Computing
Internet of Things (IoT)
53
Ongoing Research in Our Big Data Group
• Data-intensive computing
− Big data ecosystem
− AI and ML
− Scientific Workflow optimization
• Mapping, scheduling, modeling
• High-performance networking
− Bandwidth scheduling
− Transport control
− Control plane design
• Distributed sensor networks
− Deployment, routing, fusion
• Cyber security
− Monitoring, game theory
• Visualization and image processing
54
Thanks!☺
Questions ?
55