0% found this document useful (0 votes)
9 views

Group5 (Grid Computing and Domain Specific Computer Architecture)

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Group5 (Grid Computing and Domain Specific Computer Architecture)

Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 77

GRID COMPUTING

AND
DOMAIN SPECIFIC
COMPUTER
ARCHITECTURE
Presented by
Group - 5
GROUP MEMBERS
1 Aung Myin Myint Myat (3CS-
1697)
2 Aung Myint Myat (3CS-
1641)
3 Aye Tharaphu Aung
(3CT-1407)
4 Khing Sa Pae Thein (3CS-
1382)
5 May Phyo Thu (3CS-
1581)
6 Ngu Wah Aung (3CS-
1751)
7 Theint Thiri Thwe (3CS-
1656)
What is Grid 1
Computing?
- a computing infrastructure that combines computer resources spread over different
geographical locations to achieve a common goal. All unused resources on multiple
computers are pooled together and made available for a single task.

It's like having a virtual supercomputer made up of many smaller, networked computers that
can work together to perform complex calculations or process large amounts of data.

The goal of the grid is to provide a technical infrastructure to allow a group of

organizations that share a common goal to form a virtual organization.


2
How Does Grid Computing
Work?
Multilateral Nature of Grid 3

Computing
• It refers to its collaborative structure, where multiple, often
independent, participants contribute resources to form a
collective computational environment.
• Decentralization
- No Central Authority
- Peer-to-Peer Interaction
• Diverse Participants
- Heterogeneous Resources
- Resource Owners and Users
• Autonomy of Participants
- Independent Decision-Making
- Dynamic Participation
4
Security in Grid
Computing
Resource owners (like companies or organizations) want to maintain tight control over who
can access their resources, how long they can use them, and under what conditions.
• Single Sign-On (SSO) - To avoid the cumbersome process of logging into each
individual computer in the grid, a single sign-on system is used. This means a user only
needs to authenticate once to gain access to the grid. Upon authentication, the user
receives a credential, which is a digitally signed document that identifies the user and
specifies their permissions within the grid.
5
Security in Grid

Computing
Credential Delegation: Credentials can be delegated to other processes. For
example, if a user’s computation needs to generate sub-tasks that run on different
machines, the original credential can be passed along, allowing these sub-tasks to
authenticate themselves and access resources without requiring the user to log in multiple
times.

• Mapping to Local Security: When a user's credential is presented on a remote


machine, it must be translated to the local security system of that machine. For example,
on UNIX systems, this might involve mapping the grid user's credentials to a local user
ID.
6
Open Grid Services
Architecture(OGSA)
• A framework designed to standardize and facilitate interoperability between different
grid systems.

• The OGSA initiative was led by the Global Grid Forum which was a community-
driven organization that aimed to develop and promote open standards for grid computing.

• It uses WSDL (Web Services Definition Language) for describing OGSA


services.

• The services being standardized currently fall into eight broad categories
OGSA 7
SERVICES
1. Infrastructure Services
Function: Enable seamless communication and data exchange between resources, which is
fundamental for the smooth operation of complex grid applications. (Source: Celaya &
Arronategui, 2011)

2. Resource Management Services


Function: Manage resource reservation, deployment, and scheduling to optimize performance
and ensure resources are available when needed. (Source: Abramson, 2011)

3. Data Services
Function: Oversee data movement and replication, ensuring data consistency and availability
across the grid, which is crucial for data-intensive applications. (Source:
Balasangameshwara & Raju, 2012)
8
OGSA
SERVICES
4.Context Services
Function: Match resources to application requirements and usage policies, facilitating
effective resource utilization and compliance. (Source: Foster & Kesselman,
2003)

5.Information Services
Function: Provide real-time information on resource availability and status, aiding in
decision-making and resource management. (Source: Lee et al., 2011)

6.Self-Management Services
Function: Maintain specified quality of service by adjusting resource allocation and
performance to meet operational requirements. (Source: Abramson, 2011)
OGSA 9

SERVICES
7.Security Services
Function: Implement security policies to safeguard grid resources and data, ensuring
operational integrity and confidentiality. (Source: Foster & Kesselman, 2003)

8.Execution Management Services


Function: Coordinate the execution of tasks and workflows, managing dependencies to
ensure smooth operation of grid applications. (Source: Celaya & Arronategui,
2011)
1
Introduction to Grid Layered Architeture0
• In grid computing, a structured framework of layers is crucial for efficiently
managing and utilizing distributed resources.
• There are four key layers that comprise the grid computing architecture :
1
The Layered Hierarchy In Grid Model 1
1
Fabric Layer 2

• The Fabric Layer is the foundational infrastructure of grid computing,


including both physical and software resources.
• It forms the base upon which all other grid computing operations are built.
• It encompasses
⚬ actual hardware components (Eg. servers, storage devices, and
network components) and
⚬ fundamental software (Eg. Operating Systems and drivers) that
supports their operations.
• It provides the essential capabilities needed for grid computing to function
effectively.
1
Physical Resources In Fabric Layer 3
• Components
1. CPUs : Process data and execute tasks.
2. Storage Systems : Store data and applications.
3. Network Devices : Facilitate data transfer and communication.
4. Sensors : Collect data from physical environments.
• Key Points
⚬ Scalability : Essential for adapting to the growing demands of grid
applications.
⚬ Performance : High-performance hardware like multi-core processors
and high-speed networks are crucial for efficient grid operations.
(Foster & Kesselman, 2003; Balasangameshwara & Raju, 2012)
1
Software Resources In Fabric Layer 4
• Components
1. Operating Systems : Manage hardware resources and provide a platform for
applications.
2. Middleware : Bridges the gap between hardware and applications, providing a
uniform interface.
3. Application Software : The actual programs that perform specific tasks or
computations.
• Key Points
⚬ MiddleWare : Simplifies resource management by abstracting hardware
complexities, which helps applications utilize diverse resources seamlessly.
(Celaya & Arronategui, 2011)
1
Resource Layer 5
• The Resource Layer manages and controls access to individual resources within
the grid. Key aspects of this layer include:
1. Efficient Allocation : Ensures that resources are allocated effectively and
securely.
2. Structured Management : Provides an organized approach to handling
the diverse resources available.
3. Intermediary Role : Acts as a bridge between the physical hardware
(Fabric Layer) and higher-level components (Collective and Application Layers).
• This layer plays an important role in optimizing resource use and maintaining a
well-organized grid environment.
1
Core Functions In Resource Layer 6

A. Resource Management
⚬ Local Processes : Monitoring, Allocation, and Policy Enforcement for
resource usage.
⚬ Uniform Interface : Ensures consistent resource interaction.
(Abramson, 2011; Lee et al., 2011)
B. Secure Access
⚬ Security Measures : Include authentication, authorization, and
encryption to protect data and maintain security.
(Foster & Kesselman, 2003)
1
Collective Layer 7
• The Collective Layer oversees groups of resources, emphasizing coordination,
discovery, and brokering.
• Key functions include:
⚬ Coordination : Ensures efficient use of resources.
⚬ Discovery : Helps users find and access the necessary resources.
⚬ Brokering : Facilitates the allocation and optimization of resources.
• This layer is vital for optimizing resource use and maintaining overall system
performance.
1
Core Functions In Collective Layer 8

A. Resource Discovery
⚬ Resource Catalogs : Maintain directories for efficient search and
retrieval.
⚬ Discovery Protocols : Facilitate access to resources based on needs.
(Lee et al., 2011)
B. Brokering and Allocation
⚬ Brokering Services : Match providers with users and optimize
allocation.
⚬ Load Balancing : Ensures even distribution of workloads.
(Abramson, 2011)
1
Application Layer 9
• The Application Layer is where user applications interact with the grid to utilize
resources.
• Key responsibilities include:
⚬ Handling User Requests : Manages and processes user demands for
resources.
⚬ Managing Resource Usage : Oversees how resources are utilized by
applications.
⚬ Integration : Works with lower layers to ensure effective and efficient use of
resources.
• This layer ensures that applications can effectively access and use the grid's resources.
2
Core Functions In Application Layer 0

A. User Applications
• Application Types : Include simulations, data analysis tools, and
collaborative software.
• Application Integration : Ensures effective use of grid resources.
(Foster & Kesselman, 2003)
B. Resource Management Functions
⚬ Credential Management : Handles access credentials.
⚬ Request Handling : Manages resource requests and tracking.
(Celaya & Arronategui, 2011)
2
Conclusion to Grid Layered Architecture 1
•Grid computing functions through a structured framework of layers as follows :
⚬ Fabric Layer : Provides the foundational physical and software resources.
⚬ Resource Layer : Manages and secures access to individual resources.
⚬ Collective Layer : Coordinates resource discovery, brokering, and
allocation.
⚬ Application Layer : Facilitates user interaction with grid resources.

•Each layer plays a crucial role in ensuring efficient and effective grid computing.
•This layered approach not only enhances resource management but also improves
overall system performance and scalability.
2
2

Domain Specific Computer Architectures

• Architecting a computer system designed to meet unique u


requirements.

• Understand of the user-level requirements and performanc


capabilities.
2
Architecting computer systems to meet3

requirements

• Every device containing a digital processor


• Perform a particular function or collection of functions.
• General-purpose devices, such as personal computers.
• A comprehensive list of the required and desired features and capabilities for a device provides

the raw information needed to design architecture


2
4
Lists needed to consider when designing the
components
The types of processing required:

• Need to process audio, video, or other analog information?


• High-resolution graphics display ?
• Will extensive floating-point or decimal mathematics be required?
• System support multiple, simultaneously running applications?
• Are special algorithms, such as neural network processing?
2
ists needed to consider when designing the5
components
Memory and storage requirements:
• How much RAM will the operating system and user applications?
• How much non-volatile storage will be required?

Hard or soft real-time processing:


• Is a real-time response to inputs?
• There desired response times that must be met most?
2
ists needed to consider when designing the6
components
Connectivity requirements:
• What kinds of wired connections, (Ethernet and USB)?
• How many physical ports for each type of connection?
• What types of wireless connections (cellular network, Wi-Fi, Bluetooth, NFC)

Environmental limits:
• Operate in very hot or cold environments?
• Level of shock and vibration the device be able to withstand?
• Need to operate in extremely humid or dry atmospheric conditions
2
sts needed to consider when designing the7
Power consumption: components
• Is the device battery-powered?
level of power consumption for components during periods of high usage
and idle periods?
• Externally provided power
To have high processing performance or low power consumption?
• For both battery-powered systems and externally powered systems,
Limits of power dissipation before overheating becomes an issue?

Physical constraints:
• size of the digital processing components?
2
8
SMART PHONE ARCHITECTURE

Three key features a smartphone must provide

• Small size (except for the display)


• Long battery life
• Very high processing performance

But the requirements for long battery life and high


processing power are in conflict and must be balanced
2
Requirements for small size 9

Starting with a screen size ( height and width)

• large enough to render high-quality video


• function as a user-input device (keyboard)
• easily be carried in a pocket or purse.
• To make it as thin as possible.
3
Questions for thinness 0

• Provide sufficient structural strength to support the screen


• Resist damage from routine handling, drops on the floor
• Providing adequate space for batteries, digital components, and subsystems
3
1
IPHONE X

• Also called • Apple's flagship • Apple releases


the iPhone smartphone at only limited
10 the time information

• Released in • Contain some of • Information


2017 and the most comes from
discontinue advanced teardowns and
technologies iPhone X
d in 2018.
reviewers
3
he computational architecture of the iPhone2
X
• Centered on the Apple A11 Bionic SoC
• ARMv8-A six-core processor ( 4.3 billion CMOS Complementary Metal-Oxide-
Semiconductor transistors )
• Two of the cores (Monsoon) : optimized for high performance and maximum clock speed
of 2.39 GHz.
• Four cores (Mistral) : energy-efficient operation at up to 1.42 GHz.
• Out-of-order superscalar designs.
3
3
he computational architecture of the iPhone
X
• The Monsoon cores : decode up to seven instructions simultaneously
• Mistral cores : decode up to three instructions at a time.
• Executing multiple processes or multiple threads within a single process concurrently,
possible for all six cores to run in parallel.
3
o Maximize battery life 4

Running all six cores simultaneously creates a significant drain on the


batteries.

Most of the time, especially when the user is not interacting with the device, several of the
cores are placed in low-power modes to maximize battery life.
3
ttery Architecture of iphone X 5

• Contains fourth-generation low power double data rate RAM (LP-DDR4x).

• X in LP-DDR4x reduces the I/O signal voltage from the 1.112 V of the previous DDR
generation (LP-DDR4) to 0.61 V in LP-DDR4x
3
GPU of iphone X 6

• A11 SoC (a three-core GPU designed by Apple).


• The 3D rendering process implement ( tile based deferred rendering TBDR).

• Resource-constrained systems (smartphones)


• TBDR attempts to identify objects within the field of view that are not visible as early in the

rendering process as possible


• Rendering process divides the image into sections (the tiles)
• Performs TBDR on multiple tiles in parallel to achieve maximum performance
3
ural network processor of iphone X 7

• Apple Neural Engine, consisting of two cores


• capability of a total of 600 billion operations per second.

• Used for tasks such as identifying and tracking objects in the live video feed from the

phone's cameras.
3
Controller and motion Coprocessors of 8
iphone X
• Gyroscope: Measures rotation and orientation.
• Accelerometer: Measures movement and accelerations
• Barometric Sensor: Measures atmospheric pressure
• User's current activity
• Sensor data collection and processing continues at a low power level even while the

remainder of the phone is in sleep mode.


3
Controller and motion Coprocessors of 9
iphone X
• A high-performance solid-state drive (SSD) controller.
• The iPhone X contains 64 GB or, optionally, 256 GB of internal drive storage.
• Controller manages the interface to this storage,
• Including the use of error-correcting code (ECC).
• Reliability of data storage
4
Major components of iphone X 0
4
Major components of iphone X 1

Batteries
• Contain two batteries cells
• Rechargeable lithium iron batteries (2716 mAph)

Display
• 5.85 inches with 2436 x 1125 pixel resolution
• Active Matrix Organic Light Emitting Diode

Touch sensing
• capacitive sensors are integrated
4
Major components of iphone X 2

Dural cameras , IR projectors and IR camera


• Rare camera has 12 Mp, LEd flash and a zoom lens
• Can record videos with 4K (3840 x 2160) and 60 frame per second
• Infrared camera for face recognition

Wireless charging
• Qi wireless charging (electrical induction to transfer power)

Cellular radio
• 4th generation 4G
4
Major components of iphone X 3

Wi-Fi and bluetooth


• Integrated circuit containing wi-fi interface (2.4GHz and 5GHz)
• Bluetooth interface supporting version 5

Audio Amplifier and vibration motor


• Designed for extremely low power consumption when idle
• Provide high efficiency sound quality
• Vibration is produced by Taptic Engine
4
4

PERSONAL
COMPUTER
ARCHITECTUR
E
4
5

Alienware Aurora Ryzen


Edition
It’s a Gaming desktop
high-end gaming desktop
designed for serious gamers who want
top performance. It comes with
advanced hardware components that
make it extremely powerful, but also
very expensive.
4
AMD Ryzen 9 3950X 6
Processor
• It has 16 cores and can handle 32 tasks at the
same time, making it great for multitasking
• It operates at high speeds
• This processor is built using a special design
(called Zen 2 architecture) that allows it to
perform tasks about 15% faster than previous
models.
4
7
The Ryzen 9 3950X has some
clever tricks to make it faster
and more efficient.

Branch Prediction : a type of machine


learning (a form of artificial
intelligence). It learns from past
decisions to predict future ones more
accurately, making the computer faster
fewer mistakes
and reducing wasted time.
smoother and faster performance
4
8

Nvidia GeForce RTX 2080 Ti


GPU
High-performance graphics card
found in the Aurora Ryzen Edition,
designed for top-tier gaming and
advanced computational tasks.
4
Key Features of GPU 9

• Ray Tracing - Ray tracing is a technique that


makes images look more realistic by simulating
how light behaves in real life.
• Machine Learning - recognize patterns
and make decisions based on data, improve
image resolution
• Deep Learning Support - machine
learning models using TensorFlow, a popular
platform for deep neural network research.
5
0
Visual effect using ray
tracing feature
5
Core Components of GPU 1

• Compute Unified Device


Architecture(CUDA)Cores : the general-purpose cores of
the GPU -> rendering images, complex calculations in scientific
simulations or data processing tasks
• Tensor Cores : These cores are specifically designed for deep
learning tasks -> handle matrix operations, image recognition or
speech processing
• Ray-Tracing Cores: Calculating ray tracing, a technique that
produces more realistic lighting, reflections, and shadows in 3D
scenes -> enhanced visual effects
5
2

The GPU has 11 GB of GDDR6 memory, which is a


type of fast video memory
GPU supports multiple high-quality display outputs
Virtual link port is specifically designed for virtual
reality (VR) headsets. It combines video, data, and power into
a single connection, making it easier to connect VR headsets
without needing multiple cables.
5
3

Major Subsystems
of the Alienware
Aurora Ryzen
Edition
Backbone of the computer, supports PCIe 4.0,
Motherboar which doubles the data transfer speed between
d the processor and graphics card
Manages communication between the processor
ChipSet and other components,supports overclocking

Cooling Uses an Alienware liquid cooling system

32 GB of dual-channel HyperX FURY DDR4


Memory memory , Extreme Memory Profiles (XMP)

Storage 1 TB NVMe M.2 solid-state drive (SSD)


5
5

WAREHO
USE
SCALE
COMPUTI
NG
5
6
WAREHOUSE-SCALE COMPUTING (WSC)
ARCHITECTURE
Overview
• WSCs aggregate computing in large buildings with thousands of
computers.
• Seen as a single massively parallel computing system rather than
individual servers.
• Evolved from early large-scale systems to today's data centers serving
millions of users.
5
7

PROVIDERS
OF
• WSC
Search engines (e.g.,
Google Search)
• Cloud computing
platforms
(e.g., AWS, Azure,
Google Cloud)
• Streaming services
(e.g., Netflix,
5
WSC HARDWARE 8

Low-Cost, Average-Reliability
High-Reliability Hardware
Hardware

• Uses commodity hardware with average


• Utilizes expensive, high-quality
reliability but significantly lower costs.
components (e.g., enterprise-grade
• Requires redundant systems and
servers) with low failure rates.
sophisticated failover mechanisms to
• Reduces the chances of downtime but is
ensure service continuity.
costly to scale for thousands of
• Popular in most large-scale cloud services
machines.
and Internet platforms, as failure in one
• Ideal for critical systems where failure
system can be mitigated by quickly
can have significant financial or shifting workloads to redundant systems.
operational impacts.
5
Index Server Processing in WSC 9
Lookup Requests
• Each index server processes a stream of filtered lookup requests, focusing only on its portion of the index.
• Generates search results based on matches to the search term.
• For complex queries, multiple index servers collaborate, processing terms in parallel.

Result Processing
• Results from index servers are filtered and merged for higher-level processing.
• The final output is transmitted to the user, providing a ranked list of relevant pages.
• Search results include the URL and contextual text surrounding the search term for better relevance.

Performance Focus
• Response times depend more on database lookup speeds and page content extraction than on raw server processing
power.
• WSCs utilize inexpensive hardware (e.g., commodity motherboards, processors) to handle high-volume requests
efficiently.
6
Rack-Based Servers 0

Rack

Assembly
WSC servers are typically assembled in rack
units (1U slots), with each server occupying
1.75” in height and 19” in width.

• A single rack can hold up to 40 servers, stacked


vertically, occupying 70" of vertical space.
6
Rack-Based Servers 1

Server Components And


Scalability
Each server is a fully functional system, with components such as:
• Processor
• RAM
• Local Disk Drive
• 1 Gbit/sec Ethernet Interface
• Servers operate in headless mode with no display, keyboard, or mouse. Configuration and control
occur over the network.
• Consumer-grade hardware components like processors, DRAM, and disks continue to grow in
capacity, enhancing rack server performance over time.
6
Rack-Based Servers 2

Network
• EachConfiguration
server within a rack connects to the rack
network switch via 1 Gbit/s Ethernet cables.
• Racks connect to the WSC network through
multiple external Ethernet cables.
• Oversubscription: Rack connectivity is
oversubscribed by a factor of 4, meaning external
connections are only 25% of internal peak capacity.

Figure: A rack containing 16


servers
6
Rack-Based Servers 3

Cluster Organization Request Handling

• Racks are organized into clusters, each • User requests are routed to available web
sharing a cluster-level switch that servers, which direct queries to
connects to the broader WSC network. appropriate index servers in the rack
• Redundant web servers and index servers clusters.
handle search requests and database • Index lookups and search results are
lookups, ensuring load balancing and processed and passed back through the
failure protection. WSC network to the user.
6
Rack-Based Servers 4

Figure: WSC Internal


Network
6
Hardware Fault Management 5

• WSCs, with thousands of computers, will inevitably experience hardware failures despite using
high-quality components.
• WSCs use a multilevel dispatch, processing, and response model.
• A request typically passes through multiple server levels, each forwarding it to a lower-level
server until completed.
6
Hardware Fault Management 6

• Each server has a monitoring system that checks two main aspects when it sends a request to
another server:
-Responsiveness: How quickly the lower-level server responds.
-Correctness: Whether the response is valid or meets expected criteria.
• If a server receives an incorrect result or no response in time, it flags the system as "unresponsive,
or misbehaving" ensuring quick identification of faults.
• Eg. When a user sends a search request, the system delegates it to smaller systems. If one fails,
the main system detects it and reassigns the task to ensure minimal delay.
6
Hardware Fault Management 7

• Automated processes ensure that failures are detected and handled without human intervention.
• WSCs are built with multiple layers of redundancy and backup systems.
• When a failure occurs, rerouting is done quickly enough to prevent disruption, so users typically
won’t notice any service interruption.
• Failed systems are repaired or replaced in the background while the overall system keeps
functioning.
6
Electrical power consumption 8

• One of the primary operational expenses of running a Warehouse-Scale Computer (WSC) is the
cost of electricity.
• WSCs are vast data centers that house thousands of servers and networking devices, all of which
require a significant amount of power to function.
• In addition to this, cooling systems are essential to prevent these devices from overheating, which
further adds to the overall power consumption.
• Power management is critical to minimizing costs and ensuring efficient operation. (Servers
should only be active when they are processing data for users and powering down unused servers
saves a significant amount of energy.
THE WSC AS A MULTILEVEL 6
9
INFORMATION CACHE

• An index server DRAM supports in- • First level of caching: DRAM (in-
should store a large memory databases by memory database).
portion of its data in storing more index data, • Second level: Local disk on the server.
an in-memory requiring a balance • Third level: Communication between
database for faster between cost and servers in the same rack.
response times. performance when • Fourth level: Across racks in the same
deciding between fewer cluster.
servers with more • Fifth level: Across clusters, which is the
memory or more servers slowest due to bandwidth limitations.
with less memory.
7
Machine Learning Architecture 0

• Train and process data to generate prediction

• Neural Networks : one of types of Machine Learning

Architecture
7
Intel Nervana neural network 1
processor
• Requires frequent weight updates

• Tensors : serves as containers(multidimensional array)

• Intel recommends bfloat16 for deep learning

• Data types are bfloat16,and IEEE-754 32-bit in Nervana architecture


7
Why Choose bfloat16? 2

• IEEE-754 16-bit(half-precision) : 5-bit exponent,10-bit mantissa

• IEEE-754 32-bit : 8-bit exponent, 23-bit mantissa

• bfloat16 : 8-bit exponent,7-bit mantissa

• stable to operate very large and small values

• more compatible with IEEE-754 32-bit


7
Usage of Neural Network Processor 3

• Natural language processing(NLP) : understanding and generation human language

(eg:google translate,question AI)

• Machine vision : interpreting and analyzing visual information from the world (eg:

autonomous vehicles, object detection, facial recognition)


7
NNP-T AND NNP-I 4

Intel Nervana NNP-T(Training): Intel Nervana NNP-


developing and refining AI models, such as I(Inference): running trained models
deep learning training tasks • PCIe card : contain two NNP I-1000
• NNP-T1300 : fit in standard PC devices
• NNP-T1400 : fit in supercomputer-level • Use Case: more processing power but
performance consumes more energy
• M.2 card : contain one NNP I-1000
device
• Use Case: more compact and energy-
efficient but offers less processing
power
THANK
YOU!

You might also like