The document discusses parallel computer memory architectures. It describes three main types:
1. Shared memory systems, which allow all processors access to the same global memory space. These include Uniform Memory Access (UMA) and Non-Uniform Memory Access (NUMA) systems.
2. Distributed memory systems, where each processor has its own local memory and communication is needed to access data on other processors.
3. Hybrid distributed shared memory systems, which combine shared and distributed memory by networking multiple shared memory machines together. This allows scalability while maintaining shared memory within individual nodes.
The document discusses parallel computer memory architectures. It describes three main types:
1. Shared memory systems, which allow all processors access to the same global memory space. These include Uniform Memory Access (UMA) and Non-Uniform Memory Access (NUMA) systems.
2. Distributed memory systems, where each processor has its own local memory and communication is needed to access data on other processors.
3. Hybrid distributed shared memory systems, which combine shared and distributed memory by networking multiple shared memory machines together. This allows scalability while maintaining shared memory within individual nodes.
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 1
Memory Models • Data and Instructions in a parallel program are stored in the main memory - accessible for processors for the execution. • Way in which the main memory is used by processors in a multiprocessor system, • Divided parallel systems onto 1. Shared memory system 2. Distributed memory systems
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 2
STRUCTURAL CLASSIFICATION 1. Shared Memory General Characteristics • Shared memory parallel computers vary widely, but common ability for all processors to access all memory as Global Address Space. • Multiple processors can operate independently but share the same memory resources. • Changes in a memory location effected by one processor are visible to all other processors. • Shared memory machines have been classified as UMA and NUMA, based upon memory access times
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 4
• Uniform Memory Access (UMA) – Most commonly represented today by Symmetric Multiprocessor (SMP) machines – Identical processors – Equal access and access times to memory – Sometimes called CC-UMA - Cache Coherent UMA.
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 5
• Coherency: is an issue whenever there are multiple cores or processors, each with its own cache. An update done on one core may not be seen by another core, if the local cache on the second core contains an old value of the affected memory location. • Thus, whenever an update occurs to a memory location, copies of the content of that memory location that are cached on other caches must be invalidated. • Such invalidation is done lazily in many processor architectures leads to inconsistency. • Cache coherency is accomplished at the hardware level.
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 6
Non-Uniform Memory Access (NUMA) – Often made by physically linking two or more SMPs – One SMP can directly access memory of another SMP – Not all processors have equal access time to all memories – Memory access across link is slower – If cache coherency is maintained, then may also be called CC-NUMA - Cache Coherent NUMA
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 7
Advantages – Global address space provides a user-friendly programming perspective to memory – Data sharing between tasks is both fast and uniform due to the proximity of memory to CPUs Disadvantages – Primary disadvantage is the lack of scalability between memory and CPUs. Adding more CPUs can geometrically increase traffic on the shared memory-CPU path, and for cache coherent systems, geometrically increase traffic associated with cache/memory management. – Programmer responsibility for synchronization constructs that ensure "correct" access of global memory.
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 8
2. Distributed Memory General Characteristics – Distributed memory systems require a communication network to connect inter-processor memory. – Each processors have their own local memory. – Memory addresses in one processor do not map to another processor - no concept of global address space – Each processor has its own local memory, it operates independently. Changes it makes to its local memory have no effect on the memory of other processors. Hence, No concept of cache coherency.
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 9
General Characteristics – When a processor needs access to data in another processor, it is usually the task of the programmer to explicitly define how and when data is communicated. – Synchronization between tasks is likewise the programmer's responsibility. – The network "fabric" used for data transfer varies widely, though it can be as simple as Ethernet.
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 10
Advantages – Memory and processors are highly scalable – Increase the number of processors and the size of memory increases proportionately. – Each processor can rapidly access its own memory without interference and without the overhead of cache coherency. – Cost effectiveness: can use commodity, off-the-shelf processors and networking. Disadvantages – The programmer is responsible for data communication between processors. – It may be difficult to map existing data structures, based on global memory, to this memory organization. – Non-uniform memory access times - data residing on a remote node takes longer to access than node local data. Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 11 Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 12 3. Hybrid Distributed Shared Memory General Characteristics The largest and fastest computers in the world today employ both shared and distributed memory architectures.
shared memory component is usually a cache coherent SMP
machine. Processors on a given SMP can address that machine's memory as global.
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 13
• The distributed memory component is the networking of multiple shared memory, which know only about their own memory - not the memory on another machine. • Therefore, network communications are required to move data from one machine to another. • Current trends seem to indicate that this type of memory architecture will continue to prevail and increase at the high end of computing for the foreseeable future. Advantages and Disadvantages • Increased scalability is an important advantage • Increased programmer complexity is an important disadvantage
Dr Vengadeswaran Asst. Prof. ICS311 Parallel and Distributed Computing (Sem V) 14