0% found this document useful (0 votes)

23 views22 pages

2.2 DD2356 Threads

The document discusses threads and multicore processors. It describes how multiple processing elements or cores can be placed on the same chip to create multicore processors. It discusses different approaches to adding processing elements and sharing resources between cores. It also covers thread programming models, synchronization issues, variable naming, and scheduling of threads across cores.

Uploaded by

Daniel Araújo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

23 views22 pages

2.2 DD2356 Threads

Uploaded by

Daniel Araújo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 22

DD2356 – Threads

Stefano Markidis
Add to the (Model) Architecture

• What do you do with a billion transistors?

– For a long time, try to make an individual
processor (what we now call a core) faster
– Increasingly complicated hardware
yielded less and less benefit
(speculation, out of order execution,
prefetch, ...)
• An alternative is to simply put multiple processing
elements (cores) on the same chip
https://superuser.com/questions/584900/how-distinguish-between-multicore-and-multiprocessor-systems

– Thus, the “multicore processor” or “multicore

chip”

2
Adding Processing Elements I

Core
• Here’s our model so far, with the vector and
pipelining part of the “core”
– Most systems today have an L3 cache as
L1 Cache
well
• We can (try to) replicate everything...
L2 Cache

Memory

3
Adding Processing Elements II

Core Core Core Core

• Something like this
would be simple
L1 Cache L1 Cache L1 Cache L1 Cache
• But in practice,
some resources are
L2 Cache L2 Cache L2 Cache L2 Cache shared, giving us...

Memory Memory Memory Memory

4
Adding Processing Elements III

Core Core Core Core

L1 Cache L1 Cache L1 Cache L1 Cache

L2 Cache L2 Cache

Memory

5
Notes on Multicore

• Some resources are shared

– Typically the larger (slower) caches, path to memory
– May share functional units within the core (variously called
simultaneous multithreading (SMT) or hyperthreading)
– Rarely enough bandwidth for shared resources (cache, memory)
to supply all cores at the same time

6
Process VS Thread - I
Multi-Threaded
• A thread is a basic unit of Process Process
processor utilization, consisting
of a program counter, a stack
and registers.
• Processes have a single
thread of control: there is one
program counter, and one
sequence of instructions that
can be carried out at any given
time.

7
• Executing program (process) is
Process VS
• Executing Thread
defined II
by
program (process) is
defined by ♦ Address space Process

♦ Address space
♦ Program Counter
• A process is defined by address
♦ Program Counter
• Threads are multiple program
space and counter

• Threads are multiple program

• Threads are multiple program
counters
counters (+ stack and registers)
Multi-threaded process
counters

8
8 8
Programming Models For Multicore processors
• Parallelism within a process
• Compiler-managed parallelism
– Transparent to programmer
– Rarely successful
• Threads
– Within a process, all memory shared
– Each “thread” executes “normal” code
– Many subtle issues (more later)
• Parallelism between processes within a node covered later in the third
module

9
Why Use Threads?

• Manage multiple points of interaction

• Low overhead steering/probing
– Background checkpoint save
• Alternate method for nonblocking operations
• Hiding memory latency
• Fine-grain parallelism
• Compiler parallelism

10
Common Thread Programming Models

• Library-based (invoke a routine in a separate thread)

• pthreads (POSIX threads)
– See “Threads cannot be implemented as a library,” H. Boehm
http://www.hpl.hp.com/techreports/2004/ HPL-2004-209.pdf
• Separate enhancements to existing languages
• OpenMP, OpenACC, OpenCL, CUDA, ...
• Within the language itself
• Java, C11, others

11
Thread Issues

1. Synchronization
• Avoiding conflicting operations (memory references) between threads
2. Variable Name Space
• Interaction between threads and the Language
3. Scheduling
• Will the OS do what you want?

12
Synchronization of Access

Read/write model

What does thread 2 print?

Take a few minutes and think about the possibilities

13
Synchronization of Access

Many possibilities:
• 2 (what the programmer expected)
• 1 (thread 1 reorders stores so a=2 executed before b=2 (valid in language)
• Nothing: a never changes in thread 2
• Some other value from thread 1 (value of b before this code starts)

14
How Can We Fix This?
• Need to impose an order on the memory updates
– OpenMP has FLUSH
– Memory barriers (more on this later)
• Need to ensure that data updated by another thread is reloaded
– Copies of memory in cache may update eventually
– In this example, a may be (is likely to be) in register, never updated
– volatile in C

15
Synchronization of Access

• Often need to ensure that updates happen atomically (all or nothing)

– Critical sections, lock/unlock, and similar methods
– Java has synchronized methods
– C11 provides atomic memory operations

16
Variable Names
• Each thread can access all of a processes memory (except for other
thread’s stack)
– Named variables refer to the address space - thus visible to all
threads
– Compiler doesn’t distinguish A in one thread from A in another
– No modularity
– Like using Fortran blank COMMON for all variables

17
Scheduling Threads
• If threads used for latency hiding
– Schedule on the same core
– Provides better data locality, cache usage
• If threads used for parallel execution
– Schedule on different cores using different memory pathways
– Appropriate for data parallelism

18
Node Execution Models
• Where do threads run on a node?
– Typical user expectation: User’s applications uses all cores and has
complete access to them
• Reality is complex.
• Common cases include:
– OS pre-empts core 0; Or cores 0,2
– OS pre-empts user threads, distributes across cores
– Hidden core (BG/Q)

19
Performance Models: Memory
• Assume the time to move a unit of memory is tm
– Due to latency in hardware; clock rate of data paths
– Rate is 1/tm = rm
• Also assume that there is a maximum rate rmax
– E.g., width of data path * clock rate
• Then the rate at which k threads can move data is
– min(k/tm,rmax) = min(krm,rmax)

20
Limits on Thread Performance

• Threads share memory resources

• Performance is roughly linear with
additional threads until the maximum
bandwidth is reached
• At that point, each thread receives a
decreasing fraction of available bandwidth

21
Questions

• How do you expect a multithreaded STREAM to perform as you add threads?

• What happens if there are more threads that cores?
– Can programs run faster in that case?

HW1
No ratings yet
HW1
2 pages
20 Distributed Reliability Protocols PDF
0% (2)
20 Distributed Reliability Protocols PDF
31 pages
EE5900 Advanced Embedded System For Smart Infrastructure: RMS and EDF Scheduling
No ratings yet
EE5900 Advanced Embedded System For Smart Infrastructure: RMS and EDF Scheduling
35 pages
OS Lecture3 - Inter Process Communication
No ratings yet
OS Lecture3 - Inter Process Communication
43 pages
Ll4-l5 - Process Synchronization1
No ratings yet
Ll4-l5 - Process Synchronization1
93 pages
CPU Scheduling: Exercises
No ratings yet
CPU Scheduling: Exercises
8 pages
Unix Operating System Lab Lab Practical Exercise-1
No ratings yet
Unix Operating System Lab Lab Practical Exercise-1
17 pages
Unit Ii Process Management
No ratings yet
Unit Ii Process Management
44 pages
Chapter-5 Threads and Concurrancy
No ratings yet
Chapter-5 Threads and Concurrancy
47 pages
CH 17
No ratings yet
CH 17
40 pages
Threads & Concurrency: Lecture 23 - CS2110 - Fall 2018
No ratings yet
Threads & Concurrency: Lecture 23 - CS2110 - Fall 2018
34 pages
Contents of Lecture Threads Multithreads Threads Models Multithreads in Java Advantages Disadvantages
No ratings yet
Contents of Lecture Threads Multithreads Threads Models Multithreads in Java Advantages Disadvantages
48 pages
Unit-5 Part1
No ratings yet
Unit-5 Part1
85 pages
Concurrency On Operating System: Platform Technology
No ratings yet
Concurrency On Operating System: Platform Technology
21 pages
Unit 5 Transaction Processing
No ratings yet
Unit 5 Transaction Processing
12 pages
Suzuki Kasami
No ratings yet
Suzuki Kasami
32 pages
Distributed Systems
No ratings yet
Distributed Systems
238 pages
Lecture 8 Deadlock in Details
No ratings yet
Lecture 8 Deadlock in Details
43 pages
Semaphore Unix
No ratings yet
Semaphore Unix
5 pages
CSE 260 - Introduction To Parallel Computation: Larry Carter Carter@cs - Ucsd.edu
No ratings yet
CSE 260 - Introduction To Parallel Computation: Larry Carter Carter@cs - Ucsd.edu
22 pages
MIS403 Lec15 Nov14
No ratings yet
MIS403 Lec15 Nov14
24 pages
OS Chapter 4 Final
No ratings yet
OS Chapter 4 Final
27 pages
Lec17 Threads Introduction
No ratings yet
Lec17 Threads Introduction
20 pages
Operating System 4
No ratings yet
Operating System 4
33 pages
Chapter 2 Process Management Part 2 Threads and Multithreading
No ratings yet
Chapter 2 Process Management Part 2 Threads and Multithreading
42 pages
Processes and Threads: - An Operating System Executes A Variety of Programs
No ratings yet
Processes and Threads: - An Operating System Executes A Variety of Programs
24 pages
Deadlocks in OS
No ratings yet
Deadlocks in OS
2 pages
Lec 4 Superscalarprocessor PDF
No ratings yet
Lec 4 Superscalarprocessor PDF
23 pages
Roll No. ...................... Total Pages: 3: BCA/D-21
No ratings yet
Roll No. ...................... Total Pages: 3: BCA/D-21
3 pages
Operating Systems: Threads
No ratings yet
Operating Systems: Threads
32 pages
Hreads: Program Counter: Registers
No ratings yet
Hreads: Program Counter: Registers
21 pages
Questions Answered in This Lecture:: - Why Are Threads Useful? - How Does One Use POSIX Pthreads?
No ratings yet
Questions Answered in This Lecture:: - Why Are Threads Useful? - How Does One Use POSIX Pthreads?
6 pages
4 OS Threads
No ratings yet
4 OS Threads
25 pages
Biruk Tewoderos 1790
No ratings yet
Biruk Tewoderos 1790
21 pages
Software Requirements Specification Srs
No ratings yet
Software Requirements Specification Srs
34 pages
Threads in Operating System
No ratings yet
Threads in Operating System
103 pages
Advanced Performance Optimization in CUDA (S62192)
No ratings yet
Advanced Performance Optimization in CUDA (S62192)
127 pages
CH 4
No ratings yet
CH 4
21 pages
Green Black Geometric How To Find The Right University Presentation
No ratings yet
Green Black Geometric How To Find The Right University Presentation
15 pages
4 Threads and Concurrency
No ratings yet
4 Threads and Concurrency
62 pages
Lecture 04 ThreadAndMultithreading
No ratings yet
Lecture 04 ThreadAndMultithreading
35 pages
Threads: Tevfik Koşar
100% (1)
Threads: Tevfik Koşar
40 pages
(OS) - Unit-2.2-2.5 Process Management
No ratings yet
(OS) - Unit-2.2-2.5 Process Management
72 pages
System Programming - II Threads
No ratings yet
System Programming - II Threads
46 pages
Operating Systems: Suad Alaofi
No ratings yet
Operating Systems: Suad Alaofi
13 pages
NV Operating Systems UNIT II
No ratings yet
NV Operating Systems UNIT II
91 pages
Operating System Compiled Notes
No ratings yet
Operating System Compiled Notes
20 pages
Java Concurrency
No ratings yet
Java Concurrency
70 pages
Csi 3131 Mod 3 Threads
No ratings yet
Csi 3131 Mod 3 Threads
54 pages
Asynchronous Concurrent Execution
No ratings yet
Asynchronous Concurrent Execution
68 pages
Syllabus
No ratings yet
Syllabus
1 page
4 Threads
No ratings yet
4 Threads
61 pages
Chapter 04
No ratings yet
Chapter 04
37 pages
Lecture 13 - Programming Models
No ratings yet
Lecture 13 - Programming Models
15 pages
Threads
No ratings yet
Threads
16 pages
4 Threads
No ratings yet
4 Threads
41 pages
Java Concurrency 101 1709730229
No ratings yet
Java Concurrency 101 1709730229
26 pages
OS Module 1 Slides-2
No ratings yet
OS Module 1 Slides-2
47 pages
Threads: Multicore Programming Multithreading Models Thread Libraries Threading Issues Operating System Examples
No ratings yet
Threads: Multicore Programming Multithreading Models Thread Libraries Threading Issues Operating System Examples
22 pages
OS-PROCESS MANAGEMENT Module - 2.2
No ratings yet
OS-PROCESS MANAGEMENT Module - 2.2
89 pages
Threads
No ratings yet
Threads
8 pages
Threads
No ratings yet
Threads
38 pages
Deadlock
No ratings yet
Deadlock
15 pages
Multithreading PPT
No ratings yet
Multithreading PPT
44 pages
5 Threads
No ratings yet
5 Threads
33 pages
Chapter 4
No ratings yet
Chapter 4
18 pages
Lecture 25
No ratings yet
Lecture 25
41 pages
OS Module2 Unit2
No ratings yet
OS Module2 Unit2
43 pages
DSCC Unit 1 PDF
No ratings yet
DSCC Unit 1 PDF
14 pages
Threads OS
No ratings yet
Threads OS
21 pages
Lec6 - TLP Data Dependence Solutions
No ratings yet
Lec6 - TLP Data Dependence Solutions
20 pages
Lec04 SOFE3950 Threads
No ratings yet
Lec04 SOFE3950 Threads
53 pages
Lecture 16
No ratings yet
Lecture 16
30 pages
Unit2-Hardware Solution To Process Synchronization
No ratings yet
Unit2-Hardware Solution To Process Synchronization
19 pages
02 Parallel Processing - CPP and OMP
No ratings yet
02 Parallel Processing - CPP and OMP
64 pages
Lecture 03
No ratings yet
Lecture 03
49 pages
TranQuocVietAnh HW5
No ratings yet
TranQuocVietAnh HW5
5 pages
Parallel Programming Unit 2
No ratings yet
Parallel Programming Unit 2
71 pages
Operating Systems-3-Threads 3
No ratings yet
Operating Systems-3-Threads 3
17 pages
OpenMP SPM
No ratings yet
OpenMP SPM
9 pages
05 Thread
No ratings yet
05 Thread
33 pages
Os4 p2c4 Threads
No ratings yet
Os4 p2c4 Threads
20 pages
4.OS Threads Dr. Punit
No ratings yet
4.OS Threads Dr. Punit
48 pages
Os4 p2c4 Threads
No ratings yet
Os4 p2c4 Threads
20 pages
Chapter - 4 Threads (Full)
No ratings yet
Chapter - 4 Threads (Full)
72 pages
Os2 &3module
No ratings yet
Os2 &3module
69 pages
MULTITHREADING
No ratings yet
MULTITHREADING
33 pages
Zig Programming: From Zero to Systems Master
From Everand
Zig Programming: From Zero to Systems Master
Niklas Hoffmann
No ratings yet
All My IT Tech Posts
From Everand
All My IT Tech Posts
Stephen Edwards
No ratings yet
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet

2.2 DD2356 Threads

Uploaded by

2.2 DD2356 Threads

Uploaded by

DD2356 – Threads

• What do you do with a billion transistors?

– Thus, the “multicore processor” or “multicore

Core Core Core Core

Memory Memory Memory Memory

Core Core Core Core

L1 Cache L1 Cache L1 Cache L1 Cache

• Some resources are shared

• Threads are multiple program

• Manage multiple points of interaction

• Library-based (invoke a routine in a separate thread)

What does thread 2 print?

• Often need to ensure that updates happen atomically (all or nothing)

• Threads share memory resources

• How do you expect a multithreaded STREAM to perform as you add threads?

You might also like