0% found this document useful (0 votes)
6 views5 pages

(English (Auto-Generated) ) 2 4 8 Examples of Simultaneous Multithreading (DownSub - Com)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views5 pages

(English (Auto-Generated) ) 2 4 8 Examples of Simultaneous Multithreading (DownSub - Com)

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 5

[Music]

welcome back to this module on

exploiting TLP to improve Unicode

throughput in this final lesson in this

module I will give some examples of

simultaneously multi-threaded processes

let's now have a look at some SMT

processes some examples of S&T processes

the first commercial essen processor

that implemented simultaneous

multi-threading was the Intel Pentium 4

which was which entered the market in

2002 it was the first commercial SMT

design and Intel called simultaneous

multi-threading hyper threading

technologies and this table summarizes

the sharing possibilities between

different thread context petitions are

the return address table the the program

counter the return address predictors

the instruction fetch queue etc fully

she had all the caches the caches are

not replicated because if we replicate

the cache we have a multi-core which is

the topic of a different module the

physical registers are shared between

the different Hardware thread contacts

the functional units obviously that's

the whole idea of simultaneous

multi-threading also the branch


prediction hardware and the control

logic and the buses the branch

prediction Hardware yes but not the

return address predictors because that's

thread dependent furthermore is a

threshold for the processor scheduler

it doesn't allow one thread to use all

the resources even if the resource is

shared for example the problem with that

is that it may lead to LifeLock some

threats will not get

security at all therefore a threshold

policy is applied to the process of

scheduler here's another example of a

simultaneous multi-threaded processor

the IBM power 5 processor it implements

simultaneous multi-threading on top of

the power 4 microarchitecture and it

employs 2-way multi-threading so it

executes two threats at the same time in

fact it supports simultaneous

multi-threading mode as well as a single

threaded mode and st mode in SMT mode

the two threads dynamically share

everything there is to share on the tip

whereas in a single threaded mode all

resources are devoted to executing a

single thread as fast as possible every

thread has one instruction fetch queue


and the dispatch hardware selects in

every cycle one of the two queues to

decode from based on some priority

algorithm so the instructions fetched

decoded and dispatched in one cycle are

all from us from the same thread only

the back end the execution is really

simultaneously multi-threaded

furthermore a thread bit is added to the

architectural register number to access

the renaming table in this way the

renaming table is shared between the two

threads and to avoid that a single

thread monopolizes makes use of all the

resources so that the other thread is

life locked different threads have

different priorities when a thread uses

more than a certain threshold number of

reorder buffer entries then then issuing

from this thread is stalled

so that the other thread also gets a

chance because otherwise it would not be

fair and we stopped thread selection

when the thread has exceeded this

threshold when the thread also also when

the thread has exceeded a threshold

number of level two cache misses then

there are so many misses in flight that

we must all this thread because the

hardware is not able to handle more


misses and also the other thread should

be able to continue execution when it

enters when it incurs a miss and the

threads instructions in the instruction

fetch queue are flushed when the thread

executes is a very long instruction long

latency instruction for example when it

failed to acquire a lock which is a very

powerful synchronization which is a

synchronization primitive that I will

discuss in the multi-core course to

summarize everything we learned about

multi-threading

I started with software multi-threading

which is a very old form of

multi-threading where process context

switches occur each time a process

encounters a very long latency event in

particular every time a process dozen IO

operation the main focus of this module

was Hardware multi-threading and I

talked about three forms of Hardware

multi-threading block multi-threading

also called course key grain

multi-threading where we switch the

thread when the thread incurs a long

latency event for example a level one or

level two cache miss fine grained

multi-threading where a different thread


instructions from a different thread are

fetched and issued every cycle for that

reason I also call it cycle by cycle

interleaving

and the most advanced form of hardware

core multi-threading simultaneous

multi-threading where we maybe not fetch

but when we issue from different threats

in every cycle to reduce to get rid of

the vertical waste as well as the

horizontal waste finally here's a paper

that I recommend you to read a survey of

processor with explicit multi-threading

from my colleagues tio owner at all this

is a very nice good summary paper and I

recommend you strongly recommend you to

look at this paper finally a test

yourself and finally some

acknowledgement this concludes this

lesson this module and this course thank

you for watching in the next course we

will look at advanced memory hierarchy

design hope to see you back

[Music]

You might also like