A Visual Simulation Framework For Simult PDF

Uploaded by

hectorjazz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

110 views7 pages

A Visual Simulation Framework For Simult PDF

Uploaded by

hectorjazz

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 7

A VISUAL SIMULATION FRAMEWORK FOR SIMULTANEOUS

MULTITHREADING ARCHITECTURES
Adrian Florea1, Alexandru Ratiu1, Arpad Gellert1 and Lucian N. Vinţan1,2
1
Computer Engineering Department, “Lucian Blaga” University of Sibiu, Emil Cioran Street, No. 4, 550025 Sibiu,
Romania
2
Academy of Technical Sciences from Romania

E-mail:{adrian.florea, arpad.gellert, lucian.vintan}@ulbsibiu.ro, [email protected]

KEYWORDS Also, in today’s world, there is an ever-increasing

Simulation, Education, Computer Architecture, need for intelligent systems, especially in educational
Simultaneous Multithreading, Benchmarking. domain. Without modernize our teaching tools in
computer architecture, based on the latest research
ABSTRACT achievements but also on trade, we risk losing contact
with the development of computer engineering.
The computing systems, and particularly
Therefore, it is a stringent necessity to develop teaching
microarchitectures, are in a continuous expansion
resources (software simulators) related to a hard kernel
reaching an unmanageable complexity by the human
of the fundamental disciplines in computer engineering,
mind. In order to understand and control this expansion,
like computer architecture, compilers, operating systems
researchers need to design and implement larger and
and computer networks. Developing effective learning
more complex systems’ simulators. In the current
tools targeting these disciplines is a continuous
paradigm the simulators play the key role in going
challenge.
further, by translating all complex processing
In this paper we try to give a better understanding of
mechanisms in relevant and easy to understand
SMT microprocessor architectures by developing a
information. This paper aims to make a suggestive
visual simulation framework. Due to the complexity
description of the concepts and principles implemented
level, we make the learning steps easier, driven by
into a Simultaneous Multithreading Architecture. We
expressive simulations which can provide us, based on
introduce the SMTAHSim framework, an educational
the general picture of the system, a detailed one (top to
tool that simulates in an interactive manner the
down approach). But why SMT architectures request
important aspects of this particular microarchitecture.
interest? The current microarchitectures have three
The graphical simulation and the results reporting
major limiters (the so called “brick wall” concept):
techniques provide a lot of easy to understand
• Memory wall – the increasing gap created between
information that outline an expressive image of
processor clock cycle time and the main memory
Simultaneous Multithreading (SMT) processing
access time;
mechanisms. Our developed software tool facilitates the
understanding of theoretical questions, thus allowing • Instruction Level Parallelism (ILP) wall – generated
students to feel more confident when studying SMT- by the present-day impossibility to issue a
related issues. continuously higher number of instructions in parallel;
• Power wall – favorized by the frequency scaling as the
number of transistors on chip increase.
1. INTRODUCTION
The SMT architectures come as a solution to the first
The computer science (CS) domain is a very complex two limitations by combining the superscalar instruction
one, representing the result of one of the largest and issue with the multithreading approach. Thus,
fastest scientific developments known to mankind. This instructions from multiple threads could be
gradual evolution has engaged, during the last six simultaneously issued in a single clock cycle. Latencies
decades, hundreds of bright minds from different fields that occur in the execution of single threads are bridged
(mathematics, physics, electronics, automation, and by issuing operations of the remaining threads. Other
informatics), giving birth to a new science (CS), which arguments refers to the fact that, although single-core
has revolutionized everyday lives of the people. SMT architectures are on the market since 2002 (Intel
However, the main responsible for computers progress Pentium 4 Northwood Hyperthreaded) until now – in
are microprocessors. The continuous expansion of 2010 Intel released the Core™ i3, i5, i7 with
microarchitectures has lead to a hard to control and Hyperthreaded technology on each core (Intel 2010) –,
understand complexity explored with the help of larger in the authors’ opinion, there are not efficient
and more sophisticated software simulators. pedagogical tools dedicated to teach SMT concepts
easier and more intuitively with interactive animation.

Proceedings 25th European Conference on Modelling and

Simulation ©ECMS Tadeusz Burczynski, Joanna Kolodziej
Aleksander Byrski, Marco Carvalho (Editors)
ISBN: 978-0-9564944-2-9 / ISBN: 978-0-9564944-3-6 (CD)
The fast development of computer science and We developed SMTAHSim simulator using the
computer architecture especially, have determined that Microsoft .NET Framework 3.5 writing over 7K lines
many software tools, used not long ago in research, are code. The simulator is running on Windows
enhanced with an interactive graphical interface and are 2k/XP/Vista/7 and is currently used in undergraduate
taught in Computer Architecture courses. The lack of and graduate courses / laboratories in (Advanced)
simulators dedicated to simultaneous multithreading Computer Architecture at “Lucian Blaga” University of
architectures used for didactical purposes, despite they Sibiu. The simulator can be found at
are highly used in research goals, represents the starting http://webspace.ulbsibiu.ro/adrian.florea/html/simulatoar
point of this paper. In order to better achieve this e/SMTAHSim.html
purpose, we try to develop a compact hybrid simulator, The organization of the rest of this paper is as follows.
which integrates microprocessor instruction stream, In section 2 we review the Related Work in software
branch prediction and cache memory simulation. simulators domain dedicated to microarchitectures.
Judging from educational goal, through this work we Section 3 describes the theoretical background related to
propose few new ideas: SMT, whereas section 4 presents the used benchmarks
• Hybrid simulation (trace- and execution-driven) of a and simulation methodology. Section 5 illustrates the
SMT architecture using interactive animation. simulator software architecture, the simulator kernel
• Introducing real branch predictors dedicated to each from hardware viewpoint and the SMTAHSim user
simulated thread (branch prediction was only interface. Based on a short interactive animated
statistically generated in other similar simulators example, we explain the SMT functionality. Finally,
(Smullen and Taha 2006)). For example we section 6 suggests directions for future work and
implemented gshare (a two-level adaptive branch concludes the paper.
predictor (Yeh and Patt 1992)) and two state of the art
dynamic predictors: FPBNP (a fast path-based neural 2. RELATED WORK
branch predictor (Jiménez 2003)) and OGEHL After almost four decades of concerning in
(Optimized GEometric History Length branch microprocessors design, implementation and
predictor (Seznec 2005)). The last one was classified exploitation, the researchers from computer science
on 2nd place at World Championship of Branch domain got the conclusion that simulators have become
Prediction (CBP 2004) and received the best practice an integral part of the computer architecture research
award for “the predictor the closest to a possible and design process (Yi and Lilja 2006) and simulation
hardware implementation”. The branch predictors can technology and methodology represents the crux of
be used also as a third party lesson / application. computer architecture research and development (De
• Introducing a parameterized instruction cache shared Bosschere et al. 2007).
between threads (both instruction and data caches Besides their importance proved in computer
were only statistically generated in other similar architecture research field, in the latest time, simulators
simulators (Smullen and Taha 2006)). have been extensively employed as a valuable
From a didactical point of view, the developed tool pedagogical tool as they enable students to understand
(SMTAHSim) has benefits in the learning process better the theoretical concepts and to visualize how
because it helps students to observe the influence of microarchitectures components work and interact with
each parameter on the simulation model. The each other (Yi and Lilja 2006).
SMTAHSim simulator provides a wider variety of In microprocessor systems’ domain, as
configuration options. Thus, it can be determined how microarchitectural complexity increases, (crossing from
branch prediction accuracy or resource usage varies with instruction-level-parallelism to thread-level-parallelism
input parameters (number of entries in prediction tables, and toward multi- and many-core architectures), it is
history length, number of bits for weights representation, more difficult to explain concepts like caches, out-of-
etc). The execution-driven simulation allows order and speculative execution, power consumption,
SMTAHSim’s tool to give fine-grained results regarding and the interactions among the architecture components
every microarchitectural unit during and at the end of without visual aids. Graphical simulations of these
the benchmarks’ simulation. All final simulation results architectures allow students to easily grasp the
are stored in a database and can be used further to architecture concepts by observing the flow of
generate a large palette of reports regarding units’ instructions in time, also by exploring the impact of
performance in correlation with almost every parameter. different processors configuration on performance,
The SMTAHSim simulator assures three of the features dissipated energy and temperature. The static visual
specific to almost all high-performance academic office tools (such as graphical charts, diagrams, slides
standard simulators: free availability for use, etc.) are limited in efficiency: they cannot
extensibility and portability. Full inheritance and simultaneously exhibit both the structural relationships
polymorphism is used in the simulator’s source code, between microarchitectural components and the
allowing easier extension in the future, adding new temporal dependences between executed instructions
functionalities. that are in-flight in the pipeline structures and cannot
explain the functionality of coherence mechanism in
multicore architectures, etc. Some of the present-day 3. THEORETICAL BACKGROUND
most used didactical simulators are:
It is well known that superscalar architectures exploit
• WinDLX was developed for Windows operating
Instruction Level Parallelism (ILP) by fetching and
system by Herbert Grünbacher (Grünbacher 1998) and
executing more than one independent instruction per
simulates Hennessy and Patterson’s DLX (DeLuXe)
cycle. Despite that, the instruction-per-cycle (IPC) rate
architecture (Hennessy and Patterson 2007). The DLX is
is limited to relatively low values, due to a lot of factors
a didactic microprocessor designed in accordance with
(Hennessy J., Patterson D., 2007).
the most popular RISC microprocessors (SPARC,
The SMT architecture comes as a solution to the
MIPS, etc.). Simulation exposes in an expressive
above mentioned limitation by combining the
manner the principle of in-order pipelined execution
superscalar mechanism with the multithreading
(execution steps, data hazards, forwarding) and
approach, which allows exploitation of both thread-level
performance penalty involved by high latency
parallelism (TLP) and ILP. In order to achieve this
instructions (delay slots) but, because it is modeled at
performance, processor keeps different context
architecture level quite few information is given about
information (program counter, stack pointer, etc.) for
the processor.
each active thread. Latencies which normally occur in
• VLIW-DLX extends the WinDLX simulator to a single thread execution are, in this case, (partially)
VLIW model, using the same DLX ISA. It is hidden by switching to another thread. This architecture
implemented in Java and allows modifications of the represents the mapping of high level languages’ explicit
architecture, including ISA (Bečvář and Kahánek 2007). and implicit concurrencies (threads or/and micro-
• PCSpim-Cache is an execution-driven simulator threads) into a processor having implemented multiple
indented to be used in undergraduate courses for contexts. A thread from hardware level can be a task or
teaching cache memories within MIPS architecture. The a software thread within a task, but also can be made of
tool allows to run step-by-step a selected code on a software entities of smaller granularity as loops, routines
proposed cache organization and meanwhile observe or code blocks (micro-threads), which may be executed
dynamic changes in its structure (Petit et al. 2006). in parallel (Eggers et al. 1997; Vintan and Florea 2000).
• PSATSim is a powerful graphical simulator which SMT architectures inherit the superscalar processing
offers support for students in better understanding the mechanism and extend it with multithreading
tradeoff between processors’ performance and power architecture specific components. Mechanisms as out-
consumption. The simulated microarchitecture is a of-order speculative execution, register renaming and in-
configurable superscalar architecture with speculative order completion are also met in SMT architectures. For
out-of-order execution. The GUI allows in an interactive assuring a different context, some hardware resources
and easy way to simulate different microarchitectural are private for each thread (branch predictors, renaming
configurations and assures a quick feedback (Smullen tables, logical register files, ROBs, Load/Store Queues,
and Taha 2006). commit units) and others are shared among threads
However, unlike SMTAHSim, part of the existing (fetch unit, decode unit, issue queue, physical register
simulators (Hostetler and Mirtich 1996; Burger and files, execution units and cache memory), using a tag
Austin 1997; Skadron et al. 2003; Sharkey et al. 2005; information in instruction encoding to make the
August et al. 2007) were designed primarily for difference.
research, the emphasis is on modeling the effects of To ensure a high throughput, SMTs need a scheduling
architectural mechanisms. Most of these simulators are policy that arbitrates between threads for optimizing
not trying to visually express the behavior of shared resources’ utilization. The most common scheme
architectural mechanisms and the interaction between is the very simple Round-Robin policy, which switches
them. They are often designed to model a specific between threads in a circular way, regardless of their
architecture and are also too complex to be studied by behavior. A better strategy is implemented in the
students who are beginners in concepts such as SMT. ICOUNT policy which give higher priority to threads
On the other hand most of the didactic simulators used with the fewest instructions in decode, rename and
in Computer Architecture are simulating only some instruction queues. The motivation is to give higher
simplistic toy-benchmarks. As it will be further priority to fast-moving threads and, at the same time, to
presented, our developed simulator can process complex prevent starvation. ICOUNT tries to balance the number
benchmarks that are intensively used in research of instructions in the pipeline among the various threads
activities, too. The interactivity of SMTAHSim so that all threads have an approximately equal number
simulator allows both to know in every machine cycle of instructions in the front-end pipeline and instruction
the content of CPU resources (reservation stations, queues (Manadhata and Sekar 2003; Eyerman and
functional units, reorder buffer, rename buffer, pipeline Eeckhout 2009).
structure) and to experiment unforeseen circumstances SMTAHSim benefits of both mentioned fetch policies
like forcing a miss in D-Cache (this cache module is and gives user the possibility to understand how these
modeled statistically based on benchmark are influencing the IPC rate and other parameters, driven
characteristics). by simulation monitoring tool.
4. SIMULATION METHODOLOGY framework is easily extendable with our independent
modules which are inheriting the provided interface. The
The SMTAHSim tool intends to help students in
Add-Ins can come also with their own configuration and
teaching superscalar and SMT architectures, by
simulation GUIs.
simulating a large palette of hardware configurations in
step-by-step or full trace simulation mode. In order to
obtain finest results, a hybrid simulation is performed.
The results are collected at the end of each processing
cycle by the Monitoring Tool and reported according to
user preferences (see Figure 1).
SMTAHSim's execution-driven simulation is
sustained by GUI which exposes in an interactive way
the SMT's architectural structure and execution-time
information. The step-by-step simulation gives a better
perspective above the instruction stream through
processing architecture and enables the user to visualize
how basic superscalar and SMT mechanisms work.
For result validation, a set of benchmarks are used as
simulator inputs, remaining to user choice which file is
used as input for each hardware thread. The benchmarks
represent a selection from the SPEC ’95 (applu,
compress, fpppp, ijpeg, perl (SPEC 1995)) and
MediaBench 1.0 (epic, mpeg2d, mpeg2e, pegwitd, toast
(Lee et al. 1997)) benchmark suites compiled for
SimpleScalar Portable ISA (PISA). All these
benchmarks cover a lot of applications ranging from
compression to word processing, from compilers and
architectures to games enhanced with artificial
intelligence, etc. We choose to use different benchmarks
in order to discover how these different testing programs Figure 1: SMTAHSim Architecture
influence the processing performances.
The SMTAHSim framework provides two simulation
5. THE SMTAHSim FRAMEWORK modes: a step-by-step simulation or a full unanimated
simulation. The user can easily switch between these
The developed simulator must support the learning
two modes by interacting with the Simulation Control
process of students in SMT microarchitecture and search
module. Depending on the running simulation mode, the
for possible changes (architectural or optimization
Monitoring Tool filters the results stored in the
techniques) to improve it. Providing a highly
Simulator Kernel’s Results Buffer. The simulation
parameterized model for every microarchitectural
process is carried out by the Simulation Machine which
instance, the performance obtained by simulation will
performs independently of the user interfacing tools.
represent a quick feedback mechanism related to the
The Results Buffer is updated at the end of every
proposed changes, permitting thus an efficient design
processing cycle with relevant information regarding
space exploration process. The simulator’s execution
performance and with a current context copy, which are
consists in the following sequential steps:
later processed by the Monitoring Tool. This mechanism
1) Initialization phase (configuring the micro-
speeds up the simulation because the Simulation
architecture with the input parameters including the
Machine is not interrupted by the graphical tools’
benchmarks)
operations, only by the buffer’s overflow. The producer-
2) Simulation and monitoring phase
consumer design pattern is implemented: as the
3) Results’ reporting
Simulation Machine produces data, the Monitoring Tool
For the initialization phase the SMTAHSim provides
is using it to update the Presentation layer (GUI). When
help with a quick and easy to use Configuration
the buffer is full, the simulation is suspended until the
Manager. This internal tool gives users the possibility to
data are consumed. All final results are stored in the
load preconfigured or saved configurations from the
Results Repository and can be used to generate finest
Configuration Repository or guides them through the
reports with the Results Reporting tool. User is able to
configuration process. The last simulated configuration
get relevant graphics of SMT’s performance indices in
is loaded as default.
correlation with almost every architectural parameter.
Some important architectural modules (called
suggestively ISA, Branch Predictor, I-Cache, Fetch 5.1. The SMTAHSim Software Architecture
Policy) are implemented as interfaces and can be loaded
by the Add-Ins Manager as precompiled libraries. The As we reveal in Figure 1, the framework is structured in
four main software packages:
• GUI (Graphical User Interface) plays an important • Simulation Machine is the most important package,
role as the highest level (Presentation Layer) of the situated at low application level, which makes the
framework, which manages all USER’s interactions. effective simulation.
This package is developed around two basic principles:
ACTION and REACTION. All user actions have a 5.2 SMTAHSim framework: Simulation Machine
quick feedback from the system, and all this reactions
SMTAHSim models a configurable SMT architecture
are managed carefully by GUI which makes the results
(Figure 2) designed in accordance with the M-SIM
representation in an interactive and easily
architecture (Sharkey et al. 2005) which has at base a
understandable manner. Overall, this package makes the
superscalar architecture with speculative and out-of-
framework a friendly and easy to use application.
order execution. The pipeline structure of SMTAHSim
• Input/Output package is the low level management
is based on that of PowerPC 5+ comercial processor
of all the simulation inputs and outputs giving the
(Sinharoy et al. 2005). Actually, M-SIM extends the
extensibility and accessibility dimensions to the
SimpleScalar toolset (Burger and Austin 1997) with
framework. The aim of this approach is to make the user
accurate models of the pipeline structures, including
to easily access the final results and architecture
explicit register renaming, and support for the
configurations and, eventually, to develop his/her own
concurrent execution of multiple threads. Basic
configurations and extensions to the basic architecture.
superscalar units are shared among micro-threads
The framework came with some basic configurations
(Cache, Fetch Unit, Decode Unit, Dispatch Queue,
which allow a proper evaluation of the SMT
Execution Units, Physical Registers), but in order to
architecture’s performances. For others configurations, a
assure different contexts some resources are private for
wizard is guiding the user step by step through the new
each micro-thread (Branch Predictors, Rename Tables,
configuration defining process. All new simulated
Reorder Buffers, Commit Units, Logic Registers).
configurations are stored in the Configuration
Repository at the user’s decision. The simulation results
of these configurations are also stored at the user’s
decision, in the Results Repository, and linked to the
simulated configuration. Due to this, software
architecture results can be used to generate fine-grained
reports regarding performance indices in correlation
with almost every parameter, directly from the Results
Repository. The Results Reporting tool supports users
through this process and allows generating a large
diversity of figures. The Add-Ins Repository plays a
very important role because it stores all third party
modules added by developers. The management of this
collection is carried out by the Add-Ins Manager.
• Application Kernel is the middle level
(middleware) which manages all user communications Figure 2: Simulated architecture
with the application. GUIs are assured for each middle Simulation involves getting instructions from
level manager module in order to give user the access to benchmarks and passes them step by step through the
low level packages. The simulation is initialized via the pipeline stages (Figure 3). There are three sections in the
Configuration Manager and is run via the Simulation pipeline: in-order frontend (fetch the instructions from
Control module (step by step or full trace simulation). memory, make the branch prediction, decoding, rename
The Monitoring Tool manages the feedback information registers and dispatching), out-of-order execution (the
and supplies the user with interactive animation by GUI number of execution cycles is distinct for each
update. Another important tool is the Add-Ins Manager instruction type) and in-order backend (gets finished
which has the responsibility to manage all third party instructions and updates the branch predictor). All
components added by developers. This module gives the essential architectural parameters (superscalar factor,
SMTAHSim the “framework” dimension by allowing number of micro-threads, number of execution units and
developers to extend the basic SMT architecture with their execution cycles, etc.) are configurable through the
other modules (ISA, branch predictor, data cache, etc.). Configuration Manager.
The Add-Ins can provide their own configuration panel
which will be loaded by the Configuration Manager at
the configuration phase, and their parameters set will be
then stored in the Configuration Repository together
with the basic one. The developer must only implement
the interfaces provided by the Add-Ins Manager,
compile it in a library and then load it in the
SMTAHSim Add-Ins Repository. Figure 3: Simulated pipeline
Due to the benchmarks’ characteristics, the effective subsequent instructions are marked as speculative and
execution can’t be accurately simulated, because the strike-lined until the branch execution ends and it turns
registers’ values are not known all the time. As a result out that the prediction is correct. In case of a
of this limitation, the single feasible D-Cache mispredicted branch, after its execution, all speculative
implementation is based on an analytical model. Besides instructions from the afferent thread are squashed and
these, another degree of abstractization is that branch the correct fetch path is taken.
prediction is made in a single pipeline stage (Instruction
Fetch) even if in reality it could take more cycles.

5.3. SMTAHSim Framework: GUI

Projects supported by the SMTAHSim simulator are
dedicated to teach students about concepts related to
superscalar and SMT architectures (processing
mechanisms, constraints, limitation of ILP rate, etc.),
and are fairly sustained by GUI. Being the closest to the
user, this level of application has benefited the most of Figure 6: Monitoring Simulation
our attention in order to give easy and interactive access After each full trace simulation a summary of
to all its features. Therefore, user can easily configure, simulation results is shown.
simulate and track the step-by-step results. In order to
get a big picture of SMT architecture performances,
GUI also supports user with a reporting tool.

Figure 7: Results

Prediction Accuracy
HardwareBudget = 8KB
95.5
Figure 4: Configuration Manager Interface HardwareBudget = 16KB
95.0
94.5
The Configuration Manager Interface (Figure 4) 94.0
makes possible to configure the simulated architecture 93.5
[%]

from a classic superscalar one to a 4-threaded SMT one. 93.0

92.5
Each micro-thread input can be settled independently. 92.0
After the architecture’s configuration the user can 91.5
control simulation by Simulation Control Interface and 91.0
make a step-by-step simulation: one simulated CPU Gshare FPBNP OGEHL

cycle each step (“Next” button) or simulating the input Predictors

traces entirely (“Go To End” button). In both cases the

IPC rate is updated in every CPU cycle (Figure 5). Figure 8: Average branch prediction accuracies
As a concrete example, Figure 8 illustrates
comparatively the simulation results obtained with
Figure 5: Part of Simulation Control Interface SMTAHSim using three prediction structures: gshare
(Yeh and Patt 1992), FPBNP (Jiménez 2003) and
When fine step simulation is chosen, the Monitoring
OGEHL predictors (Seznec 2005). The statistics are
Tool helps user to track the instruction flow from
collected after running the benchmarks described in
fetching to committing by animated visualization of each
section 4 on two configurations (one of them imposed
architecture units. Each instruction has a thread
by the hardware constraints of Championship Branch
identification number and a unique per thread identifier,
Prediction (CBP 2004)) and represent the average
which are both distinctively colored, allowing to easily
branch prediction accuracies.
following the pipelined execution process (Figure 6).
After the prediction of each branch instruction the
6. CONCLUSIONS AND FURTHER WORK De Bosschere K. et al., 2007, “High-Performance Embedded
Architecture and Compilation Roadmap”, Transactions on
The classical approach in teaching SMT concepts is HiPEAC I, Lecture Notes in Computer Science 4050,
based largely on oral communication of professors. Springer-Verlag, pp 5-29.
They spend a lot of time in computer architecture Eggers S. Emer J., Levy H., Lo J., Stamm R., Tullsen D.,
research or use paper and pencil to follow the execution 1997, “Simultaneous Multithreading: A Platform for
of the instructions flow. Although their efforts are to Next-Generation Processors”, IEEE Micro, Vol 17, Issue
emphasize the processor kernel activities, many times 5, 12-19.
Eyerman S., Eeckhout L., March 2009, “Memory-Level
they ignore the branch prediction and cache memory
Parallelism Aware Fetch Policies for Simultaneous
simulation. Our approach represents a formative Multithreading Processors”, ACM Transactions on
necessity since computer architectures are mainly Architecture and Code Optimization, Vol. 6, No. 1.
approached in a descriptive manner. Through our Grünbacher H., 1998, “Teaching Computer Architecture /
approach, students have the opportunity to be creative Organisation using simulators”, Proceedings of the 28th
and innovative in computer architecture or in other Frontiers in Education, IEEE Computer Society, Vol. 03.
research and didactical domains of computer science, Hennessy J., Patterson D., 2007, “Computer Architecture: A
even in countries not very developed from economical Quantitative Approach”, Morgan Kaufmann, 4th Edition.
and technological points of view. Based on highly Hostetler L.B., Mirtich B., 1996, “DLXsim - A Simulator for
parameterized developed simulation tools, students can DLX”.
©Intel Corporation, 2010, http://www.intel.com/
understand more in depth and in an integrated approach
Jiménez D., 2003, “Fast Path-Based Neural Branch
the theoretical concepts related to SMT, branch Prediction”, Proceedings of the 36th International
prediction constraints, limits of instruction level Symposium on Microarchitecture.
parallelism, TLP benefits, cache memories, etc. Lee C., Potkonjak M. and Mangione-Smith W., 1997,
Although SMT architectures outperform its ”MediaBench: A Tool for Evaluating and Synthesizing
predecessors, the evolution trend is maintained on Multimedia and Communications Systems”.
vertical by growing the technologic complexity. Manadhata P., Sekar V., 2003, “Evaluating Throughput and
Therefore a more aggressive approach (many micro- Fairness of Thread Fetch Policies for SMT Processors”,
threads) is heavily limited by the management logic’s www.cs.cmu.edu/~vyass/Fall03/15740/class-
project/project_report.pdf
complexity growth. It is clear that a new evolution trend
Petit S., Tomás N., Sahuquillo J., Pont A., 2006, ”An
is needed, on horizontal approach, by decentralization of Execution-Driven Simulation Tool for Teaching Cache
processing power (multi-core). For further work we are Memories in Introductory Computer Organization
mainly concerned to solve the following issues: Courses”, Proceedings of the 2006 Workshop on
 Simulating on benchmark sets which allow a real Computer Architecture Education.
implementation of data cache. Seznec A., 2005, “Analysis of the OGEHL predictor”,
 Implementing a module for power consumption Proceedings of the 32nd International Symposium on
calculation; this can help to evaluate the SMT Computer Architecture (IEEE-ACM), Madison.
architectures based on this objective, too. It is well- Sharkey J., Ponomarev D., Ghose K, 2005, “M-SIM: A
Flexible, Multithreaded Architectural Simulation
known that SMTs are energy-intensive due to their
Environment”. Technical Report CSTR-05-DP01,
complex and concentrated control logic. This module Department of Computer Science, State University of New
is also necessary for evaluation of hardware branch York at Binghamton.
predictor within a given chip area budget, from both Sinharoy B., Kalla R.N., Tendler J.M., Eickemeyer R.J. and
power consumption and performance points of view. Joyner J. B. 2005,"POWER5 System Microarchitecture",
 Adding modules to improve the processing rate, such IBM Journal of Research and Development, Vol. 49,
as value prediction, dynamic instruction reuse and an Num. 4/5, pp. 505-521.
execution trace cache. Skadron K., Stan M.R., Huang W., Velusamy S.,
Sankaranarayanan K., Tarjan D., 2003, “Temperature-
REFERENCES Aware Microarchitecture.”. Proceedings of the 30th
International Symposium on Computer Architecture.
August D., Chang J., Girbal S., Gracia Perez D., Mouchard G., Smullen W., Taha T., 2006, ”PSATSim: An Interactive
Penry D., Temam O., Vachharajani N., 2007 “UNISIM: Graphical Superscalar Architecture Simulator for Power
An Open Simulation Environment and Library for and Performance Analysis”, Proceedings of the 2006
Complex Architecture Design and Collaborative Workshop on Computer Architecture Education.
Development”, IEEE Computer Architecture Letters, 20. SPEC 1995, The SPEC benchmark programs,
Bečvář M., Kahánek S., 2007, “VLIW-DLX Simulator for http://www.spec.org/cpu95/
Educational Purposes”, Proceedings of the 2007 Vintan L., Florea A., 2000, ”Microarhitecturi de procesare a
Workshop on Computer Architecture Education. informaţiei” (in Romanian), Editura Tehnică, Bucureşti.
Burger D., Austin T., June 1997, “The SimpleScalar Tool Set, Yeh T., Patt Y., 1992, “Alternative Implementations of Two-
Version 2.0”, University of Wisconsin Madison, USA, Level Adaptive Branch Prediction”. Proceedings of the
CSD TR #1342. 19th International Symposium on Computer Architecture.
CBP: The 1st Journal of Instruction Level Parallelism Yi J.J., Lilja D.J., 2006, “Simulation of Computer
Championship Branch Prediction Competition (CBP-1), Architectures: Simulators, Benchmarks, Methodologies,
Oregon, USA, 2004. and Recommendations”, IEEE Transactions on
Computers, vol. 55, No. 3.

Fifty Years of Microprocessor Evolution: From Single Cpu To Multicore and Manycore Systems
No ratings yet
Fifty Years of Microprocessor Evolution: From Single Cpu To Multicore and Manycore Systems
32 pages
Flynns Taxonomy
0% (1)
Flynns Taxonomy
79 pages
ACA Notes UNIT-1
No ratings yet
ACA Notes UNIT-1
20 pages
Reg Supercomputing Case Study (Aditya Suman, Sumitra Bhargava)
No ratings yet
Reg Supercomputing Case Study (Aditya Suman, Sumitra Bhargava)
20 pages
Lecture ParallelArchTLP-DLP
No ratings yet
Lecture ParallelArchTLP-DLP
52 pages
Presentation and Simulation of Computer Architectu
No ratings yet
Presentation and Simulation of Computer Architectu
17 pages
Homogeneous and Heterogeneous Multicore Systems
No ratings yet
Homogeneous and Heterogeneous Multicore Systems
9 pages
Liebherr A309 Litronic TCD Wheel Excavator Service Repair Manual SN 40998 and Up PDF
No ratings yet
Liebherr A309 Litronic TCD Wheel Excavator Service Repair Manual SN 40998 and Up PDF
50 pages
The Design Development and Testing of A PDF
No ratings yet
The Design Development and Testing of A PDF
109 pages
Web-Based MIPS Simulation Environmen PDF
No ratings yet
Web-Based MIPS Simulation Environmen PDF
7 pages
WebMIPS A New Web-Based MIPS Simulation Environmen
No ratings yet
WebMIPS A New Web-Based MIPS Simulation Environmen
7 pages
Simultaneous Multithreading Processor
No ratings yet
Simultaneous Multithreading Processor
4 pages
Luwax and Poligen - Application Guide BAFS
100% (1)
Luwax and Poligen - Application Guide BAFS
9 pages
MTP: Understanding The Essence: Veljko Milutinović
No ratings yet
MTP: Understanding The Essence: Veljko Milutinović
12 pages
Simultaneous Multithreading: Pratyusa Manadhata, Vyas Sekar (Pratyus, Vyass) @cs - Cmu.edu
No ratings yet
Simultaneous Multithreading: Pratyusa Manadhata, Vyas Sekar (Pratyus, Vyass) @cs - Cmu.edu
4 pages
Design Issues: SMT and CMP Architectures
No ratings yet
Design Issues: SMT and CMP Architectures
9 pages
Simultaneous Multithreading G Architecture: Virendra Singh
No ratings yet
Simultaneous Multithreading G Architecture: Virendra Singh
15 pages
Many Core Processor Architecture
No ratings yet
Many Core Processor Architecture
36 pages
Technologies For Network
No ratings yet
Technologies For Network
3 pages
348 PMP ® Exam Practice Test and Study Guide
No ratings yet
348 PMP ® Exam Practice Test and Study Guide
70 pages
06 Handout 1
No ratings yet
06 Handout 1
4 pages
Diseño y Avaluacion de Arquitecturas de Computadoras
No ratings yet
Diseño y Avaluacion de Arquitecturas de Computadoras
3 pages
Sterling N Computing
No ratings yet
Sterling N Computing
2 pages
VerificationManual en PDF
No ratings yet
VerificationManual en PDF
621 pages
Smd150 Computer Architecture: Per Lindgren Eislab, Lectures Andrey Kruglyak, Syncsim Johan Eriksson, VHDL
No ratings yet
Smd150 Computer Architecture: Per Lindgren Eislab, Lectures Andrey Kruglyak, Syncsim Johan Eriksson, VHDL
43 pages
Part I - Sample Questions: COMPETENCY 1: Patient Care
No ratings yet
Part I - Sample Questions: COMPETENCY 1: Patient Care
20 pages
USBDLA User Guide V1.0
No ratings yet
USBDLA User Guide V1.0
18 pages
Aircraft Fastener
100% (3)
Aircraft Fastener
119 pages
Best Practices of Apheresis in Hematopoietic Cell Transplantation Ebook Full Text
100% (14)
Best Practices of Apheresis in Hematopoietic Cell Transplantation Ebook Full Text
14 pages
Stefano - Da - Thesis For Library
No ratings yet
Stefano - Da - Thesis For Library
233 pages
Walras 96
No ratings yet
Walras 96
516 pages
The Siemens 42 FT Gearless Mill Drive
No ratings yet
The Siemens 42 FT Gearless Mill Drive
9 pages
CE Certificate
No ratings yet
CE Certificate
3 pages
Dynamic Register Renaming Through Virtua PDF
No ratings yet
Dynamic Register Renaming Through Virtua PDF
20 pages
Beyond Dataflow: Borut Robi C, Jurij Silc and Theo Ungerer
No ratings yet
Beyond Dataflow: Borut Robi C, Jurij Silc and Theo Ungerer
13 pages
Alphamaquet 1150 Brochure en PDF
No ratings yet
Alphamaquet 1150 Brochure en PDF
24 pages
A DD Merged
No ratings yet
A DD Merged
16 pages
Deh Vieni Alla Finestra by Wolfgang Amadeus Mozart
No ratings yet
Deh Vieni Alla Finestra by Wolfgang Amadeus Mozart
4 pages
What Is Figurative Language?
No ratings yet
What Is Figurative Language?
8 pages
ZXUR 9000 UMTS (V4.14.10.14) Radio Network Controller Alarm and Notification Handling Reference
0% (1)
ZXUR 9000 UMTS (V4.14.10.14) Radio Network Controller Alarm and Notification Handling Reference
37 pages
Mutable Checkpoint-Restart Automating Li PDF
No ratings yet
Mutable Checkpoint-Restart Automating Li PDF
12 pages
Auto Pilot
No ratings yet
Auto Pilot
72 pages
FMX / Cruiso / BW 8-12: Ganzeboom Transmission Parts & Torque Converters
No ratings yet
FMX / Cruiso / BW 8-12: Ganzeboom Transmission Parts & Torque Converters
2 pages
Direct Communication and Synchronization
No ratings yet
Direct Communication and Synchronization
170 pages
解冻失水率英文版
No ratings yet
解冻失水率英文版
20 pages
Adventure CVG Nov81
No ratings yet
Adventure CVG Nov81
2 pages
Zelenka Jan Dismas Massor Thomas Kohlhase
No ratings yet
Zelenka Jan Dismas Massor Thomas Kohlhase
10 pages
CSIR CLRI Junior Secretariat Assistant Paper II 2018 English
No ratings yet
CSIR CLRI Junior Secretariat Assistant Paper II 2018 English
24 pages
Eco Assignment
No ratings yet
Eco Assignment
9 pages
0 The UNIX Philosophy
No ratings yet
0 The UNIX Philosophy
14 pages
TCP Ip Multimedia
No ratings yet
TCP Ip Multimedia
87 pages
Barangay Situational Analysis 2025
No ratings yet
Barangay Situational Analysis 2025
3 pages
Applied Auditing
No ratings yet
Applied Auditing
2 pages
Guitar Music Theory - Norm Vincent - Lydian-Dominant Theory - OK!!!
No ratings yet
Guitar Music Theory - Norm Vincent - Lydian-Dominant Theory - OK!!!
22 pages
Atc18auto Tuning
No ratings yet
Atc18auto Tuning
15 pages
Applying A Constructivist and Collaborat PDF
No ratings yet
Applying A Constructivist and Collaborat PDF
25 pages
A Portable Runtime Interface For Multi-L PDF
No ratings yet
A Portable Runtime Interface For Multi-L PDF
129 pages
Strategic Value Management - Michael Thiry
No ratings yet
Strategic Value Management - Michael Thiry
8 pages
Autofs
No ratings yet
Autofs
5 pages
Autobot Mlsys2020
No ratings yet
Autobot Mlsys2020
6 pages
A Survey of Mobile Transactions
No ratings yet
A Survey of Mobile Transactions
53 pages
An Application of Machine Learning For A Smart Grid Resource Allocation Problem
No ratings yet
An Application of Machine Learning For A Smart Grid Resource Allocation Problem
6 pages
Architecture-Based Performance Analysis Applied To A Telecommunication System
No ratings yet
Architecture-Based Performance Analysis Applied To A Telecommunication System
32 pages
Zelenka Jan Dismas Kyrkomusik Wolfgang Horn
No ratings yet
Zelenka Jan Dismas Kyrkomusik Wolfgang Horn
13 pages
Bach
No ratings yet
Bach
1 page
Design & Simulation of Buck-Boost Converter Modulation Technique For Solar Application
No ratings yet
Design & Simulation of Buck-Boost Converter Modulation Technique For Solar Application
6 pages
Bourns N1027 4300 Vs 4600 FPB
No ratings yet
Bourns N1027 4300 Vs 4600 FPB
23 pages
OS Lab Manual Part 3
No ratings yet
OS Lab Manual Part 3
7 pages
Questions 1. Research Design: Balangay: A Proposed Flood Resilient House Methodology
No ratings yet
Questions 1. Research Design: Balangay: A Proposed Flood Resilient House Methodology
3 pages
Head Assy
No ratings yet
Head Assy
1 page
Continental Device India Limited: PNP Silicon Epitaxial Power Transistor CFB1370 (9AW) TO-220FP
No ratings yet
Continental Device India Limited: PNP Silicon Epitaxial Power Transistor CFB1370 (9AW) TO-220FP
2 pages
Infinite Possible The Future with Computer Fundamentals
From Everand
Infinite Possible The Future with Computer Fundamentals
Jaishree Soni
No ratings yet
Foundational Models and Architectures S1: Generative AI, #1
From Everand
Foundational Models and Architectures S1: Generative AI, #1
Leaster Startx
No ratings yet
Fundamentals of Machine Learning: a Simplified Approach
From Everand
Fundamentals of Machine Learning: a Simplified Approach
Er. Sudhir Goswami
No ratings yet
Mastering C: Advanced Techniques and Tricks
From Everand
Mastering C: Advanced Techniques and Tricks
Ted Norice
No ratings yet
Introduction to MATLAB for Scientists and Engineers: A Practical Guide to Computational Problem Solving
From Everand
Introduction to MATLAB for Scientists and Engineers: A Practical Guide to Computational Problem Solving
Eric Okoth Ogur
No ratings yet
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
From Everand
MATHEMATICAL FOUNDATIONS OF MACHINE LEARNING: Unveiling the Mathematical Essence of Machine Learning (2024 Guide for Beginners)
DAVID MACKAY
No ratings yet
Embedded Systems Programming with C: Writing Code for Microcontrollers
From Everand
Embedded Systems Programming with C: Writing Code for Microcontrollers
Larry Jones
No ratings yet
Next-Gen Mainframe: Mastering Modern Automation Techniques: Mainframes
From Everand
Next-Gen Mainframe: Mastering Modern Automation Techniques: Mainframes
Isaac Nangan
No ratings yet
Modern Mainframe Mastery: Navigating the New Era of Systems Management: Mainframes
From Everand
Modern Mainframe Mastery: Navigating the New Era of Systems Management: Mainframes
Isaac Nangan
No ratings yet
DeepSparse for Efficient CPU Inference: The Complete Guide for Developers and Engineers
From Everand
DeepSparse for Efficient CPU Inference: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
From Everand
Transformers in Deep Learning Architecture: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Encapsulating Legacy: A Guide to Service-Oriented Architecture in Mainframe Systems: Mainframes
From Everand
Encapsulating Legacy: A Guide to Service-Oriented Architecture in Mainframe Systems: Mainframes
Isaac Nangan
No ratings yet
Cerebras GPT: Wafer-Scale Architectures for Large Language Models
From Everand
Cerebras GPT: Wafer-Scale Architectures for Large Language Models
William Smith
No ratings yet
Smart Architectures: AI/ML in Mainframe-Cloud Environments: Mainframes
From Everand
Smart Architectures: AI/ML in Mainframe-Cloud Environments: Mainframes
Ricardo Nuqui
No ratings yet
Graph Layout Support for Model-Driven Engineering
From Everand
Graph Layout Support for Model-Driven Engineering
Miro Spönemann
No ratings yet
Mastering Mainframe Modernization: Mainframes
From Everand
Mastering Mainframe Modernization: Mainframes
Ricardo Nuqui
No ratings yet
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
From Everand
Introduction to Quantum Computing & Machine Learning Technologies: 1, #1
M. Sreedevi
No ratings yet
Deep Learning
From Everand
Deep Learning
Manish Soni
No ratings yet
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
From Everand
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
Sam Steed
No ratings yet
IT Analyst Internship
From Everand
IT Analyst Internship
Manish Soni
No ratings yet
“Exploring Computer Systems: From Fundamentals to Advanced Concepts”: GoodMan, #1
From Everand
“Exploring Computer Systems: From Fundamentals to Advanced Concepts”: GoodMan, #1
Patrick Mukosha
No ratings yet
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
From Everand
50 Breakthrough AI Concepts in 500 Words Each: In 500 words, #17
Nietsnie Trebla
No ratings yet
Computer Skills: Understanding Computer Science and Cyber Security (2 in 1)
From Everand
Computer Skills: Understanding Computer Science and Cyber Security (2 in 1)
Jonathan Rigdon
No ratings yet
Operating System Text Book
From Everand
Operating System Text Book
Manish Soni
No ratings yet
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
Modeling and Simulation of Discrete Event Systems
From Everand
Modeling and Simulation of Discrete Event Systems
Byoung Kyu Choi
No ratings yet
Computer Science: The Complete Guide to Principles and Informatics
From Everand
Computer Science: The Complete Guide to Principles and Informatics
Jonathan Rigdon
No ratings yet
Mobile Neural Network Framework in Practice: The Complete Guide for Developers and Engineers
From Everand
Mobile Neural Network Framework in Practice: The Complete Guide for Developers and Engineers
William Smith
No ratings yet
Microkernel Architecture Design and Implementation: Definitive Reference for Developers and Engineers
From Everand
Microkernel Architecture Design and Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Implementing a Cpu using Fpga
From Everand
Implementing a Cpu using Fpga
Othman Ahmad
No ratings yet
Operating System Interview Questions and Answers
From Everand
Operating System Interview Questions and Answers
Manish Soni
No ratings yet
Practical MXNet Applications: Definitive Reference for Developers and Engineers
From Everand
Practical MXNet Applications: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Technical Foundations of Torch: Definitive Reference for Developers and Engineers
From Everand
Technical Foundations of Torch: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Cortex-M Architecture and Programming Reference: Definitive Reference for Developers and Engineers
From Everand
Cortex-M Architecture and Programming Reference: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Comprehensive Guide to Micro:bit Technology: Definitive Reference for Developers and Engineers
From Everand
Comprehensive Guide to Micro:bit Technology: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Technical Foundations of Emulation: Definitive Reference for Developers and Engineers
From Everand
Technical Foundations of Emulation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Embedded Systems Programming with C++: Real-World Techniques
From Everand
Embedded Systems Programming with C++: Real-World Techniques
Robert Johnson
No ratings yet
Design and Implementation with i.MX Processors: Definitive Reference for Developers and Engineers
From Everand
Design and Implementation with i.MX Processors: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
From Everand
Visual Sensor Network: Exploring the Power of Visual Sensor Networks in Computer Vision
Fouad Sabry
No ratings yet
Sussman Anomaly: Fundamentals and Applications
From Everand
Sussman Anomaly: Fundamentals and Applications
Fouad Sabry
No ratings yet
Computer Vision: Fundamentals and Applications
From Everand
Computer Vision: Fundamentals and Applications
Fouad Sabry
No ratings yet

A Visual Simulation Framework For Simult PDF

Uploaded by

A Visual Simulation Framework For Simult PDF

Uploaded by

A VISUAL SIMULATION FRAMEWORK FOR SIMULTANEOUS

E-mail:{adrian.florea, arpad.gellert, lucian.vintan}@ulbsibiu.ro, [email protected]

KEYWORDS Also, in today’s world, there is an ever-increasing

Proceedings 25th European Conference on Modelling and

5.3. SMTAHSim Framework: GUI

from a classic superscalar one to a 4-threaded SMT one. 93.0

cycle each step (“Next” button) or simulating the input Predictors

traces entirely (“Go To End” button). In both cases the

You might also like