0% found this document useful (0 votes)
187 views43 pages

CMPT 305: Computer Simulation/Modelling Spring 2022 Course Introduction Modelling Pitfalls & Performance Metrics Alaa Alameldeen (Alaa@sfu - Ca)

Uploaded by

Arthur Francis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
187 views43 pages

CMPT 305: Computer Simulation/Modelling Spring 2022 Course Introduction Modelling Pitfalls & Performance Metrics Alaa Alameldeen (Alaa@sfu - Ca)

Uploaded by

Arthur Francis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

CMPT 305: Computer Simulation/Modelling

Spring 2022

Course Introduction
Modelling Pitfalls & Performance
Metrics
Alaa Alameldeen ([email protected])
© Copyright 2022 Alaa Alameldeen and Joseph Peters
When and Where?
• When:
➢ Tuesdays 10:30 AM-12:20 PM (10 minute break ~ 11:20 OR finish at 12:10)
➢ Fridays 10:30-11:20 AM
• Where: Tuesday C9002, Friday EDB7618, Also online (zoom)
• Office hours: Wednesdays 10-11AM (online on zoom)
• TAs: Mona Shahsavari and Taher Ahmadi
➢ Office hours: Check webpage
• Webpage: https://canvas.sfu.ca/courses/67328/pages/home-page
• Webpage updated frequently:
➢ Class slides
➢ Online lecture zoom links
➢ Lecture recordings
➢ Homework and project assignments
➢ Changes in class schedule
2
Course Logistics
• Delivery:
➢ Live lectures in person (C9002, EDB7618), will be recorded and posted online
❑ Slides will be posted online, but could change (don’t print before class)
➢ We’ll also try synchronous online meetings at the same lecture time via zoom (links on webpage)
➢ Office hours on zoom (links on webpage)
➢ Exams will be online. You have to join via zoom with camera turned on for proctoring
• Communications:
➢ Feel free to ask questions during class
❑ Lectures are recorded. See privacy notice on course webpage
➢ Announcements sent on class mailing list ([email protected])
➢ Participate in office hours (First-come, first-serve, not recorded)
➢ Discussions on Piazza
❑ Ask course-related questions there since many people might have similar questions
❑ Other students can answer. Instructor and TAs will monitor discussions and answer questions.
➢ Use email only for urgent questions or concerns. Start email subject line with [CMPT305-Spring2022]
❑ Email me at [email protected] and/or TAs ([email protected], [email protected])

3
About the Course
• Computer Simulation and Modelling
➢ Introduction to the modelling and computer simulation of complex systems
➢ Focus on the modelling and performance evaluation of computer systems and networks

• Topics (tentative):
• Basic concepts of simulation, modelling, and performance evaluation
• Analytic Modelling: Queueing theory, fundamental laws, single/multiple server queues
• Discrete Event Simulation: Event scheduling, random number and random variate
generation
• Simulation model and output Analysis
• Experimental Design: Factorial designs, linear regression
• Queueing Network Models
• Computer System Simulation: System components, performance metrics, simulator design

4
Safety Rules
• I personally prefer in-person lectures, but I understand students may have
different levels of comfort with in-person classes.
• You can attend the class completely online (including exams) if you’re not
comfortable coming in-person to the lecture room.
• Attend in person when possible.
• Don’t come to class if:
➢ You’re sick or have any symptoms (not just COVID symptoms)
➢ If you’ve tested positive for COVID
➢ If you were in contact with anyone who has been sick AND/OR tested positive with COVID
➢ If you have been traveling by airplane recently
• Wear a mask in class. Mask needs to cover your nose and mouth
• Keep a safe distance from other students
• If instructor is sick or exposed to COVID, lectures will move temporarily online
(zoom). Check email before coming to class. 5
Course Goals
• Subject: Theory and practice of model design, analysis, and
simulation of complex systems (including computer systems)
• Goal: Learn practical techniques that can be used to analyze
many real-world problems
• Learning Outcomes:
➢Evaluation techniques, performance metrics, workload analysis
➢Choosing correct performance metrics and proper statistical techniques
➢Use analytic (queueing) models to analyze performance
➢Design appropriate experiments that provide needed information
➢Design simulators for complex systems
6
Expected Background
• Data Structures
➢CMPT 225 OR equivalent
• Discrete Mathematics
➢MACM 101 OR (ENSC 251 AND ENSC 252) OR equivalent
• Probability and Statistics
➢STAT 270 OR equivalent
• Programming Experience in C
• Preferred: Computer Systems
➢CMPT 295 OR equivalent

7
Grading
• Grade Breakdown (tentative)
➢Exams: 36%
❑ Four Exams on Friday every third week, starting Jan 28. No midterm or final exams
❑ Grade assigned based on best 3 out of 4 scores (lowest score discarded)
➢Homework assignments: 34%
❑ Mostly programming assignments
❑ Done individually (no collaboration)
➢Two projects: 15% each
❑ Group projects
❑ 2-3 members per group
❑ Preferably same group members for both projects
❑ Use course discussion boards on Piazza to form groups
➢Class participation encouraged (lectures or online discussions)
❑Potential bonus points for top participants

8
Grading (Cont.)
• Grade Scale
➢A+: 95% and higher ➢C+: 65-69.9%
➢A: 90-94.9% ➢C: 60-64.9%
➢A-: 85-89.9% ➢C-: 55-59.9%
➢B+: 80-84.9% ➢D: 50-54.9%
➢B: 75-79.9% ➢F: Below 50%
➢B-: 70-74.9%
9
Important Dates
• Exam dates:
➢28-Jan, 18-Feb, 18-Mar, 08-Apr
➢Need to attend the class zoom meeting and turn on camera.
• Homework due dates (tentative)
➢20-Jan, 31-Jan, 18-Feb, 04-Mar
➢Please start early on homework assignments
• Project due dates (tentative)
➢25-Mar, 11-Apr
• No class on:
➢22-Feb, 26-Feb
• Last lecture: 08-Apr
• Check course webpage for any schedule changes
10
Textbook and Readings
• Reading material (if needed) will be linked off course webpage
• Textbook:
➢ Quantitative System Performance: Computer System Analysis Using Queueing Network
Models, Edward D. Lazowska, John Zahorjan, G. Scott Graham, Kenneth C. Sevcik,
Prentice Hall, 1984, Available online at https://homes.cs.washington.edu/~lazowska/qsp/
➢ We’ll only use a few chapters
➢ Old textbook so examples are outdated.
• Other References
➢ The Art of Computer Systems Performance Analysis: Techniques for Experimental Design,
Measurement, Simulation, and Modeling, R. Jain, Wiley, 1991
➢ Discrete-Event Simulation: A First Course, L.H. Leemis and S.K. Park, Prentice Hall, 2005
➢ Experimental Design and Analysis, Howard J. Seltman, 2018, Available online at
https://www.stat.cmu.edu/~hseltman/309/Book/Book.pdf

11
Academic Integrity
• Do not cheat!
➢ No sharing of code or solutions
➢ Penalties may including getting 0 points on the assignment and/or more severe penalties (suspension or
expulsion)
• Do not post your code on a public code repository
➢ Use GitHub Education Pack to get a private repository
➢ Use Bitbucket’s private repo feature, setup a private repo on SFU CSIL GitLab
❑ https://csil-git1.cs.surrey.sfu.ca
❑ Guide: https://coursys.sfu.ca/2018su-cmpt-470-e1/pages/GitLab
➢ Don’t post to public repository even after course is over
• OK to post project code on public repository if all group members agree.
• Homework assignments must be your own work.
➢ Be sure to provide proper citations
➢ Discussion ok on course discussion boards, but no sharing of solutions
➢ Only project group members can collaborate on a project.
• See SFU policies: https://www.sfu.ca/policies/gazette/student.html
12
Introduction to Simulation and
Modelling

13
Why Study Modelling?
• What is a model?
➢Abstraction based on assumptions, capturing essential characteristics of a
system
❑System: Computer system, computer application, or other physical system

• How to use a model?


➢Existing system:
❑Estimate performance (or other relevant metrics)
❑Compare two systems
❑Predict performance as a function of inputs and system conditions
➢Designing a system:
❑Predict impact of system optimizations
❑Evaluate alternative design strategies

14
Which Restaurant…?

Kolk, Melinda. eat_here.jpg. July 2011. Pics4Learning.com


Sherman, Gay. gs130.jpg. 7/20/1997. Pics4Learning.com

• Need to eat quickly. Which restaurant should I go to?


• I have one hour before my next meeting. I like the food in the restaurant on the right more,
but can I still make the meeting if I went there?
15
Which Sorting Algorithm…?
• Three algorithms with
various complexities
➢1: O(N2)
➢2: O(N Log (N))
➢3: Just need minimum,
O(N)
• Choice is input-
dependent

16
Other Examples to Model
• What is the average time a customer will be waiting in line at a grocery store?
• How many checkout clerks does a store need so customers stay in line less than
5 minutes during peak hours?
• What is the maximum time I need to drive from home to school?
• How many CPUs does the New York Stock Exchange need to process all stock
transactions during peak hours with a maximum of 50 microseconds per
transaction?
• What is the maximum wait time for a sales transaction at Amazon?
• When designing a CPU system that would run a deep learning inference model,
should we choose a 20-CPU 100MB cache system OR a 100-CPU 20MB cache
system?
• To support a minimum 100MB/sec download speed per customer 90% of the time,
how much bandwidth should an internet service provider allocate to a 1000-
house neighborhood?
17
Why Simulation?
• What is a simulator?
➢ A program that attempts to reproduce the behavior (e.g., functionality, timing) of a system.
• Why use a simulator?
➢ Speed: Faster to write simulator code than build real system
➢ Cost:
❑ Cheaper to write simulator than build real system (e.g., modern CPUs)
❑ Cheaper to train on simulator vs. real system (e.g., airplane pilots)
❑ Avoid disruption of service of real system
➢ Flexibility: Explore larger design space
➢ Intuition: Analyzing a simulated system enables gaining insight about what works and why
➢ Validation: Enables validating a system design before building real system
➢ Performance optimization: When designing a system, we can explore impact of different system
modifications to choose the best alternative for the system.
➢ Feasibility: Building a real system might currently be infeasible due to technology being
unavailable or not ready. Building a simulator can help us decide whether to invest in a
technology
18
Performance Studies
• Why conduct a performance study?
➢Predict system performance for certain inputs and system conditions
➢Evaluate different design options
• How can we conduct a performance study?
➢Input specification and characterization (Workload characterization)
➢Depending on stage of system development:
❑ Planning or early development: Build high-level model of a system to gain intuition or
understanding of how the system works. Also used to evaluate alternatives
❑ Late development: Write simulation code to evaluate design optimizations
❑ Working system: Measure performance directly
• Relevant performance metrics?
➢Mean/maximum response time, mean/maximum throughput, mean/maximum
bandwidth

19
Modelling vs. Simulation vs. Measurement
• When a real system is available, we can directly measure
performance or other metrics
➢Accuracy is much higher than modelling or simulation

Speed Simulation lower Measurement higher Modelling

Accuracy Modelling lower Simulation higher Measurement

Flexibility Measurement lower Modelling higher Simulation

Cost Modelling lower Simulation higher Measurement

20
Modelling Basics
It's a good thing to spend more time modelling the parameter that affects the results the most

21
Modelling Options
• Analytic Modelling
➢Mathematical methods used to obtain analytic solutions
➢Numerical results are computed using the analytic solutions
➢Gives rough estimates
➢Can be difficult or impossible to solve complex systems
• Simulation Models
➢Model structure of system with simulation program; also model input
➢Run simulation and analyze data
➢Can study different levels of detail
➢Takes longer time to develop and run
• Stochastic Models
➢Contain some random input components characterized by probability distributions
➢Output is also random - probability distribution of performance measures
22
Queuing Models
• Popular Analytic Modelling Method
• Single-server queue: models one system component
➢Examples: Single CPU, single disk, single communication channel, one teller at a
bank, one checkout counter at grocery store, one check-in counter at airport
• Multi-server queues: models multiple copies of the same resource
➢Examples: Multi-core CPU system; multiple tellers at a bank; several checkout
counters at a grocery store
• Queuing Network: A network of queuing elements
➢Models several system components and their interactions
➢Example: Computer System with multiple CPUs, memory chips, disks

23
Example: Single-Server Queue – Infinite
Population
• Customers arrive, receive
service, leave system
• Input parameters:
➢Inter-arrival times (distribution)
➢Service times (distribution)
• Performance measures:
➢Mean response time
➢Mean number of customers in system
➢System utilization
➢…and other metrics to be discussed
figuring out which parameters are important in a model
later Finding out which parameter has the highest impact on a model
It's a good thing to spend more time modelling the parameter that memory size
24
Single-Server Queue – Infinite Population
• Estimate
number of
customers in
system over
time
• Estimate
maximum
number of
customers in
system

25
Example: Single-Server Queue – Finite
Population
• Closed system with a fixed number
(N) of circulating customers
• Input parameters:
➢Think times (distribution)
➢Service times (distribution)
• Performance measures:
➢Mean response time
➢System utilization
➢Average number of customers in queue
➢…and other metrics

26
Queuing Network Models
• Models with multiple queues and multiple servers.
• Open System
➢Infinite population model: Customers arrive, receiver service from one or
more servers then leave system
• Closed System
➢System with fixed number (N) of total customers.
➢Customers receive service from one or more servers then wait (think time)
before receiving more service
• Mixed System
➢Some customers circulate; others enter then leave
27
Example: Internet Content Provider
N Circulating customers (Subscribers)

Content Servers

Time with no content requested

Check User Credentials

It's a good thing to spend more time modelling the parameter that affects the results the most
28
Example: Grocery Store with Membership

Checkout Counters

Shopping Time

Arriving Departing
Customers Check Membership Customers

29
Modelling Pitfalls
• Failure to understand system before modelling/analysis
➢ Need to understand inputs, workloads, system conditions
➢ Examples:
❑ Arbitrary parameters, system conditions, metrics and workloads
❑ Unrepresentative workloads
❑ Choosing an incorrect evaluation technique
❑ Ignoring important parameters and/or significant system conditions

• Choosing incorrect performance metrics (details later)


• Biased model
➢ Model built with the goal of showing a certain outcome
(e.g., Ratio Games)
➢ Model that assumes a certain result, and then looks for supporting evidence
➢ Model that shows more or less uncertainty than a real system

30
Biased Model Example
• An online retailer is considering the purchase of a server system (either A or B) to process online
sales transactions. System A outperforms B for large transactions but does worse on small
transactions. The retailer has two workloads with different mixes of large and small transactions, an
their throughput is shown in this table. Ratio games can show either system outperforming the
other.
Transactions
/sec for
Workload 1 Workload 2 Average
online
retailer
System A 500,000 1,000,000 750,000

System B 1,000,000 500,000 750,000

Normalized Workload Workload Normalized Workload Workload


Average Average
to System A 1 2 to System B 1 2

System A 1 1 1 System A 0.5 2 1.25

System B 2 0.5 1.25 System B 1 1 1


31
For this particular example to be unbiased, we should use geometric mean
Experimental Design Pitfalls
• When designing an experiment to model/measure performance,
several pitfalls:
➢Naïve design where different parameters are varied separately (assume
parameters are independent)
➢Incorrect level of detail
❑Using high-level model for mostly similar systems
❑Using detailed model/simulation for very different systems
➢Insufficient data analysis (generating data without enough analysis/insight)
➢Incorrect analysis
❑Ratio games to show incorrect outcome
❑Insufficient length of simulation (too short, simulating unrepresentative intervals)
➢Lack of sensitivity analysis figuring out which parameters are important in a model
Finding out which parameter has the highest impact on a model
It's a good thing to spend more time modelling the parameter that affects the results the most
Example: cache size VS memory size
32
Performance Evaluation Pitfalls
• When using measurements/modelling to evaluate performance,
several issues to consider:
➢Input error: Confidence in results should be based on confidence of using correct
input
➢Variability
❑Mean results are misleading if variance is high
❑Confidence intervals more useful
➢Reusing model in a different context
❑Model had certain assumptions that might not be applicable in new context
❑Context could be completely different and requires different inputs or performance metrics
➢Predicting future performance based on past or current measurements/model
results
❑Past can help predict future
❑But future inputs/workloads/system conditions can change

33
Systematic Approach to Performance Analysis
(Jain Chapter 2)

1. State goals and define the system (including identifying system boundaries)
# of checkouts required store capacity, amount of merchandise, location of store, etc.
2. Identify checkout
servicesprocess
and outcomes
average time spent at checkout
3. Select appropriate performance metrics
queuing latency (time spent in queue) and service time (time spent at checkout clerk)
4. Identify parameters: System (HW/SW parameters), workload parameters
time of day, store location, amount of merchandise bought by customers
5. Select factors (i.e., parameters that change) to study. Identify values (range or
levels) of these factors values of factors listed in previous point
6. Select appropriate evaluation technique: Analytical modelling, simulation,
measurement
7. Select workload(s) depending on system and evaluation technique
number of customers at different time of day

8. Design Experiments (Experimental design to be covered later in the course)


9. Analyze and interpret data cost of hiring # checkout clerks VS customer satisfaction, ...
10. Present results

34
How to Select an Evaluation Technique?
1. Stage in development process
➢ Analytical modelling or simulation if no prototype
➢ Better if based on previous data (e.g. - trace-based simulation)
2. Time required
➢ Analytic modelling is fast, simulations are slow
3. Available tools
➢ Modelling skills, simulation languages, measurement instruments
4. Accuracy
➢ Analytical modelling requires more assumptions and simplifications than simulation.
➢ Measurement affected most by environment
➢ Note: accuracy is not the same as correctness (i.e. if the mode is modelling an incorrect system then we
don't care about the correctness even if it's correct)
5. Insight (insights we want to gain)
➢ Analytical modelling provides most insight, measurement least
(as less parameters are involved)
6. Cost
7. Help make an argument
➢ Measurements easiest to sell, analytical results hardest
as we don't have to explain the parameters we chose, why we chose, etc.
➢ Use a second technique to validate results of first
➢ Use first technique to select parameters for second

35
How to Select Performance Metrics?
• Pick metrics that are relevant to customers and system designers
➢Mean (average) may not be sufficient if variance is high. Variability in response time
is important for customers/users
➢Useful to have metrics with low variability to reduce the need for repeated
measurements or simulations
➢Use non-redundant metrics using different metrics that measure the same thing
➢Metrics should account for all possible outcomes
• Global Metrics:
➢Reliability, availability
• Global and Individual server metrics
➢Response time, throughput, utilization, service time, queuing delay
• When designing a system, we need to consider the trade-offs
between optimizing global vs. individual metrics
36
Common Performance Metrics

37
Response Time
• Most popular definition: Time between when customer starts request and when system completes
response
➢ Another definition: Time between when customer starts request and when system starts response (includes queuing delay
but not service time)
➢ For open networks, also known as residence time.
• Response time components:
➢ Waiting time: Also called queuing delay.
❑ Refers to time that a customer spends waiting in queue
➢ Service time
❑ Time when a customer is receiving service from the server

Time
Arriving Departing
customers customers
Waiting time Service time
Queue server

Customer Customer Customer


arrives starts Completes
sevice service
38
Throughput
• Throughput: Rate that requests are serviced
➢ Depends on load (demand)
➢ Initially increases with demand
• Nominal Capacity: Maximum achievable throughput
• Usable Capacity: Maximum achievable throughput without exceeding response
time limits (or maximum practical throughput)
• Efficiency: Ratio of Usable Capacity to Nominal Capacity
Throughput Nominal Capacity
knee Usable Capacity

Knee Capacity

Demand (load) 39
Example: Multiprocessor Efficiency
• The following graph shows the throughput (in transactions per second) for a
multiprocessor system running an online transaction processing workload:
Trans/sec

Nominal Capacity

Usable Capacity

Efficiency 1.00 0.90 0.87 0.85 0.78 0.72 0.66 0.60

#CPUs 40
Other Individual (Server/Queue) Metrics
• Server Utilization: Ratio of busy time to total time
Objectives depend on contex:
1. high utilization means efficient use of resources
• Waiting time 2. high utilization means response time and queuing delay will increase significantly

➢Time spent in queue


• Queue Length: Number of customers in queue
Objective depends on context:
1. long queue length relates to more customers
• Service time 2. long queue length relates to more delay

➢Time spent being served at the server

• Important to know for each metric whether higher/lower is better


41
Other System Metrics
• Utilization: Ratio of busy time to total time
• Reliability metrics
➢ Mean time between errors
➢ Error rate, i.e., probability of error. Example: a memory chip has a bit error rate of 10-6, so
1 bit per million on average is faulty.
• Availability
➢ Fraction of time that a system is up (i.e., uptime). Example: 99.9999%
➢ Mean Time To Failure (MTTF)
• Performance/Cost Ratio
➢ Used to compare two or more systems
➢ Example: Performance/Watt used to measure energy efficiency – how much performance
does the system get for each 1W of power?
➢ Example: Performance/$ used to measure cost effectiveness – how much performance
does the system get for each $1 of system cost?
42
Class Announcements
• HW 1 posted. Due: Thursday Jan 20 at 11:59 PM
➢Intended to get you to assess whether you have the right background for this
course
• Office hours Wednesday at 10 AM
➢First-come, first-serve
➢Online on zoom

43

You might also like