0% found this document useful (0 votes)
10 views28 pages

COMM 291 Tutorial 1

The document outlines the importance of statistics in decision-making across various business sectors, emphasizing its role in analyzing data under uncertainty. It provides definitions of key terms related to data, such as variables, observations, and types of data (categorical and quantitative). Additionally, it includes tutorial information, examples, and problems to enhance understanding of statistical concepts.

Uploaded by

devsaroya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views28 pages

COMM 291 Tutorial 1

The document outlines the importance of statistics in decision-making across various business sectors, emphasizing its role in analyzing data under uncertainty. It provides definitions of key terms related to data, such as variables, observations, and types of data (categorical and quantitative). Additionally, it includes tutorial information, examples, and problems to enhance understanding of statistical concepts.

Uploaded by

devsaroya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

COMM 291 Tutorial

Week 2
TAs: Ophir Greif & Hima Kattumuri
Emails: [email protected]
[email protected]
what is statistics
Data and their computations
methods to data
analyze
is statistics
why make
important
Helps decisions under with
should I learn
uncertainty partialknowledge
statistics
why
Applicable to all of business
segments
should we care
why
There is so much data and it needs to
and made useful today be
analyzed
Definitions
variable a characteristic recorded
Data a collection of
being
values
observations synonymn For data
Data table a table data also known as a
sheet containing spread
Record a row in a
spreadsheet
Database multiple spreadsheets combined
of data
types
Data can be
categorical
or
quantitative
data can be classified into distinct bins
categorical
Few much
options repetition
two
a
Binary options
b Normal more than two unordered
c ordinal more than two ordered

Quantitative have units numerical


manyoptionsusually
little repetition

Identifier variables
unique to you
UBC number
Social INSURANE NUMBER

variable
string dates and times
can be made quantitative
age
Ext Identifer tablet

categorical brand name nominal


ordinal
rating
Quantitative price
calories Cal
protein grams
Fat
grams
6 2 Quantitative
age years
inches
height
pound
weight
count
gpa
sex nominal
categorical
child binary
only
nomina
major
me series taken over a
period of time sales records
cross sectional taken at one in time
point survey
elections
Land Acknowledgement

This event is taking place on the traditional,


ancestral, unceded territory of the
xʷməθkʷəyəm (Musqueam) Nation
COMM 291 (Optional) Tutorials - Admin Info
What will be covered in tutorials:
➔ Review of the material and practice questions

When:
➔ Tue. 12:30–2:00 pm (PST) - Hima
➔ Wed.4:00–5:30 pm (PST) - Ophir
➔ Th. 12:30–2:00 pm (PST) - Alternating
Note: Three sessions are identical. Recordings will
be posted at the end of every week
Office Hours: after tutorials + before exams. Email me if you need any help!
Lecture 1: Introduction
What is statistics?
➔ Data and their computations
➔ Methods to analyze data

Why is statistics important?


➔ Helps us make decisions under uncertainty and with partial knowledge

Why should I learn statistics?


➔ Marketing: Effect of advertising on sales OB/HR: Predicting employment
➔ Finance: Pricing financial products retention rates
➔ Accounting: Reliability of reported data Strategy/Econ: Should you enter a market?
➔ Logistics: Estimating delivery times MIS: Availability of data + big data
Lecture 1: Why should we care?
Lecture 1: Definitions
➔ Variable – a characteristic recorded about an individual
➔ Data – specific values of a variable
➔ Observations – another word for data
➔ Data table – an arrangement of data in rows and columns; also
called a spreadsheet
➔ Record – a row in a spreadsheet
➔ Database – a complex data structure possibly involving multiple
spreadsheets all linked so that information across them can be
combined
Which of the following are present here?
➔ Variable, Data, Observations, Data table, Record, Database
Lecture 2: Data
Types of Variables/Data:
➔ Categorical
◆ Can be put into ‘distinct bins’
◆ Take a small number of possible values
◆ Subtypes: binary (only 2 options),
nominal (unordered), ordinal (ordered)

➔ Quantitative ➔ Strings
◆ Have units ◆ Identifier Variable
◆ Allow arithmetic operations ● E.g. Student number, Social
◆ Can be discrete or continuous Insurance number, UPS
tracking number.
Example: Sauder BCom International Students

Which sort of data are present? Are they categorical or quantitative?


Problem #1
In May 2014, a consumer report published an article on energy bars they
tested recently. They reported some basic information about each energy bar
collected from the label and the results of some tests conducted by their staff.
The article gave the brand name, price, as well as nutritional information:
number of calories, and grams of protein and fat. Blind taste tests were
conducted by staff and the taste of each energy bar was rated on a scale from
1 to 5 (1 - worst, 5 - best).

List the variables. Indicate whether each variable is categorical or


quantitative, or neither (strings). If the variable is quantitative, state the
units. If the variable is categorical, state the type.
Problem #1 - Solution
Identifier Variable/Strings In May 2014, a consumer report published an
➔ Energy Bar Label article on energy bars they tested recently.
Categorical They reported some basic information about
➔ Taste (Ordinal) each energy bar collected from the label and
➔ Brand Name (Nominal) the results of some tests conducted by their
staff. The article gave the brand name, price,
Quantitative as well as nutritional information: number of
➔ Price ($) calories, and grams of protein and fat. Blind
➔ Calories (calories) taste tests were conducted by staff and the
➔ Protein (grams) taste of each energy bar was rated on a scale
➔ Fat (grams) from 1 to 5 (1 - worst, 5 - best).
Problem #2
Consider the following data set:

List the variables in the data set. Indicate whether each variable is
treated as categorical or quantitative, or neither. If the variable is
quantitative, state the units. If the variable is categorical, state the type.
Problem #2 - Solution Quantitative
Categorical ➔ Age (years)
Identifier ➔ Sex (nominal) ➔ Height (inches)
Variable/Strings ➔ Major (nominal) ➔ Weight (pounds)
➔ None ➔ Only child (binary) ➔ GPA (4.33 scale)
Lecture 2: Context and Units Matter
➔ Which country is best to live in?
◆ Answer depends on what you are looking for (e.g. average income,
greenhouse emissions, quality of education, etc.)
➔ Pay attention to units
◆ Different measurement systems
◆ E.g. $/pounds, km/miles, kg/pounds

➔ Do not compare the incomparables


Problem 3: Which Country Had the
Highest Population Density in 2020?
A. United States
B. China
C. Singapore
D. India
E. Monaco
Problem 3: Which Country Had the
Highest Population Density in 2020?
A. United States (34 per sq. km)
B. China (146 per sq. km)
C. Singapore (7,894 per sq. km)
D. India (411 per sq. km)
E. Monaco (18,960 per sq. km)
Problem 4: Which Country had the
highest GDP per person in 2020?

A. United States
B. Qatar
C. Luxembourg
D. Singapore
E. Canada
Problem 4: Which Country had the
highest GDP per person in 2020?

A. United States ($63,051)


B. Qatar ($52,751)
C. Luxembourg ($109,602)
D. Singapore ($58,484)
E. Canada ($42,080)
Lecture 2: Time and Data
➔ Cross-sectional: data are
collected at one point in
time (e.g. surveys)

➔ Time Series: data are


collected longitudinally at
various time points (e.g.
sales records)
Problem #5: Cross-Sectional or Time Series?
Problem #6:
Cross-Sectional
or Time Series?

Source:
http://www.270towin.com/
Lecture 2: Data Quality
➔ Where did the data come from? Is the source reliable?
➔ How well are variables defined? Be specific!
➔ Level of measurement and accuracy (use rounding)
➔ How were the data collected (e.g. objective vs. subjective)?
➔ Missing data (what is missing and does it matter?)
Problem#7: Is Climate Change Real?

➔ What can you infer from the graph?


➔ What is the source of data?
Is it reliable?
➔ What is missing from the picture?
Source: Daily Mail
Global Temperatures over 120 years

Source: NASA
Take-Away Message
➔ Data classification determines how we can phrase our
questions and conduct data analysis. Units matter!
➔ The same variable may be viewed as categorical or
quantitative, depending on the situation.
➔ Understanding the context of the data is the first step!
➔ Pay attention to the quality of collected data (e.g. sources,
level of accuracy, missing data)
Thank you and see you next week!

You might also like