0% found this document useful (0 votes)

79 views32 pages

2035 CH1 Notes

Statistics is the science of collecting, organizing, summarizing, and interpreting numerical data. It begins with asking questions that can be answered with data. Data is collected and organized, often in tables or graphs, and then summarized using numerical values like averages or measures of variability. The data is then analyzed using statistical methods and interpreted. Probability concepts are also important for modeling real-world scenarios. Statistics distinguishes between parameters that describe populations and statistics that describe samples. It also categorizes variables as nominal, ordinal, interval, or ratio based on their level of measurement. An example is provided on studying business opportunities in rural India using survey data on product consumption, demographics, and preferences. Big data is also introduced as large, complex datasets

Uploaded by

kejacob629

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

79 views32 pages

2035 CH1 Notes

Uploaded by

kejacob629

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 32

What is Statistics?

Statistics is the science of collecting,

organizing, summarizing and interpreting
numerical facts, which we call data

Statistics start with a question

1. What percent of students at UWO

smoke?
2. Are there differences in the number of
calories in different types of hot dogs?
3. Is there a relationship between size of an
engine and the gas mileage of an
automobile?
4. Is there any difference in the percentage
of students who work while going to
school compared to 10 years ago?
Collecting Data
• we need to keep in mind where the data
comes from and how it was collected

Organizing Data
• raw data does not tell us anything
• the data may first need to be organize in a
table

Summarizing Data
• data organized in a table may not tell us
anything
• best to display the data in a graph
(chapter 2)
• data can then be further summarized with
specific numerical values such as the
average or a number representing how
spread out the data are (chapter 3)
Analysis of the Data and Interpreting Your
Results
• Analyze the data using the appropriate
statistical method(s)
• How confident are you in the numerical
values you have calculated?
• What do they mean?
• Chapters 8, 9, 10, 11, 12, 13, 14 and 16
discuss these questions in more detail

However, prior to getting to the statistical

analysis part of the course, we must first
discuss probability

Probability rules will be discussed in

chapter 4
We then move on asking questions such as:

1. How do you mathematically model how

many electronic circuits in a large
shipment are defective?
2. What is the expected number of potholes
in a stretch of highway?
3. What percentage of soft drink cans have
a volume less than 95% of the volume
stated on the can?
4. What percentage of 10-kilogram bags of
cement have a weight greater than 10.2
kilograms?

To answer these questions, we need to know

something about probability distributions
(chapter 5, 6, 7)
Basic Statistical Concepts/Variables and
Data (sections 1.1)

What is a population?

• It is a set of all units that you are interested

in studying
• Units = people OR objects OR items of
interest

Examples
You are usually interested in studying a
certain characteristic of the population

• Any particular characteristic is called a

variable

Examples
Two Types of Variables

1. Quantitative (Measurement) Variable

• This is a variable that is assigned a
meaningful numerical value

2. Qualitative (Categorical) Variable

• This is a variable where the
characteristic can be assigned to
different categories
Example
Suppose we are interested in obtaining the
following data from a population of students.
Which of the variables are quantitative (Q) and
which are categorical (C)?

Age Gender
Weight Do you smoke
Height In what year at
Western are you
What cell phone How far is your
brand do you have parents’ home
from Western
How many text Do you currently
messages do you have a job where
send in a week you work 10 or
(approximately) more hours/week
If we examine every unit of the population
(for the variable of interest), we say we are
conducting a census of the population

• As you can imagine, many populations are

too large to study
• It would be too time consuming OR too
costly to conduct a census

Thus, it makes more sense to select and

analyze a subset (or portion) of the
population

This subset is called a sample

Once you have selected a sample from the
population you wish to study, you will want
to begin your analysis of the data by first
describing the sample data:

• Graph the sample data (chapter 2)

• Calculate some numbers that summarize
the data – these numbers are called
descriptive statistics (chapter 3)

Using the sample data to draw conclusions

about the population from which the sample
was drawn is a type of statistics called
inferential statistics (chapters 8 to 11)
Parameters vs. Statistics
Parameters
A descriptive measure of a population is
called a parameter
• Parameters are usually denoted by a
Greek letter

Examples:
μ = mean or average value of the population
σ 2 = measure of the spread of a population

Statistics
A descriptive measure of a sample is called
a statistic
• Statistics are usually denoted by
Roman letters

Examples:
x = mean or average value of the sample
s 2 = measure of the spread of a sample
In Statistical Inference
As mentioned earlier, we are often interested
in using a sample to draw conclusions (make
inferences) about the population from which
it was drawn

• Since the value of μ is often unknown,

we will end up using x as a point
estimate of μ

• Since the value of σ 2 is often unknown,

we will end up using s2 as a point
estimate of σ 2
Data Measurement: Nominal, Ordinal,
Interval and Ratio – Section 1.2

As stated before, a variable may be

qualitative (categorical) or quantitative

1. Qualitative Variables
There are two levels of measurements:

o Nominal (nominative) level

o Ordinal (ranked) level

a. Nominal Level

A nominal variable is used only for

categorizing a qualitative variable
• There is no meaningful order
b. Ordinal Level

When categorical variables are ranked in

order, the numbers assigned have meaning
• this is called the ordinal level of
measurement

For example, if a person is asked to rank

their favourite cities that they have visited in
Canada, it might be something like:

1. Vancouver (least favourite city)

2. Montreal
3. Halifax
4. Toronto
5. Calgary (favourite city)

This ranking does not have to have equal

distances between points: 1 to 2 is not the
same as 2 to 3
• Only the order is meaningful
Example
Which of the following situations are
nominal data and which are ordinal data?

• Stats course letter mark: A, B, C, D, F

• Tossing a coin: head/tail

• TV show rating: C, C8, G, PG, 14+, 18+

• Personal computer ownership: yes/no

• Restaurant rating: ***, , *, **, *

• Income tax filing status: married, divorced,

common-law, separated, widowed, single

• Top ten realtors in a district

• The grocery store aisle where the soup is

2. Quantitative Variables

There are two levels of measurement:

o Interval level
o Ratio level

a. Interval Level

If you are rating quantitative variables, the

numbers have meaning AND the distances
between the points are fixed and meaningful

For example, professor ratings

• You are asked a series of questions about
your course/professor and are asked to
give a rating on the scale:

1 2 3 4 5 6 7

• The distance between 3 and 4 is the same

as the distance from 5 to 6
• However, the above scale could have been
written as

−1 0 1 2 3 4 5

• Here, the value of 0 is arbitrary; it does not

mean that the course is rated as nothing
(zero); instead it represents a rating of “not
very good”
b. Ratio Level

When a quantitative variable has a

meaningful zero value AND equal distance
between points, then the variable is at the
ratio level

An example would be grades on a test

• A grade of 0 has meaning
• The distance between a grade of 65 and 70
is the same as the distance between a
grade of 89 and 94

Another example would be possible annual

rates of return on an investment portfolio
• You could earn 9.6% in a year or 2.7% or
−3.4%, or even 0%
Case Study – The State of Business in Rural
India

This was a study done in the mid 1990’s to

determine if there were business
opportunities in rural India
• In particular, in the personal
care/household commodities market

Some background:
• India is the 2nd largest country in the world
with a population > 1 billion
• 75% of the population live in rural areas
• Yet the rural market accounts for only 1/3
of the total national sales
• Since the 1990’s, India’s rural market has
become more open for trade in consumer
goods
• This was an untapped market at the time,
offering potential for large companies to
enter the Indian market
Some Data – Rural India

Income Level

65% earn less than $574 annually

23% earn between $574 and $1146 annually

Literacy Rates

66% of women are illiterate

38% of men are illiterate

Where do these numbers come from?

Personal Care Product Consumption in
Rural India (1990-1994)

1990 1994
Toothpaste 8,825 metric 17,023 metric tons
tons
Laundry Soap 272,540 mts 422,741 mts
Bathroom 158,919 mts 231,084 mts
Soap
Shampoo 497,000 litres 2,116,000 litres

These values come from a survey of 2500

households in rural India
Other Survey Data

1. They collected information on the income

level and age of the head of the household

2. Households were also asked to rate the

likelihood of purchasing toothpaste on a
scale of 1 to 5

3. Households were also asked to rank a

variety of products in terms of which they
were likely to purchase

4. Households also provided geographic

information, such as what area of India
they were from
Big Data (section 1.3)

Big Data is defined as a collection of

large and complex datasets from
different sources that are difficult to
process using traditional data
management and processing
applications

• All data are not created in the same

way, nor do they represent the same
things

• As a result, there are at least four

characteristics or dimensions
associated with big data:
Variety
This refers to the many different
forms and sources of data

Velocity
This refers to the speed with which
the data are available and can be
processed
Veracity
This has to do with data quality, correctness,
and accuracy

Volume
This has to do with the ever-increasing size
of data and databases

Value
This is sometimes considered a fifth
characteristic; data that does not generate
value makes no contribution to an
organization
Business Analytics (section 1.4)

Business analytics refers to the application

of processes and techniques that transform
raw data into meaningful information to
improve decision-making

FIGURE 1.6
Business Analytics Add Value to Data
Categories of Business Analytics

1. Descriptive Analytics
• takes traditional data and describes what
has or is happening in a business
o Used to discover hidden
relationships and patterns
o Simplest and most commonly used
category
o Data visualization is key
o Also called reporting analytics

Topics include descriptive statistics,

frequency distributions, statistical inference,
correlation, clustering techniques, data
mining, and data visualization
2. Predictive Analytics
• finds relationships in the data that are
not readily apparent with descriptive
analytics
o Patterns or relationships are
extrapolated forward in time and the
past is used to make predictions
about the future

Topics include regression, time-series,

forecasting, simulation, data mining,
statistical modeling, machine learning
techniques, decision tree models, and neural
networks
3. Prescriptive Analytics
• examines current trends and likely
forecasts to make better decisions
o Takes uncertainty into account,
recommends ways to mitigate risks,
and tries to foresee the effects of
future decisions
o Uses a set of mathematical
techniques that determine optimal
decisions given a complex set of
objectives, requirements, and
constraints

Topics include management science or

operations research aimed at optimizing
performance of a system such as
mathematical programming, simulation, and
network analysis
Data Mining and Data Visualization (1.5)

Data Mining
This is the collecting, exploring, and
analyzing of large volumes of data to
uncover hidden patterns to enhance
decision-making
• Used by companies to turn raw data into
useful information

FIGURE 1.7 -- Process of Data Mining

Data Visualization
This is the study of the visual representation
of data and is employed to convey data or
information by imparting it as visual objects
displayed in graphics

Example
Here is some recent data of the top five
manufacturing firms to receive Canadian
Government funding

Company Name Dollars Funded

FCA Canada Inc. (Chrysler) $85,800,000

Bombardier Inc. $54,150,000

Produits Kruger $39,500,000

Sonaca Montréal Inc. $23,250,000

Hanwha L&C Canada Inc. $15,000,000

Visualization of the Above Data

Bar Chart

Bubble Chart

Lind 2024 Release Chap001 PPT Accessible
No ratings yet
Lind 2024 Release Chap001 PPT Accessible
32 pages
Statistics For Business Decision
No ratings yet
Statistics For Business Decision
248 pages
9 Correlation
No ratings yet
9 Correlation
123 pages
Introduction
No ratings yet
Introduction
31 pages
Probability and Statistics Lecture 1&2
No ratings yet
Probability and Statistics Lecture 1&2
29 pages
Std121-121e - Business Statistics Course Booklet 2023
No ratings yet
Std121-121e - Business Statistics Course Booklet 2023
82 pages
TOPIC 1 - Introduction To Statistics in Relation To
No ratings yet
TOPIC 1 - Introduction To Statistics in Relation To
47 pages
MGS2150 Lecture1
No ratings yet
MGS2150 Lecture1
46 pages
Introduction Key Concepts
No ratings yet
Introduction Key Concepts
37 pages
MMW Stat 24 25
No ratings yet
MMW Stat 24 25
42 pages
Mse1 Stat Class
No ratings yet
Mse1 Stat Class
81 pages
Wa0002.
No ratings yet
Wa0002.
2 pages
L1 Introduction-Displaying Data
No ratings yet
L1 Introduction-Displaying Data
8 pages
Eco2061 Week 2
No ratings yet
Eco2061 Week 2
68 pages
Final AB 19-21 PIM3 Basics of Business Statistics
No ratings yet
Final AB 19-21 PIM3 Basics of Business Statistics
37 pages
RES1N Prefinal Module 4
No ratings yet
RES1N Prefinal Module 4
3 pages
Unit One Graphing and Descriptive Statis-1
No ratings yet
Unit One Graphing and Descriptive Statis-1
12 pages
Notes Module 1
No ratings yet
Notes Module 1
19 pages
Statistical Analysis (Lecture 1)
No ratings yet
Statistical Analysis (Lecture 1)
40 pages
Introduction To Statistics
100% (3)
Introduction To Statistics
43 pages
Fundamentals of Data Science and Analytics On Descriptive Analysis
No ratings yet
Fundamentals of Data Science and Analytics On Descriptive Analysis
53 pages
Math 101 Statistics
No ratings yet
Math 101 Statistics
100 pages
Chapter-1 Data Analysis
No ratings yet
Chapter-1 Data Analysis
14 pages
Quantitative Methods
100% (2)
Quantitative Methods
103 pages
Chapter 1
No ratings yet
Chapter 1
37 pages
Quantitative Methods
No ratings yet
Quantitative Methods
33 pages
Unit-2 Ids
No ratings yet
Unit-2 Ids
64 pages
1 - What Is Statistics
No ratings yet
1 - What Is Statistics
22 pages
Unit 2
No ratings yet
Unit 2
72 pages
What Is Statistics ? and Describing Data: Frequency Distributio N
No ratings yet
What Is Statistics ? and Describing Data: Frequency Distributio N
17 pages
Stats Notes
No ratings yet
Stats Notes
81 pages
Statistics
No ratings yet
Statistics
14 pages
Lesson 1: Brief History of Statistics
No ratings yet
Lesson 1: Brief History of Statistics
17 pages
Lecture1 2 3
No ratings yet
Lecture1 2 3
86 pages
Statistics Notes Part - 1
No ratings yet
Statistics Notes Part - 1
25 pages
BoS - Session 1
100% (1)
BoS - Session 1
37 pages
Business Statistics
No ratings yet
Business Statistics
9 pages
Intro To Course and Basic Statistics
No ratings yet
Intro To Course and Basic Statistics
31 pages
Statistics Part1
No ratings yet
Statistics Part1
28 pages
Sta 103 L1 Upda2
No ratings yet
Sta 103 L1 Upda2
104 pages
Statistics and Probability
No ratings yet
Statistics and Probability
17 pages
STATS w2 Done
No ratings yet
STATS w2 Done
8 pages
4.02 Statistics Fundamentals
No ratings yet
4.02 Statistics Fundamentals
2 pages
Lesson 01
No ratings yet
Lesson 01
6 pages
1 Descriptive Part
No ratings yet
1 Descriptive Part
13 pages
1-Introduction To Statistics PDF
100% (1)
1-Introduction To Statistics PDF
37 pages
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
No ratings yet
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
53 pages
Statistik 1
No ratings yet
Statistik 1
17 pages
Note For Int To Statistics
No ratings yet
Note For Int To Statistics
24 pages
Report Stat
No ratings yet
Report Stat
21 pages
WEEK 1 - Basic Concepts of Statistics
No ratings yet
WEEK 1 - Basic Concepts of Statistics
2 pages
Introduction To STATISTICS-new
No ratings yet
Introduction To STATISTICS-new
44 pages
CHP1 Mat161
No ratings yet
CHP1 Mat161
4 pages
Stats Bio Supp. 1
No ratings yet
Stats Bio Supp. 1
11 pages
Statistics Analysis With Software Application
No ratings yet
Statistics Analysis With Software Application
22 pages
Introduction Bus Statistics
No ratings yet
Introduction Bus Statistics
32 pages
Measuring Food Safety Culture in Food Manufacturing-1
100% (1)
Measuring Food Safety Culture in Food Manufacturing-1
58 pages
Lec 1 - Data, Tables and Graphs
No ratings yet
Lec 1 - Data, Tables and Graphs
18 pages
Chapter 1 Introduction To Statistics
No ratings yet
Chapter 1 Introduction To Statistics
28 pages
Definition of Statistics
No ratings yet
Definition of Statistics
4 pages
Past, Future and Development of Travel Writing in The Internet Age
100% (1)
Past, Future and Development of Travel Writing in The Internet Age
2 pages
Philosophical Foundations.1-6
No ratings yet
Philosophical Foundations.1-6
6 pages
Bihar PCS
No ratings yet
Bihar PCS
7 pages
MODULE 5 Learning Thinking Styles and Multiple Intelligences
No ratings yet
MODULE 5 Learning Thinking Styles and Multiple Intelligences
26 pages
The Art Science of AB Testing For Business Decisions
No ratings yet
The Art Science of AB Testing For Business Decisions
97 pages
Burat
No ratings yet
Burat
8 pages
Week 4
No ratings yet
Week 4
3 pages
Friendships, Peer Influence, and Peer Pressure During The Teen Years
No ratings yet
Friendships, Peer Influence, and Peer Pressure During The Teen Years
7 pages
Managerial Economics
No ratings yet
Managerial Economics
33 pages
Global Report On Culture For Global Sustainable Development
No ratings yet
Global Report On Culture For Global Sustainable Development
31 pages
Computer Science and Technology
No ratings yet
Computer Science and Technology
4 pages
The 21st Century Education
No ratings yet
The 21st Century Education
12 pages
Week 5 Presentation V2 UI and UX
No ratings yet
Week 5 Presentation V2 UI and UX
126 pages
Engagement and Compliance in Education Today
No ratings yet
Engagement and Compliance in Education Today
10 pages
GE 5 Ppt. Module 2
No ratings yet
GE 5 Ppt. Module 2
12 pages
A Critical Evaluation of The Concept of Human Security
100% (1)
A Critical Evaluation of The Concept of Human Security
5 pages
Rich Mathematical Task - Grade 3 - Trail Mix Time
No ratings yet
Rich Mathematical Task - Grade 3 - Trail Mix Time
12 pages
Investigation of Facilities Management Practices For Providing Feedback During The Design Development and
No ratings yet
Investigation of Facilities Management Practices For Providing Feedback During The Design Development and
19 pages
Training Alert SOL Developing Maintaining SMS-2
No ratings yet
Training Alert SOL Developing Maintaining SMS-2
3 pages
Paper 1 HL: Guided Literary Analysis (First Examinations 2021)
No ratings yet
Paper 1 HL: Guided Literary Analysis (First Examinations 2021)
1 page
Understanding Earth: Seventh Edition ART Powerpoint Presentations
No ratings yet
Understanding Earth: Seventh Edition ART Powerpoint Presentations
32 pages
Gender Basedlearningstylesofgrade Vipupils
No ratings yet
Gender Basedlearningstylesofgrade Vipupils
21 pages
University of Barishal: Undergraduate Admission 2020-2021
No ratings yet
University of Barishal: Undergraduate Admission 2020-2021
17 pages
Piagets Cognitive Developmental Theory Critical Review
No ratings yet
Piagets Cognitive Developmental Theory Critical Review
10 pages
TOS in DISS Quarter
No ratings yet
TOS in DISS Quarter
2 pages
B Tech 7th Sem Mid Term Exam
No ratings yet
B Tech 7th Sem Mid Term Exam
1 page
Investigating The Impact of Information Technology On Administrative Efficiency in Afghanistan's Public Universities: A Case Study of Kabul University
No ratings yet
Investigating The Impact of Information Technology On Administrative Efficiency in Afghanistan's Public Universities: A Case Study of Kabul University
8 pages
Conflicts As Property - Ashitha Goveas
No ratings yet
Conflicts As Property - Ashitha Goveas
5 pages
Hypothesis (H8) - WPS Office
No ratings yet
Hypothesis (H8) - WPS Office
4 pages

2035 CH1 Notes

Uploaded by

2035 CH1 Notes

Uploaded by

What is Statistics?

Statistics is the science of collecting,

Statistics start with a question

1. What percent of students at UWO

However, prior to getting to the statistical

Probability rules will be discussed in

1. How do you mathematically model how

To answer these questions, we need to know

• It is a set of all units that you are interested

• Any particular characteristic is called a

1. Quantitative (Measurement) Variable

2. Qualitative (Categorical) Variable

• As you can imagine, many populations are

Thus, it makes more sense to select and

This subset is called a sample

• Graph the sample data (chapter 2)

Using the sample data to draw conclusions

• Since the value of μ is often unknown,

• Since the value of σ 2 is often unknown,

As stated before, a variable may be

o Nominal (nominative) level

A nominal variable is used only for

When categorical variables are ranked in

For example, if a person is asked to rank

1. Vancouver (least favourite city)

This ranking does not have to have equal

• Stats course letter mark: A, B, C, D, F

• Tossing a coin: head/tail

• TV show rating: C, C8, G, PG, 14+, 18+

• Personal computer ownership: yes/no

• Restaurant rating: *****, ****, ***, **, *

• Income tax filing status: married, divorced,

• Top ten realtors in a district

• The grocery store aisle where the soup is

There are two levels of measurement:

If you are rating quantitative variables, the

For example, professor ratings

• The distance between 3 and 4 is the same

• Here, the value of 0 is arbitrary; it does not

When a quantitative variable has a

An example would be grades on a test

Another example would be possible annual

This was a study done in the mid 1990’s to

65% earn less than $574 annually

66% of women are illiterate

Where do these numbers come from?

These values come from a survey of 2500

1. They collected information on the income

2. Households were also asked to rate the

3. Households were also asked to rank a

4. Households also provided geographic

Big Data is defined as a collection of

• All data are not created in the same

• As a result, there are at least four

Business analytics refers to the application

Topics include descriptive statistics,

Topics include regression, time-series,

Topics include management science or

FIGURE 1.7 -- Process of Data Mining

Company Name Dollars Funded

FCA Canada Inc. (Chrysler) $85,800,000

Bombardier Inc. $54,150,000

Produits Kruger $39,500,000

Sonaca Montréal Inc. $23,250,000

Hanwha L&C Canada Inc. $15,000,000

You might also like

• Restaurant rating: ***, , *, **, *