0% found this document useful (0 votes)

28 views

Lecture 2

Uploaded by

Hooria Nawal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Lecture 2

Uploaded by

Hooria Nawal

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

Statistical and Mathematical

Methods for Data Analysis

Dr. Raja Noshad Jamil
Department of Artificial Intelligence
School of System and Technology
University of Managenment and Technology
Basic concepts [1]
Statistics is defined as
“The mathematics of the collection, organization, and
interpretation of numerical data, especially the analysis
of population characteristics by inference from sampling”
OR
Statistics is a science which deals with collection, classification,
distribution and interpretation of data.
OR
Statistics is a science of uncertainty.
OR
Statistics is the science of collecting, organizing, analyzing, and
interpreting data in order to make decisions.
Data sets
Data consist of information coming from observations,
counts, measurements, or responses.

Statistics is the science of collecting, organizing, analyzing,

and interpreting data in order to make decisions.

There are two types of data sets you will use when studying
statistics. These data sets are called populations and samples.

A population is the collection of all outcomes, responses,

measurements, or counts that are of interest.

A sample is a subset, or part, of a population.

Identifying Data Sets
In a recent survey, 614 small business owners in the United
States were asked whether they thought their company’s
Facebook presence was valuable. Two hundred fifty-eight (258)
of the 614 respondents said yes. Identify the population and
the sample. Describe the sample data set.
`

Solution:
The population consists of the responses of all small
business owners in the United States, and the sample
consists of the responses of the 614 small business owners
in the survey.

Notice that the sample is a subset of the responses of all

small business owners in the United States. The sample data
set consists of 258 owners who said yes and 356 owners
who said no.
Descriptive Statistics vs. Inferential Statistics

The study of statistics has two major branches: descriptive

statistics and inferential statistics.

Descriptive statistics is the branch of statistics that involves

the organization, summarization, and display of data.

Inferential statistics is the branch of statistics that involves

using a sample to draw conclusions about a population. A
basic tool in the study of inferential statistics is probability.
Descriptive and Inferential Statistics
Example :Determine which part of the study represents the
descriptive branch of statistics. What conclusions might be
drawn from the study using inferential statistics?
1. A large sample of men, aged 48, was studied for 18 years.
For unmarried men, approximately 70% were alive at age 65.
For married men, 90% were alive at age 65. (Source: The
Journal of Family Issues)
2. In a sample of Wall Street analysts, the percentage who
incorrectly forecasted high-tech earnings in a recent year was
44%. (Source: Bloomberg News)
Solution:
1.Descriptive statistics involves statements such as “For
unmarried men, approximately 70% were alive at age 65” and
“For married men, 90% were alive at age 65.” Also, the figure
represents the descriptive branch of statistics. A possible
inference drawn from the study is that being married is
associated with a longer life for men.

2. The part of this study that represents the descriptive

branch of statistics involves the statement “the percentage
[of Wall Street analysts] who incorrectly forecasted high-
tech earnings in a recent year was 44%.” A possible
inference drawn from the study is that the stock market is
difficult to forecast, even for professionals.
Parameter vs. Statistic

A parameter is a numerical description of a population

characteristic.

A statistic is a numerical description of a sample

characteristic.
Distinguishing Between a Parameter and a
Statistic
Example: Determine whether the numerical value describes a
population parameter or a sample statistic. Explain your
reasoning.
1. A recent survey of approximately 400,000 employers
reported that the average starting salary for marketing majors
is $53,400. (Source: National Association of Colleges and
Employers)
2. The freshman class at a university has an average SAT math
score of 514.
3. In a random check of 400 retail stores, the Food and Drug
Administration found that 34% of the stores were not storing
fish at the proper temperature.
Solution
1. Because the average of $53,400 is based on a subset of the
population, it is a sample statistic.
2. Because the average SAT math score of 514 is based on the
entire freshman class, it is a population parameter.
3. Because the percent, 34%, is based on a subset of the
population, it is a sample statistic.
Types of Data

Data sets can consist of two types of data: qualitative data and
quantitative data.

Qualitative data consist of attributes, labels, or nonnumerical

entries.

Quantitative data consist of numerical measurements or

counts.
Classifying Data by Type
Example: The suggested retail prices of several Honda vehicles
are shown in the table. Which data are qualitative data and
which are quantitative data? Explain your reasoning. (Source:
American Honda Motor Company, Inc.)
Solution

The information shown in the table can be separated into

two data sets. One data set contains the names of vehicle
models, and the other contains the suggested retail prices
of vehicle models.

The names are nonnumerical entries, so these are

qualitative data.

The suggested retail prices are numerical entries, so these

are quantitative data.
Levels of Measurement

Another characteristic of data is its level of measurement.

The level of measurementdetermines which statistical
calculations are meaningful.

The four levels of measurement, in order from lowest to

highest, are nominal, ordinal, interval, and ratio.
Nominal vs Ordinal

Data at the nominal level of measurement are qualitative

only. Data at this level are categorized using names, labels,
or qualities. No mathematical computations can be made at
this level.

Data at the ordinal level of measurement are qualitative or

quantitative. Data at this level can be arranged in order, or
ranked, but differences between data entries are not
meaningful.
Example

Two data sets are shown. Which data set consists of data at
the nominal level? Which data set consists of data at the
ordinal level? Explain your reasoning. (Source: The Numbers)
Solution
The first data set lists the ranks of five movies. The data set
consists of the ranks 1, 2, 3, 4, and 5. Because the ranks can
be listed in order, these data are at the ordinal level. Note
that the difference between a rank of 1 and 5 has no
mathematical meaning.

The second data set consists of the names of movie genres.

No mathematical computations can be made with the
names and the names cannot be ranked, so these data are
at the nominal level.
Interval vs. Ratio

Data at the interval level of measurement can be ordered, and

meaningful differences between data entries can be calculated.
At the interval level, a zero entry simply represents a position on
a scale; the entry is not an inherent zero.

Data at the ratio level of measurement are similar to data at

the interval level, with the added property that a zero entry is
an inherent zero. A ratio of two data entries can be formed so
that one data entry can be meaningfully expressed as a
multiple of another.
An inherent zero is a zero that implies “none.” For instance,
the amount of money you have in a savings account could be
zero dollars. In this case, the zero represents no money; it is
an inherent zero. On the other hand, a temperature of 0°C
does not represent a condition in which no heat is present.
The 0°C temperature is simply a position on the Celsius scale;
it is not an inherent zero.
To distinguish between data at the interval level and at the
ratio level, determine whether the expression “twice as
much” has any meaning in the context of the data.

For instance, $2 is twice as much as $1, so these data are

at the ratio level. On the other hand, 2°C is not twice as
warm as 1°C, so these data are at the interval level.
Classifying Data by Level
Example: Two data sets are shown at below. Which data set
consists of data at the interval level? Which data set consists
of data at the ratio level? Explain your reasoning. (Source:
Major League Baseball)
Solution
Both of these data sets contain quantitative data. Consider
the dates of the Yankees’ World Series victories. It makes
sense to find differences between specific dates. For instance,
the time between the Yankees’ first and last World Series
victories is 2009 - 1923 = 86 years. But it does not make sense
to say that one year is a multiple of another. So, these data
are at the interval level.

However, using the home run totals, you can find

differences and write ratios. From the data, you can see that
Baltimore hit 39 more home runs than Tampa Bay hit and
that New York hit about 1.5 times as many home runs as
Detroit hit. So, these data are at the ratio level.
The tables below summarize which operations are
meaningful at each of the four levels of
measurement.
When identifying a data set’s level of measurement, use the
highest levelthat applies.
Summary of Four Levels of Measurement
Key Terms for Data Types
Continuous
• Data that can take on any value in an interval.
• Synonyms: interval, float, numeric

Discrete
• Data that can only take on integer values, such as
counts.
Synonyms: integer, count
Key Terms for Data Types
Categorical
•Data that can only take on a specific set of
values.
•Example: Sex, type of chocolate, color
•Synonyms: enums, enumerated, factors, nominal,
polychotomous
Binary
•A special case of categorical with just two
categories (0/1, True, False).
•Synonyms: dichotomous, logical, indicator
Ordinal
•Categorical data that has an explicit ordering.
•Synonyms: ordered factor
Data Types
Binary data is an important special case of
categorical data that takes on only one of two
values, such as 0/1, yes/no or true/false.
Synonyms: dichotomous, logical, indicator
Ordinal
•Categorical data that has an explicit ordering.
•Synonyms: ordered factor
An example of this is a numerical rating (1, 2, 3, 4,
or 5)
Data Types
There are two basic types of structured data:
numeric and categorical.

Numeric data comes in two forms: continuous,

such as wind speed or time duration, and discrete,
such as the count of the occurrence of an event.

Categorical data takes only a fixed set of values,

such as a type of TV screen (plasma, LCD, LED, …) or
a state name (Alabama, Alaska, …).
Nominal scales
Nominal scales are used for labeling variables, without any
quantitative value.
“Nominal” scales could simply be called “labels.”
Here are some examples, below. Notice that all of these
scales are mutually exclusive (no overlap) and none of them
have any numerical significance.
A good way to remember all of this is that “nominal”
sounds a lot like “name” and nominal scales are kind of like
“names” or labels.
Nominal scale example
Type of chocolate

•Dark(1)
• Milk(2)
•White (3)
Sex
•Male(0)
• Female(1)
Color
• Red(1)
• Green(2)
• Blue(3)
• Yellow(4)
Ordinal scale
With ordinal scales, it is the order of the values is what’s
important and significant, but the differences between each
one is not really known.
Take a look at the example on below. In each case, we know
that option 4 is better than option 3 or option 2, but we
don’t know–and cannot quantify–how much better it is.
For example, is the difference between “OK” and “Unhappy”
the same as the difference between “Very Happy” and
“Happy” ? We can’t say.
Ordinal scales are typically measures of non-numeric
concepts like satisfaction, happiness, discomfort, etc.
Ordinal scale example

Ordinal” is easy to remember because is sounds like “order”

and that’s the key to remember with “ordinal scales”–it is the
order that matters, but that’s all you really get from these.

Advanced note: The best way to determine central

tendency on a set of ordinal data is to use the mode or
median; the mean cannot be defined from an ordinal set.
Key Ideas

Data are typically classified in software by their

type

Data types include continuous, discrete,

categorical (which includes binary), and ordinal

Data-typing in software acts as a signal to the

software on how to process the data

39
 PAKISTAN: ROAD TRAFFIC ACCIDENTS
 Deaths = 30,046
 % = 2.42 (of total death in Pakistan)
 Rate = 17.12
 World Rank = 95
 According to the latest WHO data published in 2018 Road
Traffic Accidents Deaths in Pakistan reached 30,046 or 2.42%
of total deaths. The age adjusted Death Rate is 17.12 per
100,000 of population ranks Pakistan #95 in the world.
Review other causes of death by clicking the links below or
choose the full health profile.
Reference:https://www.worldlifeexpectancy.com/pakistan-
road-traffic-accidents

8
 Road injuries killed 1.4 million people in 2016, about three-
quarters (74%) of whom were men and boys.
Basic concepts

Probability can be defined as the mathematics of

chance.
OR
Statisticians use the word experiment to describe
any process that generates a set of data.
OR
A probability experiment is a chance process that
leads to well defined outcomes or results.
For example, tossing a coin can be considered a
probability experiment since there are two well-
defined outcomes—heads and tails.
Basic concepts

In probability theory, an experiment or trial is any

procedure that can be infinitely repeated and has a
well-defined set of possible outcomes, known as
the sample space.

An outcome of a probability experiment is the

result of a single trial of a probability experiment.
Basic concepts
The set of all possible outcomes of a statistical
experiment is called the sample space and is
represented by the symbol S.
OR
The set of all outcomes of a probability experiment is
called a sample space. Some sample spaces for various
probability experiments are shown here.
Basic concepts

Each outcome in a sample space is called an

element or a member of simply a sample point.

Each outcome of a probability experiment

occurs at random.

Each outcome of the experiment is equally likely

unless other wise stated.
Basic concepts
An event then usually consists of one or more
out comes of the sample space.
OR
An event is a subset of a sample space.

An event with one outcome is called a

simple event.
An event consists of two or more outcomes,
it is called a compound event.
Example
A single die is rolled. List the outcomes in each event:

a. Getting an odd number

b. Getting a number greater than four

c. Getting less than one

Example cont.
Solution:
Basic concepts
Classical Probability: The formula for determining the
probability of an event E is
n(E)
P(E) =
n(S)
OR
Number of outcomes contained in the event E
P(E) =
Total number of outcomes in the sample space
Example:
Two coins are tossed; find the probability that both
coins land heads up.
Solution:
Example:
A die is tossed; find the probability of each event:

a. Getting a two

b. Getting an even number

c. Getting an number less than 5

Example cont.
Solution:
S = {1, 2, 3, 4, 5, 6}
n(S) = 6

Number of outcomes contained in the event E

P(E) =
Total number of outcomes in the sample space

a. Let A be the event of getting a “two”

A = {2}
n(A) = 1
P (A) = 1/6 = 0.1667 (or 16.67%)
Example cont.

b. a. Let B be the event of getting a “even number”

A = {2, 4, 6}
n(A) = 3
P(B) = 3/6 = 0.5 (or 50%)

c. a. Let C be the event of getting a “less than 5”

C= {1, 2, 3, 4}
n(C) = 4
P(C) = 4/6 = 0.6666 (or 66.67%)
Basic concepts
Rule 1: The probability of any event will always be a
number from zero to one. Probabilities cannot be
negative nor can they be greater than one.

Rule 2: When an event cannot occur, the probability

will be zero.

Example: A die is rolled; find the probability of

getting a 7.
Basic concepts
Rule 3: When an event is certain to occur, the
probability is 1.

Example: A die is rolled; find the probability of

getting a number less than 7.

Rule 4: The sum of the probabilities of all of the

outcomes in the sample space is 1.

Example: P(H) = P(T) = 1/2; P(H) + P(T) = 1

Basic concepts
Complement: The complement of an event A with
respect to S is the subset of all elements of S that are
not in A. We denote the complement of A by the
symbol A'.

Rule 5: The probability that an event will not occur is

equal to 1 minus the probability that the event will
occur.

Example: P(H)=1/2, then P(T) =1- P(H) = 1 - 1/2 =1/2

Basic concepts
The probability of an event A is the sum of the weights of all sample
points in A.
Therefore,
I. 0 ≤ P(A) ≤ 1

II. P(φ) = 0

III. P(S) = 1.
Basic concepts
When the probability of an event is close to zero, the
occurrence of the event is relatively unlikely. For
example, if the chances that you will win a certain
lottery are 0.00l or one in one thousand, you
probably won’t win, unless of course, you are very
‘‘lucky.’’

When the probability of an event is 0.5 or 𝟏/𝟐, there

is a 50 – 50 chance that the event will happen—the
same.
Basic concepts
When the probability of an event is close to one,
the event is almost sure to occur. For example, if
the chance of it snowing tomorrow is 90%, more
than likely, you’ll see some snow.
Empirical Probability [1]
Probabilities can be computed for situations that do
not use sample spaces. In such cases, frequency
distributions are used and the probability is called
empirical probability.
Rank Frequency

FRESHMEN 4
Sophomores 6
Juniors 8
Seniors 7
TOTAL 25
Empirical Probability [2]
Frequency of E
P(E ) =
Sum of the frequencies

P(E ) = 1/4

Empirical probability is sometimes called relative

frequency probability.
Law of large numbers

In probability theory, the law of large numbers

(LLN) is a theorem that describes the result of
performing the same experiment a large number of
times.

According to the law, the average of the results

obtained from a large number of trials should be
close to the expected value, and will tend to
become closer as more trials are performed.
Law of large numbers

The LLN is important because it "guarantees"

stable long-term results for the averages of some
random events.

For example, while a casino may lose money in a

single spin of the roulette wheel, its earnings will
tend towards a predictable percentage over a large
number of spins.
Outcome of a die
1
2
3
4
5
6
Sum of die outcome = 21
Mean = 2 1 = 3.5
6
An illustration of the law of large numbers using a
particular run of rolls of a single die. As the number of
rolls in this run increases, the average of the values of
all the results approaches 3.5.
Law of Large Numbers

Questions:
What happens if we toss the coin 100 times ? Will
we get 50 heads?

What will happen if we toss a coin 1000 times? Will

we get exactly 500 heads?
Law of Large Numbers

Solution: Probably not.

However, as the number of tosses increases, the

ratio of the number of heads to the total number of
tosses will get closer to 𝟏/𝟐.

This phenomenon is known as the law of large

numbers.

June 2024 A Level D2 MS
0% (1)
June 2024 A Level D2 MS
23 pages
Introduction & Basic Concepts in Statistics
100% (2)
Introduction & Basic Concepts in Statistics
36 pages
Converting Time Measure From Seconds To Minutes
No ratings yet
Converting Time Measure From Seconds To Minutes
21 pages
Stock Price Prediction Using Recurrent Neural Networks PDF
No ratings yet
Stock Price Prediction Using Recurrent Neural Networks PDF
132 pages
Ep05 A3
0% (2)
Ep05 A3
5 pages
Mec 624 Summary
No ratings yet
Mec 624 Summary
2 pages
Overview and Nature of Data
No ratings yet
Overview and Nature of Data
41 pages
Overview and Nature of Data
100% (1)
Overview and Nature of Data
30 pages
1-Introduction To Statistics PDF
100% (1)
1-Introduction To Statistics PDF
37 pages
Statistics_Batch4_Lecture
No ratings yet
Statistics_Batch4_Lecture
82 pages
STAT. Lec.1
No ratings yet
STAT. Lec.1
30 pages
Chapter 1: Introduction To Statistics: Basic Statistical Concepts
No ratings yet
Chapter 1: Introduction To Statistics: Basic Statistical Concepts
6 pages
Lesson Plan For Sounds
No ratings yet
Lesson Plan For Sounds
27 pages
To Statistics With Excel: © 2002 Thomson / South-Western
No ratings yet
To Statistics With Excel: © 2002 Thomson / South-Western
24 pages
Quantitative-Analysis-and-Interpretation-1
No ratings yet
Quantitative-Analysis-and-Interpretation-1
35 pages
Basic Ideas of Data Management
No ratings yet
Basic Ideas of Data Management
32 pages
Chapter 1- Introduction to Statistics
No ratings yet
Chapter 1- Introduction to Statistics
45 pages
Unit 3
No ratings yet
Unit 3
16 pages
Inferential Statistics: Rashid Msba
No ratings yet
Inferential Statistics: Rashid Msba
34 pages
Chapter 1 the Nature of Probability and Statistics Updated Spring 2023-2024
No ratings yet
Chapter 1 the Nature of Probability and Statistics Updated Spring 2023-2024
38 pages
Statistics Dont Delete
No ratings yet
Statistics Dont Delete
42 pages
MMW Stat 24 25
No ratings yet
MMW Stat 24 25
42 pages
7.data Analysis 1
No ratings yet
7.data Analysis 1
22 pages
BEH 260 Ch 1 Notes (2)
No ratings yet
BEH 260 Ch 1 Notes (2)
17 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Lecture-1 Introduction to statistical theory
No ratings yet
Lecture-1 Introduction to statistical theory
83 pages
Advanced Statistics Lesson 1
No ratings yet
Advanced Statistics Lesson 1
23 pages
Lecture 1 Introduction Prob
100% (1)
Lecture 1 Introduction Prob
38 pages
CH 01
No ratings yet
CH 01
48 pages
Stats Bio Supp. 1
No ratings yet
Stats Bio Supp. 1
11 pages
Chapter 1. Biostatistics
No ratings yet
Chapter 1. Biostatistics
34 pages
typeofdata-140903125809-phpapp02
No ratings yet
typeofdata-140903125809-phpapp02
32 pages
Definition of Statistics
No ratings yet
Definition of Statistics
4 pages
Week 7 Data Management
No ratings yet
Week 7 Data Management
14 pages
WINSEM2024-25_MCSE615L_TH_VL2024250502897_2025-01-07_Reference-Material-I
No ratings yet
WINSEM2024-25_MCSE615L_TH_VL2024250502897_2025-01-07_Reference-Material-I
50 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
17 pages
Lecture 2-Introduction To Satistics
No ratings yet
Lecture 2-Introduction To Satistics
43 pages
1data Management Mamw 100
100% (1)
1data Management Mamw 100
84 pages
Stat With Excel
0% (1)
Stat With Excel
24 pages
Statistics
No ratings yet
Statistics
5 pages
Statistics (Chapter 1)
No ratings yet
Statistics (Chapter 1)
78 pages
1 Introduction To Statistics
No ratings yet
1 Introduction To Statistics
89 pages
01 Introduction
No ratings yet
01 Introduction
50 pages
BPS Chap 1 Edited
No ratings yet
BPS Chap 1 Edited
43 pages
1 - What Is Statistics
No ratings yet
1 - What Is Statistics
22 pages
Chapter 01
No ratings yet
Chapter 01
13 pages
chapter 1
No ratings yet
chapter 1
21 pages
CH11 PPT
No ratings yet
CH11 PPT
33 pages
2035 CH1 (Nominal, Ordinal, Interval, Ratio Variables)
No ratings yet
2035 CH1 (Nominal, Ordinal, Interval, Ratio Variables)
4 pages
Statistics Analysis With Software Application
No ratings yet
Statistics Analysis With Software Application
22 pages
Data Management
No ratings yet
Data Management
36 pages
Statistics: An Overview: Unit 1
No ratings yet
Statistics: An Overview: Unit 1
10 pages
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
No ratings yet
Lecture 1-Statistics Introduction-Defining, Displaying and Summarizing Data
53 pages
Week1a Descriptive Vs Inferential
No ratings yet
Week1a Descriptive Vs Inferential
26 pages
Statistical-Tools-in-Research
No ratings yet
Statistical-Tools-in-Research
3 pages
STPDF1 - Recalling Basic Concepts
No ratings yet
STPDF1 - Recalling Basic Concepts
31 pages
MATH2203 Statistics I - Week 1 (3)
No ratings yet
MATH2203 Statistics I - Week 1 (3)
27 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
59 pages
IS2
No ratings yet
IS2
22 pages
MATH 2207 - Basics of Statistics
No ratings yet
MATH 2207 - Basics of Statistics
21 pages
Math 1f - All Lessons
No ratings yet
Math 1f - All Lessons
81 pages
Statistics: Population and Sample Parameter and Statistics
No ratings yet
Statistics: Population and Sample Parameter and Statistics
5 pages
Introduction To STATISTICS-new
No ratings yet
Introduction To STATISTICS-new
44 pages
Business Statistics I Essentials
From Everand
Business Statistics I Essentials
Louise Clark
5/5 (5)
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
From Everand
Machine Learning - A Complete Exploration of Highly Advanced Machine Learning Concepts, Best Practices and Techniques: 4
Peter Bradley
No ratings yet
Lab 6 and 7 ITSE
No ratings yet
Lab 6 and 7 ITSE
8 pages
Answers
0% (1)
Answers
49 pages
Questions_newtons_second_law_of_motion_speed_and_acceleration.pdf
No ratings yet
Questions_newtons_second_law_of_motion_speed_and_acceleration.pdf
2 pages
Statistical Analysis With Software Application: Module No. 2
No ratings yet
Statistical Analysis With Software Application: Module No. 2
13 pages
DPP 03 - Sets - Bhannat Maths - Aman Sir Maths PDF
No ratings yet
DPP 03 - Sets - Bhannat Maths - Aman Sir Maths PDF
2 pages
Berry M.W., Browne M.-Understanding Search Engines. Mathematical Modeling and Text Retrieval-SIAM, Society For Industrial and Applied Mathematics (2005) PDF
100% (1)
Berry M.W., Browne M.-Understanding Search Engines. Mathematical Modeling and Text Retrieval-SIAM, Society For Industrial and Applied Mathematics (2005) PDF
136 pages
Risk Assessment in Offshore Structures Design The Advantages of A CFD Approach
No ratings yet
Risk Assessment in Offshore Structures Design The Advantages of A CFD Approach
8 pages
Stem and Leaf Diagrams
100% (1)
Stem and Leaf Diagrams
32 pages
AnexoGERE LetraG 6thED
No ratings yet
AnexoGERE LetraG 6thED
6 pages
Avogadro Numer PDF
No ratings yet
Avogadro Numer PDF
4 pages
The Effects of Unbalanced Networks On Synchronous and Asynchronous Machine Transient Stability
No ratings yet
The Effects of Unbalanced Networks On Synchronous and Asynchronous Machine Transient Stability
9 pages
Automatically Programmed Tools
No ratings yet
Automatically Programmed Tools
17 pages
VHDL 5
No ratings yet
VHDL 5
19 pages
Notes on Cables
No ratings yet
Notes on Cables
2 pages
Grade-11 Maths Model Set A
No ratings yet
Grade-11 Maths Model Set A
3 pages
Health and Endogenous Growth: Adriaan Van Zon, Joan Muysken
No ratings yet
Health and Endogenous Growth: Adriaan Van Zon, Joan Muysken
17 pages
Nordhaus Yang Negishi Weights AER 1996
No ratings yet
Nordhaus Yang Negishi Weights AER 1996
25 pages
Trigonometry Review
No ratings yet
Trigonometry Review
7 pages
KT Ykts
No ratings yet
KT Ykts
41 pages
Coordinate system and vectors notes + dpps (1)
No ratings yet
Coordinate system and vectors notes + dpps (1)
338 pages
Archimedes Principle - Explanation, Archimedes Law Examples
No ratings yet
Archimedes Principle - Explanation, Archimedes Law Examples
7 pages
Field Work No.1 Differential Leveling: Objective
No ratings yet
Field Work No.1 Differential Leveling: Objective
7 pages
MotionDimension - 1D
No ratings yet
MotionDimension - 1D
15 pages
Information Theory and Applications
No ratings yet
Information Theory and Applications
293 pages
50 Simplification Questions Handpicked PDF by Aashish Arora
No ratings yet
50 Simplification Questions Handpicked PDF by Aashish Arora
23 pages

Lecture 2

Uploaded by

Lecture 2

Uploaded by

Statistical and Mathematical

Methods for Data Analysis

Statistics is the science of collecting, organizing, analyzing,

A population is the collection of all outcomes, responses,

A sample is a subset, or part, of a population.

Notice that the sample is a subset of the responses of all

The study of statistics has two major branches: descriptive

Descriptive statistics is the branch of statistics that involves

Inferential statistics is the branch of statistics that involves

2. The part of this study that represents the descriptive

A parameter is a numerical description of a population

A statistic is a numerical description of a sample

Qualitative data consist of attributes, labels, or nonnumerical

Quantitative data consist of numerical measurements or

The information shown in the table can be separated into

The names are nonnumerical entries, so these are

The suggested retail prices are numerical entries, so these

Another characteristic of data is its level of measurement.

The four levels of measurement, in order from lowest to

Data at the nominal level of measurement are qualitative

Data at the ordinal level of measurement are qualitative or

The second data set consists of the names of movie genres.

Data at the interval level of measurement can be ordered, and

Data at the ratio level of measurement are similar to data at

For instance, $2 is twice as much as $1, so these data are

However, using the home run totals, you can find

Numeric data comes in two forms: continuous,

Categorical data takes only a fixed set of values,

Ordinal” is easy to remember because is sounds like “order”

Advanced note: The best way to determine central

Data are typically classified in software by their

Data types include continuous, discrete,

Data-typing in software acts as a signal to the

Probability can be defined as the mathematics of

In probability theory, an experiment or trial is any

An outcome of a probability experiment is the

Each outcome in a sample space is called an

Each outcome of a probability experiment

Each outcome of the experiment is equally likely

An event with one outcome is called a

a. Getting an odd number

b. Getting a number greater than four

c. Getting less than one

b. Getting an even number

c. Getting an number less than 5

Number of outcomes contained in the event E

a. Let A be the event of getting a “two”

b. a. Let B be the event of getting a “even number”

c. a. Let C be the event of getting a “less than 5”

Rule 2: When an event cannot occur, the probability

Example: A die is rolled; find the probability of

Example: A die is rolled; find the probability of

Rule 4: The sum of the probabilities of all of the

Example: P(H) = P(T) = 1/2; P(H) + P(T) = 1

Rule 5: The probability that an event will not occur is

Example: P(H)=1/2, then P(T) =1- P(H) = 1 - 1/2 =1/2

When the probability of an event is 0.5 or 𝟏/𝟐, there

Empirical probability is sometimes called relative

In probability theory, the law of large numbers

According to the law, the average of the results

The LLN is important because it "guarantees"

For example, while a casino may lose money in a

What will happen if we toss a coin 1000 times? Will

Solution: Probably not.

However, as the number of tosses increases, the

This phenomenon is known as the law of large

You might also like