0% found this document useful (0 votes)

41 views13 pages

Ics 2328 Statistical Modeling Notes 1

Uploaded by

austinbodi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

41 views13 pages

Ics 2328 Statistical Modeling Notes 1

Uploaded by

austinbodi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

ICS 2328 STATISTICAL MODELING

CHAPTER 1

Statistics: Introduction

Definitions

Statistics
Collection of methods for planning experiments, obtaining data, and then organizing, summarizing,
presenting, analyzing, interpreting, and drawing conclusions.
Variable
Characteristic or attribute that can assume different values
Random Variable
A variable whose values are determined by chance.
Population
All subjects possessing a common characteristic that is being studied.
Sample
A subgroup or subset of the population.
Parameter
Characteristic or measure obtained from a population.
Statistic (not to be confused with Statistics)
Characteristic or measure obtained from a sample.
Descriptive Statistics
Collection, organization, summarization, and presentation of data.
Inferential Statistics
Generalizing from samples to populations using probabilities. Performing hypothesis testing,
determining relationships between variables, and making predictions.
Qualitative Variables
Variables which assume non-numerical values.
Quantitative Variables
Variables which assume numerical values.
Discrete Variables
Variables which assume a finite or countable number of possible values. Usually obtained by
counting.
Continuous Variables
Variables which assume an infinite number of possible values. Usually obtained by measurement.
Nominal Level
Level of measurement which classifies data into mutually exclusive, all inclusive categories in
which no order or ranking can be imposed on the data.
Ordinal Level
Level of measurement which classifies data into categories that can be ranked. Differences between
the ranks do not exist.
Interval Level
Level of measurement which classifies data that can be ranked and differences are meaningful.
However, there is no meaningful zero, so ratios are meaningless.
Ratio Level

Page | 1
Level of measurement which classifies data that can be ranked, differences are meaningful, and
there is a true zero. True ratios exist between the different units of measure.
Random Sampling
Sampling in which the data is collected using chance methods or random numbers.
Systematic Sampling
Sampling in which data is obtained by selecting every kth object.
Convenience Sampling
Sampling in which data is which is readily available is used.
Stratified Sampling
Sampling in which the population is divided into groups (called strata) according to some
characteristic. Each of these strata is then sampled using one of the other sampling techniques.
Cluster Sampling
Sampling in which the population is divided into groups (usually geographically). Some of these
groups are randomly selected, and then all of the elements in those groups are selected.

Notes

Population vs Sample

The population includes all objects of interest whereas the sample is only a portion of the population.
Parameters are associated with populations and statistics with samples. Parameters are usually denoted
using Greek letters (mu, sigma) while statistics are usually denoted using Roman letters (x, s).

There are several reasons why we don't work with populations. They are usually large, and it is often
impossible to get data for every object we're studying. Sampling does not usually occur without cost, and
the more items surveyed, the larger the cost.

We compute statistics, and use them to estimate parameters. The computation is the first part of the
statistics course (Descriptive Statistics) and the estimation is the second part (Inferential Statistics)

Discrete vs Continuous

Discrete variables are usually obtained by counting. There are a finite or countable number of choices
available with discrete data. You can't have 2.63 people in the room.

Continuous variables are usually obtained by measuring. Length, weight, and time are all examples of
continous variables. Since continuous variables are real numbers, we usually round them. This implies a
boundary depending on the number of decimal places. For example: 64 is really anything 63.5 <= x <
64.5. Likewise, if there are two decimal places, then 64.03 is really anything 63.025 <= x < 63.035.
Boundaries always have one more decimal place than the data and end in a 5.

Levels of Measurement

There are four levels of measurement: Nominal, Ordinal, Interval, and Ratio. These go from lowest level
to highest level. Data is classified according to the highest level which it fits. Each additional level adds
something the previous level didn't have.

• Nominal is the lowest level. Only names are meaningful here.

• Ordinal adds an order to the names.

Page | 2
• Interval adds meaningful differences
• Ratio adds a zero so that ratios are meaningful.

Types of Sampling

There are five types of sampling: Random, Systematic, Convenience, Cluster, and Stratified.

• Random sampling is analogous to putting everyone's name into a hat and drawing out several
names. Each element in the population has an equal chance of occuring. While this is the
preferred way of sampling, it is often difficult to do. It requires that a complete list of every
element in the population be obtained. Computer generated lists are often used with random
sampling. You can generate random numbers using the TI82 calculator.
• Systematic sampling is easier to do than random sampling. In systematic sampling, the list of
elements is "counted off". That is, every kth element is taken. This is similar to lining everyone
up and numbering off "1,2,3,4; 1,2,3,4; etc". When done numbering, all people numbered 4
would be used.
• Convenience sampling is very easy to do, but it's probably the worst technique to use. In
convenience sampling, readily available data is used. That is, the first people the surveyor runs
into.
• Cluster sampling is accomplished by dividing the population into groups -- usually
geographically. These groups are called clusters or blocks. The clusters are randomly selected,
and each element in the selected clusters are used.
• Stratified sampling also divides the population into groups called strata. However, this time it is
by some characteristic, not geographically. For instance, the population might be separated into
males and females. A sample is taken from each of these strata using either random, systematic,
or convenience sampling.

CHAPTER 2

Definitions

Raw Data
Data collected in original form.
Frequency
The number of times a certain value or class of values occurs.
Frequency Distribution
The organization of raw data in table form with classes and frequencies.
Categorical Frequency Distribution
A frequency distribution in which the data is only nominal or ordinal.
Ungrouped Frequency Distribution
A frequency distribution of numerical data. The raw data is not grouped.
Grouped Frequency Distribution
A frequency distribution where several numbers are grouped into one class.
Class Limits
Separate one class in a grouped frequency distribution from another. The limits could actually
appear in the data and have gaps between the upper limit of one class and the lower limit of the
next.

Page | 3
Class Boundaries
Separate one class in a grouped frequency distribution from another. The boundaries have one more
decimal place than the raw data and therefore do not appear in the data. There is no gap between
the upper boundary of one class and the lower boundary of the next class. The lower class boundary
is found by subtracting 0.5 units from the lower class limit and the upper class boundary is found
by adding 0.5 units to the upper class limit.
Class Width
The difference between the upper and lower boundaries of any class. The class width is also the
difference between the lower limits of two consecutive classes or the upper limits of two
consecutive classes. It is not the difference between the upper and lower limits of the same class.
Class Mark (Midpoint)
The number in the middle of the class. It is found by adding the upper and lower limits and dividing
by two. It can also be found by adding the upper and lower boundaries and dividing by two.
Cumulative Frequency
The number of values less than the upper class boundary for the current class. This is a running
total of the frequencies.
Relative Frequency
The frequency divided by the total frequency. This gives the percent of values falling in that class.
Cumulative Relative Frequency (Relative Cumulative Frequency)
The running total of the relative frequencies or the cumulative frequency divided by the total
frequency. Gives the percent of the values which are less than the upper class boundary.
Histogram
A graph which displays the data by using vertical bars of various heights to represent frequencies.
The horizontal axis can be either the class boundaries, the class marks, or the class limits.
Frequency Polygon
A line graph. The frequency is placed along the vertical axis and the class midpoints are placed
along the horizontal axis. These points are connected with lines.
Ogive
A frequency polygon of the cumulative frequency or the relative cumulative frequency. The vertical
axis the cumulative frequency or relative cumulative frequency. The horizontal axis is the class
boundaries. The graph always starts at zero at the lowest class boundary and will end up at the total
frequency (for a cumulative frequency) or 1.00 (for a relative cumulative frequency).
Pareto Chart
A bar graph for qualitative data with the bars arranged according to frequency.
Pie Chart
Graphical depiction of data as slices of a pie. The frequency determines the size of the slice. The
number of degrees in any slice is the relative frequency times 360 degrees.
Pictograph
A graph that uses pictures to represent data.
Stem and Leaf Plot
A data plot which uses part of the data value as the stem and the rest of the data value (the leaf) to
form groups or classes. This is very useful for sorting data quickly.

Statistics: Grouped Frequency Distributions

Guidelines for classes

1. There should be between 5 and 20 classes.

Page | 4
2. The class width should be an odd number. This will guarantee that the class midpoints are integers
instead of decimals.
3. The classes must be mutually exclusive. This means that no data value can fall into two different
classes
4. The classes must be all inclusive or exhaustive. This means that all data values must be included.
5. The classes must be continuous. There are no gaps in a frequency distribution. Classes that have
no values in them must be included (unless it's the first or last class which are dropped).
6. The classes must be equal in width. The exception here is the first or last class. It is possible to have
an "below ..." or "... and above" class. This is often used with ages.
Creating a Grouped Frequency Distribution
1. Find the largest and smallest values
2. Compute the Range = Maximum - Minimum
3. Select the number of classes desired. This is usually between 5 and 20.
4. Find the class width by dividing the range by the number of classes and rounding up. There are two
things to be careful of here. You must round up, not off. Normally 3.2 would round to be 3, but in
rounding up, it becomes 4. If the range divided by the number of classes gives an integer value (no
remainder), then you can either add one to the number of classes or add one to the class width.
Sometimes you're locked into a certain number of classes because of the instructions. The Bluman
text fails to mention the case when there is no remainder.
5. Pick a suitable starting point less than or equal to the minimum value. You will be able to cover:
"the class width times the number of classes" values. You need to cover one more value than the
range. Follow this rule and you'll be okay: The starting point plus the number of classes times the
class width must be greater than the maximum value. Your starting point is the lower limit of the
first class. Continue to add the class width to this lower limit to get the rest of the lower limits.
6. To find the upper limit of the first class, subtract one from the lower limit of the second class. Then
continue to add the class width to this upper limit to find the rest of the upper limits.
7. Find the boundaries by subtracting 0.5 units from the lower limits and adding 0.5 units from the
upper limits. The boundaries are also half-way between the upper limit of one class and the lower
limit of the next class. Depending on what you're trying to accomplish, it may not be necessary to
find the boundaries.
8. Tally the data.
9. Find the frequencies.
10. Find the cumulative frequencies. Depending on what you're trying to accomplish, it may not be
necessary to find the cumulative frequencies.
11. If necessary, find the relative frequencies and/or relative cumulative frequencies.

Page | 5
CHAPTER 3
Statistics: Data Description

Definitions
Statistic
Characteristic or measure obtained from a sample
Parameter
Characteristic or measure obtained from a population
Mean
Sum of all the values divided by the number of values. This can either be a population mean (denoted by
mu) or a sample mean (denoted by x bar)
Median
The midpoint of the data after being ranked (sorted in ascending order). There are as many numbers below
the median as above the median.
Mode
The most frequent number
Skewed Distribution
The majority of the values lie together on one side with a very few values (the tail) to the other side. In a
positively skewed distribution, the tail is to the right and the mean is larger than the median. In a negatively
skewed distribution, the tail is to the left and the mean is smaller than the median.
Symmetric Distribution
The data values are evenly distributed on both sides of the mean. In a symmetric distribution, the mean is
the median.
Weighted Mean
The mean when each value is multiplied by its weight and summed. This sum is divided by the total of the
weights.
Midrange
The mean of the highest and lowest values. (Max + Min) / 2
Range
The difference between the highest and lowest values. Max - Min

Page | 6
Population Variance
The average of the squares of the distances from the population mean. It is the sum of the squares of the
deviations from the mean divided by the population size. The units on the variance are the units of the
population squared.
Sample Variance
Unbiased estimator of a population variance. Instead of dividing by the population size, the sum of the
squares of the deviations from the sample mean is divided by one less than the sample size. The units on
the variance are the units of the population squared.
Standard Deviation
The square root of the variance. The population standard deviation is the square root of the population
variance and the sample standard deviation is the square root of the sample variance. The sample standard
deviation is not the unbiased estimator for the population standard deviation. The units on the standard
deviation is the same as the units of the population/sample.
Coefficient of Variation
Standard deviation divided by the mean, expressed as a percentage. We won't work with the Coefficient of
Variation in this course.
Chebyshev's Theorem

The proportion of the values that fall within k standard deviations of the mean is at least where
k > 1. Chebyshev's theorem can be applied to any distribution regardless of its shape.
Empirical or Normal Rule
Only valid when a distribution in bell-shaped (normal). Approximately 68% lies within 1 standard deviation
of the mean; 95% within 2 standard deviations; and 99.7% within 3 standard deviations of the mean.
Standard Score or Z-Score
The value obtained by subtracting the mean and dividing by the standard deviation. When all values are
transformed to their standard scores, the new mean (for Z) will be zero and the standard deviation will be
one.
Percentile
The percent of the population which lies below that value. The data must be ranked to find percentiles.
Quartile
Either the 25th, 50th, or 75th percentiles. The 50th percentile is also called the median.
Decile
Either the 10th, 20th, 30th, 40th, 50th, 60th, 70th, 80th, or 90th percentiles.

Page | 7
Lower Hinge
The median of the lower half of the numbers (up to and including the median). The lower hinge is the first
Quartile unless the remainder when dividing the sample size by four is 3.
Upper Hinge
The median of the upper half of the numbers (including the median). The upper hinge is the 3rd Quartile
unless the remainder when dividing the sample size by four is 3.
Box and Whiskers Plot (Box Plot)
A graphical representation of the minimum value, lower hinge, median, upper hinge, and maximum. Some
textbooks, and the TI-82 calculator, define the five values as the minimum, first Quartile, median, third
Quartile, and maximum.
Five Number Summary
Minimum value, lower hinge, median, upper hinge, and maximum.
InterQuartile Range (IQR)
The difference between the 3rd and 1st Quartiles.
Outlier
An extremely high or low value when compared to the rest of the values.
Mild Outliers
Values which lie between 1.5 and 3.0 times the InterQuartile Range below the 1st Quartile or above the 3rd
Quartile. Note, some texts use hinges instead of Quartiles.
Extreme Outliers
Values which lie more than 3.0 times the InterQuartile Range below the 1st Quartile or above the 3rd
Quartile. Note, some texts use hinges instead of Quartiles.

Stats: Measures of Central Tendency

The term "Average" is vague
Average could mean one of four things. The arithmetic mean, the median, midrange, or mode. For this
reason, it is better to specify which average you're talking about.
Mean
This is what people usually intend when they say "average"
Population Mean: mu = ( sum x ) / N
Sample Mean:
Frequency Distribution:

Page | 8
The mean of a frequency distribution is also the weighted mean.
Median
The data must be ranked (sorted in ascending order) first. The median is the number in the middle.
To find the depth of the median, there are several formulas that could be used, the one that we will use is:
Depth of median = 0.5 * (n + 1)

Raw Data
The median is the number in the "depth of the median" position. If the sample size is even, the depth of the
median will be a decimal -- you need to find the midpoint between the numbers on either side of the depth
of the median.
Ungrouped Frequency Distribution
Find the cumulative frequencies for the data. The first value with a cumulative frequency greater than depth
of the median is the median. If the depth of the median is exactly 0.5 more than the cumulative frequency
of the previous class, then the median is the midpoint between the two classes.
Grouped Frequency Distribution
This is the tough one.
Since the data is grouped, you have lost all original information. Some textbooks have you simply take the
midpoint of the class. This is an over-simplification which isn't the true value (but much easier to do). The
correct process is to interpolate.
Find out what proportion of the distance into the median class the median by dividing the sample size by 2,
subtracting the cumulative frequency of the previous class, and then dividing all that bay the frequency of
the median class.
Multiply this proportion by the class width and add it to the lower boundary of the median class.
Mode
The mode is the most frequent data value. There may be no mode if no one value appears more than any
other. There may also be two modes (bimodal), three modes (trimodal), or more than three modes (multi-
modal).
For grouped frequency distributions, the modal class is the class with the largest frequency.
Midrange
The midrange is simply the midpoint between the highest and lowest values.
Summary
The Mean is used in computing other statistics (such as the variance) and does not exist for open ended
grouped frequency distributions (1). It is often not appropriate for skewed distributions such as salary
information.

Page | 9
The Median is the center number and is good for skewed distributions because it is resistant to change.
The Mode is used to describe the most typical case. The mode can be used with nominal data whereas the
others can't. The mode may or may not exist and there may be more than one value for the mode (2).
The Midrange is not used very often. It is a very rough estimate of the average and is greatly affected by
extreme values (even more so than the mean).

Stats: Measures of Variation

Range
The range is the simplest measure of variation to find. It is simply the highest value minus the lowest value.
RANGE = MAXIMUM - MINIMUM
Since the range only uses the largest and smallest values, it is greatly affected by extreme values, that is -
it is not resistant to change.
Variance
"Average Deviation"
The range only involves the smallest and largest numbers, and it would be desirable to have a statistic which
involved all of the data values.
The first attempt one might make at this is something they might call the average deviation from the mean
and define it as:

The problem is that this summation is always zero. So, the average deviation will always be zero. That is
why the average deviation is never used.
Population Variance
So, to keep it from being zero, the deviation from the mean is squared and called the "squared deviation
from the mean". This "average squared deviation from the mean" is called the variance.

Unbiased Estimate of the Population Variance

One would expect the sample variance to simply be the population variance with the population mean
replaced by the sample mean. However, one of the major uses of statistics is to estimate the corresponding
parameter. This formula has the problem that the estimated value isn't the same as the parameter. To
counteract this, the sum of the squares of the deviations is divided by one less than the sample size.

Page | 10
Standard Deviation
There is a problem with variances. Recall that the deviations were squared. That means that the units were
also squared. To get the units back the same as the original data values, the square root must be taken.

The sample standard deviation is not the unbiased estimator for the population standard deviation.
The calculator does not have a variance key on it. It does have a standard deviation key. You will have to
square the standard deviation to find the variance.
Sum of Squares (shortcuts)
The sum of the squares of the deviations from the means is given a shortcut notation and several alternative
formulas.

A little algebraic simplification returns:

What's wrong with the first formula, you ask? Consider the following example - the last row are the totals
for the columns
1. Total the data values: 23
2. Divide by the number of values to get the mean: 23/5 = 4.6
3. Subtract the mean from each value to get the numbers in the second column.
4. Square each number in the second column to get the values in the third column.
5. Total the numbers in the third column: 5.2
6. Divide this total by one less than the sample size to get the variance: 5.2 / 4 = 1.3

Page | 11
4 4 - 4.6 = -0.6 ( - 0.6 )^2 = 0.36

5 5 - 4.6 = 0.4 ( 0.4 ) ^2 = 0.16

3 3 - 4.6 = -1.6 ( - 1.6 )^2 = 2.56

6 6 - 4.6 = 1.4 ( 1.4 )^2 = 1.96

5 5 - 4.6 = 0.4 ( 0.4 )^2 = 0.16

23 0.00 (Always) 5.2

Not too bad, you think. But this can get pretty bad if the sample mean doesn't happen to be an "nice" rational
number. Think about having a mean of 19/7 = 2.714285714285... Those subtractions get nasty, and when
you square them, they're really bad. Another problem with the first formula is that it requires you to know
the mean ahead of time. For a calculator, this would mean that you have to save all of the numbers that
were entered. The TI-82 does this, but most scientific calculators don't.
Now, let's consider the shortcut formula. The only things that you need to find are the sum of the values
and the sum of the values squared. There is no subtraction and no decimals or fractions until the end. The
last row contains the sums of the columns, just like before.
1. Record each number in the first column and the square of each number in the second column.
2. Total the first column: 23
3. Total the second column: 111
4. Compute the sum of squares: 111 - 23*23/5 = 111 - 105.8 = 5.2
5. Divide the sum of squares by one less than the sample size to get the variance = 5.2 / 4 = 1.3

x x^2

4 16

5 25

3 9

6 36

5 25

23 111

Chebyshev's Theorem

Page | 12
The proportion of the values that fall within k standard deviations of the mean will be at least ,
where k is an number greater than 1.

"Within k standard deviations" interprets as the interval: to .

Chebyshev's Theorem is true for any sample set, not matter what the distribution.
Empirical Rule
The empirical rule is only valid for bell-shaped (normal) distributions. The following statements are true.
• Approximately 68% of the data values fall within one standard deviation of the mean.
• Approximately 95% of the data values fall within two standard deviations of the mean.
• Approximately 99.7% of the data values fall within three standard deviations of the mean.
The empirical rule will be revisited later in the chapter on normal probabilities.

Page | 13

The Concept of Creativity Prospects and Paradigms
100% (1)
The Concept of Creativity Prospects and Paradigms
13 pages
Class BSC Book Statistics All Chpter Wise Notes
66% (50)
Class BSC Book Statistics All Chpter Wise Notes
128 pages
1data Management Mamw 100
100% (1)
1data Management Mamw 100
84 pages
Value Chain Development
100% (2)
Value Chain Development
75 pages
Stats Notes
No ratings yet
Stats Notes
7 pages
Experience Design For Real Time Doctor Patient Interactions
No ratings yet
Experience Design For Real Time Doctor Patient Interactions
205 pages
1 Statistics Introduction
No ratings yet
1 Statistics Introduction
36 pages
Statistics For Data Analysis
No ratings yet
Statistics For Data Analysis
71 pages
Lecture Notes - Prob and Stat
No ratings yet
Lecture Notes - Prob and Stat
229 pages
Probability and Statistics
No ratings yet
Probability and Statistics
3 pages
Business Statistics
No ratings yet
Business Statistics
9 pages
Basic Research Designs Can Be Seen From The Issues Associated With The Decision About The Purpose of
100% (1)
Basic Research Designs Can Be Seen From The Issues Associated With The Decision About The Purpose of
18 pages
Basic Concepts in Statistics
No ratings yet
Basic Concepts in Statistics
19 pages
Revision SB Chap 2 7
No ratings yet
Revision SB Chap 2 7
55 pages
Stat 203: Probability and Statistics
No ratings yet
Stat 203: Probability and Statistics
7 pages
STAT. Lec.1
No ratings yet
STAT. Lec.1
30 pages
Lec 2-1
No ratings yet
Lec 2-1
28 pages
Compensation Management - PPT Download
No ratings yet
Compensation Management - PPT Download
28 pages
3RD Quarter Reviewer
No ratings yet
3RD Quarter Reviewer
19 pages
Chapter 1 (Technical English For Statistics)
No ratings yet
Chapter 1 (Technical English For Statistics)
3 pages
Stat 2 PDF
No ratings yet
Stat 2 PDF
41 pages
Statistics Notes
No ratings yet
Statistics Notes
89 pages
Probability and Statistics Lesson 1 2
No ratings yet
Probability and Statistics Lesson 1 2
47 pages
Statistics For Business and Economics
No ratings yet
Statistics For Business and Economics
6 pages
Chapter 1 PDF
No ratings yet
Chapter 1 PDF
5 pages
Indirect Thesis Statement Examples
100% (3)
Indirect Thesis Statement Examples
5 pages
Statistics
No ratings yet
Statistics
2 pages
Statistics
No ratings yet
Statistics
248 pages
B.SC (CS With AI) Unit - 1
No ratings yet
B.SC (CS With AI) Unit - 1
19 pages
The Impact of Forensic Accounting in Fraud Detection and Prevention: Evidence From Nigerian Public Sector
100% (1)
The Impact of Forensic Accounting in Fraud Detection and Prevention: Evidence From Nigerian Public Sector
8 pages
Scientific Data
No ratings yet
Scientific Data
22 pages
I. Statistics: Introduction: Data Measurements
No ratings yet
I. Statistics: Introduction: Data Measurements
23 pages
A Project Report On "Consumer Behavior On Toyota & Implementation of TQM"
No ratings yet
A Project Report On "Consumer Behavior On Toyota & Implementation of TQM"
32 pages
Data Management
No ratings yet
Data Management
18 pages
Toothbrush Survey
No ratings yet
Toothbrush Survey
24 pages
Chapter No. 1 Statistic Basic Terms: Subject Name: Business Statistics Code: Lecture No. Revision 0 Title
No ratings yet
Chapter No. 1 Statistic Basic Terms: Subject Name: Business Statistics Code: Lecture No. Revision 0 Title
4 pages
Basic Concepts 2
No ratings yet
Basic Concepts 2
18 pages
Aar 1& 2
No ratings yet
Aar 1& 2
26 pages
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
100% (1)
Lesson 1: Engineering Data Analysis First Semester - A.Y. 2021 - 2022
4 pages
STATISTICS
No ratings yet
STATISTICS
10 pages
Module 7
No ratings yet
Module 7
49 pages
Basic Terminologies in Statistics
No ratings yet
Basic Terminologies in Statistics
4 pages
Introduction To Biostatistics
No ratings yet
Introduction To Biostatistics
37 pages
Dissertation Solent
100% (2)
Dissertation Solent
4 pages
4.10 Descriptive Statistics
No ratings yet
4.10 Descriptive Statistics
18 pages
Delivering Mathematics Instruction in The Senior High School Amidst The Pandemic: Basis For Enhanced Learning Continuity Plan
No ratings yet
Delivering Mathematics Instruction in The Senior High School Amidst The Pandemic: Basis For Enhanced Learning Continuity Plan
16 pages
EM 104 Module
No ratings yet
EM 104 Module
12 pages
MMW GE 4 Week 10 PPT 23 24
No ratings yet
MMW GE 4 Week 10 PPT 23 24
23 pages
Lecture Note 1 - Definitions of Terms
No ratings yet
Lecture Note 1 - Definitions of Terms
3 pages
Statistics Rev
No ratings yet
Statistics Rev
5 pages
Reviewer in Statistics
No ratings yet
Reviewer in Statistics
7 pages
BA 302: Chapter-1 Instructions: Introduction & Definitions by Dr. Kishor Guru-Gharana
No ratings yet
BA 302: Chapter-1 Instructions: Introduction & Definitions by Dr. Kishor Guru-Gharana
5 pages
MMWChapter4 6
No ratings yet
MMWChapter4 6
66 pages
Essentials of Business Communication-Cengage Learning (2023) - 309-350
No ratings yet
Essentials of Business Communication-Cengage Learning (2023) - 309-350
42 pages
Lecture 1
No ratings yet
Lecture 1
13 pages
Statistics Modules
No ratings yet
Statistics Modules
27 pages
STA2023 Summary Notes: Chapter 1 - 10
No ratings yet
STA2023 Summary Notes: Chapter 1 - 10
58 pages
Week 1
No ratings yet
Week 1
6 pages
GE 155.1 4C Group 4 LabEx 1
No ratings yet
GE 155.1 4C Group 4 LabEx 1
18 pages
Data Journalism Uncovering The Story in The Number
No ratings yet
Data Journalism Uncovering The Story in The Number
10 pages
Whitepaper Human Era at Work
No ratings yet
Whitepaper Human Era at Work
18 pages
Modernization of Pressure Vessel Design Codes ASME Section VIII, Division 2, 2007 Edition
No ratings yet
Modernization of Pressure Vessel Design Codes ASME Section VIII, Division 2, 2007 Edition
5 pages
Thin & Thick Cylinders
No ratings yet
Thin & Thick Cylinders
4 pages
Tutorial 4
No ratings yet
Tutorial 4
7 pages
Project Proposal
No ratings yet
Project Proposal
10 pages
Statistics Unit 1
No ratings yet
Statistics Unit 1
25 pages
Icte Lesson
No ratings yet
Icte Lesson
19 pages
STATISTICS Is A Group of Methods Used To Collect
No ratings yet
STATISTICS Is A Group of Methods Used To Collect
17 pages
Ijert Ijert: Design of Pile Foundation at GALANDER-KANDIZAL Bridge in J&K
No ratings yet
Ijert Ijert: Design of Pile Foundation at GALANDER-KANDIZAL Bridge in J&K
10 pages
2021.7.29 Bhargava Divya Thesis
No ratings yet
2021.7.29 Bhargava Divya Thesis
193 pages
EDA - Midterms - Reviewer
No ratings yet
EDA - Midterms - Reviewer
7 pages
Chapter 1
No ratings yet
Chapter 1
4 pages
Analysis of The Prospects and Challenges of Subtit
No ratings yet
Analysis of The Prospects and Challenges of Subtit
11 pages
Re-Examine OCE To Include Purchase Frequency and Perceived Risk
No ratings yet
Re-Examine OCE To Include Purchase Frequency and Perceived Risk
15 pages
Pre Test Post Test in Qualitative Resear
No ratings yet
Pre Test Post Test in Qualitative Resear
4 pages
6 Sampling and Basic Descriptive Statistics
No ratings yet
6 Sampling and Basic Descriptive Statistics
38 pages
EDA - First Quiz Reviewer
No ratings yet
EDA - First Quiz Reviewer
5 pages
Introduction Book 1
No ratings yet
Introduction Book 1
41 pages
How Do SMEs Decide On International Market Entry - An Empirical Examination in The Middle East
No ratings yet
How Do SMEs Decide On International Market Entry - An Empirical Examination in The Middle East
18 pages
Khajura RMTMP Final 30 June
No ratings yet
Khajura RMTMP Final 30 June
127 pages
The CIT - Method or Methodology
No ratings yet
The CIT - Method or Methodology
15 pages
Nature of Statistics: Sample Population Parameter, Statistic
No ratings yet
Nature of Statistics: Sample Population Parameter, Statistic
3 pages
Statistics For Research: Data and Variables
No ratings yet
Statistics For Research: Data and Variables
7 pages
B. Venkata Krishna: Education
No ratings yet
B. Venkata Krishna: Education
2 pages
Quiz 1 - Rev
No ratings yet
Quiz 1 - Rev
4 pages
STAT Quiz 3
No ratings yet
STAT Quiz 3
3 pages
Preview: Microfilmed 1992
No ratings yet
Preview: Microfilmed 1992
24 pages
Chapter 1. The Nature of Probability and Statistics
No ratings yet
Chapter 1. The Nature of Probability and Statistics
5 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Statistics: a QuickStudy Laminated Reference Guide
From Everand
Statistics: a QuickStudy Laminated Reference Guide
BarCharts Publishing, Inc.
No ratings yet

Ics 2328 Statistical Modeling Notes 1

Uploaded by

Ics 2328 Statistical Modeling Notes 1

Uploaded by

ICS 2328 STATISTICAL MODELING

• Nominal is the lowest level. Only names are meaningful here.

Statistics: Grouped Frequency Distributions

Guidelines for classes

Stats: Measures of Central Tendency

Stats: Measures of Variation

Unbiased Estimate of the Population Variance

A little algebraic simplification returns:

5 5 - 4.6 = 0.4 ( 0.4 ) ^2 = 0.16

3 3 - 4.6 = -1.6 ( - 1.6 )^2 = 2.56

6 6 - 4.6 = 1.4 ( 1.4 )^2 = 1.96

5 5 - 4.6 = 0.4 ( 0.4 )^2 = 0.16

23 0.00 (Always) 5.2

"Within k standard deviations" interprets as the interval: to .

You might also like