0% found this document useful (0 votes)
11 views26 pages

Data Managementmmw

Statistics is a branch of applied mathematics focused on collecting, organizing, and interpreting data to make inferences about populations based on samples. It includes descriptive statistics, which summarizes data, and inferential statistics, which generalizes findings from samples to larger populations. Key concepts include population vs. sample, types of variables, data collection methods, and measures of central tendency and variability.

Uploaded by

asassin831
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views26 pages

Data Managementmmw

Statistics is a branch of applied mathematics focused on collecting, organizing, and interpreting data to make inferences about populations based on samples. It includes descriptive statistics, which summarizes data, and inferential statistics, which generalizes findings from samples to larger populations. Key concepts include population vs. sample, types of variables, data collection methods, and measures of central tendency and variability.

Uploaded by

asassin831
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

MATHEMATICS AS A

TOOL
DATA MANAGEMENT
What is STATISTICS
■ Statistics is a branch of applied mathematics
concerned with collecting, organizing, and
interpreting data .It attempts to infer the
properties of a large collection of data from
inspection of a sample of the collection thereby
allowing educated guesses to be made with a
minimum of expense.
MAIN BRANCHES OF
STATISTICS
■ Descriptive Statistics refers to the collection, presentation,
and summary of data (either using charts and graphs or using a
numerical summary).

■ Inferential Statistics refers to generalizing from a sample to a


population, estimating unknown population parameters,
drawing conclusions, and making decisions.
Population and Sample
Population refers to all the items (infinite or finite) that we
are interested in. It consists of the totality of the
observations ,individuals, or objects in which the
investigator/researcher is interested in.

Sample is a subset or portion of the population. It involves


looking only at some items selected from a population.
Parameter – is a value calculated using all the Data from a
population.
Statistic- is a value calculated using the data from the sample.

What is Variable?
A VARIABLE is a characteristic of interest about an object under
investigation that can take on different possible outcomes, such as age,
hair, color, height, weight, and religious preference.
Two kinds of Variables
1.QUALITATIVE VARIABLES
>These are variables that can be placed into distinct categories, according
to some characteristics or attributes.
■ QUANTITATIVE VARIABLES – These are numerical and
can be ordered or ranked.
■ Also, these consist of two types: Discrete and Continuous.
■ Discrete are frequencies, obtained by means of counting.
■ Continuous are represented by measurement values.
DATA
■ Data is a set of values collected from the variable from each of
the subjects that belong to the sample. It refers to a collection
of natural phenomena descriptors such as results from
experiences, observations or experiments, or a set of premises.
It may consist of numbers, words, or images.
■ Data can be classified according to the type of variable for
which it was drawn. There are two general types of data
according to how the data vary across cases:
Types of Statistical Data
■ 1.Numerical data. These data have meaning as a measurement
such as a person’s height, weight, IQ,Or blood pressure or
shares of stocks a person owns.
■ 2. Categorical data: Categorical data represent characteristics
such as a person’s gender, . marital status, hometown, or the
types of movies they like. Categorical data can take on
numerical values (such as “1” indicating male and “2”
indicating female) but those numbers don’t have mathematical
meaning.
Four levels of Measurement
■ 1.Nominal –the lowest of the four ways to characterize data. It
deals with names, categories, or labels. (eg. Colors of eyes, yes
or no responses to a survey, favorite breakfast cereal, and
number on the back of a football jersey).

■ 2. Ordinal – the data at this level can be ordered but no
differences between the data. (eg. Ten cities are ranked from
one to ten, but differences between the cities don’t make much
sense, letter grades where we can order things so that A is
higher than B but without any other information).
■ 3. Interval – deals with data that can be ordered, and
in which differences between the data does make
sense. But data at this level has no starting point.(eg.
Fahrenheit and Celsius scales of temperatures).

■ 4. Ratio – the highest level of measurement. Data
possess all of the features of the interval level, in
addition to an absolute zero. Due to the presence of a
zero, it now makes sense to compare the ratios of
measurements.
DATA COLLECTION METHOD
■ Methods of Collecting Data
1.In-Person Interviews
Pros: In-depth and a high degree of confidence on the data
Cons: Time consuming, expensive and can be dismissed as
anecdotal.
2. Mail Surveys
Pros: Can reach anyone and everyone – no barrier
Cons: Expensive, data collection errors, lag time
DATA COLLECTION METHOD
3. Phone Surveys
Pros: High degree of confidence on the data collected, reach almost
anyone
Cons: Expensive, cannot self-administer, need to hire an agency
4. Web/Online Surveys
Pros: Cheap, can self-administer, very low probability of data errors .
Cons: Not all your customers might have an email address/be on the
internet, customers may be wary of divulging information online.
Three Ways of Presenting Data

1.Textual presentation use words, statements or paragraphs with


numerals, numbers to describe data.
Example:
There are 42, 036 barangays in the Phiippines. The largest
barangay in terms of population size in Barangay 176 in Caloocan
City with 247 thousand persons. It is followed by Commamealth
in Quezon City (198,295) and Batasan Hals in Quezon City
(161,409]. Twelve other
barangays posted a population size of more than a hundred
thousand
■ Tabular Presentation of Data
Tables present clear and organized data. A table must be clear and
simple but
complete.
A good table should include the following parts.
Table number and title –these are placed above the table. The
title is usually written right after the table number.
Caption subhead –this refers to columns and rows.
Body –it contains all the data under each subhead.
Source- it indicates if the data is secondary and it should be
acknowledge.
Graphical Method of Presenting the Data
A graph or chart portrays the visual presentation of data using symbols such
as lines, dots, bars or slices. It depicts the trend of a certain set of
measurements or shows comparison between two or more sets of data or
quantities.
Frequency Distribution
■ Frequency is the rate that measures how often something
occurs.
■ Example 1
Jack joins football practice every Wednesday morning, Sunday
morning and afternoon. The frequency of Jack’s football practice
every week is 3(2 on Sunday and 1 on Wednesday).By counting
frequencies we can make Frequency Distribution Table.
Example 2
Jack’s team has scored the following numbers of goals in
their games,
3,1, 2, 1,3,2, 4, 2, 3,2, 5,4,3, 2.
Jack put the numbers in order, then added up:
How often 1 occurs (2 times),
How often 2 occurs (5 times),
how often 3 occurs (4 times)
how often 4 occurs (2 times),
how often 5 occur (1 time)
Graphical Representation of Frequency
Distribution
A. Bar Graph is a pictorial representation of statistical data in such a way that length of the
rectangles in the graph represents the proportional value of the variable. Bar graphs are
generally used to compare the values of several variables at a time to analyze data. The length
of the bars(horizontal or vertical) represents the frequency of the variable and is applicable to
discrete categories only.
B. Line graph or Line chart is a graphical display of information that changes continuously
over time. Within a line graph, there are points connecting the data to show a continuous
change. The lines in a line graph can descend and ascend based on the data. We can also
compare different events, situations, and information.
C.Pie Chart is a type of graph that displays data in a circular graph. The pieces of the graph are
proportional to the fraction of the whole in each category. Each slice of the pie is relative to the
size of that category in the group as a whole. The entire “pie” represents 100 percent of a
whole, while the pie “slices” represent portions of the whole.
MEASURES OF CENTRAL
TENDENCY
Types of Measures for Center
■ Once the data are collected, it is useful to summarize the data set by
identifying a value around which the data are centered.
Mean – is the numerical balancing point of the data set.
Example;
Add all the numbers then divide by the amount of numbers.
9,3,1,8,3,6
9+3+1+8+3+6=30
30÷6=5
The mean is 5.
Median – is the middle number or the mean of the two middle
numbers in an ordered set of data.
Example;
Order the set of numbers, the median is the middle number
1,3,3,6,8,9 The median is 4.5

Mode – is the most frequently occurring number in a data set.


Example;
The most common number
9,3,1,8,3,6
The mode is 3
Types of Measures of Dispersion
or Variability
■ Another important feature that can help us understand more
about a data set is the manner in which the data are distributed.
■ Range is the difference between the largest value
(maximum) and the smallest value (minimum) in the data.
■ Standard deviation is an extremely important measure of
spread ,That is based on the mean. It is a measure of the
average deviation for all of the data point from the mean.
■ Variance is the square of the standard deviation of the data.
It does
■ Not use the same unit of measure as the original data.
Measures of Relative Position

■ Used to describe the position of a data value in
relation to the rest of the data.
■ Types:
■ Quartiles 2. Percentiles 3. Deciles
■ Quartiles..
■ Quartiles divide an ordered data set into four equal parts
(quarters). We use subscript notation to label the quartiles: Q1,
Q2 and Q3. The first quartile, Q1, is (or 25%) of the way
through the data – the lower quartile. The second quartile, Q2
is ,(or 50%) of the way through the data – the median .The
third quartile, Q3 is (or 75%) of the way through the data- the
upper quartile.
E.g:
3 4 4 5 6 8 10
4 is the lower quartile
5 would be the median
8 is the upper quartile
■ Percentiles..
■ Values of the variable that divide a ranked
Set into 100 subsets.
For example, P30 would be at 30%.
Percentile Example......
The 78th percentile means 78% are
Smaller than the given value.
Does making the 80th percentile mean that
You made an 80% on test?
DECILES
Decile
A quantitative method of splitting up a set of ranked data into 10 equally
large subsections.

Z-scores
A z-score represents the number of standard deviations a data value falls
above or below the mean. It is used as a way to measure relative position.
Z- Score formula
■ Example.....
■ A student scored a 65 on a math test that
Had a mean of 50 and a standard deviation of 10. She
scored 30 on a history test with a mean of 25 and a
standard deviation of 5. Compare her relative position on
the two tests.
Answer....
■ Math: z =(65-50)/10= 15/10= 1.5
■ History: z =(30-25)/5 =5/5=1
The student did better in math because
The z-score was higher.

You might also like