0% found this document useful (0 votes)
240 views

Statistics Analysis With Software Application

This document provides an introduction to statistical concepts. It defines statistics as the science of collecting, organizing, summarizing, and analyzing information to draw conclusions. Statistics is important because it allows people to make evidence-based decisions, but it also has limitations. The document outlines the process of statistics, which involves identifying a research question, collecting data, organizing and summarizing the data using descriptive statistics, and drawing conclusions about a population based on a sample using inferential statistics. It also distinguishes between different types of variables, such as qualitative vs. quantitative, discrete vs. continuous, and different levels of measurement for variables.

Uploaded by

LEZIL ECLIPSE
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
240 views

Statistics Analysis With Software Application

This document provides an introduction to statistical concepts. It defines statistics as the science of collecting, organizing, summarizing, and analyzing information to draw conclusions. Statistics is important because it allows people to make evidence-based decisions, but it also has limitations. The document outlines the process of statistics, which involves identifying a research question, collecting data, organizing and summarizing the data using descriptive statistics, and drawing conclusions about a population based on a sample using inferential statistics. It also distinguishes between different types of variables, such as qualitative vs. quantitative, discrete vs. continuous, and different levels of measurement for variables.

Uploaded by

LEZIL ECLIPSE
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 22

Statistical Analysis with Software Application

DEFINITION OF STATISTICS
MODULE 1:
Statistics plays a major role in many aspects of our lives.
It is used in sports, for example, to help a general manager
INTRODUCTION TO decide which player might be the best fit for a team. It is used

THE STATISTICAL in politics to help candidates understand how the public feels
about various policies. And statistics is used in medicine to help

CONCEPTS determine the effectiveness of new drugs. Used appropriately,


statistics can enhance our understanding of the world around
us. Used inappropriately, it can lend support to inaccurate
beliefs. Understanding statistical methods will provide you with
Objectives: the ability to analyze and critique studies and the opportunity
to become an informed consumer of information.
After successful completion of this Understanding statistical methods will also enable you to
module, you should be able to: distinguish solid analysis from bogus “facts.”
Statistics is the science of collecting, organizing,
summarizing, and analyzing information to draw conclusions or
• Define statistics. answer questions. In addition, statistics is about providing a
measure of confidence in any conclusions.
• Enumerate the importance and
limitations of statistics What information is referred to in the definition?
The information referred to the definition is the data.
• Explain the process of statistics
According to the Merriam Webster dictionary, data are “factual
• Know the difference between information used as a basis for reasoning, discussion, or
descriptive and inferential statistics. calculation”.
Data can be numerical, as in height, or nonnumerical,
• Distinguish between qualitative and
as in gender. In either case, data describe characteristics of an
quantitative variables.
individual.
• Distinguish between discrete and
Field of Statistics
continuous variables.
A. Mathematical Statistics- The study and development of
• Determine the level of measurement of statistical theory and methods in the abstract.
a variable.
B. Applied Statistics- The application of statistical methods to
solve real problems involving randomly generated data and the
development of new statistical methodology motivated by real
problems. Example branches of Applied Statistics:
psychometric, econometrics, and biostatistics
IMPORTANCE & LIMITATIONS OF STATISTICS:
Importance of Statistics . . .
 Statistics is important because it enables people to make decisions based on empirical evidence.
 Statistics provides us with tools needed to convert massive data into pertinent information that can be
used in decision making.
 Statistics can provide us information that we can use to make sensible decisions.
Limitation of Statistics . . .
 Statistics is not suitable to the study of qualitative phenomenon.
 Statistics does not study individuals.
 Statistical laws are not exact.
 Statistics table may be misused.
 Statistics is only, one of the methods of studying a problem.
Definitions:
 Universe is the set of all entities under study.
 A Population is the total or entire group of individuals or observations from which information is desired
by a researcher. Apart from persons, a population may consist of mosquitoes, villages, institution, etc.
 An individual is a person or object that is a member of the population being studied.
 A statistic is a numerical summary of a sample.
 Sample is the subset of the population.
 Descriptive statistics consist of organizing and summarizing data. Descriptive statistics describe data
through numerical summaries, tables, and graphs.
 Inferential statistics uses methods that take a result from a sample, extend it to the population, and
measure the reliability of the result.
 A parameter is a numerical summary of a population
PROCESS OF STATISTICS
1. Identify the research objective. A researcher must determine the question(s) he or she wants
answered. The question(s) must clearly identify the population that is to be studied. Identify the
research objective.
2. Collect the information needed to answer the questions. Conducting research on an entire population
is often difficult and expensive, so we typically look at a sample. This step is vital to the statistical
process, because if the data are not collected correctly, the conclusions drawn are meaningless. Do not
overlook the importance of appropriate data collection.
3. Organize and summarize the information. Descriptive statistics allow the researcher to obtain an
overview of the data and can help determine the type of statistical methods the researcher should use.
4. Draw conclusion from the information. In this step the information collected from the sample is
generalized to the population. Inferential statistics uses methods that takes results obtained from a
sample, extends them to the population, and measures the reliability of the result.
Take Note!
To distinguish descriptive from inferential . . .
If the entire population is studied, then inferential statistics is not necessary, because descriptive
statistics will provide all the information that we need regarding the population.
DISTINCTION BETWEEN QUALITATIVE AND QUANTITATIVE VARIABLES
Variables are the characteristics of the individuals within the population.
For example, recently my mother and I planted a tomato plant in our backyard. We collected information
about the tomatoes harvested from the plant. The individuals we studied were the tomatoes. The variable that
interested us was the weight of a tomato. My mom noted that the tomatoes had different weights even though
they came from the same plant. She discovered that variables such as weight may vary.
If variables did not vary, they would be constants, and statistical inference would not be necessary.
Think about it this way: If each tomato had the same weight, then knowing the weight of one tomato would
allow us to determine the weights of all tomatoes. However, the weights of the tomatoes vary. One goal of
research is to learn the causes of the variability so that we can learn to grow plants that yield the best tomatoes.
It is helpful to divide variables into different types, as different statistical methods are applicable to each.
The main division is into qualitative (or categorical) or quantitative (or numerical variables).
Variables can be classified into two groups:
1. Qualitative variables (Categorical) is variable that yields categorical responses. It is a word or a code
that represents a class or category.
2. Quantitative variables (Numeric) takes on numerical values representing an amount or quantity.
DISTINCTION BETWEEN DISCRETE AND CONTINUOUS
Quantitative variables may be further classified into:
1. A discrete variable is a quantitative variable that either a finite number of possible values or a countable
number of possible values. If you count to get the value of a quantitative variable, it is discrete.
2. A continuous variable is a quantitative variable that has an infinite number of possible values that are
not countable. If you measure to get the value of a quantitative variable, it is continuous.
LEVELS OF MEASUREMENT
It is important to know which type of scale is represented
by your data since different statistics are appropriate for different
scales of measurement. A characteristic may be measured using
nominal, ordinal, interval and ration scales.
1. Nominal Level - They are sometimes called categorical
scales or categorical data. Such a scale classifies persons or
objects into two or more categories. Whatever the basis
for classification, a person can only be in one category, and
members of a given category have a common set of
characteristics.
2. Ordinal Level - This involves data that may be arranged in
some order, but differences between data values either cannot be determined or meaningless. An
ordinal scale not only classifies subjects but also ranks them in terms of the degree to which they
possess a characteristics of interest. In other words, an ordinal scale puts the subjects in order from
highest to lowest, from most to least. Although ordinal scales indicate that some subjects are higher, or
lower than others, they do not indicate how much higher or how much better.
3. Interval Level - This is a measurement level not only classifies and orders the measurements, but it also
specifies that the distances between each interval on the scale are equivalent along the scale from low
interval to high interval. A value of zero does not mean the absence of the quantity. Arithmetic
operations such as addition and subtraction can be performed on values of the variable.
4. Ratio Level - A ratio scale represents the highest, most precise, level of measurement. It has the
properties of the interval level of measurement and the ratios of the values of the variable have
meaning. A value of zero means the absence of the quantity. Arithmetic operations such as
multiplication and division can be performed on the values of the variable.
Operations that make sense for variables of different scales.
Both interval and ratio data
involve measurement. Most data analysis
techniques that apply to ratio data also
apply to interval data. Therefore, in most
practical aspects, these types of data
(interval and ratio) are grouped under
metric data. In some other instances,
these type of data are also known as
numerical discrete and numerical
continuous.
MODULE 2:
DATA COLLECTION

DATA COLLECTION Everybody collects, interprets and uses information,


much of it in numerical or statistical forms in day-to-day life. In
AND BASIC CONCEPTS everyday life, in business and industry, certain statistical
information is necessary and it is independent to know where to
IN SAMPLING DESIGN find it how to collect it.
Analysis of data can lead to powerful results. Data can be
used to offset anecdotal claims, such as the suggestion that
Objectives: cellular telephones cause brain cancer. Anecdotal means that the
information being conveyed is based on casual observation, not
After successful completion of this scientific research. Because data are powerful, they can be
module, you should be able to: dangerous when misused. The misuse of data usually occurs
• Determine the sources of data (primary when data are incorrectly obtained or analyzed. Whenever we
and secondary data). look at data, we should be mindful of where the data come from.
Even when data tell us that a relation exists, we need to
• Distinguish the different methods data investigate.
collection under primary and secondary
Data collection is the process of gathering and measuring
data.
information on variables of interest, in an established systematic
• Determine the appropriate sample size. fashion that enables one to answer stated research questions,
test hypotheses, and evaluate outcomes.
• Differentiate various sampling
Consequences from Improperly Collected Data
techniques.
• Inability to answer research questions accurately.
• Know the sources of errors in sampling.
• Inability to repeat and validate the study.
• Distorted findings resulting in wasted resources.
• Misleading other researchers to pursue fruitless avenues of
investigation.
• Compromising decisions for public policy.
• Causing harm to human participants and animal subjects.

Steps in Data Gathering:


1. Set the objectives for collecting data
2. Determine the data needed based on the set objectives.
3. Determine the method to be used in data gathering and define the comprehensive data collection
points.
4. Design data gathering forms to be used.
5. Collect data.
Choosing of Method of Data Collection
Decision-makers need information that is relevant, timely, accurate and usable. The cost of obtaining,
processing and analyzing these data is high. The challenge is to find ways, which lead to information that is cost-
effective, relevant, timely and important for immediate use. Some methods pay attention to timeliness and
reduction in cost. Others pay attention to accuracy and the strength of the method in using scientific.
SOURCES OF DATA
Whether conducting research in the social sciences, humanities arts, or natural sciences, the ability to
distinguish between primary and secondary sources is essential.
The statistical data may be classified under two categories, depending upon the sources. Approaches:
Primary Data and Secondary Data.
 Primary Sources - Provide a first-hand account of an event or time period and are considered to be
authoritative. They represent original thinking, reports on discoveries or events, or they can share new
information. Often these sources are created at the time the events occurred but they can also include
sources that are created later. They are usually the first formal appearance of original research.
Primary Data - are data documented by the primary source. The data collectors documented the
data themselves. The first hand information obtained by the investigator is more reliable and accurate
since the investigator can extract the correct information by removing doubts, if any, in the minds of the
respondents regarding certain questions. High response rates might be obtained since the answers to
various questions are obtained on the spot. It permits explanation of questions concerning difficult
subject matter.
 Secondary Sources - offer an analysis, interpretation or a restatement of primary sources and are
considered to be persuasive. They often involve generalization, synthesis, interpretation, commentary or
evaluation in an attempt to convince the reader of the creator's argument. They often attempt to
describe or explain primary sources.
Secondary Data - are data documented by a secondary source. The data collectors had the data
documented by other sources.
In secondary data, data are primary data for the agency that collected them, and become
secondary for someone else who uses these data for his own purposes. Secondary data are less
expensive to collect both in money and time. These data can also be better utilized and sometimes the
quality of such data may be better because these might have been collected by persons who were
specially trained for that purpose.
On the other hand, such data must be used with great care, because such data may also be full
of errors due to the fact that the purpose of the collection of the data by the primary agency may have
been different from the purpose of the user of these secondary data. Secondly, there may have been
bias introduced, the size of the sample may have been inadequate, or there may have been arithmetic
or definition errors, hence, it is necessary to critically investigate the validity of the secondary data.
METHODS IN DATA COLLECTION
The primary data can be collected by the following five methods:
1. Direct personal interviews - The researcher has direct contact with the interviewee. The researcher
gathers information by asking questions to the interviewee.
2. Indirect/Questionnaire Method - This method of data collection involve sourcing and accessing existing
data that were originally collected for the purpose of the study.
Designing good “questioning tools” forms an important and time consuming phase in the development
of most research proposals. Once the decision has been made to use these techniques, the following questions
should be considered before designing our tools:
• What exactly do we want to know, according to the objectives and variables we identified earlier? Is
questioning the right technique to obtain all answers, or do we need additional techniques, such as observations
or analysis of records?
• Of whom will we ask questions and what techniques will we use? Do we understand the topic
sufficiently to design a questionnaire, or do we need some loosely structured interviews with key informants or
a focus group discussion first to orient ourselves?
• Are our informants mainly literate or illiterate? If illiterate, the use of selfadministered questionnaires
is not an option.
• How large is the sample that will be interviewed? Studies with many respondents often use shorter,
highly structured questionnaires, whereas smaller studies allow more flexibility and may use questionnaires with
a number of open-ended questions.
3. A focus group is a group interview of approximately six to twelve people who share similar
characteristics or common interests. A facilitator guides the group based on a predetermined set of
topics.
4. Experiment is a method of collecting data where there is direct human intervention on the conditions
that may affect the values of the variable of interest.
Bear in mind that the experimental method has several limitations that you should be aware of.
- Ethical, moral, and legal Concerns
- Unrealistic Controlled Environments
- Inability to Control for All Variables
5. Observation is a technique that involves systematically selecting, watching and recoding behaviors of
people or other phenomena and aspects of the setting in which they occur, for the purpose of getting
(gaining) specified information. It includes all methods from simple visual observations to the use of high
level machines and measurements, sophisticated equipment or facilities such as:
- Radiographic
- biochemical
- X-ray machines
- Microscope
- Clinical examinations
- Microbiological examinations
It gives relatively more accurate data on behavior and activities but Investigators or observer’s own
biases, prejudice, desires, and etc. and needs more resources and skilled human power during the use of
high level machines.
The secondary data can be collected by the following five methods:
1. Published report on newspaper and periodicals.
2. Financial Data reported in annual reports.
3. Records maintained by the institution.
4. Internal reports of the government departments.
5. Information from official publications.
CONCEPTS IN SAMPLING DESIGN
Sample Size
“How many participants should be chosen for a survey”?
One of the most frequent problems in statistical analysis is the determination of the appropriate sample
size. One may ask why sample size is so important. The answer to this is that an appropriate sample size is
required for validity. If the sample sizes it too small, it will not yield valid results. An appropriate sample size can
produce accuracy of results. Moreover, the results from the small sample size will be questionable. A sample
size that is too large will result in wasting money and time because enough samples will normally give an
accurate result.
The sample size is typically denoted by n and it is always a positive integer. No exact sample size can be
mentioned here and it can vary in different research settings. However, all else being equal, large sized sample
leads to increased precision in estimates of various properties of the population.
Take Note!
- Representativeness, not size, is the more important consideration.
- Use no less than 30 subjects if possible.
- If you use complex statistics, you may need a minimum of 100 or more in your sample (varies with
method)

Choosing of sample size depends on nonstatistical considerations and statistical considerations.

• Non-statistical considerations – It may include availability of resources, man power, budget, ethics and
sampling frame.

• Statistical considerations – It will include the desired precision of the estimate.

Three criteria need to be specified to determine the appropriate sample size:

1. Level of Precision
Also called sampling error, the level of precision, is the range in which the true value of the population is
estimated to be.
2. Confidence Interval
It is statistical measure of the number of times out of 100 that results can be expected to be within a
specified range. For example, a confidence interval of 90% means that results of an action will probably
meet expectations 90% of the time.
3. Degree of Variability
Depending upon the target population and attributes under consideration, the degree of variability
varies considerably. The more heterogeneous a population is, the larger the sample size is required to
get an optimum level of precision.

METHODS IN DETERMINING THE SAMPLE SIZE

Here are the common used methods in determining the sample size.

• Slovin’s Formula

- Slovin’s formula is used to calculate the sample size n given the population size and error. It is computed
as
N
n≥
1+ N e 2

Where:
N is the total population.
e is the level of precision.
Example:

A researcher plans to conduct a survey about food preference of BS Stat students. If the population of
students is 1000, find the sample size if the error is 5%.

Solution:

N
n≥
1+ N e 2

1000
n≥ 2
=285.71
1+1000(0.05)

The researcher need to survey 286 BS stat students.

•This is the link for online calculator of sample size:

- https://www.calculator.net/sample-size-calculator.html

- https://ph.search.yahoo.com/search?
fr=mcafee&type=E211PH885G0&p=raosoft+sample+size+calculator
BASIC SAMPLING DESIGN
The goal in sampling is to obtain individuals for a study in such a way that accurate information about
the population can be obtained.
Reason for Sampling
- Important that the individuals included in a sample represent a cross section of individuals in the
population.
- If sample is not representative it is biased. You cannot generalize to the population from your statistical
data.
DEFINITIONS:
• Observation unit - An object on which a measurement is taken. This is the basic unit of observation,
sometimes called an element. In studying human populations, observation units are often individuals.
• Target population - The complete collection of observations we want to study.
• Sampled population - The collection of all possible observation units that might have been chosen in a sample;
the population from which the sample was taken.
• Sample - A subset of a population.
• Sampling unit - A unit that can be selected for a sample. We may want to study individuals, but do not have a
list of all individuals in the target population. Instead, households serve as the sampling units, and the
observation units are the individuals living in the households.
• Sampling frame - A list, map, or other specification of sampling units in the population from which a sample
may be selected. For a survey using in-person interviews, the sampling frame might be a list of all street
addresses.
• Sampling technique/Sampling Strategies - It is a plan you set forth to be sure that the sample you use in your
research study represents the population from which you drew your sample.
• Sampling Bias - This involves problems in your sampling, which reveals that your sample is not representative
of your population.
ADVANTAGE OF SAMPLING OVER COMPLETE ENUMERATION
- Less Labor - Reduced Cost
- Greater Speed - Greater Scope
- Greater Efficiency and Accuracy - Convenience - Ethical Considerations
TWO TYPES OF SAMPLES
1. Probability Sample
• Samples are obtained using some objective chance mechanism, thus involving randomization.
• They require the use of a complete listing of the elements of the universe called the sampling frame.
• The probabilities of selection are known.
• They are generally referred to as random samples.
• They allow drawing of valid generalizations about the universe/population.

2. Non - probability Sample


• Samples are obtained haphazardly, selected purposively or are taken as volunteers.
• The probabilities of selection are unknown.
• They should not be used for statistical inference.
Sampling Procedure
1. Identify the population.
2. Determine if population is accessible.
3. Select a sampling method.
4. Choose a sample that is representative of the population.
5. Ask the question, can I generalize to the general population from the accessible population?
Basic Sampling Technique of Probability Sampling

• Simple Random Sampling - Most basic method of drawing a probability sample. Assigns equal probabilities of
selection to each possible sample. It is a technique that uses random
numbers or codes to represent a population. These numbers or
codes represent the entire member of the population. This can be
done by drawing a name from a box, using the random function
of a calculator, or using random generator programs.

Advantage: It is very simple and easy to use.

Disadvantage: The sample chosen may be distributed over a wide


geographic area.

When to use: This is preferable to use if the population is not widely spread geographically. Also, this is more
appropriate to use if the population is more or less homogenous with respect to the characteristics of the
population.

• Systematic random sampling is a technique such that every k th member of the population is being selected
until the required sample size is achieved. A careful organization of the members is required to ensure that a
selection process will yield to a sample that best represent the population. You may arrange them (in a list)
alphabetically, by sex, by age, socio-economic status, marital
status, etc.

Obtaining a Systematic Random Sample

1. Decide on a method of assigning a unique serial number,


from 1 to N, to each one of the elements in the population.

2. Compute for the sampling interval

3. Select a number, from 1 to k, using a randomization


mechanism. The element in the population assigned to this
number is the first element of the sample. The other
elements of the sample are those assigned to the numbers and so on until you get a sample of size.

Advantage: Drawing of the sample is easy. It is easy to administer in the field, and the sample is spread evenly
over the population.

Disadvantage: May give poor precision when unsuspected periodicity is present in the population.

When to use: This is advisable to us if the ordering of the population is essentially random and when
stratification with numerous data is used.

• Stratified Random Sampling - It is obtained by separating the


population into non-overlapping groups called strata and then obtaining a
simple random sample from each stratum. The individuals within each
stratum should be homogeneous (or similar) in some way.

Advantage: Stratification of respondents is advantageous in terms of


precision of the estimates of the characteristics of the population.
Sampling designs may vary by stratum to adjust for the differences in the
conditions across strata. It is easy to use as a random sampling design.
Disadvantage: Values of the stratification variable may not be easily available for all units in the population
especially if the characteristic of interest is homogeneous. It is possible that there are not representative in one
or two strata. Also, transportation costs can be high if the population covers a wide geographic area.

When to use: If the population is such that the distribution of the characteristics of the respondents under
consideration concentrated in small and spread segment of the population. Thus, this is preferred to use if
precise estimates are desired for stratified parts of the population and if sampling problems differ in the various
strata of the population.

• Cluster Sampling - You take the sample from naturally


occurring groups in your population. The clusters are
constructed such that the sampling units are
heterogeneous within the cluster and homogeneous
among the clusters. Cluster sampling is commonly use
to a large population whose members are grouped
based on their geographic locations.

Advantage: There is no need to come out with a list of


units in the population; all what is needed is simply a list of
the clusters. It is also less costly since the elements are
physically closer together.

Disadvantage: In actual field applications, adjacent households tend to have more similar characteristics than
households distantly apart.

When to use: If the population can be grouped into clusters where individual population elements are known to
be different with respect to the characteristics under study, this preferable to use.

Take Note!

Used probability sampling if the main objective of the sample survey is making inferences about the
characteristics of the population under study.

Basic Sampling Technique of NonProbability Sampling

• Accidental Sampling - There is no system of selection but only those whom the researcher or interviewer
meets by chance.

• Quota Sampling - There is specified number of persons of certain types is included in the sample. The
researcher is aware of categories within the population and draws samples from each category. The size of each
categorical sample is proportional to the proportion of the population that belongs in that category.

• Convenience Sampling - It is a process of picking out people in the most convenient and fastest way to get
reactions immediately. This method can be done by telephone interview to get the immediate reactions of a
certain group of sample for a certain issue.

• Purposive Sampling - It is based on certain criteria laid down by the researcher. People who satisfy the criteria
are interviewed. It is used to determine the target population of those who will be taken for the study.

• Judgement Sampling - selects sample in accordance with an expert’s judgment.

Cases wherein Non-Probability Sampling is Useful

- Only few are willing to be interviewed - Extreme difficulties in locating or identifying subjects

- Probability sampling is more expensive to implement - Cannot enumerate the population elements.
Sources of Errors in Sampling
1. Non-sampling Error
- Errors that result from the survey process.
- Any errors that cannot be attributed to the sample-to-sample variability.
Sources of Non-Sampling Error
1. Non-responses 2. Interviewer Error 3. Misrepresented Answers
4. Data entry errors 5. Questionnaire Design 6. Wording of Questions
7. Selection Bias
2. Sampling Error
- Error that results from taking one sample instead of examining the whole population.
- Error that results from using sampling to estimate information regarding a population.
DATA PRESENTATION
MODULE 3:
Data are usually collected in a raw format and thus the

DESCRIPTIVE inherent information is difficult to understand. Therefore, raw


data need to be summarized, processed, and analyzed to

STATISTICS usefully derive information from them. However, no matter how


well manipulated, the information derived from the raw data
should be presented in an effective format, and otherwise it
Objectives: would be a great loss for both authors and readers. Planning how
After successful completion of this the data will be presented is essential before appropriately
module, you should be able to: processing raw data.

✦ Organize qualitative and quantitative


data in tables CONSTRUCTING FREQUENCY, PERCENTAGE & DESCRIPTIVE
STATISTICS IN EXCEL
✦ Present data appropriately and
The first part of this session is to review the procedures
accurately were it can be easily
to calculate the descriptive statistics using EXCEL. (This step only
interpreted.
needs to be done once.)
✦ Review the procedures to calculate the  Go to TOOLS-ADD INS and select the Analysis Toolpaks
descriptive statistics using Excel. and OK. This will add the analysis tools to your EXCEL.
✦ Conduct statistical analysis to calculate  If for some reason, when you use Data Analysis in the
descriptive statistics using excel. future and it is not there, just download it again.
How to Construct Frequency and Distribution Table?
A frequency distribution lists each category of data and
the number of occurrences for each category of data.

Example: Use the “Data File”


Statement of the Problem
The study aims to determine the Intrinsic and Extrinsic Motivation to the academic performance of the
Grade 10 Students of Maximo L. Gatlabayan Memorial National High School.
1. What is the profile of the respondents in terms of sex?
2. What is the level of the Academic Performance of the students?
Solution:
To answer question 1, we need to construct a frequency distribution to determine how many female
and male respondents participated in the study.
Procedure in Constructing Frequency Table
If the data is in the form of qualitative data
To construct the frequency distribution using excels use the command:
=frequency(data_array,bins_array)
Then Ctrl Shift Enter
{=frequency(data_array,bins_array)}
Following the process using formula for Frequency and Percentages (using Excel)
Frequency

 Based on the data assign sex in numerical data (e.i Male – 1 , Female – 2)

 Construct the frequency table as shown:

 Using the formula or command in excel calculate the frequency distribution to determine how
many female and male respondents participated in the study.
1. Highlight the “Frequency Column” then,
2. Enter the Formula

3. Enter the data on the data Array by highlighting the data in “Sex Column” → comma →
highlight data in “Bin Array”
4. Shift → Ctrl →Enter.

5. Sum up the total using AutoSum command or =sum(number1, number2) then ENTER.

Percentages
1. Construct the percentage table by adding a column using the frequency table.

2. Calculate the percentage using this command “=(number1/Totalnumber)*100” then ENTER.

3. Sum up the total using AutoSum command or =sum(number1, number2) then ENTER.


Now present the final output:
FINAL OUTPUT

Table 1 shows the frequency and percentage distribution of the respondents in terms of sex. It can be
gleaned from the table that, out of 20 respondents considered in the study, 11 or 55 % are male and 9 or 45%
are female.

Constructing Frequency Table using “Data Analysis” in Excel

Frequency
If the data is in the form of quantitative data

Using the same “Statement of the Problem”

The study aims to determine the Intrinsic and Extrinsic Motivation to the academic performance of the
Grade 10 Students of Maximo L. Gatlabayan Memorial National High School.
1. What is the profile of the respondents in terms of sex?
2. What is the level of the Academic Performance of the students?
STEPS:
1. Set an interval or range for your data. It is needed for the “BIN RANGE”.
2. Click “DATA” on the menu bar and Click “DATA ANALYSIS” on the tool bar
3. The dialog box “DATA ANALYSIS” will appear and choose “HISTOGRAM” on the dialog box then click OK.
4. Highlight your data for the “INPUT RANGE”.
5. Highlight your data for the “BIN RANGE”.
6. Click the box of “LABELS IN FIRST ROW” then click “OK”.
7. The result will appear on the new worksheet of the excel file. Get the Percentage and total.

Step 1. Step 2.
Step 3.

Step 4.

Step 5.

Step 6. Final Output


GRAPHICAL PRESENTATION

A graph is a very effective visual tool as it displays data at a glance, facilitates comparison, and can
reveal trends and relationships within the data such as changes over time, and correlation or relative
share of a whole.
It is considered an important medium of communication because we are able to create a pictorial
representation of the numerical figures.
Suited when we need to show the results of the study to nonprofessionals and or people who dislike
numbers and too lengthy texts.

BAR GRAPH

- It is constructed by labeling each category of data on either the horizontal or vertical axis or the
frequency or relative frequency of the category on the other axis. Rectangles of equal width are drawn
for each category. The height of each rectangle represents the category’s frequency or relative
frequency.
- It is use to organize discrete data.

Types of Bar Graphs

 Simple Bar Graph. The simple bar chart is used for the case of one variable only.

 Multiple Bar Graph\ Grouped Column Chart. The multiple bar charts are an extension of a simple bar
chart when there are quantities of several variables to be displayed. The bars representing the
quantities for the different variables are piled next to one another for each attribute. The figure
becomes very cumbersome when there are too many variables and components.

 Component Bar Graph/ Subdivided Column Chart. In this type of bar chart, the components (quantities)
of each variable are piled on top of one another. It saves space as compared to a multiple bar chart. One
of the disadvantage of this graph is that it is not always easy to compare size of the components, or
parts. It is used to represent data in which the total magnitude is divided into different or components.

Remember!
 Bar graphs may also be drawn with horizontal bars. Horizontal bars are preferable when category names
are lengthy.
 In bar graphs, the order of the categories does not usually matter. However, bar graphs that have
categories arranged in decreasing order of frequency help prioritize categories for decision-making
purposes in areas such as quality control, human resources, and marketing.

HISTOGRAM
- It is constructed by drawing rectangles for each class of data. The height of each rectangle is the
frequency or relative frequency of the class. The width of each rectangle is the same and the rectangles
touch each other.
- It is a graph used to present quantitative data, is similar to the bar graph.
- It is use to organize continuous data.

PIE CHART
- It is a circle divided into sectors. Each sector represents a category of data. The area of each sector is
proportional to the frequency of the category.
- Pie charts are typically used to present the relative frequency of qualitative data. In most cases the data
are nominal, but ordinal data can also be displayed in a pie chart.
When should a bar graph or a pie chart be used?

 Pie charts are useful for showing the division of all possible values of a qualitative variable into its parts.
 Bar graphs are useful when we want to compare the different parts, not necessarily the parts to the
whole.

LINE GRAPH

- A graph that shows information that is connected in some way (such as change over time)
- Line segments are then drawn connecting the points. It is use to organize continuous data.
- Very useful in identifying trends in the data over time.

Simple Line Graph


The simplest of line graphs is the single line graph, so called because it displays information concerning
one variable only, in terms of its frequencies.

Multiple Line Graph


Multiple line graphs illustrate information on several variables so that comparison is possible between
them.

Guidelines for Constructing Good Graphics


 Title and label the graphic axes clearly, providing explanations if needed. Include units of measurement
and a data source when appropriate.
 Avoid distortion.
 Minimize the amount of white space in the graph. Use the available space to let the data stand out. If
you truncate the scales, clearly indicate this to the reader
 Avoid clutter, such as excessive gridlines and unnecessary backgrounds or pictures.
 Don’t distract the reader.
 Avoid three dimensions.
 Do not use more than one design in the same graphic. Let the data speak for themselves.

DESCRIPTIVE STATISTICS
How to Calculate Measures of Central Tendency, Measures of Variation, Skewness and Kurtosis for
Ungrouped and Sample Data Using Excel?

Example:
The data given below are the scores of randomly selected applied statistics undergraduate students in
Section A and Section B. Compare the scores of Section A and Section B based on measures of central tendency,
and measures of variation and determine which section performed better in their final examination. Also,
describe the shape of the distribution of these two data sets using skewness and kurtosis.

1. Click “DATA” on the menu bar and Click “DATA ANALYSIS” on the tool bar. The Dialog box will appear.
2. Select “Descriptive Statistics” then click “OK”.

3. Highlight your data for the “INPUT RANGE” and click the box of “LABELS IN FIRST ROW” then click “OK”.
4. Click “Summary statistics” and then click “OK”. Repeat the process for Data Set B.
When comparing distributions, it is better to
use a measure of variation/dispersion in addition to
a measure of central tendency but because in this
example Data set A and Data set B have the same
value for measures of central tendency, we will just
used measure of variation/dispersion to compare
these two data set.
Based on the result, Data set B has a larger
variability since it has larger value computed based
on different measures of variation. This means that
Data Set B is much more spread out than the Data
Set A.
In this example, we want a data set with a large mean value and a small standard deviation so we can
say that this is the section that performed better. Section A and Section B have the same mean value but in
terms of standard deviation Section A have smaller value compared to Section B, therefore, Section A performed
better in their final examination. In terms of the shape of the distribution, these two data sets have the shape in
terms of Skewness and kurtosis. It shows that Data Set A and Data Set B have platykurtic shaped and it is skewed
to the right.

You might also like