0% found this document useful (0 votes)

104 views

Unit - 8 Data Analysis

The document discusses the process of data processing and analysis. It involves editing raw data by checking for errors and omissions. Next is coding, which assigns numerical codes to responses to categorize them. Classification groups the coded data into common characteristics. Tabulation summarizes the coded and classified data into statistical tables for further analysis. Descriptive statistics are used to describe samples, while inferential statistics make inferences about populations from samples. Various measures of central tendency and dispersion are also discussed, along with techniques for data transformation, tabulation, and statistical analysis using appropriate methods based on variables and data scale.

Uploaded by

Eng. Tadele Dandena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

104 views

Unit - 8 Data Analysis

Uploaded by

Eng. Tadele Dandena

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOC, PDF, TXT or read online on Scribd

You are on page 1/ 6

Unit-7

DATA PROCESSING AND ANALYSIS

Coding, entering and classifying the data:

Once the field data begin to flow in, the attention is turned to data processing and
analysis. Data processing implies editing, coding, classifying and tabulation of the
collected data so that they are amenable to analysis.

EDITING:

The first step in data processing is to edit the raw data. Editing of data is the process of
examining the collected raw data to detect errors and omissions and to correct these when
possible. It involves a careful scrutiny of the completed questionnaires one edits to
assure that the data are:
 Accurate
 Concise with other information / facts gathered
 Uniformly entered
 As complete as possible
 Arrange to facilitate coding and tabulation.

The editing can be done at two levels; on the field where the data is collected or in the
office.

CODING:

Coding refers to the process of assigning numerical or other symbols to answer so that
responses can be put into a limited number of categories or classes. By this several
answers are reduced to a few categories.

CLASSIFICATION:

Classification is the process of arranging data in groups or classes on the basis of

common characteristics. Data having common characteristics are placed together and in
this way the entire data gets divided into a number of groups or classes. Classification
can be in the form of either attribute (literacy, sex, honesty etc) or class intervals (like
income, population, age, weight etc).

TABULATION:

Tabulation is the process of summarizing raw data and displaying in the compact form (in
the form of statistical tables) for further analysis. It is an orderly arrangement of data in
columns and rows

1
Descriptive Vs Inferential statistics:
Descriptive Statistics: It is a type of statistics used to describe or summarize information
about population or a sample.
Inferential statistics: Statistics used to make inferences or judgment about a population
on the basis of sample information.

POPULATION PARAMETERS V/S SAMPLE STATISTICS:

Population parameters it is used to designate variables in a population. It is a measure of
population characteristics.
μ- to represent population mean
σ-to represent standard deviation of the population
N-to represent population size
P-to represent population proportion

Sample Statistics: It is used to designate variables in a sample. It measures

characteristics of a sample
X – to represent sample mean
s- to represent sample standard deviation
n- to represent sample size
p-to represent sample proportion

MEASURES OF CENTRAL TENDENCY:

The Mean: it is the arithmetic average; it involves all observations and is affected by
extreme cases.
The Medium: It is the mid point / middle value. It does not involve all observations.
The Mode: The value that occurs most often is referred to as the modal value.
MEASURES OF DISPERSION:
Range: The difference between the smallest and the largest value of a frequency
distribution is known as the range of the distribution.
Deviation scores: A method of calculating how far way any observation is from the
mean is to calculate the individual deviation
d=x-x

Average deviation: taking the average of individual deviations.

AD = Σ (x- x) / n
Since Σ ( x – x) is always zero AD is also zero.

Mean Absolute Deviation: MAD = Σ | x – x | / n

Mean Square Deviation: MSD = ν = Σ ( x – x ) 2 / n

Standard Deviation = √ MSD = √ ∑ ( x – x ) 2 / n

2
Data analysis:
Tabulation: It refers to the orderly arrangement of data in a table or other summary
format. At the very beginning, we have to develop a master sheet, and transform all
information from each questionnaire on this master sheet. Counting the number of
responses to a question and putting them in a frequency distribution is tabulation.

Ex: Do you have TV? a. Yes b. No

Possession of TV:

RESPONSES FREQUENCY
YES 27
NO 93
TOTAL 100

Percentage: For simplicity frequency distribution has to be followed by percentages.

Cross – tabulation: a technique of organizing data by group category or classes.
Ex: Question1: What is your Occupation a. Business b. Civil Servant
Question2: Do you have a TV a.) Yes b.) No.
Television possession by occupation:
Occupation Response Total
Yes No
Business 8 20 28
Civil Servant 19 73 92
Total 27 93 120
Percentage – Cross Tabulation: Changing the figures in the above table into
percentages.
Data Transformation: It is a process of changing data’s original form to a format that is
more suitable to perform a data analysis that will achieve the research objective.
To a question to students of a college, whether the college life is beautiful?
a. Strongly agree b. Agree c. Neither agree nor disagree d. Disagree
e. Strongly disagree

On a 5-point scale, it can be converted as:

Strongly Agree as + 2 or 5
Agree as +1 or 4
Neither agree or disagree as 0 or 3
Disagree as -1 or 2
Strongly disagree as -2 or 1

Using index numbers:

Index numbers are data summary values calculated based on figures for some base period
to facilitate comparisons over time.
The index number shows percentage changes form a base number (if the data are time
related, a base year is chosen). The index numbers are computed by dividing each years
activity by the base year activity and multiplying by hundred.
Ex: The following is a hypothetical data of a given region:

3
Year Land (in hectares) Index Population size Index
(in millions)
1990 20000 100 3 100
1991 21000 105 3.1 103
1992 21500 107.5 3.2 107
1993 22000 110.0 3.3 110
1994 23000 115 3.3 110
1995 22000 110 3.4 113
1996 22500 112.5 3.5 117
1997 21500 107.5 3.6 117
1998 23000 115 3.7 120

Note: We took the beginning year value as base year.

Calculating rank order: Respondents often indicate a rank ordering of preference. To

summarize these data for al respondents researchers perform a data transformation by
multiplying the frequencies by the rank (score) to develop a new scale.
Ex: Individual ranking of places of work of selected second year students of a particular
university gave the following results:

Respondents Mekalle Bahirdar Nazareth Awassa Jimma

1 5 4 1 3 2
2 1 2 4 5 3
3 5 1 2 3 4
4 3 1 5 2 4
5 5 4 2 1 3

Preference Rank
Place of Work 1 2nd 3rd 4th
st
5th
Mekalle 1 - 1 - 3
Bahirdar 2 1 - 2 -
Nazareth 1 2 - 1 1
Awassa 1 1 2 - 1
Jimma - 1 2 2 -

In this case we have to multiply number of respondents by the rank score and then
summarizing up the scores. The lowest total score show the first preference ranking.
Accordingly:
Mekalle: (1x1) + (1x3) + (3x5) = 19 Ranked 5th
Bahirdar: (2x1) + (1x2) + (2x4) = 12 Ranked 1st
Nazareth: (1x1) + (2x2) + (1x4) + (1x5) = 14 Ranked 2nd
Awassa: (1x1) + (1x2) + (2x3) + (1x5) = 14 Ranked 2nd
Jimma: (1x2) + (2x3) + (2x4) = 16 Ranked 4th

4
STATISTICAL ANALYSIS:
This requires choosing the appropriate statistical techniques. The choice of the method
of statistical analysis depends on:
1. The type of questions to be asked: i.e., is it to measure central tendency,
relationship between the variables, or category differences.
2. The number of variables:
 Univariate data analysis: When a researcher generalizes from a sample
about one variable
 Bivariate data analysis: When the desire is to explain the relationship
between two variables at a time
 Multivariate data analysis: Is the simultaneous investigation of more
than two variables.
3. Scale measurement of data: There are four scale measurements of data as
follows:
a) Nominal Data: (scale) Data that fall into different categories. Where one cannot
array the category in any order of magnitude, no mathematical operations can be
conducted on this data Ex: about sex, religion, the chest number’s worn by the
athletes or the numbers on the t-shirts of the football players meant for their
identification.
b) Ordinal Data: (scale) Data that permits a ranking by order of magnitude but it is
not possible to determine how much one item is compared with another. Not
much of mathematical operations can be conducted on data in this type as well ex:
one might rank from 1 to 5 for five towns in which they might prefer to work, the
position of a athlete at the end of a race.
c) Interval Data: (scale) / ratio data – provides detailed information but not much
of mathematical operations as addition, subtraction, multiplication, division can
be worked out Ex: time of finishing a 100 meters race by the athletes in a
competition etc
d) Ratio Data: (scale) Provides detailed information and all the mathematical
operations can be conducted on this type of data. Ex: income, age, weight, height,
price, output etc.

PARAMETRIC Vs NON-PARAMETRIC:
Note:
Parametric analysis based on interval data and nonparametric analysis is based on
nominal or ordinal data
Parametric statistics: Statistical procedures that use interval scales or ratio scale and
assume population or sampling distribution is normal.
Non-parametric statistics: Statistical procedures that use nominal or ordinal data and
make no assumptions about the distribution of the population or sampling distribution.
Examples of selecting the appropriate statistical methods:

5
Scale of Problems Statistical questions to Possibilities of
measurement be answered statistical
significance
Interval or ratio Compare actual Z-test if sample size
Is the sample mean
scale and hypothetical is large ( n > 30)
significantly different
values of average t-test if sample size
from the hypothesized
is small (n<30)
population mean.
Ordinal Scale Compare actual Does the distribution The Chi – square
and expected differ from the expected test (2 )
values
Nominal Scale Identify sex of key Is the number of female The Chi – square
executives executives equal to the test (2 )
number of male
executives
Nominal scale For comparing Is there significant One way and two
more than two difference between more ANOVA (analysis
variables than two variables in of variance test)
terms of mean?

Hypothesis testing procedure:

1. Setting of the hypothesis :
a. The Null Hypothesis: a statement about a status-quo that asserts any change
from what has been thought to be true will be entirely due to random error.
b. The alternative hypothesis: is a statement indicating the opposite of null
hypothesis.
The purpose of hypothesis testing is to determine which one of these two hypotheses
is correct.
2. Level of significance and critical values: This determines the chances of
committing a type I error i.e., the chances that the null hypothesis is true and is
rejected, generally represented with . (type II error is the chances that the null
hypothesis is false and is accepted) Generally 5% is taken when nothing is
mentioned for . And the relevant critical values based on the tail of the test and
the test statistic adopted is represented diagrammatically.
3. Test statistic: appropriate statistical tool in the form of Z-test, t-test, 2 test or
ANOVA is conducted.
4. Computation: based on the relevant test statistic the values obtained after editing
is introduced into the relevant formula.
5. Decision: In this step based on the results obtained after computation acceptance
or rejection of the null hypothesis is done.

Statistical Analysis of Data With Report Writing
100% (2)
Statistical Analysis of Data With Report Writing
16 pages
Painless Pre-Algebra
From Everand
Painless Pre-Algebra
Barron's Educational Series
3/5 (2)
Case Analysis On Unilever
No ratings yet
Case Analysis On Unilever
2 pages
Statistics For Business and Economics: Anderson Sweeney Williams
No ratings yet
Statistics For Business and Economics: Anderson Sweeney Williams
25 pages
L31 Bayesian Logistic Regression PDF
No ratings yet
L31 Bayesian Logistic Regression PDF
8 pages
Chapter 6 (1)
No ratings yet
Chapter 6 (1)
14 pages
BRM CH_07
No ratings yet
BRM CH_07
7 pages
Data Analysis and report Writing BRM
No ratings yet
Data Analysis and report Writing BRM
49 pages
Processing & Analysis of Data
No ratings yet
Processing & Analysis of Data
25 pages
Marketing Ii: Facultad de Economía y Negocios Universidad de Chile
No ratings yet
Marketing Ii: Facultad de Economía y Negocios Universidad de Chile
18 pages
Lecture Notes: (Introduction To Medical Laboratory Science Research)
No ratings yet
Lecture Notes: (Introduction To Medical Laboratory Science Research)
13 pages
BRM Chapter 6
No ratings yet
BRM Chapter 6
8 pages
Data Analysis
No ratings yet
Data Analysis
52 pages
research 6 1
No ratings yet
research 6 1
34 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
44 pages
Data Preparation and Analysis 3
No ratings yet
Data Preparation and Analysis 3
182 pages
Chapter 5 - Analysis and Presentation of Data
No ratings yet
Chapter 5 - Analysis and Presentation of Data
30 pages
MMW
No ratings yet
MMW
7 pages
Topic 1 Introduction To Statistics
No ratings yet
Topic 1 Introduction To Statistics
35 pages
Introduction To Statistics: February 21, 2006
No ratings yet
Introduction To Statistics: February 21, 2006
34 pages
Statistics A Review
No ratings yet
Statistics A Review
47 pages
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
No ratings yet
Course Introduction Inferential Statistics Prof. Sandy A. Lerio
46 pages
BRM CH-6 PPt (1)
No ratings yet
BRM CH-6 PPt (1)
30 pages
Research Methodology Quantitative Design
No ratings yet
Research Methodology Quantitative Design
31 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
22 pages
Research Project 1
No ratings yet
Research Project 1
17 pages
CH01 - Introduction To Statistics 2
No ratings yet
CH01 - Introduction To Statistics 2
52 pages
EDU702 Quantitative Research
No ratings yet
EDU702 Quantitative Research
31 pages
Chapter 7&8
No ratings yet
Chapter 7&8
40 pages
Chapter 5. Processing and Analyzing Data
No ratings yet
Chapter 5. Processing and Analyzing Data
29 pages
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
No ratings yet
Lesson 5 (Descriptive Statistics Part 1)_Oct 2024
72 pages
DDR - Chapter 05
No ratings yet
DDR - Chapter 05
41 pages
Research Methodology: Result and Analysis (Part 1)
No ratings yet
Research Methodology: Result and Analysis (Part 1)
65 pages
Unit 2
No ratings yet
Unit 2
72 pages
Chapter Six Data Processing, Analysis and Interpretation
No ratings yet
Chapter Six Data Processing, Analysis and Interpretation
8 pages
RM Data Analysis
No ratings yet
RM Data Analysis
67 pages
Market Research: Data Analysis Methods
No ratings yet
Market Research: Data Analysis Methods
20 pages
Data Processing
No ratings yet
Data Processing
73 pages
PR2-MODULAR-M
No ratings yet
PR2-MODULAR-M
5 pages
Statistics For Data Science
100% (1)
Statistics For Data Science
27 pages
Unit 9
No ratings yet
Unit 9
9 pages
Statistics
No ratings yet
Statistics
12 pages
Introduction To STATISTICS-new
No ratings yet
Introduction To STATISTICS-new
44 pages
Notes On Data Processing, Analysis, Presentation
No ratings yet
Notes On Data Processing, Analysis, Presentation
63 pages
Chapter VI: Data Processing, Analysis and Interpretation
No ratings yet
Chapter VI: Data Processing, Analysis and Interpretation
40 pages
Reviewer in IE-SAN1
No ratings yet
Reviewer in IE-SAN1
5 pages
Quantitative data Analysis 2025.pptx
No ratings yet
Quantitative data Analysis 2025.pptx
69 pages
Thesis Bset Week 3
No ratings yet
Thesis Bset Week 3
8 pages
Data Management
No ratings yet
Data Management
19 pages
AL- I (Unit -I)
No ratings yet
AL- I (Unit -I)
19 pages
Quantitative Analysis: Basic Statistical Processes
No ratings yet
Quantitative Analysis: Basic Statistical Processes
4 pages
Math as a Tool Data Management Introduction and Central Tendency
No ratings yet
Math as a Tool Data Management Introduction and Central Tendency
12 pages
Data Analysis Procedure
0% (1)
Data Analysis Procedure
27 pages
Introduction To Statistics
100% (3)
Introduction To Statistics
43 pages
Quantitative Research
100% (1)
Quantitative Research
31 pages
RM Module 1
No ratings yet
RM Module 1
63 pages
Step 6 Data Analysis
No ratings yet
Step 6 Data Analysis
23 pages
CH 8 Data Analysis
No ratings yet
CH 8 Data Analysis
34 pages
MMW Midterm Reviewer
No ratings yet
MMW Midterm Reviewer
8 pages
6 Data Analysis
No ratings yet
6 Data Analysis
24 pages
Research 619
No ratings yet
Research 619
22 pages
Lecture 8 Data Analysis
No ratings yet
Lecture 8 Data Analysis
30 pages
Basic-Statistical-Concepts-_-Measures-of-Location.docx
No ratings yet
Basic-Statistical-Concepts-_-Measures-of-Location.docx
14 pages
2013 Highway Eng'g-I Project 3rd Yr
No ratings yet
2013 Highway Eng'g-I Project 3rd Yr
6 pages
Case Analysis On Toyota Company
No ratings yet
Case Analysis On Toyota Company
2 pages
Coca Case Analysis
No ratings yet
Coca Case Analysis
2 pages
Case Analysis On Facebook
No ratings yet
Case Analysis On Facebook
2 pages
Unit - 9 Report Writing
No ratings yet
Unit - 9 Report Writing
7 pages
Unit-2 Research Process and Problem Formulation
No ratings yet
Unit-2 Research Process and Problem Formulation
10 pages
Unit - 7 Data Collection
No ratings yet
Unit - 7 Data Collection
6 pages
Unit - 5 Research Proposal
No ratings yet
Unit - 5 Research Proposal
4 pages
Int Stats For Eco - Assignment Question Paper
No ratings yet
Int Stats For Eco - Assignment Question Paper
2 pages
Experimental Hydraulics Methods Instrumentation Data Processing and Management Volume I Fundamentals and Methods 1st Edition Marian Muste (Editor)
100% (3)
Experimental Hydraulics Methods Instrumentation Data Processing and Management Volume I Fundamentals and Methods 1st Edition Marian Muste (Editor)
62 pages
Stats Exam 1 Cheat Sheet
No ratings yet
Stats Exam 1 Cheat Sheet
3 pages
Obe 1
No ratings yet
Obe 1
5 pages
Thesis On Financial Time Series
100% (3)
Thesis On Financial Time Series
7 pages
VaibhavKumar Extendedproject PDF
100% (2)
VaibhavKumar Extendedproject PDF
10 pages
A Simple But Effective Logistic Regression Derivation
No ratings yet
A Simple But Effective Logistic Regression Derivation
6 pages
MAS202 - Assignment 2: Exercise 1
No ratings yet
MAS202 - Assignment 2: Exercise 1
16 pages
Econometrics I Final Examination Summer Term 2013, July 26, 2013
No ratings yet
Econometrics I Final Examination Summer Term 2013, July 26, 2013
9 pages
SYCSE AIML Detail Syllabus 24 25
No ratings yet
SYCSE AIML Detail Syllabus 24 25
38 pages
T-Test: T-Test Pairs Pre With Post (Paired) /CRITERIA CI (.9500) /missing Analysis
No ratings yet
T-Test: T-Test Pairs Pre With Post (Paired) /CRITERIA CI (.9500) /missing Analysis
2 pages
Chapter 5 Estimation
No ratings yet
Chapter 5 Estimation
17 pages
Course-Outline-R301b-2024-25
No ratings yet
Course-Outline-R301b-2024-25
11 pages
1405-221: Homework No.2
No ratings yet
1405-221: Homework No.2
3 pages
MONOVA
No ratings yet
MONOVA
22 pages
HYPOTHESES AND ERRORS New
No ratings yet
HYPOTHESES AND ERRORS New
23 pages
Week 8: Inferential Statistics: Test of Independence
No ratings yet
Week 8: Inferential Statistics: Test of Independence
4 pages
4q Capstone Reviewer
No ratings yet
4q Capstone Reviewer
14 pages
T-Test: Two-Sample Assuming Unequal Variances: Variabl E1 Variab Le2
No ratings yet
T-Test: Two-Sample Assuming Unequal Variances: Variabl E1 Variab Le2
10 pages
Chap 7_Two Sample Test
No ratings yet
Chap 7_Two Sample Test
59 pages
Pendekatan Covarian Based SEM Dengan Estimasi Bollen-Stine
No ratings yet
Pendekatan Covarian Based SEM Dengan Estimasi Bollen-Stine
8 pages
Unit 1 Introduction To Statistics: Structure
No ratings yet
Unit 1 Introduction To Statistics: Structure
24 pages
02 Simple Random Sampling
No ratings yet
02 Simple Random Sampling
35 pages
PR2 Sample-Size
No ratings yet
PR2 Sample-Size
12 pages
03 9709 62 2023 2RP Afp M23 13022023030714
No ratings yet
03 9709 62 2023 2RP Afp M23 13022023030714
16 pages
Mit18 05 s22 Statistics
No ratings yet
Mit18 05 s22 Statistics
173 pages
Pengaruh Motivasi, Supervisi, Dan Etika Auditor Terhadap Kualitas Audit
No ratings yet
Pengaruh Motivasi, Supervisi, Dan Etika Auditor Terhadap Kualitas Audit
10 pages
Time Series Analysis and Its Applications: With R Examples: Second Edition
No ratings yet
Time Series Analysis and Its Applications: With R Examples: Second Edition
18 pages

Unit - 8 Data Analysis

Uploaded by

Unit - 8 Data Analysis

Uploaded by

Unit-7

DATA PROCESSING AND ANALYSIS

Coding, entering and classifying the data:

Classification is the process of arranging data in groups or classes on the basis of

POPULATION PARAMETERS V/S SAMPLE STATISTICS:

Sample Statistics: It is used to designate variables in a sample. It measures

MEASURES OF CENTRAL TENDENCY:

Average deviation: taking the average of individual deviations.

Mean Absolute Deviation: MAD = Σ | x – x | / n

Mean Square Deviation: MSD = ν = Σ ( x – x ) 2 / n

Standard Deviation = √ MSD = √ ∑ ( x – x ) 2 / n

Ex: Do you have TV? a. Yes b. No

Percentage: For simplicity frequency distribution has to be followed by percentages.

On a 5-point scale, it can be converted as:

Using index numbers:

Note: We took the beginning year value as base year.

Calculating rank order: Respondents often indicate a rank ordering of preference. To

Respondents Mekalle Bahirdar Nazareth Awassa Jimma

Hypothesis testing procedure:

You might also like