0% found this document useful (0 votes)

33 views30 pages

Data Analysis

Uploaded by

fentawmelaku1993

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views30 pages

Data Analysis

Uploaded by

fentawmelaku1993

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 30

Research Methodology

Data Analysis

April 2023
Data Analysis
 The data, after collection, has to be processed and analysed in
accordance with the outline laid down for the purpose at the time of
developing the research plan.
 This is essential for ensuring that we have all relevant data for
making contemplated comparisons and analysis.
 processing implies editing, coding, classification and tabulation of
collected data so that they are amenable to analysis.
 Data analysis is the process of inspecting, transforming, and
modeling data with the goal of discovering useful information
suggesting conclusions, & supporting decision making.
Data Processing and Presentation
Editing
–Editing of data is a process of examining the collected raw
data (specially in surveys) to detect errors & omissions and
to correct these when possible.
–Editing is done to assure that the data are accurate,
consistent with other facts gathered, uniformly entered, as
complete as possible and have been well arranged to
facilitate coding and tabulation.
Coding
– Coding refers to the process of assigning numerals or
other symbols to answers so that responses can be put
into a limited number of categories or classes.

– Such classes/categories must possess the characteristic

of exhaustiveness and also that of mutual exclusively
(which means that a specific answer can be placed in
one and only one cell in a given category set).
Coding
– Another rule to be observed is that of uni-
dimensionality by which is meant that every class is
defined in terms of only one concept (it measures
single attributes) .
– Coding decisions should usually be taken at the
designing stage of a data collection.
Classification
– Classification is the process of arranging data in groups or
classes on the basis of common characteristics.
– Classification can be one of the following two types,
depending upon the nature of the phenomenon involved:
—) Classification According to Attributes
—) Classification According to Class-intervals
- Classification According to Attributes: data are classified on the
basis of common characteristics which can be descriptive (such
as literacy/educational level, sex……).
- Descriptive characteristics refer to qualitative phenomenon
which cannot be measured quantitatively.
- Such data are known as statistics of attributes and their
classification is said to be classification according to attributes.
Classification
– Classification According to Class-intervals: Unlike descriptive
characteristics, the numerical characteristics refer to
quantitative phenomenon which can be measured through
some statistical units.

– Numerical data relating to income, production, age,

weight….come under this category.
– Such data are known as statistics of variables and are
classified on the basis of class intervals.
Tabulation
– When a mass of data ‘Big Data’ has been assembled, it
becomes necessary for the researcher to arrange it in
some kind of concise & logical order. This procedure is
referred to as tabulation.
– Thus, tabulation is the process of summarizing raw
data and displaying the same in compact form (i.e.,
in the form of statistical tables) for further analysis.
– In a broader sense, tabulation is an orderly
arrangement of data in columns and rows
Data Analysis

Two types of data analysis

Quantitative
Qualitative
Data Analysis

Quantitative Data Analysis

–The two most commonly used quantitative data analysis
methods are:
i. Descriptive Statistics
ii. Inferential Statistics
–Descriptive statistics are used to describe, summarize, or
explain a given set of data to find patterns.
–Inferential statistics is used to infer certain characteristics
of samples to population.
Descriptive analysis
Descriptive analysis: Describes the sample data through several characteristics of a particular array of measurement.
Measures of Mean = ∑ Xi / n;
Central  Median = Middle Data in an array of sequentially ordered values (with same values repeated as
Tendency necessary);
 Mode = Most frequent data
Measures of Distributions: Composition of the data set under consideration. It defines the nature of the data at
Dispersion hand. Typical conventional statistical distributions include:
 Mean Deviation = ∑ (Xi – Mean) / n; (average deviation of individual scores in the distribution from
the mean)
 Variance = ∑ (Xi – Mean)2 / (n – 1) for a sample data. If the data is for a population, ∑ (Xi –
Mean)2 / N. Mean of squared deviations (sum of squared deviations divided by sample size)
 Standard Deviations = √Variance; Square root of mean of squared deviations;
Note: Measure of central tendency (particularly mean) should be reported with measure of dispersion
(particularly standard deviation) to describe data

Measures ofChi-Square tests – measures goodness for fit (not causality) using frequency distributions;
Associations Correlation coefficients –shows type and strength of relation of items
Descriptive Statistics

Measures of Central Tendency

– Measures of central tendency (or statistical averages) tells
the point about which items have a tendency to cluster.
– The three most frequently used measures of central
tendency are:
—) Mean
—) Median
—) Mode
Mean
–also known as arithmetic average
– is the most commonly used & accepted measure of central
tendency.

– This should be used in the case of interval or ratio data.

– If the scores for a given sample distributions are:
32, 32, 35, 36, 37, 38, 38, 39, 39, 39, 40, 40, 42, 45

– The mean of the distribution will be:

[32+32+35+36+37+38+38+39+39+39+40+40+42+45]/14 =
Median
– Median is defined as the middle value in an ordered
arrangement of observations.

– The median is often used to summarize the location of a

distribution.

– Further, the median can be used with ordinal, interval,

or ratio measurements.

– If the scores for a given sample distributions are:

32, 32, 35, 36, 37, 38, 38, 39, 39, 39, 40, 40, 42, 45
The median will be: 38 + 39 = 38.5
Mode
– Mode can be defined as the most frequently occurring
value in a group of observations.

– If the scores for a given sample distributions are:

32, 32, 35, 36, 37, 38, 38, 39, 39, 39, 40, 40, 42, 45

– Then, the mode would be 39 because a score of 39

occurs three times, more than any other score.

– Mode is very good measure for ascertaining the

location of distribution in the case of nominal data.
Measure of Dispersion
• An averages fails to give any idea about the scatter of the
values of items of a variable in the series around the true
value of average

• The measure of dispersion is important to specify the

spread of distribution, which is measured by measure of
dispersion.
– The three most frequently used measures statistics
measuring dispersion are:
—) Range
—) Mean deviation
—) Standard deviation
—) Variance
Range
– Range is the difference between the highest & lowest
value.
– It is based solely on extreme values. Thus, it cannot
truly reveal the body of measurement.
Variance
– The squared deviation of a random variable from
its mean.

– Variance makes deviation much larger than it

actually is, hence to remove the effect they are
un-squared.
– Take the square root of the squared deviations in
the process of computing standard deviation.
Standard Deviation
– The standard deviation provides the best measure of
dispersion for interval/ratio measurements and is the
most widely used statistical measure after mean.

– The standard deviation for a sample will be calculated

by the following formula:
Example:
– The owner of a cafe is interested in how much people
spend at her cafe.

– She examines 10 randomly selected customers and

noted the following:
44, 50, 38, 96, 42, 47, 40, 39, 46, 50

– She calculated the mean by adding the observations and

dividing by 10 to get:
x = 49.2
Example:
– Below is the table for getting the standard deviation:
X (X- 49.2) (X-49.2)2
44 -5.2 27.04
50 0.8 0.64
38 11.2 125.44
96 46.8 2190.24
42 -7.2 51.84
47 -2.2 4.84
50 -9.2 84.64
39 -10.2 104.04
46 -3.2 10.24
50 0.8 0.64
Total 2,600.4
Example:
– Hence, the variance is 289 & the standard deviation is
the square root of 289 = 17.
– The mean for this example was about 49.2 and the
standard deviation was 17.
– We have:
49.2 - 17 = 32.2
49.2 + 17 = 66.2
– What this means is that most of the customers probably
spend between 32.20 and 66.20.
Skewness

Mean Mean Mean

Mode Mode
Median
Median Mode Median

Negatively Symmetric Positively

Skewed (Not Skewed) Skewed
Normal
Research Software
– Statistical software are specialized computer programs
for statistical & econometric analysis.

– The most commonly used statistical packages in

research:
—) SPSS
—) STATA
—) SAS
—) Minitab
—) NVivo
Reading Assignment
Inferential statistics
Qualitative Analysis
Quality Assurance
 Research quality: is the measure of how facts,
problems and objectives are established, how
the researchers explicated the facts and how
conclusions are drawn from the facts.
 The research design should ensure the quality
of all the processes.
Research quality
Epistemological viewpoint
Positivist Relativist Constructionist
Do the measures Have sufficient Does the study clearly
Validity (construct) correspond closely to number of gain access to the
reality? perspectives been experiences of those
included? doing the research
setting?
Will similar
Reliability Will the measures observations be Is there transparency
yield the same result reached by other how sense was made
on other occasions? observers? from the raw data?
What is the Do the concepts and
To what extent does probability that patters constructs derived
the study confirm or observed in the from this study have
Generalizability sample will be any relevance to other
contradict existing
findings in the same repeated in the general setting?
field? populations?
Research rigor (quality) (How to ensure. Eg. Case study)
Measure of quality (rigor) in case study research
Construct validity Internal validity External validity Reliability
 Data triangulation (both source and  Theoretical framework  Theory as a basis of  Case study protocol
collection strategy) for example: [preferably explicitly generalization (report of how the entire
derived] to be used as (particularly for single case study was
 Documents and archival data (internal basis of research case studies) conducted)
reports, minutes or archives, annual reports, processes;
press or other secondary articles);  Multiple case studies for  Case study database
 Pattern matching replication (database with all
 Interview data (original interviews carried (matching patterns available documents,
out by researchers) identified to those  Cross case analysis (as interview transcripts,
reported by other applicable) archival data, etc.)
 Review of transcripts and draft by peers, key authors);
informants  Rationale for case study  Maintain chain of
 Theory triangulation selection (explanation evidence
 Present cases systematically (from research (different theoretical why this case study was
question to conclusion and vice versa lenses and bodies of appropriate in view of
including citation to specific evidence literature used, either as research question)
sources) research framework, or
as means to interpret  Details on case study
 Indication of data collection circumstances findings) context (explanation of
(explanation how access to data has been e.g., industry context)
achieved) as well as checking for  Using different analysis
circumstances of data collection vs. actual techniques such as logic  Comparison with other
procedure (reflection of how actual course of models and explanation literature/study
research affected data collection process) building, rival
explanation in addition to
 Explanation of data analysis (clarification of pattern matching in
data analysis procedure) analysis
 Reflexivity (account of the researcher’s stand
and the research process has shaped the fact
finding and outcome of the research)

Processing and Analyzing Data: Construction Management Chair
No ratings yet
Processing and Analyzing Data: Construction Management Chair
29 pages
Chapter 5. Processing and Analyzing Data
No ratings yet
Chapter 5. Processing and Analyzing Data
29 pages
Topic 8 Data Processing and Analysis PDF
No ratings yet
Topic 8 Data Processing and Analysis PDF
157 pages
Planning Data Analysis Using Statistics
No ratings yet
Planning Data Analysis Using Statistics
27 pages
Chapter Five:: Analyses and Interpretation of Data
No ratings yet
Chapter Five:: Analyses and Interpretation of Data
72 pages
Lecture 5
No ratings yet
Lecture 5
33 pages
Presentation On Data Analysis: Submitted by
No ratings yet
Presentation On Data Analysis: Submitted by
38 pages
Pointers To Review Statistics
No ratings yet
Pointers To Review Statistics
6 pages
Chapter VI: Data Processing, Analysis and Interpretation
No ratings yet
Chapter VI: Data Processing, Analysis and Interpretation
40 pages
Notes On Data Processing, Analysis, Presentation
No ratings yet
Notes On Data Processing, Analysis, Presentation
63 pages
Unit .......
No ratings yet
Unit .......
45 pages
Business Statistics I BBA 1303: Muktasha Deena Chowdhury Assistant Professor, Statistics, AUB
100% (1)
Business Statistics I BBA 1303: Muktasha Deena Chowdhury Assistant Professor, Statistics, AUB
54 pages
Ge8 Statistics
No ratings yet
Ge8 Statistics
2 pages
BRM Chapter 6
No ratings yet
BRM Chapter 6
8 pages
Statistics in Research Processing and Data Analysis
No ratings yet
Statistics in Research Processing and Data Analysis
34 pages
Business Statistics and Computing Complete Ppts
No ratings yet
Business Statistics and Computing Complete Ppts
213 pages
Chapter 8
No ratings yet
Chapter 8
36 pages
Chapter2-Statistical Analysis
No ratings yet
Chapter2-Statistical Analysis
86 pages
Statistics Intro 1
No ratings yet
Statistics Intro 1
41 pages
Mathematics in The Modern World
No ratings yet
Mathematics in The Modern World
13 pages
Module 8
No ratings yet
Module 8
28 pages
Chap 4 Research Method and Technical Writing
No ratings yet
Chap 4 Research Method and Technical Writing
33 pages
Final SRB Unit 2
No ratings yet
Final SRB Unit 2
162 pages
Data Analysis Procedure
0% (1)
Data Analysis Procedure
27 pages
Data Analysis Topics Discussed Getting Data Ready For Analysis 1) - Editing Data (Definition)
No ratings yet
Data Analysis Topics Discussed Getting Data Ready For Analysis 1) - Editing Data (Definition)
8 pages
Statistics
No ratings yet
Statistics
21 pages
BRM - Data Analysis, Interpretation and Reporting Part II
No ratings yet
BRM - Data Analysis, Interpretation and Reporting Part II
102 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
35 pages
6 Data Analysis
No ratings yet
6 Data Analysis
24 pages
Article Review 1 Eng
No ratings yet
Article Review 1 Eng
30 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
24 pages
Research Bussiness
No ratings yet
Research Bussiness
9 pages
Assignment No 3
No ratings yet
Assignment No 3
16 pages
Lecture Notes: (Introduction To Medical Laboratory Science Research)
No ratings yet
Lecture Notes: (Introduction To Medical Laboratory Science Research)
13 pages
Statistical Characteristics of Numerical Data
No ratings yet
Statistical Characteristics of Numerical Data
9 pages
Lecture 1
No ratings yet
Lecture 1
32 pages
3 - Descriptive Stat
No ratings yet
3 - Descriptive Stat
70 pages
Quantitative Data Analysis 2025
No ratings yet
Quantitative Data Analysis 2025
69 pages
Presentation 4
No ratings yet
Presentation 4
29 pages
Statistical Analysis 1
No ratings yet
Statistical Analysis 1
35 pages
22 RM - Group 22
No ratings yet
22 RM - Group 22
44 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
26 pages
Data Analysis
No ratings yet
Data Analysis
6 pages
Business Analytics
No ratings yet
Business Analytics
44 pages
BRM CH - 07
No ratings yet
BRM CH - 07
7 pages
Business Statstics Complete
No ratings yet
Business Statstics Complete
13 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
50 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
63 pages
14 - Chapter 7 PDF
No ratings yet
14 - Chapter 7 PDF
39 pages
Foundations or Research Analysis
No ratings yet
Foundations or Research Analysis
31 pages
Dataaaaa
No ratings yet
Dataaaaa
40 pages
Statistics, Statistical Modelling & Data Analytics
No ratings yet
Statistics, Statistical Modelling & Data Analytics
68 pages
Lecture 8 Data Analysis
No ratings yet
Lecture 8 Data Analysis
30 pages
Reviewer Part 1
No ratings yet
Reviewer Part 1
9 pages
Qunt Data Coding & Analysis
No ratings yet
Qunt Data Coding & Analysis
104 pages
It Is Also Including Hypothesis Testing and Sampling
No ratings yet
It Is Also Including Hypothesis Testing and Sampling
12 pages
Data Analysis
No ratings yet
Data Analysis
40 pages
Business Analytics
No ratings yet
Business Analytics
40 pages
Chapter One
No ratings yet
Chapter One
24 pages
Chapter 2
No ratings yet
Chapter 2
28 pages
Chapter 4-1
No ratings yet
Chapter 4-1
18 pages
Chapter 5-6
No ratings yet
Chapter 5-6
9 pages
Chapter 1
No ratings yet
Chapter 1
24 pages
Cost Engineering Lecture Note
No ratings yet
Cost Engineering Lecture Note
62 pages
Accident Costs
No ratings yet
Accident Costs
40 pages
Public Finance & Taxations CH 1-4
100% (1)
Public Finance & Taxations CH 1-4
177 pages
Development and Engineering Economics Chapter 2
No ratings yet
Development and Engineering Economics Chapter 2
23 pages

Data Analysis

Uploaded by

Data Analysis

Uploaded by

Research Methodology

– Such classes/categories must possess the characteristic

– Numerical data relating to income, production, age,

Two types of data analysis

Quantitative Data Analysis

Measures of Central Tendency

– This should be used in the case of interval or ratio data.

– The mean of the distribution will be:

– The median is often used to summarize the location of a

– Further, the median can be used with ordinal, interval,

– If the scores for a given sample distributions are:

– If the scores for a given sample distributions are:

– Then, the mode would be 39 because a score of 39

– Mode is very good measure for ascertaining the

• The measure of dispersion is important to specify the

– Variance makes deviation much larger than it

– The standard deviation for a sample will be calculated

– She examines 10 randomly selected customers and

– She calculated the mean by adding the observations and

Mean Mean Mean

Negatively Symmetric Positively

– The most commonly used statistical packages in

You might also like