Lesson 09 Data Analysis I Descriptive Statistics

The document discusses data preparation which involves collecting data from various sources, converting it to a numeric format, and handling missing values. It describes 3 key steps: 1) Coding data using a coding sheet, 2) Entering coded data into a spreadsheet or statistical software, and 3) Handling missing values through listwise deletion or imputation. Descriptive analysis is used to statistically describe and summarize data through frequency distributions, measures of central tendency (mean, median, mode), and measures of dispersion like standard deviation.

Uploaded by

lasith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views

Lesson 09 Data Analysis I Descriptive Statistics

Uploaded by

lasith

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 15

Data Preparation

Data Preparation
• In research projects, data are collected from a variety of sources:
• Questionnaire surveys
• Interviews
• Observational data
• This data must be converted into a machine-readable and numeric
format.
• Preparation of data, converting them into numeric format is a essential
requirement before moving to analysing the data.
• Data preparation usually follows the following three steps:
1. Data coding
2. Data entry
3. Handling missing values
1. Data coding

• Coding is the initial process of converting data into numeric format.

• A coding sheet should be created to guide the coding process.
• It contains detailed description of each variable in a research study, items or
measures for that variable, the response scale for each item (i.e., whether it is
measured on a nominal, ordinal, interval, or ratio scale; whether such scale is a
five-point, seven-point, or some other type of scale), and how to code each value
into a numeric format.
• For instance, if you have a measurement item on a seven-point Likert scale with
the rang from “strongly disagree” to “strongly agree”, you can code that item as
1 for strongly disagree, 4 for neutral, and 7 for strongly agree, with the
intermediate anchors in between.
• Nominal data such as industry type can be coded in numeric form using a coding
scheme such as: 1 for manufacturing, 2 for retailing, 3 for financial, 4 for
healthcare, and so forth (of course, nominal data cannot be analysed statistically).
Coding

Q.5.2. ‘The most attractive place (AP) in the University is the hostel’. What
is your agreement (select the most appropriate answer).

Strongly agree 1
Somewhat Agree 2
Agree 3
Neither agree nor disagree 4
Somewhat disagree 5
Disagree 6
Strongly disagree 7
Code:
Q.5.2. AP – 6
You need to prepare a codebook/sheet.
2. Data entry
• Coded data can be entered into a spreadsheet, database, or directly into a statistical program
like SPSS/Minitab.
• Most statistical programs provide a data editor for entering data. However, these programs
store data in their own format (e.g., SPSS stores data as .sav files), which makes it difficult
to share that data with other statistical programs.
• Hence, it is often better to enter data into a spreadsheet such as Microsoft Excel, where they
can be reorganized as needed, shared across programs, and subsets of data can be extracted
for analysis.
• Each observation can be entered as one row in the spreadsheet and each measurement item
can be represented as one column. The entered data should be frequently checked for
accuracy, via occasional spot checks on a set of items or observations, during and after entry.
• Furthermore, while entering data, the coder should watch out for obvious evidence of bad
data, such as the respondent selecting the “strongly agree” response to all items irrespective
of content, including reverse-coded items. If so, such data can be entered but should be
excluded from subsequent analysis.
• Excel and Minitab data sheets.
Application of SPSS
• SPSS stands for Statistical Package for the Social Sciences.
• SPSS is a software package using for statistical analyses, including data processing
(data entry and data editing), data manipulation (data merging), and data display in
terms of tables and graphs.
• With the Start the SPSS program, It appears three views:
• Data view
• Variable view
• Output viewer
• Data view: Each column contains the data of each variable while each row contains
the data of each case.
• Variable view: there are 11 columns, each column contains each characteristics:
name, type, width, decimals, label, values, missing, columns, align, measures and
role. Each row contains the characteristic of each variable.
• Output Viewer: Output (tables, graphs etc.) will be kept in the Output Viewer
automatically after running statistical command/s during working with a dataset.
3. Missing values

• Respondents may not answer certain questions.

• During data entry, some statistical programs automatically treat blank entries as
missing values, while others require a specific numeric value such as -1 or 999 to
be entered to denote a missing value.
• Missing values can be handle in two ways;
• 1. During data analysis, missing values in most software programs is to simply drop the
entire observation containing even a single missing value, in a technique called list wise
deletion.
• 2. Some software programs allow the option of replacing missing values with an estimated
value via a process called imputation (assign value).
• For instance, if the missing value is one item in a multi-item scale, the assigned
value may be the average of the respondent’s responses to remaining items on
that scale.
• If the missing value belongs to a single-item scale, many researchers use the
average of other respondent’s responses to that item as the imputed value.
If missing value belongs to the Multi-item

Academic Performance Rank

During this time, my subject knowledge has been improved 6

My English language grew during the university period 5

My computer skills grew during the university period -

During this time my presentation skills has developed 5

If missing value belongs to a single-item
Q. No. RACE GEN AGE
Uva/B/01 2 2 38
Uva/B/02 2 1 37
Uva/B/03 1 2 58
Uva/B/04 1 2 -
Uva/B/05 1 2 40
Uva/B/06 1 1 54
Uva/B/07 1 1 48
Uva/B/08 1 2 35
Uva/B/09 1 2 47
Sab/R/01 1 2 -
Sab/R/02 1 2 53
Data Analysis
• Numeric data collected in a research project can be analysed
quantitatively using statistical tools which are broadly classified as;
1. Descriptive Analysis
2. Inferential analysis
• Descriptive analysis refers to statistically describing, aggregating,
averaging and presenting the constructs of interest or associations
between these constructs.
• Inferential analysis refers to the statistical testing of hypotheses
(theory testing) and use to reach statistical conclusion.
Descriptive Analysis
• Descriptive analysis is based on Univariate Analysis, or analysis of a
single variable, refers to a set of statistical techniques that can
describe the general properties of one variable.
• Univariate statistics include:
1. Frequency distribution
2. Central tendency
3. Dispersion
Frequency distribution
• Frequency distribution of a variable is a summary of the frequency (or
percentages) of individual values or ranges of values for that variable.

• Central tendency
• Central tendency is an estimate of the centre of a distribution of values.
• There are three major estimates of central tendency: mean, median, and
mode.
• The mean is the simple average of all values in a given distribution.
• The median is the middle value within a range of values in a distribution.
• The mode is the most frequently occurring value in a distribution of values.
Frequency Distribution
1. Calculate Frequency and Central Tendency using SPSS
• If you want to find frequency, you can follow the following steps.

• From menu bar -> Analyze->Descriptive statistics ->Frequencies -> Variables (eg, EDUCA) -
> OK.
Education level
Frequency Percent Valid Cumulative
Percent Percent
Grade 1-5 4 1.1 1.1 1.1
Grade 6-9 15 4.1 4.1 5.2
Grade 10 to O/L 128 35.0 35.0 40.2
Valid Up to A/L 195 53.3 53.3 93.4
First degree 22 6.0 6.0 99.5
Higher degree 2 .5 .5 100.0
366 100.0 100.0
Total
Central Tendency using SPSS

• If you want to find Mean, SD, Variance, Minimum and Maximum

values, you can follow the following steps.
• From menu bar -> Analyze->Descriptive statistics ->Descriptive ->
Variables (eg, EDUCA) -> Options -> Mean, Variance, Minimum
and Maximum -> continue -> OK.
Descriptive Statistics
N Minimum Maximum Mean Variance
1.c. 366 2 7 4.61 .541
Valid N 366
(listwise)
Dispersion
• Dispersion refers to the way values are spread around the central tendency. How
widely the values are clustered around the mean.
• The common measures of dispersion is the standard deviation (SD).
• If you want to find Mean, SD, Variance, Minimum and Maximum values, you can
follow the following steps.
• From menu bar -> Analyse->Descriptive statistics ->Descriptive -> Variables
(eg, EDUCA) -> Options -> Mean, SD, Variance, Minimum and Maximum ->
continue -> OK.
Descriptive Statistics
N Mean Std. Deviation
1.c. 366 4.61 .735
Valid N (listwise) 366

Data Analysis Using SPSS: Research Workshop Series
No ratings yet
Data Analysis Using SPSS: Research Workshop Series
86 pages
Spss and Statistics Guide
100% (1)
Spss and Statistics Guide
28 pages
AP Statistics Crash Course
From Everand
AP Statistics Crash Course
Michael D'Alessio
No ratings yet
Data Analysis and Interpretation
No ratings yet
Data Analysis and Interpretation
33 pages
Spss Training Manual
No ratings yet
Spss Training Manual
94 pages
Quantitative Data Analysis
No ratings yet
Quantitative Data Analysis
35 pages
Research 8 Grade 8 Melc 1 q4 Week1
No ratings yet
Research 8 Grade 8 Melc 1 q4 Week1
26 pages
Missing Data and Data Cleaning - Tagged
No ratings yet
Missing Data and Data Cleaning - Tagged
31 pages
Q 4_MELC 1 RESEARCH II
No ratings yet
Q 4_MELC 1 RESEARCH II
8 pages
Lecture Notes: (Introduction To Medical Laboratory Science Research)
No ratings yet
Lecture Notes: (Introduction To Medical Laboratory Science Research)
13 pages
RM Module 1
No ratings yet
RM Module 1
63 pages
Sas Stat
No ratings yet
Sas Stat
44 pages
Psych Stats Reviewer
No ratings yet
Psych Stats Reviewer
35 pages
Notes On Data Processing, Analysis, Presentation
No ratings yet
Notes On Data Processing, Analysis, Presentation
63 pages
PSY - 2060 - 2022H1 - Session 01 2022-01-19 02 - 39 - 01
No ratings yet
PSY - 2060 - 2022H1 - Session 01 2022-01-19 02 - 39 - 01
24 pages
Data Analysis and Interpretation of Data
No ratings yet
Data Analysis and Interpretation of Data
32 pages
Data Analysis
No ratings yet
Data Analysis
37 pages
BRM Chapter 6
No ratings yet
BRM Chapter 6
8 pages
SPSS Data Analysis
100% (6)
SPSS Data Analysis
47 pages
4 q2 Practical Research
No ratings yet
4 q2 Practical Research
31 pages
SIA 2101 - Lecture 10 - Research Analysis
No ratings yet
SIA 2101 - Lecture 10 - Research Analysis
82 pages
RESEARCH
No ratings yet
RESEARCH
5 pages
Practical Research 2 Q2 Lesson
No ratings yet
Practical Research 2 Q2 Lesson
7 pages
DATA PROCESSING, ANALYSING AND INTERPRETATION Ipmi
100% (1)
DATA PROCESSING, ANALYSING AND INTERPRETATION Ipmi
120 pages
LESSON-5-PLANNING-DATA-ANALYSES
No ratings yet
LESSON-5-PLANNING-DATA-ANALYSES
19 pages
Lecture 8 Data Analysis
No ratings yet
Lecture 8 Data Analysis
30 pages
Erm Spss Example
No ratings yet
Erm Spss Example
17 pages
Data Analysis and Reporting HS 490: Missing Data. Once The Coded Data Have Been Entered Into A Computer System
No ratings yet
Data Analysis and Reporting HS 490: Missing Data. Once The Coded Data Have Been Entered Into A Computer System
6 pages
Chapter 10 Data Analysis-Quantitative
No ratings yet
Chapter 10 Data Analysis-Quantitative
93 pages
SPSS Notes
No ratings yet
SPSS Notes
2 pages
Chapter 12 Outline
No ratings yet
Chapter 12 Outline
8 pages
Statistics in Research Analysis
No ratings yet
Statistics in Research Analysis
12 pages
SPSS
No ratings yet
SPSS
25 pages
Statistics
No ratings yet
Statistics
68 pages
Week 12 Data Analysis and Presentation
No ratings yet
Week 12 Data Analysis and Presentation
21 pages
New Week 3 4
No ratings yet
New Week 3 4
15 pages
Midterm Exam Reviewer
No ratings yet
Midterm Exam Reviewer
12 pages
BRM Lab File
No ratings yet
BRM Lab File
52 pages
Basics of Statistics
No ratings yet
Basics of Statistics
3 pages
Data Analysis
No ratings yet
Data Analysis
61 pages
Research Reviewer: Finding Answers Through Data Collection
No ratings yet
Research Reviewer: Finding Answers Through Data Collection
6 pages
chapter 13
No ratings yet
chapter 13
71 pages
Data Analysis
No ratings yet
Data Analysis
39 pages
Unit 1 SPSS
No ratings yet
Unit 1 SPSS
9 pages
Abhinn - Spss Lab File
No ratings yet
Abhinn - Spss Lab File
67 pages
Research Samples and Explanations
No ratings yet
Research Samples and Explanations
56 pages
Combine PDF
No ratings yet
Combine PDF
9 pages
PSY2060 Session 01
No ratings yet
PSY2060 Session 01
20 pages
Statistics For Data Science
No ratings yet
Statistics For Data Science
30 pages
Topic 8 Data Processing and Analysis PDF
No ratings yet
Topic 8 Data Processing and Analysis PDF
157 pages
Statistics
No ratings yet
Statistics
3 pages
SPSS Notes
No ratings yet
SPSS Notes
8 pages
1 Data and Statistics
No ratings yet
1 Data and Statistics
65 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
4 pages
Quantitative Research Methods - Data Processing and Analysis
No ratings yet
Quantitative Research Methods - Data Processing and Analysis
25 pages
Unit - 8 Data Analysis
No ratings yet
Unit - 8 Data Analysis
6 pages
Module 4 2024 DS
No ratings yet
Module 4 2024 DS
75 pages
8602 (8)
No ratings yet
8602 (8)
29 pages
Introduction To Non Parametric Methods Through R Software
From Everand
Introduction To Non Parametric Methods Through R Software
Editor IJSMI
No ratings yet
Statistical Analysis and Visualization
From Everand
Statistical Analysis and Visualization
Mohit Chatterjee
No ratings yet

Lesson 09 Data Analysis I Descriptive Statistics

Uploaded by

Lesson 09 Data Analysis I Descriptive Statistics

Uploaded by

Data Preparation

• Coding is the initial process of converting data into numeric format.

• Respondents may not answer certain questions.

Academic Performance Rank

During this time, my subject knowledge has been improved 6

My English language grew during the university period 5

My computer skills grew during the university period -

During this time my presentation skills has developed 5

• If you want to find Mean, SD, Variance, Minimum and Maximum

You might also like