Lesson 09 Data Analysis I Descriptive Statistics
Lesson 09 Data Analysis I Descriptive Statistics
Data Preparation
• In research projects, data are collected from a variety of sources:
• Questionnaire surveys
• Interviews
• Observational data
• This data must be converted into a machine-readable and numeric
format.
• Preparation of data, converting them into numeric format is a essential
requirement before moving to analysing the data.
• Data preparation usually follows the following three steps:
1. Data coding
2. Data entry
3. Handling missing values
1. Data coding
Q.5.2. ‘The most attractive place (AP) in the University is the hostel’. What
is your agreement (select the most appropriate answer).
Strongly agree 1
Somewhat Agree 2
Agree 3
Neither agree nor disagree 4
Somewhat disagree 5
Disagree 6
Strongly disagree 7
Code:
Q.5.2. AP – 6
You need to prepare a codebook/sheet.
2. Data entry
• Coded data can be entered into a spreadsheet, database, or directly into a statistical program
like SPSS/Minitab.
• Most statistical programs provide a data editor for entering data. However, these programs
store data in their own format (e.g., SPSS stores data as .sav files), which makes it difficult
to share that data with other statistical programs.
• Hence, it is often better to enter data into a spreadsheet such as Microsoft Excel, where they
can be reorganized as needed, shared across programs, and subsets of data can be extracted
for analysis.
• Each observation can be entered as one row in the spreadsheet and each measurement item
can be represented as one column. The entered data should be frequently checked for
accuracy, via occasional spot checks on a set of items or observations, during and after entry.
• Furthermore, while entering data, the coder should watch out for obvious evidence of bad
data, such as the respondent selecting the “strongly agree” response to all items irrespective
of content, including reverse-coded items. If so, such data can be entered but should be
excluded from subsequent analysis.
• Excel and Minitab data sheets.
Application of SPSS
• SPSS stands for Statistical Package for the Social Sciences.
• SPSS is a software package using for statistical analyses, including data processing
(data entry and data editing), data manipulation (data merging), and data display in
terms of tables and graphs.
• With the Start the SPSS program, It appears three views:
• Data view
• Variable view
• Output viewer
• Data view: Each column contains the data of each variable while each row contains
the data of each case.
• Variable view: there are 11 columns, each column contains each characteristics:
name, type, width, decimals, label, values, missing, columns, align, measures and
role. Each row contains the characteristic of each variable.
• Output Viewer: Output (tables, graphs etc.) will be kept in the Output Viewer
automatically after running statistical command/s during working with a dataset.
3. Missing values
• Central tendency
• Central tendency is an estimate of the centre of a distribution of values.
• There are three major estimates of central tendency: mean, median, and
mode.
• The mean is the simple average of all values in a given distribution.
• The median is the middle value within a range of values in a distribution.
• The mode is the most frequently occurring value in a distribution of values.
Frequency Distribution
1. Calculate Frequency and Central Tendency using SPSS
• If you want to find frequency, you can follow the following steps.
• From menu bar -> Analyze->Descriptive statistics ->Frequencies -> Variables (eg, EDUCA) -
> OK.
Education level
Frequency Percent Valid Cumulative
Percent Percent
Grade 1-5 4 1.1 1.1 1.1
Grade 6-9 15 4.1 4.1 5.2
Grade 10 to O/L 128 35.0 35.0 40.2
Valid Up to A/L 195 53.3 53.3 93.4
First degree 22 6.0 6.0 99.5
Higher degree 2 .5 .5 100.0
366 100.0 100.0
Total
Central Tendency using SPSS