SPSS Excercise 1
SPSS Excercise 1
SPSS
INTRODUCTION
SPSS Statistics is software for advanced statistical analysis ( Statistical Product and
Service Solutions) commonly referred to as SPSS. It is a suite of software programs
that analyzes scientific data related to the social sciences. It was initially developed by
IBM in the 1960s and is widely used in various fields, including social
sciences, business, healthcare, and research.
In this introduction to SPSS, it covers some of the key aspects and functions
of the software:
1. Data Input and Management:
SPSS allows you to input data from various sources, including
spreadsheets, databases, and text files.
You can create, edit, and organize your dataset using SPSS Data
Editor.
2. Descriptive Statistics:
SPSS provides a range of tools for summarizing and describing
your data. You can calculate measures like mean, median,
standard deviation, and more.
3. Data Transformation:
You can manipulate your data by creating new variables,
recoding existing ones, and performing various data
transformations.
4. Data Visualization:
SPSS offers several graphical tools to create charts, histograms,
scatterplots, and other visualizations to help you understand
your data.
5. Hypothesis Testing:
SPSS supports a wide range of statistical tests, including t-tests,
chi-square tests, ANOVA, regression analysis, and non-parametric
tests.
6. Regression Analysis:
You can perform linear and logistic regression analyses to model
relationships between variables.
BY PAMHOR
EXERSISE 1
Demographic details:
(sample study)
Converting the collected data from words to number with the help of filter in excel.
BY PAMHOR
EXERSISE 2:
VARIABLE VIEW in SPSS
1 2 3 4 5 6 7 8 9 10 2
In SPSS, the Variable View is where you define and provide information about the variables in
your dataset. This view is essential for data management and analysis as it allows you to specify
the characteristics of each variable, such as its name, type, measurement level, and other
attributes. Here's an explanation of the key elements in the Variable View of an SPSS data sheet:
1. Name: This column contains the names of the variables in your dataset. Variable names
must begin with a letter and can include letters, numbers, and underscores. They should
be descriptive and meaningful to help you identify the variables easily.
2. Type: This column specifies the data type of each variable. Common types include:
Numeric:
Used for
continuous
data, such as
age, income,
or test
scores.
String: Used
for text or categorical data, such as names, categories, or labels.
Date: Used for date and time variables.
3. Width: The width represents the maximum number of characters that a variable can
hold. For numeric variables, it specifies the total number of digits, including decimal
places. For string variables, it sets the maximum length of the string.
BY PAMHOR
4. Decimals: This column determines the number of decimal places for numeric variables.
It's only applicable to numeric types and is typically set to zero for whole numbers.
5. Label: The label column allows you to provide a more detailed and descriptive name or
label for each variable. This can be especially helpful for documentation and clarity in
your analysis.
6. Values: This column is used for defining
value labels for categorical or ordinal
variables. It allows you to assign
meaningful labels to numerical codes,
making the data more understandable.
For example, you can use a value label
to replace "1" with "Male" and "2" with
"Female" for a gender variable.
7. Missing: You can specify how missing
values are coded for each variable. For example, you can indicate that missing values are
represented by a specific number or a dot (.) in your dataset.
8. Columns: The columns column determines the starting and ending positions of each
variable in the data sheet. It helps define the column range in which the variable's data
will be stored.
9. Align: This column allows you to specify the alignment of the variable's
values in the data sheet. Options include left, center, and right alignment.
10. Measure: The measure column indicates the measurement level of the
variable. There are four measurement levels in SPSS:
Nominal: Used for categorical data without a specific order (e.g., colors, gender).
Ordinal: Used for ordered categorical data (e.g., education levels).
Scale: Used for continuous or interval data (e.g., temperature,
income).
Date: Used for date and time variables.
11. Role: This column specifies the role of the variable, such as "Input," "Target," or "Both."
It is relevant when you are performing certain analyses like regression or machine
learning.
Once you've defined your variables in the Variable View, you can switch to the Data View to
input or import your data. Properly defining variables in the Variable View is crucial for accurate
data analysis and ensures that SPSS understands the data's characteristics.
BY PAMHOR
Variable view: -
EXERSISE 3:
DATA VIEW
the Data View is one of the two main views used for working with your
dataset. It provides a spreadsheet-like interface where you can view, enter,
edit, and manage the actual data in your dataset.
BY PAMHOR
Computing variable
BY PAMHOR
the "Compute Variable" function allows you to create new variables based on calculations or
transformations of existing variables in your dataset. You can use this feature to perform a wide
range of data transformations, such as mathematical operations, conditional statements, and
combining variables.
Descriptive analysis:
Descriptive analysis in SPSS involves
summarizing and exploring your
dataset to understand its basic
characteristics
BY PAMHOR
Steps:-
1. Analyse
2. Descriptive Statistics
3. Frequencies
4. Select the Variable
5. Options: You can click the "Statistics" button
in the "Frequencies" dialog to access
additional options. This allows you to request
various statistics in addition to the basic
frequencies. Common statistics include mean,
median, mode, minimum, and maximum for
the selected variable.
6. Chart (Optional): If you want to create a
chart (e.g., bar chart) select and
7. Click ok.
Mean
The term "mean" refers to the statistical measure of central tendency known as the arithmetic
mean or average. It is a way to summarize a set of numerical values by calculating the sum of all
the values and then dividing by the total number of values.
Median
Generally median represents the mid-value of the given set of data when arranged in a
particular order.
A. Mode
The most frequent number occurring in the data set is known as the mode.
Standard deviation
BY PAMHOR
Steps:-
1. Analyze select this
2. Descriptive Statistics
3. Frequencies
4. Select the Variable
5. Options: You can click the
"Statistics" button in the "Frequencies"
dialog to access additional options.
This allows you to request various
statistics in addition to the basic
frequencies. Common statistics include
mean, median, mode, minimum, and
maximum for the selected variable.
6. Chart (Optional): If you want to create a chart (e.g., bar chart)
7. click OK.
Output:-
BY PAMHOR
• For example, mean, median, mode and sum are selected here.
• Click on ‘continue’ and the tab below is seen.
• This is the end result of finding mean, median and mode.
BY PAMHOR
Exercise: 4
CHECKING NORMALITY OF DATA
Checking for normality of data is a statistical test to determine if a data set is well-modelled by a
normal distribution and to compute how likely it is for a random variable underlying the data set
to be normally distributed 12. The Shapiro-Wilk test is a statistical test for checking the
normality of data3. If the p-value of the test is less than the significance level (usually 0.05), the
null hypothesis is rejected, indicating that the data is not normally distributed3. The assumption
of normality means that you should make sure your data roughly fits a bell curve shape before
running certain statistical tests or regression
Steps:-
Analyze
Descriptive Statistics
Frequencies
Select the Variable
Click on charts
Select chart type
Choose “show normal curve
on Histogram “if needed
Click continue
Click ok
Exercise: 5
Correlations
Correlation means connection. Correlation analysis studies the relationship or connection
between two or more variables. Two variables are said to be correlated if they differ in such
a way that changes in one variable accompany changes in the other.
Example: The relationship between a student’s height and weight in a class, as well as the
relationship between a family’s earnings and the amount spent each month.
STEPS-
1) Analysis
2) Correlate
3) Bivariate
4) Select the variables
5) Select correlation of coefficient
(e.g.: - Pearson)
6) Select test of significant (e.g.: - toe
tailed)
BY PAMHOR
7) Go to options
8) Select statistics – mean and standard deviation
9) Click continue
10) Now click OK
Select this
Out put
STEPS:-
1) After getting the correlation, go to graphs
2) Legacy dialogs
3) Scatter dot
4) Select simple scatter (according to your
convenient)
5) Click define
Select here