Final SPSS Record
Final SPSS Record
CONSTRUCTING BAR
DIA AGRAM,
3 12/7/2024 HISTOGRAM, PIE
DIAGEAM
AIM:
● To open SPSS and study about the Menus and Tools in it
PROCEDURE:
● To start SPSS, choose IBM SPSS Statistics 20 from the Start menu. The Data Editor window will be
shown as below:
● In order to type any data into SPSS that you already have, you need to select Type in data option and
then click on OK.
● If you are going to work on the existing data set in SPSS, click OK and select C drive → Program
Files
→ IBM → SPSS → Statistics → 20 → Samples → English → and choose the data set you desire.
● In the data editor window the Menu bar consists of the usual Windows menus, plus some specific to
SPSS, such as Data, Transform, Analyze and Graph. Details of the menus are given below:
● The Menu bar provides easy access to most SPSS features. It consists of ten drop-down menus:
● To enter data in SPSSData Editor Window, follow the steps given for the below example:
● Example:The Framinhgham Heart Study followed a cohort of 5209 men and women for over 25years.
The study has been important in identifying risk factors associated with cardiovascular disease. The
following is a description of the variables we have selected from the study for our purpose:
Column Description of Variable
● Clicking the cell Numeric and then the button in the cell opens the Variable Type dialog box.
● As sex is a categorical variable, click the radio button for String.
● In our example there are three variables: sex(categorical), age (numeric), and systolic blood pressure
(numeric). You may provide a description of the variable listed in each row of the Viewer window in the
Label column. For example, we may assign the label gender to the variablesex. Enter gender in the Label
column corresponding to the variable sex.
● To define possible values of the variable sex (possible values M for male and F for female) click the
Values cell in the row for the variable, and then click the button in the cell.
● After assigning values for male and female click on Add and then click OK.
● In the same way enter the remaining variables Age and Systolic. Ageshould be defined as a numeric
variablewith two digits (to minimize the chances of transcription error) and Systolicas a numeric variable
with 3 digits.
● To delete a variable (row), select the row number that you wish to delete, click Edit, and then on Clear.
The selected variable will be deleted and all variables to the right of the deleted variable will shift to the
left. Alternatively, you can select the row and press Delete key on your keyboard.
● To insert a new variable (row) between existing variables: click on the row that is below the row where
you wish to enter a new variable, click Dataon the menu bar, and then click Insert Variable from the
pull-down menu.
● To enter the values for the assigned variables, switch from the Variable View window to the Data
View window. The three variables gender, age, and systolic are represented as columns. Each row
represents a case or an observation.
● Enter the values for all cases on one variable (column) and then repeat the procedure for all values in
the remaining columns
● The column Missing gives the number of data values that is not given in the data view for that particular
variable. With SPSS, there are two forms of missing values: system-missing and userdefined missing.
● System-missing values are those that SPSS automatically treats as missing. The most common form of
this type of value is when there is a "blank" in the data file.
● User-defined missing values are those that the user specifically informs SPSS to treat as missing.
Rather than leaving a blank in the data file, numbers are often entered that are meant to represent data.
● You need to inform SPSS in particular that those numbers are to be treated as a missing value; otherwise
it will treat it as valid.
● Click the particular cell under the column Missingfor the variable (Eg. Systolic) we will get the Missing
Values dialog box in which we need to select the Discrete Missing Values radio button and give the
value as 9999.
● With this definition of the missing values for the variable systolic, SPSS will treat 9999 as a missing
value of the variable and not include it in any computations involving the systolic blood pressure
● For stringvariables, Width refers to how many characters a value can hold.
● For numeric variables, Width refers to how many digits should be displayed. However, SPSS will often
override the specified width if it’s insufficient. Columns refer to the width of the variable’s column.
● You can use the Align column for setting the alignment for the values of the variables you are entering
in the data view.
● The most important assignment for a variable is the Measures. In the measures column we need to
choose the appropriate option for that variable. There are three types of measures Nominal, Ordinal
and Scale.
● Nominal variables are Qualitative variables without any order. It can be numeric or categorical
variables.
● Ordinal variables are Qualitative variables with some order. These are categorical variables with
intrinsic ranking.
● Scale variables are Quantitative variables.
● After giving all the values for the assigned variables in the Data View Window, in order to save the
data file you have created, go to File in the menu bar and choose Save Asand give the name in the File
name box and click onSave tab.
● To insert a new case (row) in between cases that already exists in your data file: click the row below
the row where you wish to enter the new case,click Dataon the menu bar, click Insert Case from the
pull-down menu.
● To delete a case, click the case number that you wish to delete, click Editfrom the menu, and then on
Clear. The selected case will be deleted and the rows below will shift upward.
RESULT:
Entered data into SPSS Data Editor Window and learnt various Menu and Tools available with SPSS.
EX.NO: 02
DATE: 01/07/2024
AIM :
To construct frequency table univariant frequency table and cross tabulation.
PROCEDURE:
Step 1: Go to start menu �All program ->IBM SPSS statistics and select IBM SPSS statistics 20
Step 2: The software opens out of the two views->data and variable, choose variable view.
Step 3: In the variable view, enter the field you will have in your frequency table and define their type and
size.
Step 4: After entering the field in the, variable view, shift to data view and you can see your fields are ready
to take inputs, so enter inputs for the field.
Step 5: Go to analyze in the menu bar->descriptive statistics->choose frequencies. Frequencies dialog box
will open. Select the fields you want in your output opens, click OK and output opens.
Step 6: Go to analyze->descriptive statistics ->choose crosstabs, select the fields that you want as rows and
the once that you want as column and click OK. output opens.
SAMPLE INPUT:
Sample Output:
RESULT:
Entered data in SPSS Data Editor Window.
EX.NO: 03
DATE: 08/07/2024
AIM :
To represent the data graphically, bar chart pie chart and histogram multi pie chart, subdivided bar
diagram
PROCEDURE:
Step 1: Go to start menu ->All program->IBM SPSS statistics and select IBM SPSS statistics 20
Step 2: The software opens out of the two views->data and variable, choose variable view.
Step 3: In the variable view, enter the field you will have in your frequency table and define their type and
size.
Step 4: After entering the field in the, variable view, shift to data view and you can see your fields are
ready to take inputs, so, enter inputs for the field.
Step 5: Go to analyze in the menu bar->descriptive statistics->choose frequencies. Frequencies dialog box
will open. Select the fields you want in your output opens, click OK and output opens.
Step 6: on the right side, you can see option, choose chart from the options. Chart dialog box opens. Select
the type of chart->bar, pie or histogram, you want and click ok.
Step 7: Go to graphs->legacy dialogs and choose bar. Bar dialog box opens. Select clustered and click defines
or select stacked and click defines.
Step8: Stack or clustered dialog box opens select one variable each for category axis, define stack/cluster
by, rows and columns and click OK.
Sample Output:
Gender
Cumulative
Frequency Percent Valid Percent Percent
Valid F 3 60.0 60.0 60.0
M 2 40.0 40.0 100.0
Total 5 100.0 100.0
RESULT:
Constructing Piechart, Histogram, Barchart data files using SPSS is done successfully
EX.NO: 04
DATE: 15/07/2024
PROCEDURE:
● Suppose that we would like to sort the data in the data file according to the age of the subjects enrolled in
the study, select a data set that is already available in SPSS sample files (Eg:accidents.sav) and choose
Datafrom the menu bar and then select SortCases.
● In order to sort the subjects according to the age, select agecat and move it to the Sort by box. You can
sort cases in ascending or descending order.
If you select multiple sort variables, cases are sorted by each variable within category of the prior
variable on the Sort by: list.
● For example, if you select gender as the first sorting variable andage as the second sorting variable, cases
will be sorted by age classification within each gender category.
● The below window shows the sorted data for the variable agecat.
● Split File feature can be accessed from the Data menu. This feature split the data file into separate groups
for analysis based on the values of one or more grouping variables.
● If you select multiple grouping variables, cases are grouped by each variable within categories of the prior
variable.
● For example, if you select sex as the first grouping variable and agecat as the second grouping variable,
cases will be grouped by age category classification within each gender group.
● If you check Organize output by groups radio button, all results from each statistical procedure will be
displayed separately for each split-file group (in this case gender).
● In the below window we can see that the variable gender is split into two different groups say male and
female.
● For Merging Files in SPSS, both files need to have a common indexing key (preferably numeric). This
would be a unique identifier for each variable in your data set.
● The common indexing keys have to be sorted in ascending order for SPSS to be able to merge files, so
make sure both files are sorted in ascending order before trying to merge.
● Both data files should provide different data for the same set of variables. For example, you might record
the same information for customers in two different sales regions and maintain the data for each region in
separate files.
● Open one of the data files that you want to merge and that will be your active data set. The cases from this
file will appear first in the merged data file. Here we are going to open the file MERGE1 that is created
as shown below:
● It is shown above that here the key variable we have chosen is idand we have sorted that variable in
ascending order either by using Sort Cases option or by right-clicking on the variable name id and
selecting Sort Ascending option.
● Choose Data from the menu bar and select Merge Files and then click on Add Cases.
● In the Add Cases dialog box shown below we can choose An open dataset radio button if the second file
we need to merge is opened already or else we can select An external SPSS
Statistics data file radio button and browse for the second file, i.e. MERGE2
● Then click on Continue. We can see that three variables gender, sex and sex1 are shown under Unpaired
Variables List.
● Variables from the active dataset are identified with an asterisk (*). Variables from the other dataset are
identified with a plus sign (+). Identical variables in both the files are grouped under Variables in New
Active Dataset box namely id, age &dob.
● Variables from either data file that do not match a variable name in the other file are given in the Unpaired
Variables List. You can create pairs from unpaired variables and include them in the newmerged file
● To pair the variables that has same data but given with different names, we need to select the variables
(for eg: here we select gender and sex) and click the Pair tab and it will be included in the Variables in
New Active Dataset box. The variable name from the active dataset is used as the variable name, in the
new merged file (i.e. gender)
● We can see that there is another variable sex1 in the Unpaired Variables box, we’ve created this variable
as a copy of the variable sex in the second data file in order to ensure that the same data has been merged
in the appropriate order as it was in the second data file MERGE2 into the new merged data file,
MERGE1. So, we need to click on the variable sex1 and add it to the Variables in New Active Dataset
box by clicking on the arrow tab and then click OK
● Now we can see the new merged file being opened as shown below:
● Now in order to Merge two data files that have Same Cases but Different Variables we choose Data
→ Merge → Add Variables.
The two data files we are going to merge are Bankloan1.sav and Bankloan2.savThe active data
file is Bankloan1.sav and we are going to merge the additional variables in the file Bankloan2.sav for the
same number of persons into the active data file.
● In our Eg: we take the variable id as the Key Variable and Sort it in Ascending order in both the files
which is shown below:
● Now go to the Data menu and select the Merge option and then select Add Variables.
● In the Add Variables dialog box shown above we can see that the second file Bankloan2.sav is already
opened hence it is shown in An open dataset box select it and click on Continue if it is not already
opened Browse and open it.
● A dialog box will be opened as shown below in which we can see the Key Variable that have same data
is shown under Excluded Variables box. By default, this list contains any variable names from the other
dataset that duplicate variable names in the active dataset. Here the only similar variable is id.
● In the New Active Dataset box Variables to be included in the new, merged dataset is given.
By default, all unique variable names in both datasets are included on this list.
● Select the Variable, id and Click on Match cases on key variables in sorted files option and also select
the Both files provide cases radio button and click on the arrow tab. The selected variable from Excluded
Variables will be shifted to Key Variables box. Now click OK.
● SPSS will give you a warning regarding Sorted Key Variables as shown below. Make sure both files were
sorted in ascending order before trying to do a file merge
● Click on OK and we can see that the variables in the file Bankloan2.sav have been merged in the file
Bankloan1.sav, for the same number of cases in both the files.
RESULT: Sorting, splitting and merging data files using SPSS is done successfully
EX.NO: 05
DATE: 22/07/2024
AIM :
To calculate the measures of central tendency mean, median and mode
PROCEDURE:
Step 1: Go to start menu -> All program -> IBM SPSS statistics 20
Step 2: Click variable view and enter the data as Group 1 and Group 2
Step 3: Click data view and enter the appropriate data nearly 6 to 7 values
Sample output :
RESULT: The measures of central tendency mean, median and mode using SPSS is done successfully.
EX.NO: 06
DATE: 29/07/2024
SKEWNESS, KURTOSIS
AIM:
To Calculate Methods of Dispersion-Standard Deviation, Quartiles, Skewness, Kurtosis
PROCEDURE:
Step 3: Go to data view and give input for the fields you want to define
Sample output:
REGRESSION TREND
AIM:
Calculation of Regression Trend Line.
PROCEDURE:
Step 1 Go to the variable view and enter the fields you want.
Step 2 Go to the data view and give inputs for the fields.
Step 3 Then go to Analyze Regression-Linear Move the field names to right Choose plots,move
*zpred to Y axis and move *zresid to X axis and then press continue.
Sample input :
Sample output :
EX.NO: 08
DATE: 12/08/2024
CORRELATION ANALYSIS
AIM:
To calculate how strongly the variables are related to each other using SPSS.
INTRODUCTION:
● Correlation is a statistical technique that can show whether and how strongly pairs of variables are
related. For example, height and weight are related; Correlation can tell you just how much of the
variation in peoples' weights is related to their heights.
● Like all statistical techniques, correlation is only appropriate for certain kinds of data. Correlation works
for quantifiable data in which numbers are meaningful, usually quantities of some sort. It cannot be
used for purely categorical data, such as gender, brands purchased, or favorite color.
● The main result of a correlation is called the correlation coefficient (or "r"). It ranges from -1.0 to +1.0.
The closer r is to +1 or -1, the more closely the two variables are related.
● If ‘r’is close to 0, it means there is no relationship between the variables. If ‘r’ is positive, it means that
as one variable gets larger the other gets larger. If ‘r’is negative it means that as one gets larger, the other
gets smaller (often called an "inverse" correlation).
● While correlation coefficients are normally reported as ‘r’, squaring them makes them easier to
understand. The square of the coefficient (or r square) is equal to the percent of the variation in one
variable that is related to the variation in the other.
● A correlation coefficient of +1 indicates a perfect positive correlation. As variable X increases, variable
Y increases. As variable X decreases, variable Y decreases.
PROCEDURE:
● Select Height Correlation data set.
● Go to Analyze→Correlate→Bivariate…
● In the Bivariate Correlations dialog box, select the two variables Height and Length and put in the
Variable box by clicking on the arrow mark and click on OK.
● The correlation coefficient we obtained is 0.651 which determines that we’ve obtained a Positive
Correlation (between 0 to +1). This shows that the value of height increases with that of the Femur Length
or vice-versa.
● Significant value must be < 0.05 which means that the “there is significant difference” between the
variables hence we reject the null hypothesis.
● If the Significance value is> 0.05 then we accept the null hypothesis that “there is no significant
difference”. Here we’ve got 0.042.
● Now open the data set, Vehicle Correlation.sav
● Again go to Analyze→Correlate→Bivariate…
● In the Bivariate Correlations dialog box, select the variables and put them in the Variables box and click
on OK.
OUTPUT:
Correlations:
Correlations
Weight in Fuel Efficiency in
1000lbs Miles/gallon
Pearson
1 -.839**
Correlation
Weight in 1000lbs Sig. (2-tailed) .002
N
10 10
Pearson
-.839** 1
Fuel Efficiency in Correlation
Miles/gallon Sig. (2-tailed) .002
N 10 10
**. Correlation is significant at the 0.01 level (2-tailed).
RESULT:
Calculated how strongly the variables are related to each other for the given data using SPSS.
EX.NO: 09
DATE: 19/08/2024
AIM:
Test of significance for single and two samples (1-test for mean and
standard deviation;2-test for proposition)
PROCEDURE:
*In variable view enter data as machine.enter in values and select 1=machine a, 2=machine b. Enter
field name as quality and count.
*Analyze->descriptive statistics->crosstabs.
*Move machine to rows and quality to columns.
*Enter into statistics and enable chi square then select continue, press ok.
Sample input
Sample Output:
Test for mean and standard deviation
RESULT:
Test of significance for single and two samples is is learnt successfully.
EX.NO: 10
DATE: 02/09/2024
INTRODUCTION:
The independent-samples t-test (or independent t-test, for short) compares the means between two unrelated
groups on the same continuous, dependent variable.
For example, you could use an independent t-test to understand whether first year graduate salaries differed
based on gender (i.e., your dependent variable would be "first year graduate salaries" and your independent
variable would be "gender", which has two groups: "male" and "female").
When you choose to analyze your data using an independent t-test, part of the process involves checking to
make sure that the data you want to analyze can actually be analyzed using an independent t-test. You need to
do this because it is only appropriate to use an independent t-test if your data "passes" six assumptions that
are required for an independent t-test to give you a valid result.
Assumption #1: Your dependent variable should be measured on a continuous scale (i.e., it is measured at the
interval or ratio level).
Assumption #2: Your independent variable should consist of two categorical, independent groups.
Assumption #3: You should have independence of observations, which means that there is no relationship
between the observations in each group or between the groups themselves
Assumption #4: There should be no significant outliers. Outliers are simply single data points within your
data that do not follow the usual pattern
Assumption #5: Your dependent variable should be approximately normally distributed for each group of the
independent variable.
Assumption #6: There needs to be homogeneity of variances. You can test this assumption in SPSS Statistics
using Levene’s test for homogeneity of variances.
SCENARIO:
Finding whether there is any significant difference between the AverageSales based on the Vehicle type
(Automobile or Truck).
PROCEDURE:
● Open the Car sales.sav dataset in SPSS using File-> Open->Data options.
● In the Independent-Samples T Test dialog box shown above, put the variable Sales in thousands in the
Test Variable box
● In the Grouping Variable box, put the variable Vehicle Type and then Click on Define groups tab and
assign 0 in Group 1 box and assign 1 in Group 2 box and then click on Continue tab.
Equal
varianc - - - -
9.66 .00 12.03087
es 3.10 155 .002 37.38794 61.15357 13.62231
assume 2 2 2
8 8 7 9
Sales in
d
thousan
Equal
ds
varianc - - -
47.70 16.03748 -
es not 2.33 .024 37.38794 69.63868
assume 2 7 5.137208
1 8 8
d
RESULT:
Independent Sample T-test in SPSS is learnt successfully.
EX.NO: 11
DATE: 09/09/2024
PROCEDURE:
● Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences among
group means and their associated procedures (such as "variation" among and between groups), developed
by statistician and evolutionary biologist Ronald Fisher.
● The one-way analysis of variance (ANOVA) is used to determine whether there are any statistically
significant differences between the means of three or more independent (unrelated) groups.
Create ANOVA TEST using the Employee Data.sav dataset.
Step 2: Put the variable Salary (Current salary)in Dependent variable box and the variable jobcat
(Employment Category) in Fixed Factor box. Now click Options tab to customize the output.
Step 3: In the Univariate Options dialog box, move the variable jobcat from Factor(s) and
Factor Interactions box into Display Means for: box and click on the check boxes Descriptive Statistics,
Homogeneity tests and Estimate of effect size and then click on Continue and click on OK in the
Univariate dialog box.
OUTPUT:
Between-Subjects Factors
Value Label N
1 Clerical 363
Employment Category 2 Custodial 27
3
Manager 84
Descriptive Statistics
Dependent Variable: Current Salary
Employment Category Mean Std. Deviation N
Tests the null hypothesis that the error variance of the dependent variable
is equal across groups.
a. Design: Intercept + jobcat
RESULT: Implemented ANOVA test in SPSS using the given dataset successfully.
EX.NO: 12
DATE: 23/09/2024
INTRODUCTION:
● The two-way ANOVA compares the mean differences between groups that have been split on two
independent variables (called factors).
● The primary purpose of a two-way ANOVA is to understand if there is an interaction between the two
independent variables on the dependent variable.
● For example, you could use a two-way ANOVA to understand whether there is an interaction between
gender and educational level on test anxiety amongst university students, where gender (males/females)
and education level (undergraduate/postgraduate) are your independent variables, and test anxiety is your
dependent variable.
Step 1:Open the Employee Data.sav data set. Goto Analyze->General Linear Model ->Univariate
Step 2: In the Univariate dialog box, select one dependent variable for eg: salary (Current Salary)in
Dependent Variable box and two independent variables say educ and jobcat(Educational Level and
Employment category) in Fixed Factor box. Then click Options tab to customize the output.
Step 3:In the Univariate: Options dialog box, move the (Overall) option from Factor(s) and Factor
Interactions box into Display Means for: box and click on the check boxes Descriptive Statistics,
Homogeneity tests and Estimate of effect size and then click on Continue.
Step 4: Click OK in the Univariate dialog box and the output will be displayed.
OUTPUT:
AIM:
To implement a Linear In Curve Estimation in SPSS.
PROCEDURE:
i) Go to Analyze -> Click Regression ->Click Curve Estimation.
OUTPUT:
RESULT:
AIM:
To implement Multiple Response Analysis in SPSS using a dataset
PROCEDURE:
i) Go to Analyze -> Multiple Response -> Data Variable Sets
ii) After a dialog box opens select all values to analyzed and
move to Variables set box. Then enter the counted value as 1 in
the box.
iii) Now enter the Name of the Analyzes in the name bar and
in the label bar. Then click add, now the variables are inside the
name socialmedia
SAMPLE INPUT:
SAMPLE OUTPUT:
RESULT:
The Multiple Response Analysis is implemented successfully.
EX.NO: 15
DATE: 07/10/2024
AIM:
To implement Chi Square Test in SPSS.
PROCEDURE:
i) Go to Analysis-> Descriptive Statistics-> CrossTabs
ii) In Crosstabs Dialog box, move the variables into rows and columns
iii) Then click Statistics, and check for
Chi square is ticked,thenclick OK
SAMPLE INPUT:
SAMPLE OUTPUT:
RESULT:
The Chi Square Test in SPSS is implemented Successfuly