0% found this document useful (0 votes)

3 views22 pages

Unit 4

Bivariate analysis is a statistical method that examines the relationship between two variables, utilizing techniques like scatterplots and regression analysis to uncover patterns and correlations. Percentage tables and contingency tables are tools used to display joint distributions of categorical variables, aiding in the understanding of customer satisfaction and other relationships. Additionally, box plots and scatter plots are employed to visualize data distributions and linear relationships, respectively, facilitating informed decision-making in various fields.

Uploaded by

sobiyaparasuraman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

3 views22 pages

Unit 4

Uploaded by

sobiyaparasuraman

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 22

UNIT-4

Bivariate Analysis
4.1. Relationship between Two variables:
• Bivariate analysis is a statistical method used to explore and understand the
relationship between two different variables in a dataset.
• It involves examining how changes in one variable are associated with
changes in another, providing insights into potential connections,
correlations, or dependencies between them.
• Bivariate analysis commonly employs techniques such as scatterplots,
correlation coefficients, and regression analysis to assess the strength,
direction, and significance of the relationship between the two variables.
• This type of analysis is invaluable for uncovering patterns, making predictions,
and informing decision-making across various fields, including economics, social
sciences, and scientific research.

4.2. Percentage Tables:

Percentage Tables:
In bivariate analysis, a percentage table is used to display the joint distribution of two
categorical variables. It shows the relative frequencies or percentages of observations
that fall into each combination of categories from the two variables. Here's an example
of how a percentage table in bivariate analysis might look:

Customer Satisfaction
Product Category Satisfied Dissatisfied Total

Electronics 45% 15% 60%

Clothing 25% 10% 35%
Books 20% 30% 50%
Appliances 10% 20% 30%
Total 100% 75% 175%
In this example:

• The rows represent the product categories (Electronics, Clothing, Books, and
Appliances).
• The columns re present customer satisfaction levels (Satisfied and Dissatis fied).
• The numbers in the table represent the percentage of customers falling into each category
combination.
• For instance, 45% of customers who bought Electronics were satisfied, while 15%
were dissatisfied. This bivariate percentage table provides insights into how
customer satisfaction varies across different product categories.
• It helps you understand which categories have the highest or lowest levels of
customer satisfaction, enabling you to make data-driven decisions for improving
customer experience and product offerings.

Key Observations:

Product Categories: The rows of the table represent different product categories
(Electronics, Clothing, Books, and Appliances).

Customer Satisfaction: The columns represent customer satisfaction levels, categorized as

Satisfied and Dissatisfied.

Individual Category Percentages:

• In the Electronics category, 45% of customers were satisfied, while 15% were
dissatisfied.
• In the Clothing category, 25% of customers were satisfied, while 10% were
dissatisfied.
• In the Books category, 20% of customers were satisfied, while 30% were
dissatisfied.
• In the Appliances category, 10% of customers were satisfied, while 20% were
dissatisfied.

Analysis:

High Satisfaction, Low Dissatisfaction: Electronics has the highest satisfaction rate
(45%) among customers, with a relatively low dissatisfaction rate (15%). This suggests
that customers who purchased Electronics products tend to be more satisfied.

Mixed Satisfaction: Clothing has a moderate satisfaction rate (25%) and a low
dissatisfaction rate (10%). It indicates that customers buying Clothing products have a
reasonably positive experience, but there is room for improvement.

Low Satisfaction, High Dissatisfaction: Books have a lower satisfaction rate (20%) and
a higher dissatisfaction rate (30%). This category requires attention to address the
higher dissatisfaction level.
Low Satisfaction, Moderate Dissatisfaction: Appliances have the lowest satisfaction rate
(10%) among all categories, with a moderate dissatisfaction rate (20%). This category
needs significant improvement to enhance customer satisfaction.

Overall Insights:

Electronics appears to be the most popular category with high customer satisfaction.

Books and Appliances need improvement in customer satisfaction, with Appliances being the
most challenging category.

Clothing has a reasonably positive satisfaction level but could benefit from some
enhancements.

This analysis of the bivariate percentage table helps you identify areas where customer
satisfaction is strong and where improvements are needed, allowing you to make informed
decisions to enhance the customer experience and product offerings in your e-commerce
platform.

Contingency table:
A contingency table displays frequencies for combinations of two categorical variables.
Analysts also refer to contingency tables as cross tabulation and two- way tables.

Contingency tables classify outcomes for one variable in rows and the other in columns. The
values at the row and column intersections are frequencies for each unique combination of the
two variables.
Use contingency tables to understand the relationship between categorical variables. For
example, is there a relationship between gender (male/female) and type of computer
(Mac/PC)
Example Contingency Table:
The contingency table example below displays computer sales at our fictional store.
Specifically, it describes sales frequencies by the customer’s gender and the type of
computer purchased. It is a two-way table (2 X 2). I cover the naming conventions at the
end.
In this contingency table, columns represent computer types and rows represent genders. Cell
values are frequencies for each combination of gender and computer type. Totals are in the
margins. Notice the grand total in the bottom-right margin.
t a glance, it’s easy to see how two-way tables both organize your data and paint a picture of
the results. You can easily see the frequencies for all possible subset combinations along with
totals for males, females, PCs, and Macs.

For example, 66 males bought PCs while females bought 87 Macs. Furthermore, there are
117 females, 106 males, 96 PC sales, 127 Mac sales, and a grand total of 223 observations
in the study.

Creating Contingency table:

We showed 3 ways for creating APA style contingency tables in SPSS:

• CROSSTABS is easiest. You can create several tables in one go but they
require quite some (manual) editing.
• CTABLES runs the desired table straight away and could be run from the menu.
However, it creates one table at the time and requires an additional license.
• TABLES also comes up with the right table straight away. However, the
syntax is difficult and there's no menu.

Running Simple Contingency Tables in SPSS

• The fastest way to create the table we just saw is running one line of SPSS
syntax:
crosstabs educ by marit.
• The categories of the first variable -educ or education- become rows in the table. The values
of the second variable -marit or marital status- become the columns. As a rule of thumb,
the columns hold the groups you want to compare on whatever goes into the
rows.
In this case, we're comparing marital status
groups on education level.
• If we had wanted to do the reverse -compare education level groups on
marital status- we'd swap the rows and columns and run
crosstabs marit by educ.

CROSSTABS with Column Percentages:

Right. So how do our groups compare on education level? It's hard to see from our first
table because each marital status group has a different n or sample size. We may see more
of a pattern if we add column percentages. The syntax below does just that.

APA Contingency Tables from CTABLES

• The table we just created can be run in one go with CTABLES. However, this only
works if you've a license for the custom tables module. You can check this by
running
• show license.
• If the resulting table includes “Custom Tables”, try the syntax below.
Now drag and drop statistics right underneath Marital status and just close the window.

Let's now make 2 text replacements:

• use n instead of “Count”

• use % instead of “% within Marital status”
• Using the Ctrl + H short key in the output viewer should work. Or -much nicer- use
the OUTPUT MODIFY syntax below if you're on SPSS version 22 or higher.
4.3. Analyzing Contingency Table:
Marginal Distribution

These distributions represent the frequency distribution of one categorical variable

without regard for other variables. Unsurprisingly, you can find these distributions in the
margins of a contingency table.

The following marginal distribution examples correspond to the blue highlights.

For example, the marginal distribution of gender without considering computer type is
the following:

Males: 106

Females: 117

Alternatively, the marginal distribution of computer types is the following:

PC: 96

Mac: 127

Conditional Distribution

For these distributions, you specify the value for one of the variables in the contingency
table and then assess the distribution of frequencies for the other variable. In other
words, you condition the frequency distribution for one variable by setting a value of the
other variable. That might sound complicated, but it’s easy using a contingency table.
Just look across one row or down one column.

The following conditional distribution examples correspond to the green highlights.

For example, the conditional distribution of computer type for females is the

following:
PC: 30

Mac: 87

Alternatively, the conditional distribution of gender for Macs is the following: Males: 40

Females: 87

Finding Relationships in a Contingency Table:

• If there is a relationship between ice cream preference and gender, we’d expect
the conditional distribution of flavors in the two gender rows to differ. From the
contingency table, females are more likely to prefer chocolate (37 vs. 21), while males
prefer vanilla (32 vs. 12).

• Both genders have an equal preference for strawberry. Overall, the two- way table
suggests that males and females have different ice cream preferences.

Row and column percentages help you draw conclusions when you have unequal
numbers in the margins. In the contingency table example above, more women than men
prefer chocolate, but how do we know that’s not due to the sample having more women?
Use percentages to adjust for unequal group sizes. Percentages are relative frequencies.
Learn more about Relative Frequencies and their Distributions.

Here’s how to calculate row and column percentages in a two-way table.

Row Percentage: Take a cell value and divide by the cell’s row total. Column

Percentage: Take a cell value and divide by the cell’s column total.

For example, the row percentage of females who prefer chocolate is simply the number of
observations in the Female/Chocolate cell divided by the row total for women: 37 / 66 =
56%.

The column percentage for the same cell is the frequency of the Female/Chocolate cell
divided by the column total for chocolate: 37 / 58 = 63.8%.

4.4. Handling Several Batches:

Box plot:
A box plot, also known as a box plot, box plots or box-and-whisker plot, is a
standardized way of displaying the distribution of a data set based on its five- number
summary of data points: the “minimum,” first quartile [Q1], median, third quartile [Q3]
and “maximum.”

Boxplots in SPSS:

Our data file contains a sample of N = 238 people who were examined in a driving
simulator. Participants were presented with 5 dangerous situations to which they had to
respond as fast as possible. The data hold their reaction times and some other variables.
Our boxplot shows some potential outliers as well as extreme values.

Outliers:
Outliers are data points in a dataset that significantly deviate from the majority of other
data points. They are observations that fall well outside the typical range or distribution
of values and may be unusually high or low in comparison to the rest of the data.
Outliers can potentially distort statistical analyses and should be
carefully examined to determine whether they represent genuine extreme values or are
the result of errors or anomalies in the data collection process.

Removing outliers using spss in histogram:

Let's first try to identify outliers by running some quick histograms over our 5
reaction time variables. Doing so from SPSS’ menu is discussed in Creating
Histograms in SPSS. A faster option, though, is running the syntax below.

Let's take a good look at the first of our 5 histograms shown below. The “normal
range” for this variable seems to run from 500 through 1500 ms. It seems that 3 scores
lie outside this range. So are these outliers? Honestly,

different analysts will make different decisions here.

Personally, I'd settle for only excluding the score ≥ 2000 ms. So what's the right way to
do so? And what about the other variables?

The right way to exclude outliers from data analysis is to specify them as user missing
values. So for reaction time 1 (reac01), running

excludes reaction times of 2000 ms and higher from all data analyses and editing. So
what about the other 4 variables?

The histograms for reac02 and reac03 don't show any outliers.

For reac04, we see some low outliers as well as a high outlier. We can find which values
these are in the bottom and top of its frequency distribution as shown below. We can
exclude all of these outliers in one go by running

missing values reac04 (lo thru 400,2085).

By the way: “lo thru 400” means the lowest value in this variable (its minimum) through 400
ms.

For reac05, we see several low and high outliers. The obvious thing to do seems to run
something like

missing values reac05 (lo thru 400,2000 thru hi).

But sadly, this only triggers the following error: The problem here is that you

can't specify a low and a high

range of missing values in SPSS.

Since this is what you typically need to do, this is one of the biggest stupidities still
found in SPSS today. A workaround for this problem is to

RECODE the entire low range into some huge value such as 999999999; add the

original values to a value label for this value;

specify only a high range of missing values that includes 999999999.

The syntax below does just that and reruns our histograms to check if all outliers have
indeed been correctly excluded.

Result

First off, note that none of our 5 histograms show any outliers anymore; they're now
excluded from all data analysis and editing. Also note the bottom of the frequency table
for reac05 shown below.

Even though we had to recode some values, we can still report precisely which outliers we
excluded for this variable due to our value label.

Before proceeding to boxplots, I'd like to mention 2 worst practices for excluding outliers:

removing outliers by changing them into system missing values. After doing so, we no
longer know which outliers we excluded. Also, we're clueless why values are system
missing as they don't have any value labels.

removing entire cases -often respondents- because they have 1(+) outliers. Such cases
typically have mostly “normal” data values that we can use just fine for analyzing other (sets
of) variables.

Sadly, supervisors sometimes force their students to take this road anyway. If so, SELECT
IF permanently removes entire cases from your data.

4.5. Scatterplots and Resistant

Lines: Linear Relationship:
In bivariate analysis, which involves examining two variables together, one of the key
aspects to explore is linear relationships. A linear relationship between two variables
means that as one variable changes, the other tends to change in a straight-line fashion.

In a scatter plot, a linear relationship would appear as a pattern where the data points
roughly follow a straight line. If, as one variable increases, the other also tends to
increase, it's a positive linear relationship. Conversely, if one variable increases as the
other decreases, it's a negative linear relationship.

The strength of a linear relationship can be quantified using correlation coefficients like
Pearson's correlation coefficient. A value close to 1 indicates a strong positive linear
relationship, close to -1 indicates a strong negative linear relationship, and close to 0
suggests little to no linear relationship.

Identifying and understanding linear relationships in bivariate analysis is essential as it

helps in making predictions, drawing insights, and can serve as the basis for building
linear regression models to make quantitative predictions based on these relationships.

A scatter plot is a graphical representation used to visualize the relationship between two
sets of data points. It consists of points on a two-dimensional plane, where each point
represents the values of two variables. By plotting these points, you can quickly identify
patterns, correlations, or trends in the data. Scatter plots are commonly used in data
analysis to determine whether there is a connection between the variables and to
visualize how changes in one variable relate to changes in another, making them a
valuable tool in understanding and interpreting data.
Scatter plot in SPSS:
The starting assumption is that you have already imported your data into SPSS, and that
you’re looking at something like the data set below.

This hypothetical data set contains the mid-term and final exam scores of 40 students in
a Statistics course (the first 20 records are displayed above). We want to create a scatter
plot to visualize the relationship between the two sets of scores.
Create a Scatter Plot
Click Graphs -> Legacy Dialogs -> Scatter/Dot as illustrated below. Note, however that
in newer versions of SPSS, you will need to click Graphs > Scatter/Dot.

We recommend that you click the Reset button to clear any previous settings.

The next step is to move your variables into the X Axis and Y Axis boxes. If your data
is from a regression study, select your predictor/independent variable, and use the arrow
button to move it to the X Axis. Then select the criterion/dependent variable, and use
the arrow button to move it to the Y Axis box. If your data is from a simple correlation
study, as is the case with our example, there may not be obvious predictor/independent
and criterion/dependent variables. In these cases, it doesn’t matter which variable you
move to the X Axis box and which variable you move to the Y Axis box.

It is a good idea to give your scatter plot a title. To do this, click the Titles button, add
your title, and click Continue to return to the “Simple Scatterplot” dialog box.
Select Simple Scatter and then click Define.This brings up the “Simple Scatterplot”
dialog box below.
Select OK to generate your scatter plot.

Each student in our hypothetical study is represented by one dot on our scatter plot.
Each dot’s position on the X (horizontal) axis represents a student’s mid- term exam
score, and its position on the Y (vertical) axis represents their final exam score.

After we create a scatter plot, we need to review it to assess the nature of the
relationship – if any – that exists between our variables. The scatter plot above indicates
that there is a positive linear relationship between mid-term and final exam scores in
this Statistics course. In other words, lower mid-term exam scores tend to be associated
with lower final exam scores, and higher mid-term exam scores tend to be associated
with higher final exam scores. It is important to note that a scatter plot cannot prove a
causal relationship between variables. Therefore, we cannot conclude that high mid-
term exam scores cause high final exam scores on the basis of the scatter plot above.
Some of the other relationships between variables that your scatter plot may indicate are
illustrated below.

Fitting resistant lines:

The eda_rline function fits a robust line through a bivariate dataset. It does so by first
breaking the data into three roughly equal sized batches following the x-axis variable. It
then uses the batches’ median values to compute the slope and intercept.

However, the function doesn’t stop there. After fitting the inital line, the function fits
another line (following the aforementioned methodology) to the model’s residuals. If
the slope is not close to zero, the residual slope is added to the original fitted model
creating an updated model. This iteration is repeated until the residual slope is close to
zero or until the residual slope changes in sign (at which point the average of the last
two iterated slopes is used in the final fit).
An example of the iteration follows using data from Velleman et. al’s book. The dataset,
neoplasms, consists of breast cancer mortality rates for regions with varying mean
annual temperatures.

Note that the 16 record dataset is not divisible by three thus forcing an extra point in the
middle batch (had the remainder of the division by three been two, then each extra point
would have been added to the tail-end batches).

Next, we compute the medians for each batch

The two end medians are used to compute the slope as:

where the subscripts r and l reference the median values for the right-most and left-
most batches.

Once the slope is computed, the intercept can be computed as follows:

where (x,y),l,m,r are the median x and y values for each batch. This line is then used to
compute the first set of residuals. A line is then fitted to the residuals following the same
procedure outlined above.

he initial model slope and intercept are 3.412 and -69.877 respectively and the
residual’s slope and intercept are -0.873 and 41.451 respectively. The residual slope is
then added to the first computed slope and the process is again repeated thus
generating the following tweaked slope and updated residuals:
The updated slope is now 3.412 + (-0.873) = 2.539. The iteration continues until the
slope residuals stabilize. The final line for this working example is,

where the final slope and intercept are 2.89 and -45.91, respectively.

Fitting resistant line in SPSS:

Data Preparation: Start by opening your dataset in SPSS. Make sure it contains the
variables you want to analyze.

Analyze Menu: Go to the "Analyze" menu at the top of the SPSS window.
Regression: Under the "Analyze" menu, select "Regression."
Linear: In the "Regression" submenu, choose "Linear."
Dependent and Independent Variables: In the "Linear Regression" dialog box, specify
your dependent variable (the one you want to predict) and your independent variable(s)
(the ones you want to use to predict the dependent variable).
Options: Click the "Options" button in the "Linear Regression" dialog box.
Resistant Line: In the "Options" dialog, you might find an option related to "Resistant
Line." This option is typically used for resistant line regression techniques, such as
robust regression. You may need to check a box or select a specific method depending
on your analysis requirements.
OK: After setting your options, click "OK" to close the "Options" dialog.
Run: Back in the "Linear Regression" dialog box, click "OK" to run the analysis.

Agriculture Research Proposal
100% (7)
Agriculture Research Proposal
10 pages
Ccs341 Data Warehousing
75% (8)
Ccs341 Data Warehousing
2 pages
Analyzing Survey Data With Minitab
No ratings yet
Analyzing Survey Data With Minitab
3 pages
SMDM Business-Report Arvind Soni-2
0% (1)
SMDM Business-Report Arvind Soni-2
15 pages
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
From Everand
Multivariate Analysis – The Simplest Guide in the Universe: Bite-Size Stats, #6
Lee Baker
No ratings yet
EDA Unit-4
No ratings yet
EDA Unit-4
24 pages
EDA Unit IV
No ratings yet
EDA Unit IV
17 pages
EDA Unit IV
No ratings yet
EDA Unit IV
17 pages
Module 5 Bivariate Analysis
No ratings yet
Module 5 Bivariate Analysis
81 pages
Data Analysis and Reporting
No ratings yet
Data Analysis and Reporting
13 pages
Cross Tabs Are Frequently Used Because
No ratings yet
Cross Tabs Are Frequently Used Because
7 pages
Unit IV 2
No ratings yet
Unit IV 2
24 pages
Data Analysis: DR - Ajay Pandit March 16, 2010
No ratings yet
Data Analysis: DR - Ajay Pandit March 16, 2010
35 pages
BBA 4009 Competitive Analysis
No ratings yet
BBA 4009 Competitive Analysis
20 pages
Unit Iv
No ratings yet
Unit Iv
24 pages
MPP 813 - Session 7 - NBK - Categorical Data Analysis
No ratings yet
MPP 813 - Session 7 - NBK - Categorical Data Analysis
52 pages
Cross Tabs
No ratings yet
Cross Tabs
2 pages
Business Research Methods Zikmund CHP 20
No ratings yet
Business Research Methods Zikmund CHP 20
32 pages
Case Study Questions Unit 3
No ratings yet
Case Study Questions Unit 3
9 pages
Malhotra Data Analysis 1
No ratings yet
Malhotra Data Analysis 1
17 pages
Lecture 5 Stats
No ratings yet
Lecture 5 Stats
11 pages
Contingency Tables
No ratings yet
Contingency Tables
24 pages
Business Report Project - Sheetal - SMDM
100% (1)
Business Report Project - Sheetal - SMDM
20 pages
Example Probability II
No ratings yet
Example Probability II
1 page
Bivariate Analysis
No ratings yet
Bivariate Analysis
40 pages
Unit - 2 BRM PDF
No ratings yet
Unit - 2 BRM PDF
9 pages
Data Analysis
No ratings yet
Data Analysis
16 pages
9 Data Analysis
No ratings yet
9 Data Analysis
43 pages
Session 1, 2
No ratings yet
Session 1, 2
100 pages
Data Analysis
No ratings yet
Data Analysis
13 pages
Understanding, Organising and Presentation of Data (Ppt1)
No ratings yet
Understanding, Organising and Presentation of Data (Ppt1)
81 pages
MGS2150 Lecture2 Notes II
No ratings yet
MGS2150 Lecture2 Notes II
36 pages
MSPD-MQ-STAT-S6 Cross Table - 2 Categorical Variables KHI
No ratings yet
MSPD-MQ-STAT-S6 Cross Table - 2 Categorical Variables KHI
33 pages
BRM - 9e - PPT - CH - 20 Student
No ratings yet
BRM - 9e - PPT - CH - 20 Student
14 pages
02-Organizing and Visualizing Data
No ratings yet
02-Organizing and Visualizing Data
44 pages
Exponential
No ratings yet
Exponential
54 pages
Quantitative Methods
No ratings yet
Quantitative Methods
28 pages
Homework 7
No ratings yet
Homework 7
5 pages
Notes - EDA-Unit4
No ratings yet
Notes - EDA-Unit4
29 pages
Presenting Tables and Charts
No ratings yet
Presenting Tables and Charts
31 pages
6.3 Teacher Edition
No ratings yet
6.3 Teacher Edition
12 pages
MCA Mathematical Foundation For Computer Application 11
No ratings yet
MCA Mathematical Foundation For Computer Application 11
17 pages
Data Exploration and Visualization Unit 2
100% (1)
Data Exploration and Visualization Unit 2
19 pages
Bohrnstedt y Knoke - Statistics For Social Data Analysis - 4,8,9
No ratings yet
Bohrnstedt y Knoke - Statistics For Social Data Analysis - 4,8,9
67 pages
Course Hero Reference 4
No ratings yet
Course Hero Reference 4
11 pages
Topic 7 Frequency Distribution, CrossTabulation & T-Test
No ratings yet
Topic 7 Frequency Distribution, CrossTabulation & T-Test
20 pages
Chapter 1: Exploring Data: Section 1.1
No ratings yet
Chapter 1: Exploring Data: Section 1.1
15 pages
Presentation of Data
No ratings yet
Presentation of Data
49 pages
SMDM Project - Business Report - R
100% (2)
SMDM Project - Business Report - R
21 pages
Unit 4
No ratings yet
Unit 4
63 pages
Harish Kumar Tsaini SMDM
No ratings yet
Harish Kumar Tsaini SMDM
16 pages
Age Why - Do - You - Like - The - Shop Crosstabulation
No ratings yet
Age Why - Do - You - Like - The - Shop Crosstabulation
11 pages
Marketing Research
No ratings yet
Marketing Research
36 pages
Basic Data Analysis
No ratings yet
Basic Data Analysis
33 pages
Unit 4
No ratings yet
Unit 4
19 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
45 pages
Crosstabulation A Tutorial by Russell K. Schutt To Accompany
No ratings yet
Crosstabulation A Tutorial by Russell K. Schutt To Accompany
8 pages
Quantitative Methods in Management
No ratings yet
Quantitative Methods in Management
131 pages
BRM Short Terms
No ratings yet
BRM Short Terms
10 pages
W02-Measurment Scales and Data Understanding
No ratings yet
W02-Measurment Scales and Data Understanding
28 pages
Unit 4 Dev 2023 Final
No ratings yet
Unit 4 Dev 2023 Final
23 pages
Pivot Tables: Easy Excel Essentials, #1
From Everand
Pivot Tables: Easy Excel Essentials, #1
M.L. Humphrey
No ratings yet
Unit 2 - Descriptive Analytics
No ratings yet
Unit 2 - Descriptive Analytics
85 pages
Ex 3
No ratings yet
Ex 3
2 pages
Ex 2
No ratings yet
Ex 2
3 pages
SDN Lab Manual
No ratings yet
SDN Lab Manual
11 pages
Unit 2
No ratings yet
Unit 2
40 pages
Machine Learning - AL3451 - Notes - Unit 1 - Introduction To Machine Learning
No ratings yet
Machine Learning - AL3451 - Notes - Unit 1 - Introduction To Machine Learning
29 pages
Summary ML
No ratings yet
Summary ML
1 page
IV AI-DS AD3491 FDSA Unit3
No ratings yet
IV AI-DS AD3491 FDSA Unit3
35 pages
IV AI-DS AD3491 FDSA Unit4
No ratings yet
IV AI-DS AD3491 FDSA Unit4
30 pages
Dev Unit 5
No ratings yet
Dev Unit 5
22 pages
Dev Unit 4
No ratings yet
Dev Unit 4
40 pages
CCS341 Data Warehousing
No ratings yet
CCS341 Data Warehousing
7 pages
Dev Unit 3
No ratings yet
Dev Unit 3
36 pages
Machine Learning - AL3451 - Notes - Unit 1 - Introduction To Machine Learning
No ratings yet
Machine Learning - AL3451 - Notes - Unit 1 - Introduction To Machine Learning
29 pages
Machine Learning - AL3451 - Notes - Unit 5 - Design and Analysis of Machine Learning Experiments
No ratings yet
Machine Learning - AL3451 - Notes - Unit 5 - Design and Analysis of Machine Learning Experiments
33 pages
Security Lab Manual1
No ratings yet
Security Lab Manual1
64 pages
DAA Assignment 01
No ratings yet
DAA Assignment 01
19 pages
DS Unit1b
No ratings yet
DS Unit1b
22 pages
BC 304 PI Past Papers
No ratings yet
BC 304 PI Past Papers
29 pages
3.1 Fluid Mosaic Model
No ratings yet
3.1 Fluid Mosaic Model
35 pages
Education 101 PPT, Jun 2023
No ratings yet
Education 101 PPT, Jun 2023
19 pages
24/07/08 TP-Link W8920G 108M ADSL and ADSL2+ Set Up Guide
No ratings yet
24/07/08 TP-Link W8920G 108M ADSL and ADSL2+ Set Up Guide
7 pages
KNR 182 Unit 2 (1) (Mariah Wilson)
No ratings yet
KNR 182 Unit 2 (1) (Mariah Wilson)
5 pages
Lesson 13
No ratings yet
Lesson 13
8 pages
Mineral Resource Conflict Jharkhand
No ratings yet
Mineral Resource Conflict Jharkhand
20 pages
IPE 4715 Material Handling and Maintenance
No ratings yet
IPE 4715 Material Handling and Maintenance
2 pages
Invisisil Op2131sd Uv Cure Optical Bonding Silicone Tds
No ratings yet
Invisisil Op2131sd Uv Cure Optical Bonding Silicone Tds
5 pages
Elp Group1 Lesson 4 7
No ratings yet
Elp Group1 Lesson 4 7
17 pages
Units 15 16 - Exercises
No ratings yet
Units 15 16 - Exercises
4 pages
Global Marketing
No ratings yet
Global Marketing
9 pages
COM 20250108 Aria - Pub.provider@
No ratings yet
COM 20250108 Aria - Pub.provider@
40 pages
Brochure Antech Type C
No ratings yet
Brochure Antech Type C
2 pages
PRATAP DOME Holding
No ratings yet
PRATAP DOME Holding
1 page
Oral Cancer Essay
No ratings yet
Oral Cancer Essay
3 pages
The Necklace
No ratings yet
The Necklace
3 pages
Session 2: Personal Professional Development: Pre-Test
No ratings yet
Session 2: Personal Professional Development: Pre-Test
9 pages
Homework: Level 3 BTEC Applied Science Unit 1 Past Paper Exam Questions
No ratings yet
Homework: Level 3 BTEC Applied Science Unit 1 Past Paper Exam Questions
3 pages
School-Forms-1-7 1
No ratings yet
School-Forms-1-7 1
18 pages
All Ges101 Past Questions-1-1
No ratings yet
All Ges101 Past Questions-1-1
55 pages
Module 5 Welding
No ratings yet
Module 5 Welding
119 pages
Occlusion and Periodontal Health
No ratings yet
Occlusion and Periodontal Health
8 pages
Asian Terminals vs. Reyes, JR
No ratings yet
Asian Terminals vs. Reyes, JR
2 pages
Lista de Libros 2024
No ratings yet
Lista de Libros 2024
2 pages
Special Web 1 PDF
No ratings yet
Special Web 1 PDF
12 pages
Kenya - Going Nuts Macadamia Farming As A Livelihood Strategy For Kibugus Farmers
No ratings yet
Kenya - Going Nuts Macadamia Farming As A Livelihood Strategy For Kibugus Farmers
63 pages
DSCP & Vlan Priority
No ratings yet
DSCP & Vlan Priority
13 pages
World Football Champions
No ratings yet
World Football Champions
98 pages

Unit 4

Uploaded by

Unit 4

Uploaded by

UNIT-4

4.2. Percentage Tables:

Electronics 45% 15% 60%

Customer Satisfaction: The columns represent customer satisfaction levels, categorized as

Individual Category Percentages:

Creating Contingency table:

We showed 3 ways for creating APA style contingency tables in SPSS:

Running Simple Contingency Tables in SPSS

CROSSTABS with Column Percentages:

APA Contingency Tables from CTABLES

Let's now make 2 text replacements:

• use n instead of “Count”

These distributions represent the frequency distribution of one categorical variable

The following marginal distribution examples correspond to the blue highlights.

Alternatively, the marginal distribution of computer types is the following:

The following conditional distribution examples correspond to the green highlights.

Finding Relationships in a Contingency Table:

Here’s how to calculate row and column percentages in a two-way table.

4.4. Handling Several Batches:

Removing outliers using spss in histogram:

different analysts will make different decisions here.

missing values reac04 (lo thru 400,2085).

missing values reac05 (lo thru 400,2000 thru hi).

can't specify a low and a high

original values to a value label for this value;

specify only a high range of missing values that includes 999999999.

4.5. Scatterplots and Resistant

Identifying and understanding linear relationships in bivariate analysis is essential as it

Fitting resistant lines:

Next, we compute the medians for each batch

Once the slope is computed, the intercept can be computed as follows:

Fitting resistant line in SPSS:

You might also like