0% found this document useful (0 votes)
10 views

Basic Methods of Comparing Data in Minitab Express

Uploaded by

eddielau1022
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views

Basic Methods of Comparing Data in Minitab Express

Uploaded by

eddielau1022
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Basic Methods of Comparing Data in Minitab Express

The following are some basic ways to compare data in Minitab Expressed based on whether the
data is categorical or numeric. Specific direction on how to perform each comparison can be found
on the Simple Data Comparisons videos found online in our class.

Numeric and Categorical Data (Descriptive Statistics & Boxplot)


Example Hypothesis: People with pets exercise more than people without pets.
Variables used to analyze this data:

 Numeric: Number of hours a person spends exercising each week


 Categorical: Have a Pet?
Directions

Minitab Express: Statistics  Describe  Descriptive Statistics


 Data Tab: Variable = Hours Exercise, Group variables = Pet
 Statistics Tab: Mean, SD, Min, Q1, Median, Q3, Max, N
 Display Tab: Boxplot

Results

Example of what descriptive statistics from this type of comparison will look like are shown below.

Example of the Boxplot that this comparison will create.

Interpreting the Data & Boxplot

Start with the mean for both groups. Then


look at the overall distribution of the data on
the boxplot. What is the minimum value for
each group? What is the median (remember,
that means that half the data falls on either
side of this point).

Written Analysis of the Data


On average pet owners exercised 7 hours, while non-pet owners exercised 5.56 hours. The boxplot results
show that even though pet owners had the lowest number of hours exercised, overall they exercised a lot
more (median = 7) compared to the non-pet owners (median 3.5 hours). These results support the original
hypothesis that people with pets exercise more than people without pets.
Numeric and Categorical Data (Bar Chart)
Example Hypothesis: People with pets get less sleep than people without pets.
Variables used to analyze this data:

 Numeric: Hours sleep


 Categorical: Pet
Directions

Minitab Express: Graphs  Bar Chart  Function of a variable  Simple (single Y variable)
 Function: Mean
 Continuous variable: Hours Sleep
 Categorical Variable: Pet

Add Mean to Chart
 Click on Chart, then click on the + icon with a circle around it (upper right corner)
 Check the “Data Labels” box (this defaults to a linear – straight – regression line)

Results

Example of Bar Chart graph with means.

Interpreting this Table


Look at the bars for
each category to see if
there is a difference
between them. The
mean value will also
help you when
interpreting the results.

Written Analysis of the Data


The bar chart of the data with mean shows very little difference between pet and non-pet owners
for the mean number of hours they sleep. Non-pet owners indicated slightly higher sleep with an
average of 7.17 hours, compared to pet owners with 6.92 average hours. These results do not
support the original hypothesis that people with pets get less sleep than people without pets.
Categorical and Categorical Data (Cross Tabulation)
Example Hypothesis: More women are on social media for 4-6 hours than men.
Variables used to analyze this data:

 Categorical: Gender?
 Categorical: How often do you use social media during a 24 hour period?
Directions

Minitab Express: Statistics  Cross Tabulation and Chi-Square


 Data Tab: Row = Gender; Column = Social Media
(Note, for this comparison it doesn’t matter which variable gets used for the row and which
is used for the column)
 Display Tab: Percent of row total, Percent of column total

Results

Example of a cross tabulation and Chi-square from this type of comparison is shown below.

Interpreting this table


Which results should you look at when trying to
determine if your hypothesis was supported or
not? In this case, you would look at the % of row
for each gender for the 4-6 hours category.

Written Analysis of the Data


The cross tabulation results showed that 37.5% of
females and 16.67% of males were on social media
4-6 hours. This data supports our hypothesis that
more women are on social media for 4-6 hours
than men.

Why wouldn’t you use the % of Column for this question?


For this question the column data (50% female, 33.33% males, & 16.67% other) is only looking at
the people who were on social media 4-6 hours and then what gender they were. This would be
like separating people by their social media use first and then looking at the percent gender
breakdown. A hypothesis that would use this data would look like this – People who are on social
media for 4-6 hours are more likely to be female than male. Yet, remember that for our initial
hypothesis we started our claim with women (overall) being on social medial more than men in the
4-6 hour range. This means we separated people by gender first and then looked at the number of
hours they used social media.
Numeric and Numeric Data (Simple Scatterplot)
Example Hypothesis: People who exercise more tend to drink more water.
Variables used to analyze this data:

 Numeric: Hours spent exercising each week


 Numeric: Number of 12oz glasses of water drank each day
Directions
Minitab Express: Graphs  Scatterplot  Simple Note, for this comparison it
 Y variable = Exercise doesn’t matter which variable gets
 X variable = Water used as the Y variable and which
gets used for the X variable.
Add Regression Fit line to Scatterplot
 Click on Scatterplot, then click on the + icon with a circle around it (upper right corner)
 Check the “Regression Fit” box (this defaults to a linear – straight – regression line)

Results

Example of Scatterplot with Regression Fit Line

Interpreting this Graph


Look at the Regression Fit line to see
if there is a positive, negative, or no
relationship between the variables.
Positive relationship means when one
variable goes up, the other variable
goes up. Negative relationship means
when one variable goes up, the other
variable goes down. No relationship
means the variables don’t seem to be
related. The closer the dots (data
points) are to the line, the better the
fit!

Example of each type


of Relationship

Written Analysis of the Data


The scatterplot of the data with the regression fit line shows a positive relationship between
number of hours a person exercises and the number of glasses of water they drink each day. The
dots aren’t that close to the line, so it’s not a strong relationship. Yet, these results do seem to
mostly support the original hypothesis that people who exercise more tend to drink more water.
Numeric & Numeric Data compared by a Categorical Variable (Scatterplot
with Groups)
Example Hypothesis: There is a difference in the relationship between time spent exercising each
week and glasses of water drank each day for pet owners versus non-pet owners.
Variables used to analyze this data:

 Numeric: Hours spent exercising each week


 Numeric: Number of 12oz glasses of water drank each day
 Categorical: Pet Owner
Directions
Minitab Express: Graphs  Scatterplot  With Groups Note, for this comparison it
 Y variable = Exercise doesn’t matter which variable gets
 X variable = Water used as the Y variable and which
 Group Variable = Pet gets used for the X variable.
Add Regression Fit line to Scatterplot
 Click on Scatterplot, then click on the + icon with a circle around it (upper right corner)
 Check the “Regression Fit” box (this defaults to a linear – straight – regression line)
 Click the arrow next to “Regression Fit” and in the box that comes up, uncheck the “Fit
Intercept” box

Results
Example of Scatterplot with Groups and Regression Line Fit

Interpreting this
Graph
Look at the Regression
Fit line for each group
to see if there is a
positive, negative, or
no relationship
between the variables
(see previous page for
more on this). Then
compare the lines to
see if there is any
difference between
them.
Example Summary Statistics that are also generated with the Scatterplot with Groups

Written Analysis of the Data


The scatterplot of the data with the regression fit line shows a positive relationship for pet & non-
pet owners between the variables (exercise & water). The dots for each group aren’t close to their
line, so it’s not a strong relationship. The scatterplot data indicates a similar trend for both pet and
non-pet owners, but their lines are at different angles, indicating different relationships between
their data. The Summary statistics back up that there are differences in the data, with non-pet
owners having a lower mean for exercise (5.56 hours) and water (3.28 glasses) compared to pet
owners (exercise = 7.0 hours; water = 6.73 glasses). These results help show that we can support
our original hypothesis that there is a difference in the relationship between time spent exercising
each week and glasses of water drank each day for pet owners versus non-pet owners.

You might also like