0% found this document useful (0 votes)
116 views5 pages

EXCEL Data Analysis

These are guidelines I made for some excel data analysis; Linear regression (Both multiple and Simple), Descriptive Statistics among others

Uploaded by

N M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
116 views5 pages

EXCEL Data Analysis

These are guidelines I made for some excel data analysis; Linear regression (Both multiple and Simple), Descriptive Statistics among others

Uploaded by

N M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

EXCEL

(a) Using excel, obtain a liner regression output.


Before starting regression analysis in excel, you first enable the the Analysis ToolPak add-in
as;
 In your Excel page, click File > Options.
 In the Excel Options dialog box, select Add-ins on the left sidebar, make sure Excel
Add-ins is selected in the Manage box, and click Go.
 In the Add-ins dialog box, tick off Analysis Toolpak, and click OK:
Please Note; This ensures that the Data Analysis tools is added to the Data tab of your
Excel ribbon.

The regression output is generated as follows;


 On the Data tab, in the Analysis group, click the Data Analysis button.(At the right
top part of excel page)
 Select Regression and click OK.
 In the Regression dialog box, click on the X_Range Box and select the independent
variables (Rating)
 In the Regression dialog box, click on the Y_Range Box and select the dependent
variables (Amount Spent)
 Choose your preferred Output option - "a new worksheet ply".
 Select the Residuals checkbox to get the difference between the predicted and
actual values.
Please check the results from the screenshot attached in the explanation part.

(b) A Linear regression equation that can be used to predict the amount spent based
on ratings on the services and products purchased.
Take not on the coefficients where, a=intercept and b=ratings coefficient. Amount
spent=y and x=rating.
From the form of a linear equation whish is form y = bx + a,
The Required Linear Equation is;
 y= 14.02529x+ 0.694253
Where;
 a=intercept
 b=ratings coefficient/Slope/Gradient
 y=Amount spent
 x=rating.

(c) Interpret the slope of the linear regression.


b=ratings coefficient/Slope/Gradient= 14.02529
The slope indicates that for every increase of the rating by one unit, the amount spent
increases by 14.02529 dollars or for every decrease of the rating by one unit, the
amount spent decreases by 14.02529 dollars.
The higher the rating, the more the amount spent by the customers in dollars.
(d) State and interpret the coefficient of determination.
Coefficient of determination refers to the R Square which 0.665899 in our case.
Multiply 0.665899 by 100 to get a percentage which is 66.5859%.
This means that the x-variable which refers to the ratings accounts for 66.5859% of the y-
variable which is the amount spent by the customers in dollars.
This means that the ratings accounts for 66.5859% of the amount spent by the customers in
dollars. The remaining percentage which is 100%-66.5859%=33.4141% means
that 33.4141% of the amount spent by customers in dollars is not explained by the ratings
data.

Please find the excel output attached for all the information used in the explanations.
Explanation
References.
Linear regression analysis in Excel. (2019, December 4). Excel tutorials, functions and
formulas for beginners and advanced users - Ablebits.com
Blog. https://www.ablebits.com/office-addins-blog/2018/08/01/linear-regression-analysis-
excel/
Schmidt, A. F., & Finan, C. (2018). Linear regression and the normality assumption. Journal
of clinical epidemiology, 98, 146-151.
A. Construct a scatter plot and add trendline to display this data. Sketch a copy of
the graph.
The following steps are used to create scatter plots in Excel;
 Enter the above data in a new worksheet or open the worksheet that contains the
above data.
 Select (Highlight) all the data above to plot in the scatter chart.
 Click the "Insert tab", and then click "Insert Scatter (X, Y)" or "Bubble Chart."
 Click "Scatter."
 You can change the chart style by clicking on the chart area of the chart to
display the Design and Format tabs and also change the chart headline to change the title
text.
 Click on the Chart Elements (Plus Sign) on top-right corner of the chart and select
trendline and then click on the arrow and select linear (This is because the dots show a
linear association between Hours of Study and Exam Grade.
 You can also add axes titles and other relevant information by clicking on the Chart
Elements.
Note: Please find a Excel copy of the chart attached in the attachment part on the
Explanation part.

B. What conclusion can be drawn?


According to the scatter plot, there is a clear positive linear relationship between the Hours
of Study and the Exam Grades. This is evident because the dots on the scatter plot are
arranged on almost a straight line with an increasing gradient.
A linear trendline fits the data accurately confirming the linear relationship between the two
variables under study. This means that, the more the Hours of Study, the higher the Exam
Grades scored and vice versa.

C. Does Hours of Study or Exam grade have more variation? (Hint: You must
calculate a measure of dispersion that allows for comparisons)
Before starting any analysis in excel, you first enable the the Analysis ToolPak add-in as;
 In your Excel page, click File > Options.
 In the Excel Options dialog box, select Add-ins on the left sidebar, make sure Excel
Add-ins is selected in the Manage box, and click Go.
 In the Add-ins dialog box, tick off Analysis Toolpak, and click OK:
Please Note; This ensures that the Data Analysis tools is added to the Data tab of your
Excel ribbon.
Produce the data summery in Excel as;
 On the Data tab, in the Analysis group, click Data Analysis (Top-right).
 Select Descriptive Statistics and click OK.
 Select the Input Range.
 Select the Output Range as New Worksheet Ply.
 Make sure Summary statistics is checked.
 Click OK.
Note; Please find the summary statistics copy attached below.
Variation Explanation: Hours of Study shows more variation than the Exam Grades. This
is because the Hours of Study standard deviation (1.366260102) is greater compared to its
respective Hours of Study mean (3.666666667) than that of Exam Grades which are,
standard deviation (6.892024376) and mean (85.5). The skewness and kurtosis in the
Hours of Study are also much higher than those of the Exam Grades.

Before starting any analysis in excel, you first enable the the Analysis ToolPak add-in as;
 In your Excel page, click File > Options.
 In the Excel Options dialog box, select Add-ins on the left sidebar, make sure Excel Add-
ins is selected in the Manage box, and click Go.
 In the Add-ins dialog box, tick off Analysis Toolpak, and click OK:
Please Note; This ensures that the Data Analysis tools is added to the Data tab of your Excel
ribbon.
 
Produce the data summery in Excel as;
 Enter your data in a new excel tab.
 On the Data tab, in the Analysis group, click Data Analysis (Top-right of your Excel
page).
 Select Descriptive Statistics and click OK.
 Select the Input Range.
 Select the Output Range as New Worksheet Ply.
 Make sure Summary statistics is checked.
 Click OK.
Please Note; Please find the summary statistics copy attached below.
 
Produce a box plots in Excel as follows;
 Step 1: Create column table with the following rows for each series: Minimum, Quartile
1, Quartile 2, Quartile 3, Maximum. 
 Step 2: Calculate the quartile values using the formulas: MIN(), QUARTILE.INC(cell
range, 1), QUARTILE.INC(cell range, 2), QUARTILE.INC(cell range, 3), MAX()
respectively. 
 Step 3: Calculate quartile differences. Calculate the differences between each phase.
You have to calculate the differentials between the First quartile and minimum value.
Median and first quartile. Third quartile and median. Maximum value and third
quartile. To begin, create a third table, and copy the minimum values from the last table
there directly. Calculate the quartile differences with the Excel subtraction formula (cell1
– cell2), and populate the third table with the differentials.
 Step 3: Create a stacked column chart. Select all the data from the third table, and click
Insert > Insert Column Chart > Stacked Column. The chart doesn't yet resemble a
box plot, as Excel draws stacked columns by default from horizontal and not vertical
data sets. To reverse the chart axes, right-click on the chart, and click Select Data.
Click Switch Row/Column. Click OK.
 Step 4: Convert the stacked column chart to the box plot style as: Select the bottom part
of the columns. Click Format > Current Selection > Format Selection. The Format
panel opens on the right. On the Fill tab, in the Formal panel, select No Fill. 

Create the whiskers for the box plot above as:


 Select the topmost data series.
 On the Fill tab, in the Formal panel, select No Fill.
 From the ribbon, click Design > Add Chart Element > Error Bars > Standard
Deviation.
 Click one of the drawn error bars (+). Open the Error Bar Options tab, in the Format
panel, and set the following: Set Direction to Minus. Set End Style to No Cap. For
Error Amount, set Percentage to 140.
Note: Please find the Excel Copies Attached. For box and whisker plots is all about being
creative but not a formal process because there is none in Excel. 

You might also like