0% found this document useful (0 votes)
30 views45 pages

Excel Sheet Short Note Book

Excel sheet short notes

Uploaded by

manish sahu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views45 pages

Excel Sheet Short Note Book

Excel sheet short notes

Uploaded by

manish sahu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 45

A

REPORT ON

BUSINESS STATISTICS LAB – I

COURSE CODE: ABS5001P

IN THE PARTIAL FULFILLMENT OF THE REQUIREMENTS OF


AWARD OF THE DEGREE OF

BACHLORE OF COMMERCE

DEPARTMENT OF BUSINESS ADMINISTRATION SHRINATHJI

INSTITUTE OF BIOTECHNOLOGY AND MANAGEMENT

UPALI ODEN, NATHDWARA 2023-2024

SUBMITTED BY

i
TABLE OF CONTENTS

Sr. Name of Contents


PageNo.
No.

UNIT- I
1-9
1
INTRODUCTION TO MS EXCEL

UNIT- II
10-22
2 STATISTICAL CHARTS IN MS EXCEL

UNIT- III
3 23-29
STATISTICAL MEASURES

UNIT- IV
30-35
4 STATISTICAL MEASURES

UNIT- V
36-42
5 CORRELATION AMD REGRESSION

ii
iii
UNIT- I

INTRODUCTION TO MS EXCEL

1
There are numbers of spreadsheet programs but from all of them, Excel is most widely used.
People have been using it for last 30 years and throughout these years, it has been upgraded
with more and more features.

The best part about Excel is, it can apply to many business tasks, including statistics, finance,
data management, forecasting, analysis, inventory, billing, and business intelligence.

Following are the few things which it can do for you:

• Number Crunching

• Charts and Graphs

• Store and Import Data

• Manipulating Text

• Templates/Dashboards

• Automation of Tasks

• And Much More...

Three most important components of Excel is which you need to understand first:

1. Cell: A cell is a smallest but most powerful part of a spreadsheet. You can enter your data
into a cell either by typing or by copy-paste. Data can be a text, a number, or a date. You can
also customize it by changing its size, font color, background color, borders, etc. Every cell is
identified by its cell address, cell address contains its column number and row number (If a
cell is on 11th row and on column AB, then its address will be AB11).

2. Worksheet: A worksheet is made up of individual cells which can contain a value, a


formula, or text. It also has an invisible draw layer, which holds charts, images, and
diagrams. Each worksheet in a workbook is accessible by clicking the tab at the bottom of the
workbook window. In addition, a workbook can store chart sheets; a chart sheet displays a
single chart and is accessible by clicking a tab.

3. Workbook: A workbook is a separate file just like every other application has. Each
workbook contains one or more worksheets. You can also say that a workbook is a collection
of multiple worksheets or can be a single worksheet. You can add or delete worksheets, hide

2
them within the workbook without deleting them, and change the order of your worksheets
within the workbook.

Microsoft Excel Window Components

Before you start using it, it’s really important to understand that what’s where in its window.
So ahead we have all the major component which you need to know before entering the
world of Microsoft Excel.

Active Cell: A cell which is currently selected. It will be highlighted by a rectangular box
and its address will be shown in the address bar. You can activate a cell by clicking on it or
by using your arrow buttons. To edit a cell, you double-click on it or use F2 to as well.

Columns: A column is a vertical set of cells. A single worksheet contains 16384 total
columns. Every column has its own alphabet for identity, from A to XFD. You can select a
column clicking on its header.

Rows: A row is a horizontal set of cells. A single worksheet contains 1048576 total rows.
Every row has its own number for identity, starting from 1 to 1048576. You can select a row
clicking on the row number marked on the left side of the window.

3
Fill Handle: It’s a small dot present on the lower right corner of theactive cell. It helps you to
fill numeric values, text series, insert ranges, insert serial numbers, etc.

Address Bar: It shows the address of the active cell. If you have selected more than one cell,
then it will show the address of the first cell in the range.

Formula Bar: The formula bar is an input bar, below the ribbon. It shows the content of the
active cell and you can also use it to enter a formula in a cell.

Title Bar: is located at the very top of the screen. On the Title bar, Microsoft Excel displays
the name of the workbook you are currently using. At the top of your screen, you should see
"Microsoft Excel - Book1" or a similar name.

The Menu Bar - is directly below the Title bar and displays the menu. The menu begins with
the word File and continues with the following: Edit, View, Insert, Format, Tools, Data,
Window, and Help. You use the menu to give instructions to the software. Point with your
mouse to a menu option and click the left mouse button. A drop-down menu will appear. You
can now use the left and right arrow keys on your keyboard to move left and right across the
Menu bar options. You can use the up and down arrow keys to move up and down the
dropdown menu. To select an option, highlight the item on the drop-down menu and press
Enter. An ellipse after a menu item signifies additional options; if you select that option, a
dialog box will appear.

The Toolbars -- provide shortcuts to menu commands. Toolbars are generally located just
below the Menu bar. The basic toolbars – Standard and Formatting – are available as the
Microsoft Excel is opened. If not, follow the steps outlined below:

4
1. Point to View, which is located on the Menu bar.
2. Click the left mouse button.
3. Press the down arrow key until Toolbars is highlighted.

4. Press Enter. Both Standard and Formatting should have a checkmark next to them. If both
have a checkmark next to them, press Esc three times to close the menu. If either does not
have a checkmark, press the down arrow key until Customize is highlighted.

5. Press Enter.

6. Point to the box or boxes next to the unchecked word or words, Standard and/or
Formatting, and click the left mouse button. A checkmark should appear.

7. Note: You turn the checkmark on and off by clicking the left mouse button.

8. Point to Close and click the left mouse button to close the dialog box.

Worksheets – Microsoft Excel consists of worksheets. Each worksheet contains columns and
rows. The columns are lettered A to IV; the rows are numbered 1 to 65536. The combination
of column and row coordinates make up a cell address. For example, the cell located in the
upper left corner of the worksheet is cell A1, meaning column A, row 1. Cell E10 is located
under column E on row 10. You enter your data into the cells on the worksheet.

The Formula Bar -- If the Formula bar is turned on, the cell address displays on the left side
of the Formula bar. Cell entries display on the right side of the Formula bar.

The Status Bar - If the Status bar is turned on, it appears at the very bottom of the screen as
shown below. Before proceeding, make sure the Status bar is turned on.

File Menu: The file menu is a simple menu like all other applications. It contains options like
(Save, Save As, Open, New, Print, Excel Options, Share, etc).

5
Ribbon Tab: Starting from the Microsoft Excel 2007, all the options menus are replaced
with the ribbons. Ribbon tabs are the bunch of specific option group which further contains
the option.

Worksheet Tab: This tab shows all the worksheets which are present in the workbook. By
default you will see, three worksheets in your new workbook with the name of Sheet1,
Sheet2, Sheet3 respectively.

USING EXCEL SHORTCUT KEY

Description Excel Shortcuts

1. To create a new workbook Ctrl + N

2. To open an existing workbook Ctrl + O

3. To save a workbook/spreadsheet Ctrl + S

4. To close the current workbook Ctrl + W

5. To close Excel Ctrl + F4

6. To move to the next sheet Ctrl + PageDown

7. To move to the previous sheet Ctrl + PageUp

8. To go to the Data tab Alt + A

9. To go to the View tab Alt + W

10. To go the Formula tab Alt + M

11. To edit a cell F2

12. To copy and paste cells Ctrl + C, Ctrl + V

6
13. To italicize and make the font bold Ctrl + I, Ctrl + B

14. To center align cell contents Alt + H + A + C

15. To fill color Alt + H + H

16. To add a border Alt + H + B

17. To remove outline border Ctrl + Shift + _

18. To add an outline to the select cells Ctrl + Shift + &

19. To move to the next cell Tab

20. To move to the previous cell Shift + Tab

21. To select all the cells on the right Ctrl + Shift + Right arrow

22. To select all the cells on the left Ctrl + Shift + Left Arrow

23. To select the column from the selected cell Ctrl + Shift + Down Arrow
to the end of the table
24. To select all the cells above the selected cell Ctrl + Shift + Up Arrow

25. To select all the cells below the selected cell Ctrl + Shift + Down Arrow

26. To add a comment to a cell Shift + F2

27. To delete a cell comment Shift + F10 + D

28. To display find and replace Ctrl + H

29. To activate the filter Ctrl + Shift + L

Alt + Down Arrow

7
30. To insert the current date Ctrl + ;

31. To insert current time Ctrl + Shift + :

32. To insert a hyperlink Ctrl + k

33. To apply the currency format Ctrl + Shift + $

34. To apply the percent format Ctrl + Shift + %

35. To go to the “Tell me what you want to do” Alt + Q


box
36. To select the entire row Shift + Space

37. To select the entire column Ctrl + Space

38. To delete a column Alt+H+D+C

39. To delete a row Shift + Space, Ctrl + -

40. To hide selected row Ctrl + 9

41. To unhide selected row Ctrl + Shift + 9

42. To hide a selected column Ctrl + 0

43. To unhide a selected column Ctrl + Shift + 0

44. To group rows or columns Alt + Shift + Right arrow

45. To ungroup rows or columns Alt + Shift + Left arrow

8
UNIT- II

STATISTICAL CHARTS IN MS

EXCEL

9
SIMPLE BAR CHART

Simple bar graph is the graphical representation of a given data set in the form of bars. The
bars are proportional to the magnitude of the category they represent on the graph. The main
purpose of a bar graph is to compare quantities/items based on statistical figures. In a simple
bar graph, the comparison can be made based on only one parameter.

Simple Bar Charts can be drawn in two ways – a vertical bar graph or a horizontal bar graph.
In a vertical bar graph, the bars are plotted vertically on the graph, while in a horizontal bar
graph the bars are plotted horizontally on the graph.

Simple Bar Charts are made with the help of the two axes – the x-axis and the y-axis. In the
case of a vertical simple bar graph, the category to be compared, or you can say the fixed
variable is represented by the x-axis while the magnitude of that fixed variable is represented
by the y-axis. The bars run vertically along the y-axis up to the value proportional to the
category it represents.

For example:

For a horizontal Simple Bar Chart, it is just the opposite, as the y-axis represents the category
and the x-axis represents the magnitude of the category. The bars run horizontally along the
x-axis.

For example:

10
How to draw a Simple Bar Chart?

As the name suggests, it is simple to draw a Simple Bar Chart. You just need to follow a
few simple steps and keep a few things in mind the task is done. Given below are the steps
to draw a simple bar graph. In these steps, we will be drawing a vertical simple bar graph as
vertical bar graphs are preferred more than horizontal bar graphs.
Step1:First draw the two axes (X-axis and Y-axis)
Step2: Represent the variant (fixed variable) or the category that is being compared on the
x-axis.
Step3: Represent the frequency or the magnitude on the y-axis with equal intervals between
them. You can scale the y-axis according to your needs.
Step4: Plot the bars starting from the x-axis, adjacent to each other as well as separated
from each other such that the height of bars represents the frequencies/magnitude of the
respective variants/categories.

Example: Draw a Simple Bar Chart using the data given below.

Class Interval 10-20 20-30 30-40 40-50 50-60

Frequency 45 60 48 35 40

11
MULTIPLE BAR CHART

Multiple bar diagram is identical to a regular bar graph with the exception that there are two
or more bars in each category, one for each subdivision. This diagram is created using the
same method as a straightforward bar chart with the exception that we use various tones, hues
and/or dots to distinguish between various phenomena. If the sum of various phenomena is
meaningless, we often draw multiple bar charts.

How to Draw Multiple Bar Diagram

Steps to draw a multiple bar diagram are as follows:

1. Step 1: On the x-axis, plot the categories.

2. Step 2: Draw a series of parallel bars for each category. Each bar in this example
represents a specific subcategory (like girl/boy).

3. Step 3: On the y-axis, the corresponding numerical values are plotted.

4. Step 4: The rectangles’ (or bars’) height (length) is determined proportionally to the
magnitude of the observations.

5. Step 5: Different colors should be used to differentiate between the various bars in a
set. This improves the diagram’s readability and aesthetic appeal.

12
Example 2: Draw a multiple bar diagram to show the production data for agriculture in the
following table:

Production of agriculture (in tons)

Year Paddy Wheat Maize


2001 60 10 12
2002 30 20 10
2003 70 25 30

Solution: The multiple bar graphs of the given data is as follows:

Sub-Divided Bar Graph/Diagram

In these diagrams, the bar corresponding to each phenomenon is divided into several
components. Each part or component occupies a proportional part of the bar to its share in the
total. For example, the bar corresponding to the number of students enrolled in a course can
be further sub-divided into boys and girls.

 When preparing a sub-divided bar diagram, the various components in each bar
should be kept in the same sequence.

 It is important to use different colours or shades to differentiate between different


components.

 A suitable index should explain these various colours or shades.

13
 These diagrams are quite useful for comparing the sizes of various parts and throwing
light on the relationship between these integral parts. For instance, such diagrams are
used to present data such as sales profits from various products, a family’s
expenditure pattern, the budget outlay for receipts and expenditures, and so on.

Example of Sub-Divided Bar Graph

Represent the following information using a sub-divided bar diagram, showing the quarterly
sales of three varieties of soap manufactured by a company.

SOLUTION:

14
PIE CHART
The “pie chart” is also known as a “circle chart”, dividing the circular statistical graphic into
sectors or sections to illustrate the numerical problems. Each sector denotes a proportionate
part of the whole. To find out the composition of something, Pie-chart works the best at that
time. In most cases, pie charts replace other graphs like the bar graph, line plots, histograms,
etc.

Formula

The pie chart is an important type of data representation. It contains different segments and
sectors in which each segment and sector of a pie chart forms a specific portion of the total
(percentage). The sum of all the data is equal to 360°.

The total value of the pie is always 100%.

To work out with the percentage for a pie chart, follow the steps given below:

 Categorize the data

 Calculate the total

 Divide the categories

 Convert into percentages

 Finally, calculate the degrees

Therefore, the pie chart formula is given as

(Given Data/Total value of Data) × 360°

Note: It is not mandatory to convert the given data into percentages until it is specified. We
can directly calculate the degrees for given data values and draw the pie chart accordingly.

15
How to Create a Pie Chart

Imagine a teacher surveys her class on the basis of favourite Sports of students:

Football Hockey Cricket Basketball Badminton

10 5 5 10 10

The data above can be represented by a pie chart as following and by using the circle graph
formula, i.e. the pie chart formula given below. It makes the size of the portion easy to
understand.

Step 1: First, Enter the data into the table.

Step 2: Add all the values in the table to get the total.

Step 3: Next, divide each value by the total and multiply by 100 to get a per cent:

Football Hockey Cricket Basketball Badminton

(10/40) × 100 (5/ 40) × 100 (5/40) ×100 (10/ 40) ×100 (10/40)× 100
=12.5% =12.5% =25% =25%
=25%

Step 4: Next to know how many degrees for each “pie sector” we need, we will take a full
circle of 360° and follow the calculations below:

Football Hockey Cricket Basketball Badminton

(10/ 40)× 360° (5 / 40) × 360° (5/40) × 360° (10/ 40)× 360° (10/ 40) × 360°
=45° =45° =90° =90°
=90°

Step 5: Draw a circle and use the protractor to measure the degree of each sector

16
LINE CHART

A line graph or line chart or line plot is a graph that utilizes points and lines to represent
change over time. It is a chart that shows a line joining several points or a line that shows the
relation between the points. The graph represents quantitative data between two changing
variables with a line or curve that joins a series of successive data points. Linear graphs
compare these two variables in a vertical axis and a horizontal axis.

Types of Line Graphs

The following are the types of the line graph. They are:

1. Simple Line Graph: Only one line is plotted on the graph.

2. Multiple Line Graph: More than one line is plotted on the same set of axes. A
multiple line graph can effectively compare similar items over the same period of
time.

3. Compound Line Graph: If information can be subdivided into two or more types of
data. This type of line graph is called a compound line graph. Lines are drawn to show
the component part of a total. The top line shows the total and line below shows part
of the total. The distance between every two lines shows the size of each part.

17
Vertical Line Graph

Vertical line graphs are graphs in which a vertical line extends from each data point down to
the horizontal axis. Vertical line graph sometimes also called a column graph. A line parallel
to the y-axis is called a vertical line.

Horizontal Line Graph

Horizontal line graphs are graphs in which a horizontal line extends from each data point
parallel to the earth. Horizontal line graph sometimes also called a row graph. A line parallel
to the x-axis is called a vertical line

18
HISTOGRAM
A histogram is a graphical representation of a grouped frequency distribution with
continuous classes. It is an area diagram and can be defined as a set of rectangles with bases
along with the intervals between class boundaries and with areas proportional to frequencies
in the corresponding classes. In such representations, all the rectangles are adjacent since the
base covers the intervals between class boundaries. The heights of rectangles are proportional
to corresponding frequencies of similar classes and for different classes; the heights will be
proportional to corresponding frequency densities.

In other words, a histogram is a diagram involving rectangles whose area is proportional to


the frequency of a variable and width is equal to the class interval.

How to Plot Histogram?

You need to follow the below steps to construct a histogram.

1. Begin by marking the class intervals on the X-axis and frequencies on the Y-axis.

2. The scales for both the axes have to be the same.

3. Class intervals need to be exclusive.

4. Draw rectangles with bases as class intervals and corresponding frequencies as


heights.

5. A rectangle is built on each class interval since the class limits are marked on the
horizontal axis, and the frequencies are indicated on the vertical axis.

6. The height of each rectangle is proportional to the corresponding class frequency if


the intervals are equal.

7. The area of every individual rectangle is proportional to the corresponding class


frequency if the intervals are unequal.

Although histograms seem similar to graphs, there is a slight difference between them. The
histogram does not involve any gaps between the two successive bars.

When to Use Histogram?

The histogram graph is used under certain conditions. They are:

 The data should be numerical.

 A histogram is used to check the shape of the data distribution.

 Used to check whether the process changes from one period to another.

 Used to determine whether the output is different when it involves two or more
processes.

19
 Used to analyse whether the given process meets the customer requirements.

Question: The following table gives the lifetime of 400 neon lamps. Draw the histogram for
the below data.

Lifetime (in hours) Number of lamps

300 – 400 14

400 – 500 56

500 – 600 60

600 – 700 86

700 – 800 74

800 – 900 62

900 – 1000 48

Solution:

The histogram for the given data is:

20
SCATTER CHART

A scatter plot is also called a scatter chart, scattergram, or scatter plot, XY graph. The
scatter diagram graphs numerical data pairs, with one variable on each axis, show their
relationship. Now the question comes for everyone
Scatter Plot Uses and Examples

Scatter plots instantly report a large volume of data. It is beneficial in the following
situations –

 For a large set of data points given

 Each set comprises a pair of values

 The given data is in numeric form

Question:

Draw a scatter plot for the given data that shows the number of games played and scores
obtained in each instance.

No. of games 3 5 2 6 7 1 2 7 1 7

Scores 80 90 75 80 90 50 65 85 40 100

Solution:

X-axis or horizontal axis: Number of games

Y-axis or vertical axis: Scores

Now, the scatter graph will be:

21
22
UNIT- III

STATISTICAL MEASURES

23
MEAN
The “average” number is found by adding all data points and dividing by the number of data
points.

Example: The mean of 9, 5, and 1 is (9+5+1)/3 = 15/3 =5

How to find Mean using Excel?

The mean of a dataset in Excel can be found it by applying the formula “Average” to the data
set. Also if you want to calculate the mean quickly you can just select the range. In the
bottom right corner, you can find average of the array.

For example:

In Microsoft Excel, the mean can be calculated by using one of the following functions:

 AVERAGE- returns an average of numbers.


 AVERAGEA - returns an average of cells with any data (numbers, Boolean and text values).
 AVERAGEIF - finds an average of numbers based on a single criterion.
 AVERAGEIFS - finds an average of numbers based on multiple criteria.

For the in-depth tutorials, please follow the above links. To get a conceptual idea of how
these functions work, consider the following example.

In a sales report (please see the screenshot below), supposing you want to get the average of
values in cells C2:C8. For this, use this simple formula:

=AVERAGE(C2:C8)

To get the average of only "Banana" sales, use an AVERAGEIF formula:

=AVERAGEIF(A2:A8, "Banana", C2:C8)

To calculate the mean based on 2 conditions, say, the average of "Banana" sales with the
status "Delivered", use AVERAGEIFS:

=AVERAGEIFS(C2:C8,A2:A8, "Banana", B2:B8, "Delivered")

24
You can also enter your conditions in separate cells, and reference those cells in your
formulas, like this:

MEDIAN
Median is the middle value in a group of numbers, which are arranged in ascending or
descending order, i.e. half the numbers are greater than the median and half the numbers are
less than the median. For example, the median of the data set {1, 2, 2, 3, 4, 6, 9} is 3.

This works fine when there are an odd number of values in the group. But what if you have
an even number of values? In this case, the median is the arithmetic mean (average) of the
two middle values. For example, the median of {1, 2, 2, 3, 4, 6} is 2.5. To calculate it, you
take the 3rd and 4th values in the data set and average them to get a median of 2.5.

In Microsoft Excel, a median is calculated by using the MEDIAN function. For example, to
get the median of all amounts in our sales report, use this formula:

25
=MEDIAN(C2:C8)

To make the example more illustrative, I've sorted the numbers in column C in ascending
order (though it is not actually required for the Excel Median formula to work):

MODE
Mode is the most frequently occurring value in the dataset. While the mean and median
require some calculations, a mode value can be found simply by counting the number of
times each value occurs.

For example, the mode of the set of values {1, 2, 2, 3, 4, 6} is 2. In Microsoft Excel, you can
calculate a mode by using the function of the same name, the MODE function. For our
sample data set, the formula goes as follows:

=MODE(C2:C8)

26
GEOMETRIC MEAN
The Geometric Mean (GM) is the average value or mean which signifies the central
tendency of the set of numbers by finding the product of their values. Basically, we multiply
the numbers altogether and take the nth root of the multiplied numbers, where n is the total
number of data values. For example: for a given set of two numbers such as 3 and 1, the
geometric mean is equal to √ (3×1) = √3 = 1.732.

The Geometric Mean (G.M) of a series containing n observations is the nth root of the
product of the values.

Consider, if x1, x2 …. Xn are the observation, then the G.M is defined as:

Question 1: Find the G.M of the values 10, 25, 5, and 30

Solution: Given 10, 25, 5, 30

We know that,

Therefore, the geometric mean = 13.915

Question 2 : Find the geometric mean of the following data.

27
Solution: Here n=5

= Antilog 8.925/5

= Antilog 1.785

= 60.95

Therefore the G.M of the given data is 60.95

HARMONIC MEAN
The Harmonic Mean (HM) is defined as the reciprocal of the average of the reciprocals of the
data values.. It is based on all the observations, and it is rigidly defined. Harmonic mean
gives less weightage to the large values and large weightage to the small values to balance the
values correctly. In general, the harmonic mean is used when there is a necessity to give
greater weight to the smaller items. It is applied in the case of times and average rates.

Since the harmonic mean is the reciprocal of the average of reciprocals, the formula to define
the harmonic mean “HM” is given as follows:

If x1, x2, x3,…, xn are the individual items up to n terms, then,

Harmonic Mean, HM = n / [(1/x1)+(1/x2)+(1/x3)+…+(1/xn)]

Calculate the harmonic mean for the following data:

x 1 3 5 7 9 11

f 2 4 6 8 10 12

Solution:

The calculation for the harmonic mean is shown in the below table:

28
x f 1/x f/x

1 2 1 2

3 4 0.333 1.332

5 6 0.2 1.2

7 8 0.143 1.144

9 10 0.1111 1.111

11 12 0.091 1.092

N =42 Σ f/x = 7.879

The formula for weighted harmonic mean is

HMw = N / [ (f1/x1) + (f2/x2) + (f3/x3)+ ….(fn/xn) ]

HMw = 42 / 7.879

HMw = 5.331

Therefore, the harmonic mean, HMw is 5.331.

29
UNIT- IV

STATISTICAL MEASURES

30
Measures of Dispersion and skewness
Calculating Range in Excel

Excel does not offer a function to compute range. However, we can easily compute it by
subtracting the minimum value from the maximum value. The formula would be =MAX()-
MIN() where the dataset would be the referenced in both the parentheses. The =MAX() and
=MIN() functions would find the maximum and the minimum points in the data. The
difference between the two is the range. The higher the value of the range, the greater is the
spread of the data.

Variance

The calculation of variance differs slightly depending on whether the data set describes a
sample or the entire population. We have already seen that variance is nothing but the
average of the squared deviations. When we are computing the variance for a population, we
divide the sum of squared deviations by n. However, when we compute the variance for a
sample, we divide the sum of squared deviations by (n-1).

This change is taken care of by Excel with two different functions: =VAR.P() for population
variance, and =VAR.S() for sample variance.

If we treat our data set as the population, then the variance for Arun is 1275, and the variance
for John is 162.5. If we treat our data as a sample, the variance for Arun is 1189.58, and the
variance for John is 50.

Older versions of Excel used =VARP() and =VARS() to calculate population variance, and
sample variance, respectively.

31
Microsoft Excel also supports two other functions that calculate variance, =VARA() for
sample variance, and =VARPA() for population variance. These differ from the other
variance functions in how they treat certain text strings within the data.

=VARA() and =VARPA() can handle the following text strings that =VAR.S() and
=VAR.P() ignore:

1. Logical values such as TRUE and FALSE are counted, and treated as 1, and 0,
respectively.

2. Any text value is counted, and is treated as 0.

Standard Deviation
We already know that the standard deviation is nothing but the square root of variance.
Naturally, if the variance computation is different for a sample and for a population, the
standard deviation would be different as well. Similar to variance, Excel offers two functions,
=STDEV.S() for sample standard deviation, and =STDEV.P() for population standard
deviation.

Older versions of Excel support =STDEV() for sample standard deviation, and =STDEVP()
for population standard deviation.

Standard deviation can also be computed on logical strings, and text, just like variance. The
function for sample variance is =STDEVA() and the function for population variance is
=STDEVPA(), if the text values are to be counted. The treatment of the text values remain
the same as with =VARA() and =VARPA() functions.

Inter-Quartile Range (IQR)


Microsoft Excel has two functions to compute quartiles. The inter-quartile range has to be
calculated as the difference between the quartile 3 and quartile 1 values. You can read more
about exclusive and inclusive quartiles.

Quartiles can be calculated using =QUARTILE.INC() or =QUARTILE.EXC(). Both


functions calculate the quartiles by calculating the percentiles on the data. However, the

32
=QUARTILE.EXC calculates exclusive quartiles, and cannot calculate quartile 0 or quartile 4
(the extreme values are excluded). The inclusive function =QUARTILE.INC() can be used to
calculate quartiles including quartile 0 and quartile 4.

Both functions have the following syntax: =QUARTILE.INC(range, quartile_number) where


the quartile_number can be between 0 and 4. Any quartile number outside these values would
return an error.

Moving Average
A moving average, also called a moving mean or a rolling mean, is a calculation that relies on
a series of averages from data subsets within an entire data set. It's a term statisticians,
technical analysts and financial analysts use to describe changes to averages as new data
becomes available. It explains how a data series changes over a set period. The moving
average also updates to include recent data along with data points from pre-determined
intervals.

Calculate moving average for a certain time period

A simple moving average can be calculated in no time with the AVERAGE function.
Supposing you have a list of average monthly temperatures in column B, and you want to
find a moving average for 3 months (as shown in the image above).

Write a usual AVERAGE formula for the first 3 values and input it in the row corresponding
to the 3rd value from the top (cell C4 in this example), and then copy the formula down to
other cells in the column:

=AVERAGE(B2:B4)

33
You can fix the column with an absolute reference (like $B2) if you want to, but be sure to
use relative row references (without the $ sign) so that the formula adjusts properly for
other cells.

Remembering that an average is computed by adding up values and then dividing the sum by
the number of values to be averaged, you can verify the result by using the SUM formula:

=SUM(B2:B4)/3

Kurtosis
Kurtosis is a descriptive statistic that is not as well known as other descriptive statistics such
as the mean and standard deviation. Descriptive statistics give some sort of summary
information about a data set or distribution. As the mean is a measurement of the center of a
data set and the standard deviation how spread out the data set is, kurtosis is a measurement
of the thickness of the fails of a distribution.

Kurtosis in Excel

With Excel it is very straightforward to calculate kurtosis. Performing the following steps
streamlines the process of using the formula displayed above. Excel's kurtosis function
calculates excess kurtosis.

1. Enter the data values into cells.

2. In a new cell type =KURT(

3. Highlight the cells where the data are at. Or type the range of cells containing the
data.

4. Make sure to close the parentheses by typing )

5. Then press the enter key.

34
The value in the cell is the excess kurtosis of the data set.

For smaller data sets, there is an alternate strategy that will work:

1. In an empty cell type =KURT(

2. Enter the data values, each separated by a comma.

3. Close the parentheses with )

4. Press the enter key.

This method is not as preferable because the data are hidden within the function, and we
cannot do other calculations, such as a standard deviation or mean, with the data that we have
entered.

35
UNIT- IV

CORRELATION AND REGRESSION

36
Correlation basically means a mutual connection between two or more sets of data. In
statistics, bivariate data or two random variables are used to find the correlation between
them. The correlation coefficient is generally the measurement of the correlation between the
bivariate data which basically denotes how much two random variables are correlated with
each other.

If the correlation coefficient is 0, the bivariate data are not correlated with each other.

If the correlation coefficient is -1 or +1, the bivariate data are strongly correlated with each
other.

r=-1 denotes strong negative relationship and r=1 denotes strong positive relationship.

In general, if the correlation coefficient is close to -1 or +1 then we can say that the bivariate
data are strongly correlated to each other.

Where,

 r: Correlation coefficient.

 : Values of the variable x.

 y_i: Values of the variable y.

 n: Number of samples taken in the data set.

 Numerator: Covariance of x and y.

 Denominator: Product of Standard Deviation of x and Standard Deviation of y.

Example: Consider the following data set :

37
Finding the Correlation Coefficient in Excel:

1. Using CORREL function

In Excel to find the correlation coefficient use the formula :

=CORREL(array1,array2) array1 : array of variable x array2: array of variable y To insert


array1 and array2 just select the cell range for both.

1. Let’s find the correlation coefficient for the variables and X and Y1.

array1 : Set of values of X. The cell range is from A2 to A6.

array2 : Set of values of Y1. The cell range is from B2 to B6.

38
Regression Analysis
In statistical modeling, regression analysis is used to estimate the relationships between two
or more variables:

Dependent variable (aka criterion variable) is the main factor you are trying to understand
and predict.

Independent variables (aka explanatory variables, or predictors) are the factors that might
influence the dependent variable.

Regression analysis helps you understand how the dependent variable changes when one of
the independent variables varies and allows to mathematically determining which of those
variables really has an impact.

This example shows how to run regression in Excel by using a special tool included with
the Analysis ToolPak add-in.

Enable the Analysis ToolPak add-in

Analysis ToolPak is available in all versions of Excel 365 to 2003 but is not enabled by
default. So, you need to turn it on manually. Here's how:

1. In your Excel, click File > Options.

2. In the Excel Options dialog box, select Add-ins on the left sidebar, make sure Excel
Add-ins is selected in the Manage box, and click Go.

39
3. In the Add-ins dialog box, tick off Analysis Toolpak, and click OK:

This will add the Data Analysis tools to the Data tab of your Excel ribbon.

Run regression analysis

In this example, we are going to do a simple linear regression in Excel. What we have is a list
of average monthly rainfall for the last 24 months in column B, which is our independent
variable (predictor), and the number of umbrellas sold in column C, which is the dependent
variable. Of course, there are many other factors that can affect sales, but for now we focus
only on these two variables:

40
With Analysis Toolpak added enabled, carry out these steps to perform regression
analysis in Excel:
1. On the Data tab, in the Analysis group, click the Data Analysis button.

2. Select Regression and click OK.

3. In the Regression dialog box, configure the following settings:


 Select the Input Y Range, which is your dependent variable. In our case, its umbrella
sales (C1:C25).
 Select the Input X Range, i.e. your independent variable. In this example, it's the
average monthly rainfall (B1:B25).

If you are building a multiple regression model, select two or more adjacent columns with
different independent variables.

 Check the Labels box if there are headers at the top of your X and Y ranges.
 Choose your preferred Output option, a new worksheet in our case.
 Optionally, select the Residuals checkbox to get the difference between the predicted
and actual values.

41
4. Click OK and observe the regression analysis output created by Excel.

42

You might also like