0% found this document useful (0 votes)
63 views12 pages

(F-F) M X + I (F - F) + (F - F)

This document provides instructions for analyzing distribution series using MS Excel. It discusses how to calculate key characteristics like mode, median, variance, and standard deviation from distribution data. The steps include running Excel, inputting data, using the Descriptive Statistics tool to generate a report on the data, and calculating specific metrics like mode and median through formulas. Formulas for key characteristics are presented in a table.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
63 views12 pages

(F-F) M X + I (F - F) + (F - F)

This document provides instructions for analyzing distribution series using MS Excel. It discusses how to calculate key characteristics like mode, median, variance, and standard deviation from distribution data. The steps include running Excel, inputting data, using the Descriptive Statistics tool to generate a report on the data, and calculating specific metrics like mode and median through formulas. Formulas for key characteristics are presented in a table.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 12

Laboratory work No.

Analysis of distribution series

The purpose of the work is to gain the ability to analyze the


distribution series by means of MS Excel.
Its tasks is to analyze the statistical distribution series with the help
of "Analysis Toolpack".

Methodical guidelines

Base formulas for calculating the characteristics of distribution series


are presented in Table 5.1.

Table 5.1

Formulas for calculating the characteristics of distribution


series

(
f-
M f0
0 M
)
Mode M
=
o x
0+
i -
1

(
f
M -
f0
0 M-
1
)+(
f
M -
f0
0 M +
1
)

1
Median 2
f-
i SM
e-
1
M
e=x+
0 i ,
fMe

2 ( x i− x̄ i )2× f i
Variance σ =
∑ fi
Standard deviation σ= √ σ 2
σ
Coefficient of variation K σ = ×100 %

You can solve the tasks of distribution series analysis in Excel by


means of "Analysis Toolpack" and built-in statistical functions. Let’s
consider the order of work in Excel.
Analysis of ungrouped statistical data. The work starts with
running Excel (similar to running the other applications – using the Start
menu or by means of the shortcut). The indicators are not necessarily
entered into the input spreadsheet when forming the file, you can move it
from Microsoft Office documents through the clipboard. Then you can
transform and visualize the data (Fig. 5.1).

Fig. 5.1. Input data

Descriptive Statistics mode is used to generate one-dimensional


statistical report on basic indicators of position, scattering and asymmetry
of the population which is analyzed. To move to this mode you need to
enter the menu item Data – Data Analysis and select this mode (Fig. 5.2).

Fig. 5.2. Selecting the Descriptive Statistics mode


In the dialog box of this mode (Fig. 5.3) the following parameters are
given:
1. Input range, i.e. entering the cells that contain statistical data.
2. Grouping, i.e. setting the position by columns or by rows, depending
on the location of the data in the input range.
3. Labels in the first row which are activated if the first row (column) in
the input range contains column labels. If you have no column labels this
parameter should be deactivated. In this case standard names for the output
range data will be automatically created.
4. Output range / New worksheet / New Workbook.
In the Output range position you need to enter a link to the upper-left cell
of output range in the box. The output range is automatically detected, and a
message appears in the case of the possible imposition of the original range of the
input data.
In the New worksheet position new worksheet opens. Excel will enter the
results starting with the cell A1. If you want to specify the name of the new
worksheet enter the name in the box located in front of it.
In the New Workbook position a new workbook opens. Excel will enter
the results at the cell A1 on the first worksheet.
5. Summary Statistics which is activated if the output range gets one
field for each Descriptive Statistics indicator.
6. Confidence level for mean is activated if the output table includes a
row for maximum sampling error at the prescribed level of reliability.
7. K-th largest which is activated if the row for the k-th largest (from a
maximum xmax) value of the sample should be included in output table. If k = 1,
then the row will contain the maximum value of the sample.
8. K-th smallest which is activated if the output table includes the row for
the k-th smallest (from xmin) value of the sample. Enter a number k in the box. If
k = 1, then the row will contain the minimum value of the sample.
Input parameters of Descriptive Statistics mode are presented in Fig. 5.3,
and the calculated parameters of this mode – in Fig. 5.4.
Fig. 5.3. The parameters of Descriptive Statistics mode

Fig. 5.4. The calculated parameters of Descriptive Statistics mode

According to the data the coefficient of oscillation is calculated as follows:


R 154 . 00
K R = ×100 %= ×100 %=29 . 74 %;
x̄ 517 . 81
σ 43 .53
K σ = ×100 %= ×100 %=8 . 40 % .
coefficient of variation is: x̄ 517 . 81
the

Analysis of grouped statistical data. After entering or moving input data


from Microsoft Office documents via clipboard into a spreadsheet Excel, if
necessary, you can transform and visualize the row data. In this case, you need to
add a column with individual values of signs (midpoint of interval) for each
group (Fig. 5.5).
The function for calculating the weighted arithmetic average is not present
in Excel. But it is possible to obtain by combination of other functions.
The cell C9 contains the formula =SUMPRODUCT(C3:C8;B3:B8)/
SUM(B3:B8), by means of which the average current assets is calculated (Fig.
5.5).

Fig. 5.5. The calculation of the average current assets

For the determination of mode (3) and median (4) one should make some
calculations (Fig. 5.6–5.7).
The calculation of mode is presented in Fig. 5.6. The content of the cells
(Fig. 5.6) is as follows:
the cell C9 contains the formula =MAX(B3:B8) with the help of which the
modal number of companies is calculated;
the cell C10 contains the formula =MATCH(C9;B3:B8;0) by means of
which the displacement of modal value is calculated in an array of B3:B8;
the cell C11 contains the formula =INDEX(A3:A8;C10;1), i.e. the modal
interval of the amount of current assets in the A3:A8 array;
the cell C12 contains the formula =LEFT(C11;1) which shows the lower
limit of the modal interval of the amount of current assets;
the cell C13 contains the formula =INDEX(B3:B8;C10-1;1), i.e. the
number of enterprises with modal interval of current assets ( fM ) in the B3:B8
0-1

array;
the cell C14 contains the formula =INDEX(B3: B8;C10+1;1), i.e. the
number of enterprises with larger amount of current assets ( fM ) in the B3:B8
0+1

array;
the cell C15 contains the formula =C12+2×((C9-C13)/((C9-C13)+(C9-
C14))) for calculating the mode of the amount of current assets.

Fig. 5.6. The calculation of the mode

Due to the fact that the median divides the population in half, it will be
where the cumulative frequency is a half or more than half of the total amount of
frequencies, and the previous cumulative frequency is less than a half of the
population size (Fig. 5.7).
The content of cells (Fig. 5.7) is as follows:
in the cells C3:C8 cumulative frequency is calculated (for example, the cell
C5 contains the formula =C4+B5);
the cell B9 contains the formula =SUM(B3:B8) by means of which the
number of population (the number of enterprises) is calculated;
the cell C10 contains the formula =B9/2 which determines a half of the
number of population (50 % of enterprises);
the cell C11 contains the formula =MATCH(C10;C3:C8;1), i.e. the number
of position of the largest value among the numbers less than it or equal to the
middle of the interval is determined in the C3:C8 array;
the cell C12 contains the formula =INDEX(C3:C8;C11;1), i.e. a number that
meets the search criteria, formed in the cell C11 and taken from the C3:C8 array;
the cell C13 contains the formula =IF(C10=C12;C11;C11+1) by means of
which the shift on the median interval is calculated;
the cell C14 contains the formula =INDEX(B3:B8;C13;1) reflecting the
frequency of the median interval;
the cell C15 contains the formula =INDEX(A3:A8;C13;1), i.e. the median
interval found in the A3:A8 array;
the cell C16 contains the formula =LEFT(C15;1) reflecting the lower limit
of median interval;
the cell C17 contains the formula =INDEX(C3: C8;C13-1;1) calculating
the value of the cumulative frequency before the median interval;
the cell C18 contains the formula =C16+2×((B9/2-C17)/C14) calculating
the median of the current assets amount.
Fig. 5.7. The calculation of the median
The calculation of mean square deviation allows to determine the coefficient
of variation (Fig. 5.8).
These parameters (Fig. 5.8) were determined with the help of the
following formulas:
the cell C10 contains =(SUMPRODUCT(POWER(C3:C8-C9; 2);B3:B8))
/SUM(B3:B8) for calculating the variance;
the cell C11 contains =SQRT(C10) for calculating the mean square
deviation;
the cell C12 contains =(C11/C9)×100 for calculating the variation.

Fig. 5.8. The calculation of variance, mean square deviation


and coefficient of variation

To calculate quartiles one should determine the upper and lower quartiles.
The calculation of the first quartile (Fig. 5.9) is similar to the calculation of
the median, except for these cells:
the cell C10 containing the formula =B9×0,25;
the cell C18 containing the formula =C16+2×((B9×0,25-C17)/C14).
Fig. 5.9. The calculation of the first quartile

The third quartile is in the range 11 – 13 years and is equal to 11.5. To


define it the cell B10 contains the formula =B9×0,75, and the cell B18 contains
the formula =C16+2×((B9×0,75-C17)/C14).
So the quartile deviation is equals to:

Q 3 -Q 1 11.5−6 . 6
Q= = =4 . 9 ,
2 2

and quartile parameter of variation:


Q 4 . 90
K Q= ×100 %= ×100 %=52 . 91 % .
Me 9. 26
The results of calculating the total variance and its components are
presented in Fig. 5.10.
Fig. 5.10. The calculation of total variance and its components

The contents of cells (Fig. 5.10) is as following:


the cell D10 contains the formula =(B9+C9)/(5+5) for calculating of the
average volume of production at two types of ownership enterprises;
the cell D11 contains the formula =B9/5 for calculating the average volume
of production at the state enterprises;
the cell D12 contains the formula =C9/5 for calculating the average volume
of the production at private enterprises;
the cell D13 contains the formula =VARP(B4:B8) for calculating the
intragroup variance (state enterprises);
the cell D14 contains the formula =VARP(C4:C8) for calculating the
intragroup variance (private enterprises);
the cell D15 contains the formula =(D13×5+D14×5)/10 for calculating the
average of the group variances;
the cell D16 contains the formula =((POWER(D11-D10;2))×5+
(POWER(D12-D10;2))×5)/10 for calculating the intergroup variance;
the cell D17 contains the formula =SUM(D15;D16) for calculating the the
total variance.
The variance obtained and the mean square deviation of alternative signs
are shown in Fig. 5.11.
Fig. 5.11. The calculation of the variance and mean square deviation

These parameters (Fig. 5.11) were determined by means of the following


formulas:
the cell E7 contains the formula =(SUM(C4:C6))/SUM(B4:B6) for
calculating the average percentage of suitable products in three batches;
the cell E8 contains the formula =1-E7 for calculating the average percent
of defective products;
the cell E9 contains the formula =E7×E8 for calculating the variance of
suitable products proportion (alternative characteristics variance);
the cell E10 contains the formula =SQRT(E9) for calculating the mean
square deviation of alternative signs.
The determination of the skewness and kurtosis coefficients involves the
calculation of the moments of the third and fourth orders (Fig. 5.12).
Fig. 5.12. The calculation of the skewness and kurtosis coefficients

The content of the cells (see Fig. 5.12) is the following:


the cell C11 contains the formula =(SUMPRODUCT(POWER(C3:C8-
C9;3);B3:B8))/SUM(B3:B8) for calculating the moment of the third order;
the cell C12 contains the formula =C11/POWER(C10;3) for calculating the
skewness coefficient;
the cell C13 contains the formula =(SUMPRODUCT(POWER(C3:C8-
C9;4);B3:B8))/SUM(B3:B8) for calculating the moment of the fourth order;
the cell C14 contains the formula =(C13/POWER(C10;4))-3 for
calculating the kurtosis coefficient.

You might also like