Unit 3 DS
Unit 3 DS
STATISTICS
data items lie from each other. It shows the distance from the
the upper and lower quartiles, thus, the plot is also termed as
value1 = [82,76,24,40,67,62,75,78,71,32,98,89,78,67,72,82,87,66,56,52]
value2=[62,5,91,25,36,32,96,95,3,90,95,32,27,55,100,15,71,11,37,21]
value3=[23,89,12,78,72,89,25,69,68,86,19,49,15,16,16,75,65,31,25,52]
value4=[59,73,70,16,81,61,88,98,10,87,29,72,16,23,72,88,78,99,75,30]
box_plot_data=[value1,value2,value3,value4]
plt.boxplot(box_plot_data)
plt.show()
RESULT OF CODE :
Pivot Table
• Pivot tables are one of Excel's most powerful features. A pivot table allows you
to extract the significance from a large, detailed data set.
• A heat map (or heatmap) is a data visualization technique that shows magnitude of
a phenomenon as color in two dimensions. The variation in color may be by hue or
intensity, giving obvious visual cues to the reader about how the phenomenon is clustered
or varies over space.
HEAT MAPS :-
• A heatmap is a two-dimensional graphical
representation of data where the individual
values that are contained in a matrix are
represented as colours
CODE :-
From pandas import DataFrame
data=[{2,3,4,1},{6,3,5,2},{6,3,5,4},{3,7,5,4},{2,8,1,5}]
plt.pcolor(df)
• plt.show()
RESULT OF CODE :-
CORRELATION STATISTICS
corrmat = data.corr()
• The use of random variables is most common in probability and statistics, where they
are used to quantify outcomes
• Risk analysts use random variables to estimate the probability of an adverse event
occurring.
Variance
• Variance is a measure of how data points differ from the mean.
According to Layman, a variance is a measure of how far a set
of data (numbers) are spread out from their mean (average)
value.
• Variance means to find the expected difference of deviation
from actual value. Therefore, variance depends on the standard
deviation of the given data set.
• The more the value of variance, the data is more scattered from
its mean and if the value of variance is low or minimum, then it
is less scattered from mean. Therefore, it is called a measure of
spread of data from mean.
COVARIANCE
• Covariance is a measure of the relationship between two random variables and to what extent,
they change together. Or we can say, in other words, it defines the changes between the two
variables, such that change in one variable is equal to change in another variable. This is the
property of a function of maintaining its form when the variables are linearly transformed.
Covariance is measured in units, which are calculated by multiplying the units of the two
variables.
• Covariance can have both positive and negative values. Based on this, it has two types:
1.positive covariance
2.Negitive covariance
Correlation Linear
Transformations of Random
Variable
THANK YOU