0 Boxplot
0 Boxplot
The area inside the box (50% of the data) is known as the Inter Quartile
Range. The IQR is calculated as –
IQR = Q3-Q1
Outlies are the data points below and above the lower and upper
limit. The lower and upper limit is calculated as –
The values below and above these limits are considered outliers and the
minimum and maximum values are calculated from the points which lie
under the lower and upper limit.
How to create a box plots?
Let us take a sample data to understand how to create a box plot.
Here are the runs scored by a cricket team in a league of 12 matches –
100, 120, 110, 150, 110, 140, 130, 170, 120, 220, 140, 110.
To draw a box plot for the given data first we need to arrange the data
in ascending order and then find the minimum, first quartile, median,
third quartile and the maximum.
To find the First Quartile we take the first six values and find their median.
Note: If the total number of values is odd then we exclude the Median while
calculating Q1 and Q3. Here since there were two central values we included
them. Now, we need to calculate the Inter Quartile Range.
What is Histogram?
A histogram is a graphical representation of the frequency distribution of
continuous series using rectangles.
The x-axis of the graph represents the class interval, and the y-axis shows the
various frequencies corresponding to different class intervals.
There are no gaps between two consecutive rectangles based on the fact that
histograms can be drawn when data are in the form of the frequency distribution
of a continuous series.
No histogram can be drawn for a data set in the form of discrete series, and this
makes histograms different from bar graphs as they can be plotted for both
discrete and continuous series.
The major difference between a histogram and a bar graph is that the former is
two-dimensional; i.e., both the width and length of the rectangles are used for
comparison, whereas the latter is one-dimensional, which means only the length
of the rectangles is used for comparison. A histogram is used to determine the
value of the Mode of a data set in the form of a continuous series.
Types of Histogram
When histograms are drawn based on the data with unequal class intervals,
they are known as Histograms of unequal class intervals.
In the above table, the class interval is calculated as the difference between the
upper-class limit and lower-class limit, i.e.,
15-10=5, 20-15=5, 20-25=5, 30-25=5, 40-30=10, 60-40=20, and 80-60=20.
4. Plotting Histogram:
This method gives the investigator/analyst a visual idea of the nature of the
association between the two variables. It is the simplest method of studying the
relationship between two variables as there is no need to calculate any numerical
value.
3. Positive Correlation
When the points of the scatter diagram cluster around a straight line
(upward slope from left to right), then the correlation is said to be positive.
4. Negative Correlation
When the points of the scatter diagram cluster around a straight line
(downward/negative slope), then the correlation is said to be negative.
5.No Correlation
When the points of the scatter diagram are scattered in a haphazard manner, then
there is zero or no correlation.
How to interpret a Scatter Diagram?
While interpreting a scatter diagram, the given below points should be taken into
consideration:
Dense or Scattered Points: If the plotted points are close to each other, then
the analyst can expect a high degree of correlation between the two variables.
However, if the plotted points are widely scattered, then the analyst can expect
a poor correlation between the variables.
Trend or No Trend: If the points plotted on the scatter diagram shows any
trend either upward or downward, then it can be said that the variables are
correlated. However, if the plotted points do not show any trend, then it can be
said that the variables are uncorrelated.
2. First Step: It is the first step of investigating the relationship between two
variables.
3. Unsuitable for Large Observations: If there are more than two variables, it
becomes difficult to draw a scatter diagram.