Data Visualization & Data Exploration - Unit II

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

Data Visualization &

Data exploration
By:
Dr. Shweta Agarwal
SOB, UPES,KANDOLI CAMPUS
 One of the most effective mechanisms for presenting data in a form
meaningful to decision makers is graphical depiction.
 Through graphs and charts, the decision maker can often get an overall
picture of the data and reach some useful conclusions merely by studying the
chart or graph.
 Data graphs can generally be classified as
 Quantitative
 Qualitative.
 The qualitative graphs are plotted using non-numerical categories.
Quantitative data graphs

 Quantitative data graphs are plotted along a numerical scale. There are six
types of quantitative data graphs:
 Histogram
 Dot plots
 A stem-and-leaf plot
 Scatter plot
 Frequency Polygons
 Ogives
 Box-and-Whisker Plots
Histogram

 A histogram is a series of contiguous bars or rectangles that represent the


frequency of data in given class intervals.
 A histogram is a useful tool for differentiating the frequencies of class
intervals.
 A quick glance at a histogram reveals which class intervals produce the
highest frequency totals.
 Examination of the histogram reveals where large increases or decreases
occur between classes.
Dot plot

 A relatively simple statistical chart that is generally used to display


continuous, quantitative data is the dot plot.
 In a dot plot, each data value is plotted along the horizontal axis and is
represented on the chart by a dot.
 If multiple data points have the same values, the dots will stack up vertically.
 If there are a large number of close points, it may not be possible to display
all of the data values along the horizontal axis.
Stem-and-leaf plot

 A stem-and-leaf plot is constructed by separating the digits for each number


of the data into two groups, a stem and a leaf.
 The leftmost digits are the stem and consist of the higher valued digits.
 The rightmost digits are the leaves and contain the lower values.
 One advantage of such a distribution is that the instructor can readily see
whether the scores are in the upper or lower end of each bracket and also
determine the spread of the scores.
Scatter Plot

 A scatter plot is a two-dimensional graph plot of pairs of points from two


numerical variables.
 It help the analyst to get an idea of whether the two numerical variables
exhibit any relationship.
Frequency Polygon

 A frequency polygon, like the histogram, is a graphical display of class


frequencies.
 However, instead of using bars or rectangles like a histogram, in a frequency
polygon each class frequency is plotted as a dot at the class midpoint, and the
dots are connected by a series of line segments.
 Construction of a frequency polygon begins by scaling class midpoints along
the horizontal axis and the frequency scale along the vertical axis.
 A dot is plotted for the associated frequency value at each class midpoint.
Connecting these midpoint dots completes the graph.
Ogive

 An ogive (o-jive) is a cumulative frequency polygon.


 Construction begins by labeling the x-axis with the class endpoints and the y-
axis with the frequencies.
 However, the use of cumulative frequency values requires that the scale along
the y-axis be great enough to include the frequency total.
 A dot of zero frequency is plotted at the beginning of the first class, and
construction proceeds by marking a dot at the end of each class interval for
the cumulative value.
 Connecting the dots then completes the ogive.
Box-and-Whisker Plots

 A box-and-whisker plot, sometimes called a box plot, is a diagram that


utilizes the upper and lower quartiles along with the median and the two
most extreme values to depict a distribution graphically.
 The plot is constructed by using a box to enclose the median. This box is
extended outward from the median along a continuum to the lower and upper
quartiles, enclosing not only the median but also the middle 50% of the data.
 From the lower and upper quartiles, lines referred to as whiskers are
extended out from the box toward the outermost data values.
 The box-and-whisker plot is determined from five specific numbers:
 The median (Q2)
 The lower quartile (Q1)
 The upper quartile (Q3)
 The smallest value in the distribution
 The largest value in the distribution
Qualitative Graphs

 In contrast to quantitative data graphs that are plotted along a numerical


scale, qualitative graphs are plotted using non-numerical categories. In this
section, we will examine these types of qualitative data graphs:
 pie charts,
 bar charts
Pie Chart

 A pie chart is a circular depiction of data where the area of the whole pie represents
100% of the data and slices of the pie represent a percentage breakdown of the
sublevels.
 Pie charts show the relative magnitudes of the parts to the whole. They are widely used
in business, particularly to depict such things as budget categories, market share, and
time/resource allocations.
 However, the use of pie charts is minimized in the sciences and technology because pie
charts can lead to less accurate judgments than are possible with other types of graphs.
 Generally, it is more difficult for the viewer to interpret the relative size of angles in a
pie chart than to judge the length of rectangles in a bar chart.
 To construct a pie chart from the data, first convert the raw figures to
proportions by dividing each figure by the total of all the figures.
 Because a circle contains 360 degrees, each proportion is then multiplied by
360 to obtain the correct number of degrees to represent each item
Bar Graph

 A bar graph or chart contains two or more categories along one axis and a series of bars, one
for each category, along the other axis.
 Typically, the length of the bar represents the magnitude of the measure (amount,
frequency, money, percentage, etc.) for each category.
 The bar graph is qualitative because the categories are non-numerical, and it may be either
horizontal or vertical.
 In Excel, horizontal bar graphs are referred to as bar charts, and vertical bar graphs are
referred to as column charts.
 A bar graph generally is constructed from the same type of data that is used to produce a pie
chart. However, an advantage of using a bar graph over a pie chart for a given set of data is
that for categories that are close in value, it is considered easier to see the difference in the
bars of bar graph than discriminating between pie slices

You might also like