0% found this document useful (0 votes)
3 views

PA-NOTE-6 Data Visualization (different types of chart)

The document explains histograms as a type of bar graph that represents grouped continuous data, detailing various distribution shapes such as bell-shaped, uniform, right-skewed, left-skewed, and bimodal. It also distinguishes histograms from bar graphs and discusses the importance of bin sizes in data grouping. Additionally, it provides an overview of different types of charts and best practices for their design, including column charts, bar graphs, line graphs, dual axis charts, area charts, pie charts, and scatter plots.

Uploaded by

anikeit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

PA-NOTE-6 Data Visualization (different types of chart)

The document explains histograms as a type of bar graph that represents grouped continuous data, detailing various distribution shapes such as bell-shaped, uniform, right-skewed, left-skewed, and bimodal. It also distinguishes histograms from bar graphs and discusses the importance of bin sizes in data grouping. Additionally, it provides an overview of different types of charts and best practices for their design, including column charts, bar graphs, line graphs, dual axis charts, area charts, pie charts, and scatter plots.

Uploaded by

anikeit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

Histograms

An extension of the bar graph is the histogram. A histogram is a type of vertical


bar graph in which the bars represent grouped continuous data. The shape of a
histogram can tell you a lot about the distribution of the data, as well as provide you
with information about the mean, median, and mode of the data set. The following
are some typical histograms, with a caption below each one explaining the
distribution of the data, as well as the characteristics of the mean, median, and
mode. Distributions can have other shapes besides the ones shown below, but these
represent the most common ones that you will see when analyzing data. In each of
the graphs below, the distributions are not perfectly shaped, but are shaped enough
to identify an overall pattern.

a)
Figure a represents a bell-shaped distribution, which has a single peak and tapers
off to both the left and to the right of the peak. The shape appears to be symmetric
about the center of the histogram. The single peak indicates that the distribution
is unimodal. The highest peak of the histogram represents the location of the mode
of the data set. The mode is the data value that occurs the most often in a data set.
For a symmetric histogram, the values of the mean, median, and mode are all the
same and are all located at the center of the distribution.

b)
Figure b represents a distribution that is approximately uniform and forms a
rectangular, flat shape. The frequency of each class is approximately the same.
c)
Figure c represents a right-skewed distribution, which has a peak to the left of the
distribution and data values that taper off to the right. This distribution has a single
peak and is also unimodal. For a histogram that is skewed to the right, the mean is
located to the right on the distribution and is the largest value of the measures of
central tendency. The mean has the largest value because it is strongly affected by
the outliers on the right tail that pull the mean to the right. The mode is the smallest
value, and it is located to the left on the distribution. The mode always occurs at the
highest point of the peak. The median is located between the mode and the mean.

d)
Figure d represents a left-skewed distribution, which has a peak to the right of the
distribution and data values that taper off to the left. This distribution has a single
peak and is also unimodal. For a histogram that is skewed to the left, the mean is
located to the left on the distribution and is the smallest value of the measures of
central tendency. The mean has the smallest value because it is strongly affected by
the outliers on the left tail that pull the mean to the left. The median is located
between the mode and the mean.

e)
Figure e has no shape that can be defined. The only defining characteristic about
this distribution is that it has 2 peaks of the same height. This means that the
distribution is bimodal.
While there are similarities between a bar graph and a histogram, such as each bar
being the same width, a histogram has no spaces between the bars. The quantitative
data is grouped according to a determined bin size, or interval. The bin size refers to
the width of each bar, and the data is placed in the appropriate bin.
The bins, or groups of data, are plotted on the x-axis, and the frequencies of the
bins are plotted on the y-axis. A grouped frequency distribution is constructed for
the numerical data, and this table is used to create the histogram. In most cases, the
grouped frequency distribution is designed so there are no breaks in the intervals.
The last value of one bin is actually the first value counted in the next bin. This
means that if you had groups of data with a bin size of 10, the bins would be
represented by the notation [0-10), [10-20), [20-30), etc. Each bin appears to contain
11 values, which is 1 more than the desired bin size of 10. Therefore, the last digit of
each bin is counted as the first digit of the following bin.
The first bin includes the values 0 through 9, and the next bin includes the values 9
through 19. This makes the bins the proper size. Bin sizes are written in this manner
to simplify the process of grouping the data. The first bin can begin with the smallest
number of the data set and end with the value determined by adding the bin width to
this value, or the bin can begin with a reasonable value that is smaller than the
smallest data value.
Different Types of Graphs and Charts for Presenting Data

To better understand each chart and how they can be used, here's an overview of
each type of chart.

1. Column Chart

A column chart is used to show a comparison among different items, or it can show a
comparison of items over time. You could use this format to see the revenue per
landing page or customers by close date.
Design Best Practices for Column Charts:

 Use consistent colors throughout the chart, selecting accent


colors to highlight meaningful data points or changes over time.
 Use horizontal labels to improve readability.
 Start the y-axis at 0 to appropriately reflect the values in your
graph.
2. Bar Graph

A bar graph, basically a horizontal column chart, should be used to avoid clutter
when one data label is long or if you have more than 10 items to compare. This type
of visualization can also be used to display negative numbers.
Design Best Practices for Bar Graphs:

 Use consistent colors throughout the chart, selecting accent


colors to highlight meaningful data points or changes over time.
 Use horizontal labels to improve readability.
 Start the y-axis at 0 to appropriately reflect the values in your
graph.
3. Line Graph

A line graph reveals trends or progress over time and can be used to show many
different categories of data. You should use it when you chart a continuous data set.
Design Best Practices for Line Graphs:

 Use solid lines only.


 Don't plot more than four lines to avoid visual distractions.
 Use the right height so the lines take up roughly 2/3 of the y-
axis' height.
4. Dual Axis Chart

A dual axis chart allows you to plot data using two y-axes and a shared x-axis. It's
used with three data sets, one of which is based on a continuous set of data and
another which is better suited to being grouped by category. This should be used to
visualize a correlation or the lack thereof between these three data sets.
Design Best Practices for Dual Axis Charts:

 Use the y-axis on the left side for the primary


variable because brains are naturally inclined to look left first.
 Use different graphing styles to illustrate the two data sets,
as illustrated above.
 Choose contrasting colors for the two data sets.
5. Area Chart

An area chart is basically a line chart, but the space between the x-axis and the line
is filled with a color or pattern. It is useful for showing part-to-whole relations, such as
showing individual sales reps' contribution to total sales for a year. It helps you
analyze both overall and individual trend information.
Design Best Practices for Area Charts:

 Use transparent colors so information isn't obscured in the


background.
 Don't display more than four categories to avoid clutter.
 Organize highly variable data at the top of the chart to make it
easy to read.
7. Pie Chart

A pie chart shows a static number and how categories represent part of a whole --
the composition of something. A pie chart represents numbers in percentages, and
the total sum of all segments needs to equal 100%.
Design Best Practices for Pie Charts:

 Don't illustrate too many categories to ensure


differentiation between slices.
 Ensure that the slice values add up to 100%.
 Order slices according to their size.
9. Scatter Plot Chart

A scatter plot or scattergram chart will show the relationship between two different
variables or it can reveal the distribution trends. It should be used when there are
many different data points, and you want to highlight similarities in the data set. This
is useful when looking for outliers or for understanding the distribution of your data.

You might also like