We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7
Understanding and the line that divides the box into two
parts. Half the scores are greater than
interpreting box plots or equal to this value and half are less.
How to read a box Inter-quartile range
The middle “box” represents the middle plot/Introduction to box plots 50% of scores for the group. The range of scores from lower to upper quartile is Box plots are drawn for groups of W@S referred to as the inter-quartile range. scale scores. They enable us to study The middle 50% of scores fall within the the distributional characteristics of a inter-quartile range. group of scores as well as the level of the scores. Upper quartile Seventy-five percent of the scores fall To begin with, scores are sorted. Then below the upper quartile. four equal sized groups are made from the ordered scores. That is, 25% of all Lower quartile scores are placed in each group. The Twenty-five percent of scores fall below lines dividing the groups are the lower quartile. called quartiles, and the groups are referred to as quartile groups. Usually Whiskers we label these groups 1 to 4 starting at The upper and lower whiskers represent the bottom. scores outside the middle 50%. Whiskers often (but not always) stretch over a wider range of scores than the middle quartile groups.
Interpreting box plots/Box
plots in general Box plots are used to show overall patterns of response for a group. They provide a useful way to visualize the range and other characteristics of responses for a large group.
The diagram below shows a variety
of different box plot shapes and positions. Definitions Median The median (middle quartile) marks the mid-point of the data and is shown by difference that could be explored further in the Items in Detail reports and through consultation. The 4 sections of the box plot are uneven in size – See example (1). This shows that many students have similar views at certain parts of the scale, but in other parts of the scale students are more variable in their views. The long upper whisker in the example means that students views are varied amongst the most positive quartile group, and very similar Some general observations for the least positive quartile about box plots group. The Items in Detail reports can be used to explore this The box plot is comparatively further. short – see example (2). This Same median, different suggests that overall students distribution – See examples (1), have a high level of agreement (2), and (3). The medians (which with each other. generally will be close to the The box plot is comparatively average) are all at the same level. tall – see examples (1) and (3). However, the box plots in these This suggests students hold quite examples show very different different opinions about this distributions of views. aspect or sub-aspect. One box plot is much higher It always important to consider or lower than another – the pattern of the whole compare (3) and (4) – This could distribution of responses in a box suggest a difference between plot. groups. For example, the box plot for boys may be lower or higher than the equivalent plot for girls. Follow this up by looking at Box Plot Explained: the Items at a Glance reports. Obvious differences between box plots – see examples (1) and Interpretation, Examples, & (2), (1) and (3), or (2) and (4). Any obvious difference between box plots for comparative groups is Comparison worthy of further investigation. Your school box plot is much higher or lower than the national In descriptive statistics, a box reference group box plot. This also suggests an area of plot or boxplot (also known as a box and whisker plot) is a type of line that divides the box into two chart often used in explanatory parts (sometimes known as the data analysis. Box plots visually second quartile). Half the scores show the distribution of are greater than or equal to this numerical data and skewness by value, and half are less. displaying the data quartiles (or percentiles) and averages. Upper Quartile Seventy-five percent of the scores Box plots show the five-number fall below the upper quartile value summary of a set of data: (also known as the third quartile). including the minimum score, Thus, 25% of data are above this first (lower) quartile, median, value. third (upper) quartile, and maximum score. Maximum Score The highest score, excluding outliers (shown at the end of the right whisker). Whiskers The upper and lower whiskers represent scores outside the middle 50% (i.e., the lower 25% of scores and the upper 25% of scores). The Interquartile Minimum Score Range (or IQR) The lowest score, excluding The box plot shows the middle outliers (shown at the end of the 50% of scores (i.e., the range left whisker). between the 25th and 75th percentile). Lower Quartile Twenty-five percent of scores fall below the lower quartile value (also known as the first quartile). Why Are Box Plots Median Useful? The median marks the mid-point of the data and is shown by the Box plots divide the data into The box plot shape will show if a sections containing approximately statistical data set is normally 25% of the data in that set. distributed or skewed.
Box plots are useful as they
provide a visual summary of the data enabling researchers to When the median is in the middle quickly identify mean values, the of the box, and the whiskers are dispersion of the data set, and about the same on both sides of signs of skewness. the box, then the distribution is Note the image above represents symmetric. data that is a perfect normal distribution, and most box plots When the median is closer to the will not conform to this symmetry bottom of the box, and if the (where each quartile is the same whisker is shorter on the lower length). end of the box, then the distribution is positively skewed (skewed right). Box plots are useful as they show the average When the median is closer to the top of the box, and if the whisker score of a data set is shorter on the upper end of the box, then the distribution is The median is the average value negatively skewed (skewed left). from a set of data and is shown by the line that divides the box into two parts. Half the scores are Box plots are useful as greater than or equal to this they show the value, and half are less. dispersion of a data set Box plots are useful as In statistics, dispersion (also they show the called variability, scatter, or spread) is the extent to which a skewness of a data set distribution is stretched or squeezed.
The smallest and largest values
are found at the end of the ‘whiskers’ and are useful for providing a visual indicator regarding the spread of scores (e.g., the range). For example, outside 1.5 times the interquartile range above the upper quartile and below the lower quartile (Q1 – 1.5 * IQR or Q3 + 1.5 * IQR).
The interquartile range (IQR) is How To Compare Box
the box plot showing the middle 50% of scores and can be Plots calculated by subtracting the lower quartile from the upper Box plots are a useful way to quartile (e.g., Q3−Q1). visualize differences among different samples or groups. They Box plots are useful as manage to provide a lot of statistical information, including they show outliers — medians, ranges, and outliers. within a data set Note although box plots have An outlier is an observation that been presented horizontally in is numerically distant from the this article, it is more common to rest of the data. view them vertically in research papers When reviewing a box plot, an outlier is defined as a data point Step 1: Compare the that is located outside the whiskers of the box plot. medians of box plots
Compare the respective medians
of each box plot. If the median line of a box plot lies outside of the box of a comparison box plot, at the end of two whiskers. This then there is likely to be a shows the range of scores difference between the two (another type of dispersion). groups. Larger ranges indicate wider distribution, that is, more scattered data.
Step 3: Look for
potential outliers
When reviewing a box plot, an
outlier is defined as a data point that is located outside the Step 2: Compare the whiskers of the box plot. interquartile ranges and whiskers of box Step 4: Look for signs of plots skewness If the data do not appear to be Compare the interquartile ranges symmetric, does each sample (that is, the box lengths) to show the same kind of examine how the data is asymmetry? dispersed between each sample. The longer the box, the more dispersed the data. The smaller, the less dispersed the data.
Model Sum of Squares DF Mean Square F Sig. 1 Regression .471 4 .118 1.576 .196 Residual 3.590 48 .075 Total 4.061 52 A. Predictors: (Constant), LC, EXT, DEBT, TANG B. Dependent Variable: DPR