Notes CHAPTER 2

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 12

CHAPTER 2 : DATA PRESENTATION

Is an essential step before further statistical analysis is warranted.


The data are summarized and displayed enabling researchers, managers and decision-makers to observe important features of the data and provide insight into the type of model and analysis that should be used. Some common data presentations include frequency table, bar chart, pie chart, histogram, frequency curve, line graph, stem-and-leaf display and ogive. Raw Data Data that have been collected or recorded but have not been arranged or processed yet. Example 1: Below is the height of 10 children in cm: 58 22 25 76 30 53 64 47 31 66

These data is also called ungrouped data. Organizing And Graphing Qualitative Data Frequency Definition: The number of observations that fall in a category (qualitative data) or in a class/ interval (quantitative data). Frequency Distributions Table A frequency distribution for qualitative data lists all the categories and the number of elements that belong to each of the categories

Prepared by Zaihasra Binti Rukiman

Example: Car model Waja Wira Saga Gen-2 Total Number of cars 66 50 39 25 180

Graphical Presentation Of Qualitative Data 1. Pie chart a. A circle, contains more than one wedge (or V-shaped piece). b. Normally the percentage distribution is well presented by using the pie chart. c. A pie chart can be used to represent categorical data. It consist of one or more circles that are divided into sectors. The sectors show the number of objects or percentage of each group or category. The angle in the sector is proportional to the number or percentage of elements in that category. Student Grades
D

Eg: Current asset RM (million) Stocks 1520 Cash 720 Others 860 Construct a pie chart for the information above. 2. Bar chart a. Used to display frequency contained in the frequency distribution.

Prepared by Zaihasra Binti Rukiman

i. y axis: Frequency or the relative frequency or percentage ii. x axis: Category

Graphical Presentation Of Quantitative Data 1. Histograms A graph that displays the data by using vertical bars of various heights to represent the frequencies a. The only difference is that there is no gap among the bars. b. Similar to the previous bar graphs. c. The height of the bars represents the frequency.

Prepared by Zaihasra Binti Rukiman

d. The bars can be drawn either vertically or horizontally.

2. Frequency polygon A graph that displays the data by using lines that connect points plotted for the frequencies at the midpoints of the classes. The frequencies represent the heights of the midpoints.

3. Ogives A graph that represents the cumulative frequencies for the classes in a frequency distribution.

Prepared by Zaihasra Binti Rukiman

4. Stem-and-leaf plots A stem-and-leaf plot is a data plot that uses part of a data value as the stem (the leading digit) and part of the data value as the leaf (the trailing digit) to form groups or classes. It has the advantage over grouped frequency distribution of retaining the actual data while showing them in graphic form. Sometime we can construct a mixture model.

Stem

Leaf

Leaf

Stem

Leaf

Prepared by Zaihasra Binti Rukiman

Conclusion Data can be organized in some meaningful way using frequency distributions. Once the frequency distribution is constructed, the representation of the data by graphs is a simple task. Construct The Frequency Tables - A frequency distribution where several numbers are grouped into one class. - A grouping of data into mutually exclusive classes showing the number of observations in each class. Class interval E.g. 40-49 as class interval 40 = lower class limit ( the smallest numbers that can actually belong to the different classes) 49 = upper class limit (the largest numbers that can actually belong to the different classes) Class boundaries The class boundaries are obtained by increasing the upper class limits and decreasing the lower class limits by the same amount so that there are no gaps between consecutive under classes. The amount to be added or subtracted is the difference between the upper limit of one class and the lower limit of the following class

Prepared by Zaihasra Binti Rukiman

E.g. for class interval 40-49 Lower class boundaries 40-0.5=39.5 Upper class boundaries 49+0.5=49.5

Class width E.g. for class interval 40-49 The difference between the upper & lower boundaries of any class. 49.5-39.5=10 The class width is also the difference between the lower limits of 2 consecutive classes or the upper limits of two consecutive classes. E.g. for class interval 40-49 49-40+1=10

Class mark (midpoint) The number in the middle of the class. E.g. for class interval 40-49 40+49/2=44.5

Cumulative frequency Gives the total number of values that fall below the upper boundary of each class. Each class has the same lower limit but a different upper limit

Cumulative Relative Frequency =


c u m u laetiv e q u eyn o f e a cchla s s fr c C u m u laetiv la tiv fr e q u e n = y re e c s u m o f a re q u e e sc i f ll n

Cumulative Percentage Distributions Cumulative Percentage = Cumulative Rel. Freq. X 100

How to develop a grouped frequency distribution?

Prepared by Zaihasra Binti Rukiman

STEP 1

Determine the range R = Highest Value Lowest Value

STEP 2 Determine the tentative number of classes (k) k = 1 + 3.33 log N Always round off

Note: The number of classes should be between 5 and 20. The actual
number of classes may be affected by convenience or other subjective factors STEP 3 Find the class width by dividing the range by the number of classes.
class width = Range number of classes c= R k

(Always round off ) STEP 4 Write the classes or categories starting with the lowest score. Stop when the class already includes the highest score. Add the class width to the starting point to get the second lower class limit. Add the class width to the second lower class limit to get the third, and so on. List the lower class limits in a vertical column and enter the upper class limits, which can be easily identified at this stage. STEP 5 Determine the frequency for each class by referring to the tally columns and present the results in a table. STEP 6 If necessary, find the relative frequencies & or relative cumulative frequencies. When constructing frequency tables, the following guidelines should be followed.

Prepared by Zaihasra Binti Rukiman

The classes must be mutually exclusive. That is, each score must belong to exactly one class. Include all classes, even if the frequency might be zero. All classes should have the same width, although it is sometimes impossible to avoid open ended intervals such as 65 years or older. The number of classes should be between 5 and 20.

Example 1: In the previous examination, the mark for statistics examination for 50 persons students as bellows: 19 23 47 17 24 21 27 25 23 18 21 69 36 29 27 33 20 65 42 23 30 65 31 70 37 25 40 41 17 18 22 26 71 33 25 73 20 24 65 46 37 76 35 16 27 75 24 63 25 25

You are required to create frequency table. Determine the range. R = Highest Value Lowest Value R = 76 16 = 60 Determine the tentative number of classes (K). K = 1 + 3. 33 log N = 1 + 3.33 log 50 = 1 + 3.33 (1.69897) = 6.65 *Round off the result to the next integer if the decimal part exceeds 0. K=7
class width = Range number of classes c= R k

Prepared by Zaihasra Binti Rukiman

c=

60 = 8.57 = 9 7

* Round off the quotient if the decimal part exceeds 0.

Classes 16 24 25 33 34 42 43 51 52 60 61 69 70 - 78

Tally //// //// //// // //// //// //// //// // // //// ////

Frequency 17 14 7 2 0 5 5

Cumulative frequency distribution : The less than cumulative frequency distribution (F<) is constructed by adding the frequencies from the lowest to the highest interval while the more than cumulative frequency distribution (F>) is constructed by adding the frequencies from the highest class interval to the lowest class interval.

Relative frequency distribution: A Relative frequency distribution indicates the proportion of the total number of observations that is occurring in each interval. That is, frequency of each class int erval relative frequency (rf ) = total number of observatio ns
rf = f

Relative frequencies may be expressed in percent. Hence a relative frequency table is also called percentage frequency distribution Note: A Relative cumulative frequency distribution may be constructed using relative frequencies of the cumulative frequency less than or more than.

Example 2:

Prepared by Zaihasra Binti Rukiman

10

By using the data below, you are required to build the frequency distribution that include class boundaries, relative frequency, percentage of relative frequency, class interval, less than & more than cumulative frequency. Data presentation by graph

2.

Histogram (axis x = class boundaries & axis y = frequencies)

Example 4 Class interval 33-41 42-50 51-59 60-68 69-77 78-86 87-95 f 2 5 11 15 11 3 3 =50 Class boundaries 32.5-41.5 41.5-50.5 50.5-59.5 59.5-68.5 68.5-77.5 77.5-86.5 86.5-95.5

16 14 12 10 8 6 4 2 0

Frequencies

Histogram graph shows the marks for students in statistics examination

32.5-41.5 41.5-50.5 50.5-59.5 59.5-68.5 68.5-77.5 77.5-86.5 86.5-95.5

Class boundaries
2. Frequency polygon (axis x = class boundaries & axis y = frequencies)

Form by having the midpoint of each class represent the data in that class & then connecting the sequence of midpoints at their respective class.

Prepared by Zaihasra Binti Rukiman

11

3.

Ogives (axis x = class boundaries & axis y = frequencies). Two types of ogives.

More than ogives is an increasing function (cf < from, starting from 0) Less than ogives falls to the right (cf > from, starting from frequency total)

Prepared by Zaihasra Binti Rukiman

12

You might also like