0% found this document useful (0 votes)
3 views

Data-Collection

This document outlines the objectives and methods of data collection and presentation, emphasizing the importance of organizing and interpreting data effectively. It categorizes data into types such as quantitative and qualitative, and discusses various methods for collecting and presenting data, including tables, graphs, and charts. Additionally, it details the characteristics and construction of different graphical representations like histograms, bar graphs, and pie charts.

Uploaded by

kole jgol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Data-Collection

This document outlines the objectives and methods of data collection and presentation, emphasizing the importance of organizing and interpreting data effectively. It categorizes data into types such as quantitative and qualitative, and discusses various methods for collecting and presenting data, including tables, graphs, and charts. Additionally, it details the characteristics and construction of different graphical representations like histograms, bar graphs, and pie charts.

Uploaded by

kole jgol
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Data Collection & Presentation

Objectives: At the end of this chapter, the student would be able to:

1. Identify data to be collected relevant to the problem.

2. Know different types of data.

3. Organize and present the gathered data using appropriate tables and graphs.

4. Employ appropriate method of data gathering.

The Collection of Data

Collection of Data refers to the process of gathering numerical information such as interview,
questionnaire, experiments, observation, and documentary analysis. Data should be properly collected
so that an investigator may be able to answer the questions under consideration with a reasonable
degree of confidence. Data are also the collections of any number of related observations on one or
more variables. It is a statistical facts, historical facts, principles, opinions and items of various sources
like scores, ages, I.Q., income, intelligence test scores, aptitude tests, personality trait ratings, and
others.

Data are the facts and figures that are collected, analyzed, and summarized for presentation and
interpretation. Data may be classified as either quantitative or qualitative. Quantitative data measure
either how much or how many of something, and Qualitative data provide labels, or names, for
categories of like items. For example, suppose that a particular study is interested in characteristics such
as age, gender, marital status, and annual income for a sample of 100 individuals. These characteristics
would be called the variables of the study, and data values for each of the variables would be associated
with each individual.

Sample survey methods are used to collect data from observational studies, and experimental design
methods are used to collect data from experimental studies. The area of descriptive statistics is
concerned primarily with methods of presenting and interpreting data using graphs, tables, and
numerical summaries. Whenever statisticians use data from a sample - i.e., a subset of the population –
to make statements about a population, they are performing statistical inference. Estimation and
hypothesis testing are procedures used to make statistical inferences.

Fields such as health care, biology, chemistry, physics, education, engineering, business, and economics
make extensive use of statistical inference. Methods of probability were developed initially for the
analysis of gambling games. Probability plays a key role in statistical inference; it is used to provide
measures of the quality and precision of the inferences. Many of the methods of statistical inference are
described in this article. Some of these methods are used primarily for single-variable studies, while
others, such as regression and correlation analysis, are used to make inferences about relationships
among two or more variables.

What are the Methods of Collection of Data?

1. Direct or interview method

This is done in a personal communication with the individual you want to interview.

2. Indirect or questionnaires method

This is done by sending questionnaires to the person from whom you would like to get information.

3. Registration method
This is done utilizing existing records.

4. Observation method

This can be done directly or indirectly.

5. Experiment method

This can be done by making or conducting scientific inquiry.

What is the Classification of Data?

1. Raw Data (Ungrouped Data) is the collected data that have not been organized numerically. It is an
arrangement of raw data in ascending or descending order or magnitude.

2. Categorical Data are observations that are put in the same or different classes, the classes possessing
qualitative differences.

3. Ranked Data are observations that show their relative position based on some characteristic, without
necessarily yielding a numerical value for that characteristic.

4. Quantitative Data are concerned with commodity stocks, prices, costs, and profits are analyzed in
relation to consumption, supply and demand.

5. Discrete Data consist of either a finite number of values or countable number of values. This
characterized by gaps for which no real values may be obtained. They are made up of items the values of
which have been obtained by counting. Example: number of books, school enrollment and etc.

6. Continuous Data arises from measurement of a continuous variable. Examples: weights of children,
school achievement, I.Q., heights of children.

The Presentation of Data

1. Textual Form of Presentation

The presentation of data, are incorporated in the paragraphs of discussion. Many people cannot easily
understand or comprehend data set in a tabular form unless a preliminary explanation is made.

2. Tabular Form of Presentation

Another presentation of data is in tabular form, a way of classifying related numerical facts in horizontal
arrays and vertical arrays. It is the process of condensing classified data and arranging them in a table.
Data can more readily be understood and comparisons may more easily be made. The most commonly
used tabular summary of data for a single variable is a frequency distribution. A frequency distribution
shows the number of data values in each of several non-overlapping classes. Another tabular summary,
called a relative frequency distribution, shows the fraction, or percentage, of data values in each class.
The most common tabular summary of data for two variables is a cross tabulation, a two-variable
analogue of a frequency distribution.

Constructing a frequency distribution for a quantitative variable requires more care in defining the
classes and the division points between adjacent classes. A frequency distribution would show the
number of data values in each of these classes, and a relative frequency distribution would show the
fraction of data values in each.

A cross tabulation is a two-way table with the rows of the table representing the classes of one variables
and the columns of the table representing the classes of another variable. To construct a cross tabulation
using the variables gender and age, gender could be shown with two rows, male and female, and age
could be shown with six columns corresponding to the age classes 20-29, 30-39, 40-49, 50-59, 60-69, 70-
79.
What are the Parts of Statistical Table?

1. Table heading

2. Stub

3. Box head

3.1 stub head

3.2 master caption

3.3 column caption

3.4 row caption

4. Body

Table Number

Title

Master Caption
Stub

Head Column Column Column Column


Caption Caption Caption Caption

Row Caption

Row Caption

Row Caption

Analyzing rows and columns. This simple example began with a discussion of the row-points in the table
shown above. However, one may rather be interested in the column totals, in which case one could plot
the column points in a small-dimensional space, which satisfactorily reproduces the similarity (and
distances) between the relative frequencies for the columns, across the rows, in table shown above. In
fact it is customary to simultaneously plot the column points and the row points in a single graph, to
summarize the information contained in a two-way table.

The Graphical Form of Presentation

Another method of presentation of data is by using graphs or charts; the most commonly used being the
line diagrams, bar charts, pie diagrams, pictorial graphs, and statistical maps. It is more
understandable. Graph is the data of statistical analysis results into diagram which easily understandable
at a glance. To appeal to a person’s sense of sight, as much data as possible is conveyed in a condensed,
quick and accurate manner by putting the data in a diagrammatical form. Graphs which are commonly
used in quality control activities.

A number of graphical methods are available for describing data. A bar graph is a graphical device for
depicting qualitative data that have been summarized in a frequency distribution. Labels for the
categories of the qualitative variable are shown on the horizontal axis of the graph. A bar above each
label is constructed such that the height of each bar is proportional to the number of data values in the
category.
A bar graph is a graphical device for depicting qualitative data that have been summarized in a frequency
distribution. Labels for the categories of the qualitative variable are shown on the horizontal axis of the
graph. A bar above each label is constructed such that the height of each bar is proportional to the
number of data values in the category.

The Basic Steps for Drawing Graphs

a. Be clear on purpose of drawing the graph. When drawing the graph, the most
important thing is to be clear about the purpose of drawing it. Then parallel with the
purpose, collect the information and the data.

b. Arrange the data and information into graph form. It is difficult to hold the interest
of or to convince anyone with information and data as it is. It is therefore essential to
process the data by taking the average to do a comparison.

c. Select the graph. Examine each graph’s advantages and disadvantages: match these
to the purpose of usage before deciding on the graph to be used.

d. Decide on the graph title. Reaffirm the purpose of drawing the graph as in step “a”,
then bearing in mind these points, decide on the graph title: 1) be concise; b) convey
the facts at a glance in an easily understandable manner; c) hold the interest of the
person; d) attract the attention of the person; e) put a substitute in if the main title is
insufficient to explain the content of the graph.

e. Decide the composition and color shades. In attempting to draw a good graph, do
not overdo by placing too much emphasis on composition and color shades. Be
cautious or it could be a failure.

f. Draw a draft for the graph. Using free hand, try to draw a draft for the graph.
Examine the graph size, scale, units and the total balance of the completed graph.

g. Draw the graph. When preparations on steps (a-f) are completed, you are ready for
the actual drawing. Bear these points in mind as you proceed.

1. Be definite on the base line (the line where the scale is zero).

2. When drawing the scale lines, they must be lighter than the base line.

3. In a graph where there are different units for entry, use double scales.
4. In the graph where there are many line, bar or sectors put an index to indicate
each line. Make the main lines border or create the differences by changing the
colors.

5. The numeric values of the scales for the X and Y axes should be such that the
coordinate values in the graphs are easily understandable.

6. The lettering should be legible and easily read.

7. Numbers should be limited for three figures only.

8. When there are many lines, bars or sections, rank them in the order of
importance.

9. It is a must to put in the explanation for scale units, scale numbers and the index.

10. Get the balance for the X and Y axes.

The Histogram

Histogram refers to a graphic representation showing vertical lines or rectangles of a frequency


distribution. It accommodates any number of categories at any level of measurement such as nominal,
ordinal, or interval. Histogram is a frequency distribution chart, puts a dispersion shape in a graph form.
It examines the quality of a group, and seeks to show which central value will give what manner of
dispersion. Even if the average value is the same the broader the width of dispersion, the lower the
quantity it is. A histogram is the type of bar graph where the data exists in a range and is divided into
intervals. In each interval, the data occurrence is tallied into a frequency chart and drawn into graph
form. When the distribution condition, the dispersion situation, and the distribution shape are known,
then the changes in quality characteristics can be accurately judged. In other words, when data is
grouped in an easily readable form, histogram becomes a tool which can be very useful.

A histogram is the most common graphical presentation of quantitative data that have been
summarized in a frequency distribution. The values of the quantitative variable are shown on the
horizontal axis. A rectangle is drawn above each class such that the base of the rectangle is equal to the
width of the class interval and its height is proportional to the number of data values in the class. A
number of graphical methods are available for describing data.

The Reading from the Shape of the Distribution

a. Standard Histogram. The right and left sides of the peak are symmetrical. This is a
histogram where there is consistency in the work process.

b. Comb-like Histogram. When there is a peculiarity in the measurement method and


the rounding up of the data, the histogram class intervals and the integer multiple of
the measurement, it will result in this shape.

c. Cliff-like Histogram. This shape is seen when the data of the things which are
outside of a certain specification are picked out from the total sum. In this chart, all
things below a certain specification value have been picked out from the total sum
but even those below the specification value, when the measurements are
rechecked, are now within the specification set.

d. High Plateau Histogram. When the differences in the average values are very small,
there will be no peaks but a flat top. When this shape is obtained, search for the
factor to differentiate the average vale, and divide (stratify) into different
histograms.

e. Bi-Modal Histogram. A bi-modal histogram occurs when different set of data with
different average values are placed into one graph.
f. Isolated Island Histogram. This histogram occurs when there is a miss in the process
sampling, data collection, method or measurement method. Then it must investigate
the cause by looking back into past daily reports and record.

The Bar Graph

Bar Graph. A graphic representations of frequencies in vertical or horizontal lines which is similar to
histogram. It is one of the most common and widely used graphical devices.

Bar Chart. The bar chart is a graph for the comparison of independent elements. Generally, the vertical
axis indicates the size of the numeric quantity (degree, number of cases, defect ratio, cost, etc.), while
the horizontal axis indicates the characteristic values (deficit items, defect cause, etc.) a bar chart is used
for the comparison of quantities. Therefore it would be correct to take the information from the
comparison of the proportion of the bar length and the magnification. Bar charts cannot indicate any
change in time series but they can be very effective graphs to indicate the comparison of quantities at a
specific time.

The Characteristics of Bar Charts

a. The highest value and the lowest value are easily found.

b. Where there is a small difference between the items compared which in numeric
form would be difficult to detect here the difference is easily detected.

c. The ranking of the comparison objects can be detected.

d. The comparison data (previous months, year, etc.) can be collectively shown.

How to Draw a Bar Chart

1. Draw the vertical axis and the horizontal axis. The vertical line in a graph is generally
then vertical axis of a graph on the left hand side in an L shape. Draw the axes in
bold lines.

2. Marking the scales:

a. Put the largest value on the top of the vertical axis and move downwards in the
order.

b. The base point is generally zero.

c. To see the changes for easy understanding, do not place the base point as zero.

d. Put the scales in round numbers.

e. Scale marking should be shorter and lighter than axes lines.

f. Scale markings are placed usually on the inner sides.

g. Space intervals between one bar and another bar should be in the ratio of 2:1 of
the column width.

h. However, if there are many bars, they can be closer together.

i. The column width should be equal throughout.

j. The item names should be equal throughout.

k. The item name should be written in the center of the bar column.

3. Putting in hatching.

a. When there is no necessity to differentiate between the items, the hatching can
be similar.
b. Where there is necessity to differentiate between the items, use different types
of hatching.

c. If the bars are crowded, the oblique lines alternately placed may make the bars
appear bent, be careful of these.

4. In the cause of numbers of extreme range differences, use a way line to indicate this.

5. When comparison is done for figures with small differences, shorten the mid section
by putting a wavy line to it.

6. Checking the graph.

a. Are the numeric value and the bar length in concurrence?

b. Are there any mistakes in writing down the numeric values of the scales?

c. Do the bars appear bent?

d. Is the hatching done as requested?

e. Is the graph neat and tidy?

7. Titling the graph

a. Write the title larger than the scale numbers and the items.

b. The title must be concise and easily understandable.

8. Putting in the precaution notes.

a. The size of the letters must be smaller than the item.

b. The notes should be clustered at one place.

c. The notes should be concise, using as few words as possible.

The Pie Chart or Circle graph

Pie Chart (Circle Graph) can provide a fast and easy presentation of nominal data divided into a few
categories. Irrespective of whether it is the population figure, sales turnover, productivity level, budget
figures, defect total, accident occurrence cases, etc. by the area size, we can intuitively grasp the
composition ratio of each category in the pie chart. A donut chart is a form of pie chart; a concentric
circle is drawn for the data name and statistic, etc. to be entered into the graph. The graph resembles a
donut shape as it is aptly called. A pie chart enables one to grasp in a glance the composite ratio of each
category such as, by characteristic elements.

A pie chart is another graphical device for summarizing qualitative data. The size of each slice of the pie
is proportional to the number of data values in the corresponding class.
How to Draw a Pie Chart

1. Draw a circle, placing a line from the center to the right in the horizontal position.
This will be the base line.

2. Take the angle for each item; draw the division line.

a. The cut-off fun-shaped portion is called a pie-shape.

b. Starting from the item with the biggest percentage, entering the item in a
clockwise direction is the usual practice.

c. Fit in the item others at the end.

3. Write in the item name and percentage into the pie.

a. Write the item name in a horizontal position with the percentage below.

b. The percentage is written up to one decimal point.

c. Should the pie be small, use the indicator lines to denote the wordings outside
the pie.

d. As a rule, for the item others, it should not be shaded.

4. In a donut chart, the center of the donut will be for the statistics.

The Line Graph

Line Graph is a graph suitable for plotting the information of the changes in time process.

How to Draw the Line Graph

1. Draw the vertical axis and the horizontal axis, marking in the scales respectively.

2. Plot the data.

3. Join the plotted data into the line.

4. For data which is way above, put a wave to cut the vertical axis, thereby shortening
the scales midway.

5. When plotting different data, it is the best to change the line indication to create a
differential.

The use of Line Graph


1. To explain the change in quantity. It can be used for the explanation of sales results,
productivity achievement, defect ratio, etc. In just enumerating the numbers, no
skills are involved. When this is depicted in a line graph, it would then appeal to the
sense of sight of a person and with more persuasive power. In this instance, it must
be noted that the graph must be accurate and easily understood.

2. It can be used for the control of periodic changes in quality, cost, delivery time, etc.
In concrete terms the characteristic of the achievement value to be controlled
should be plotted on the graph, and when it is compared to the standard value or
the objective value, the problem can be detected early and preventive action can be
taken.

3. To analyze the level and to have a grasp of the points movement in an abnormal
process. It is also useful to analyze improvement activities that must be carried out.

4. To detect the problem areas in the present state. When the present state is shown in
the graph, it would show whether the present level is normal or whether it is
necessary to carry out improvement. It also picks up the problem areas where the
action must be taken.

5. For effective confirmation of countermeasure steps. Compare the data obtained


from before and after the countermeasures. Then draw onto it a line graph which is
used to check the effectiveness of countermeasures.

6. For preparation of report. The data obtained from workplace experiment or market
survey is rearranged into a line graph which summarizes a large quantity of
information into a concise and easily understood form. This can be widely used in
the arrangement of information.

The Scatter Diagram

Scatter Diagram. Examines the relationship between one data and another and the level of the
relationship. The objective of the drawing a scatter diagram is to correspond two sets of data and to
examines the distribution pattern:

a. The relationship between cause (factor) and result (characteristic).

b. The relationship between characteristic (result) and result (characteristic).

c. The relationship between cause (factor) and cause (factor).

Situations:

a. Relationship between the number of complaints and number of machines under


repair by serviceman.

b. Relationship between the number of years of sales experience and sales figures.

c. Relationship between number of completed projects by circle and number of


meetings conducted.

d. Relationship between monthly sales turnover of any product and gross profit.

The most important point to note in reading scatter diagrams is to check to see whether the data is
stratified or not. When the data is stratified where there seems to be no correlation, a correlation is seen
to exist now. And where there seems to be a correlation, no correlation exists.

When judging the correlation in this manner, it is important to note the range of the data and to
carefully read the graph.

A scatter diagram will not teach why there is a correlation, so it is vital to examine the two sets of data
technically.
What to note in scatter diagram?

To ensure the effective application of a scatter diagram, check the following points:

a. Is the data accumulation too scanty?

b. Are there mistakes in measurement?

c. Are the results obtained from the scatter diagram being applied to the next action?

How to Draw a Scatter Diagram

a. To examine the relationships between two sets of data, collect the corresponding
group of data.

b. Whether that should be in the x-axis or y-axis in the case of factor and characteristic,
the main data will be X, Y respectively. Also decide on the respective largest value
and smallest value.

c. Put the horizontal axis as X and draw the horizontal axis and vertical axis. In the
scale, the smallest value is X, Y. Make the two scales as closely identical as possible.

d. Where the X,Y data intersects plot the point. If the data values overlap, make
concentric circles.

e. Lastly, enter the data sampling, collection period, objective, and product name,
manufacturing group, operator’s name and the data manufactured.

The Frequency Curve

Frequency Curve refers to the graphic representation of the number scores in each interval of the
distribution.

Frequency Distribution is the tabular arrangement of the given data by using categories or classes and
their corresponding frequencies.

Frequency Polygon is a line graph of class frequencies plotted against class marks. It is made by
connecting the midpoints of the rectangular tops in the histogram, or simply joining the plotted points
for the class marks and their corresponding frequencies. Thos kind of graphical presentation can also
accommodate categories of wide range, but is more useful for data such as ordinal and interval because
it stresses continuity along a scale.

Expected Frequency is the theoretical frequency for a cell in a contingency table or multinomial table,
computed on the basis of some hypothesis.

Observed Frequency is the actual frequency count of any observation and recorded in a cell of a
contingency table or multinomial table.

Cumulative Frequency refers to a frequency obtained by cumulating or successively adding in the


individual frequencies from the bottom or at the top. Greater than cumulative frequency starts adding
the frequency successively from the highest class limit and less than from the lowest class limit.

Kinds of Cumulative Frequency

1. The less than cumulative frequency distribution whose sum of frequencies for each
class interval is less than the upper class boundary of the interval they correspond
to.

2. The greater than cumulative frequency distribution whose sum of frequencies for
each class interval is greater than the lower class boundary of the interval they
correspond to.

Steps in Constructing a Frequency Distribution


1. Find the range by getting the difference between the highest and lowest values in
the set of data.

2. Find the number of class intervals or categories desired. The ideal number of class
intervals is somewhere between 5 and 15.

3. Find the approximate size of the class interval by dividing the range by the desired
number of class intervals.

4. Write the interval starting with the lowest lower limit as determined by the
researcher’s choice. The upper limit is determined by the size of the class interval
minus 1.

5. Find the class frequencies for each class interval by referring to the tally column.

6. Compute for the class interval, then dividing the sum by 2.

Example: consider the following raw data.

85 90 88 86 80

74 81 85 81 90

82 87 84 89 72

70 77 71 85 78

76 74 70 73 74

89 83 90 74 90

78 88 85 81 89

86 91 84 90 88

76 75 83 70 80

75 79 86 80 76

Solution:

Step 1: Arrange data to an ascending order.

91 88 84 80 75

90 87 84 79 74

90 87 83 79 74

90 86 83 78 74

90 86 82 78 74

90 86 81 77 73

89 85 81 76 72

89 85 81 76 71
88 85 80 76 70

88 85 80 75 70

Step 2: Determine the range of the score in the above data.

Range = Highest Value – Lowest Value

Range = 91 – 70

Range = 21

Step 3: determine the number classes.

Number of Classes =

Number of Classes =

Number of Classes = 8

Note:

a. If series contains less than 50 cases, 10 classes or less just enough. The usual class intervals
are 3, 5, and 10.

b. If series contains 50 to 100 cases, 10 to 15 classes are suggested.

c. If more than 100 cases, 15 or more classes are good.

Step 4: Find the starting point of the class limits.

a. Divide the highest value or score by the class interval or size. Take note the remainder

Remainder 1

b. Subtract the highest value by the remainder.

91 – 1 = 90 is the starting point of the class limits.

Step 5: You can write the class limits or intervals in either descending or ascending order. Note that the
lower and upper limits of every class interval are included in the class size.

Midpoint =

Midpoint =

Midpoint = 91

Class
Class
Limits/ Midpoint Tally Frequency Cf< Cf> Cf<% Cf>%
Boundaries
Intervals

90 – 92 91 89.5-92.5 IIIII-I 6 50 6 100 12

87 – 89 88 87.5-89.5 IIIII-II 7 44 13 88 26

84 – 86 85 84.5-87.5 IIIII-IIII 9 37 22 74 44

81 – 83 82 81.5-84.5 IIIII-I 6 28 28 50 56

78 – 80 79 78.5-81.5 IIIII-II 7 22 35 44 70
75 – 77 76 75.5-78.5 IIIII-I 6 15 41 30 82

72 – 74 73 72.5-75.5 IIIII-I 6 9 47 18 94

69 – 71 70 69.5-72.5 III 3 3 50 6 100

N = 50

Class Interval is the distance between the upper and lower limits of a step of test scores in a grouped of
frequency distribution.

Class Mark is the mid-value of the classes in a frequency.

Class Boundaries are values obtained from a frequency distribution by increasing the upper class limits
and decreasing the lower class limits by the same amount so that there are no gaps between
consecutive classes. These are carried out to one more decimal place than the recorded observation.

Class Frequency is the number of observations belonging to a class interval, or the number of items
within the category.

The Source of Data

1. Documentary Sources. This source may be taken from primary or secondary information.

2. Field Sources. This includes living persons which have sufficient knowledge about social
conditions or had been in intimate contact with the subject over a considerable period of
time.

You might also like