Handouts 2 ENDATA130 Data Collection and Organization
The document discusses various data gathering techniques and concepts in data management including data organization. It covers topics like methods of data collection, types of questionnaires, types of questions, data management components, constructing frequency distributions, and alternative approaches like stem-leaf plots.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
28 views26 pages
Handouts 2 ENDATA130 Data Collection and Organization
The document discusses various data gathering techniques and concepts in data management including data organization. It covers topics like methods of data collection, types of questionnaires, types of questions, data management components, constructing frequency distributions, and alternative approaches like stem-leaf plots.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 26
Learning Outcomes:
At the end of the discussion and presentations,
the student should be able to Determine the various data gathering techniques; Explore the concepts in data management; and Perform data organization. HANDOUTS 2 ENDATA130 Prepared by: Y.E. Fernandez Topic Coverage Data Gathering/Collection Technique Data Management- Data Organization Methods of Data Gathering/Collection:
1)asking questions – through
direct / interview form indirect/questionnaire form (Survey) 2)Observation 3)Use of Existing data 4)Experimentation 5) Simulation Types of Questionnaires: Unstructured – the questions asked are in no particular order or arrangement for as long as all those that are needed to answer the questions posed in the study are asked.
Structured – questions are arranged
according to the order of the statement of the problem. Types of Questions: Open-ended – those which can be answered in any form and length. Close-ended – those for which the researcher provides a number of possible responses to choose from. - may include alternate response questions, multiple choice questions, scaled opinionaire, attitude scale using Likert’s scale, etc. NOTE: Common questions asked for Descriptive statistics are demographic data. Data Management Data Management includes Data Organization - involves meaningful organization of data into frequency distribution or tabular form.
Data Presentation – involves use of
statistical graphs and charts. Two Types of Data File / Set: Raw Data or Ungrouped Data – refer to the data in the original form as they are collected.
Organized Data or Grouped Data – refer
to the data already systematically organized into a frequency distribution. Frequency Distribution – the organization of raw/ungrouped data in table form, using classes and frequencies.
Frequency – the number of values in a
specific class of distribution. Types of Frequency Distribution 1) Categorical Frequency Distribution - applicable for nominal and ordinal level data.
2) Grouped Frequency Distribution -
applicable for numeric data such as interval and ratio level data. For Categorical Frequency Distribution Construction of Categorical Frequency Distribution - features of the table include:
Class Tally Frequency Percent
(refer to the WORD or DOCS File for
illustration.) For Grouped Frequency Distribution Features of Grouped Frequency Distribution: 1) class or class interval – specific range of values whose frequency is obtained. 2) Class limits – values included in a given class and include a) lower class limit (LL) – lowest value in a given class. b) Upper class limit (UL) – highest value in a given class. Features of Grouped Frequency Distribution (cont’d) 5) Class boundaries – range of numerical values that separate the classes so that there are no gaps in the frequency distribution. (Basic Rule: The class should have the same decimal place value as the data, but the class boundaries should have one additional place value and end with 5) - refer to the board for illustration 6) Class Midpoint (M) - middlemost value in a given class Where: M = (UL + LL) / 2 Features of Grouped Frequency Distribution (cont’d) 3) Range (R) – the difference between the highest and lowest value in the data file/set. Where: R = highest value - lowest value in the raw data 4) Class size/width/length (C) - number of units of numeric value in a given class. Where: C = UL – LL + 1 Or : C = Range / desired number of classes Guidelines for classes 1) There should be between 5 and 20 classes. 2) The class width should be an odd number. This will guarantee that the class midpoints are integers instead of decimals. 3) The classes must be mutually exclusive. This means that no data value can fall into two different classes Guidelines for classes (cont’d) 4) The classes must be all inclusive or exhaustive. This means that all data values must be included. 5) The classes must be continuous. There are no gaps in a frequency distribution. Classes that have no values in them must be included (unless it's the first or last class which are dropped). 6) The classes must be equal in width. The exception here is the first or last class. It is possible to have an "below ..." or "... and above" class. This is often used with ages. The Reasons for Constructing a Frequency Distribution 1) To organize the data in a meaningful, intelligible way. 2) To enable the reader to determine the nature or shape of the distribution. 3) To facilitate computational procedures for measures of average (central tendency) and spread (dispersion) of the data set. 4) To enable the researcher to draw charts and graphs for data presentation. 5) To enable the reader to make comparisons among different data set. NOTE: The Distribution. The distribution is a summary of the frequency of individual values or ranges of values for a variable. This is the main concern of performing data organization. Construction of Grouped Frequency Distribution 1) From the data file, determine the range. 2) Decide on the number of classes (arbitrary) - number of classes should not be too many nor too few (usually between 5 to 20 classes). 3) Determine the class size (C) using the range and desired number of classes. The class size is usually rounded-up to the nearest odd number so that the midpoint of each classes will also be a whole number. Construction of Grouped Frequency Distribution (cont’d) 4) Set-up classes by making the lowest value in the data file as the LL of the first or lowest class.
5) Make a tally from the data file to complete
the grouped frequency distribution. (refer to the WORD or DOCS file for illustration) Kinds of Frequencies: 1.) Frequency, f = based on the result of usual tally of raw data. 2.) “less than” cumulative frequency, <f = frequency up to but not exceeding the upper boundary of a given class interval. 3.) “greater than” cumulative frequency, >f = sum of the frequencies more than the lower boundary of a given class. Kinds of Frequencies: 4.) relative frequency = frequency of a given class divided the total frequency. 5.) Percentage relative frequency = relative frequency multiplied by 100 (expressed in % value) Alternative Approach for Data Organization Using Stem-Leaf Plot: Stem-Leaf Plot - a method of organizing data and is a combination of sorting and graphing. - is a data plot that uses part of the data value as the stem and the part of the data value as the leaf to form groups or classes. Construction of a Stem-Leaf Plot 1) Arrange the raw data in order according to magnitude. 2) Separate the data according to their digits. (see illustration on the board) 3) Identify the stem (leading digit for each value) and the leaf ( the trailing digits in each value). 4) Construct the stem-leaf plot Example of Constucting a Stem – Leaf Plot Construct a Stem-leaf Plot for the following data on Age of Movie-goers
Chapter 2: Frequency Distribution and Measures of Central Tendency 2.1 A FREQUENCY DISTRIBUTION Is A Tabular Arrangement of Data Whereby The Data Is Grouped