0% found this document useful (0 votes)
11 views

Data Management

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

Data Management

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Data Management

Topic Outline
• Types of Data Management
• Type of Variables
• Scales of measurement of Data
• Ways of presenting Data
Objective:
• To understand the concept of a
frequency distribution table and how to
present data using different types of
graphs.
Data Management or Statistics
• The science of collecting, organizing, presenting,
analyzing and interpreting numerical data.

• Refer to the mere tabulation of numeric information in


reports of stocks, market transactions, or to the body of
techniques used in processing or analyzing data.
Data Management or Statistics
Statistics involves sampling method.

Population – is the entirety of the group including all the


members that forms a set of data.

Sample – contains a few member of the population. Samples


were taken to represent the characteristics or traits of the
population.
Types of Data Management or Statistics

• Descriptive Statistics – is concerned with collecting,


organizing, presenting, and analyzing numerical data. The
statistician tries to describe or summarize a situation. It can also
be represented with graphs.

• We describe how high or low our data set or much more a given
data over the other. Involved one specific population.
Types of Data Management or Statistics

Common Tools are:


• Measures of Central Tendency (Middle Value to describe the
population)
• Measures of Variability (Spread of the data set)

Example:
In a math class, 15 out of 35 students were able to receive a passing
mark. The average score of the class is 62 out of 100.
Types of Data Management or Statistics

• Inferential Statistics – also called Statistical Inference or


Inductive Statistics, is concerned with analyzing the
organized data leading to prediction or inferences.

• It implies that before carrying out an inference, appropriate and


correct descriptive measures or methods are employed to bring
out good results.
Types of Data Management or Statistics

Common Tools are:


• Hypothesis Testing
• Regression Analysis

Example: In sample survey conducted, 65% of Filipino Generation


Z, prefers to drink milk tea than coffee while 34% of Filipino
Millennials prefer to drink milk tea than coffee.
Variables
• The characteristic that is being studied is called variable.
• It varies across individuals or objects.
• It includes age, race, gender, intelligence, personality
type, attitudes, political or religious affiliation, height,
weight, marital status, eye color, etc.
Two Types of Variable/Data
1. Qualitative
2. Quantitative
2.1. Discrete
2.2. Continuous
Qualitative
• represents differences in quality, character, or kind but
not in amount.

Example: Sex, birthplace or geographical locations, religious


preference, marital status, eye color, brand, etc.
Quantitative
• Numeric in nature and can be ordered or ranked.

Example: Weight, height, age, test scores, speed, body


temperature, grades, etc.

• can be categorized as discrete or continuous


Discrete Variables
• Variable whose values can be counted using integral
values or can be exactly counted.

Example: Number of enrollees, drop-outs, deaths, number of


students in a classroom, number of computers functioning,
number of calls received.
Continuous Variables
• Variable that can assume any numerical value over an
interval/s. It can yield decimal or fractions.

Example: Height. Weight, temperature, time.


Scales of Measurement of Data
1. Nominal or Categorical Data
2. Ordinal Data
3. Interval Data
4. Ratio Data
Nominal or Categorical Data
• Use number for the purpose of identifying membership
in group or category.

• Examples: (a) electric consumption:


(1)residential, (2) commercial, (3) industrial,
(4)government, (5) others
(b) Gender of NEUST BSIT/BSN students: (1) male,
(2) female
(c) Field of study: (1) BSN, (2) BIT, (3) BSChem
(4) ECET
Ordinal Data
• connote ranking or inequalities
• Examples: (1)grades (1, 2, 3)
(2) Socioeconomics status (low, medium, or high)
(3) Intelligence (above average, average, below average)
(4) Built of people (small, medium, large, extra large)
(5) Contest (first, second, third)
(6) Likert scale (Strongly disagree, Disagree, Undecided,
Agree, Strongly Agree)
Interval Data
Properties:
• Is measured in a form of number. Example is you can measure
temperature using thermometer.
• Has rank and order. Example 1ºC is always lower than 5ºC.
• Is equidistant, meaning, it has equally spaced intervals.
• Doesn’t have any meaningful zero (0)
• Can be negative. Example -12ºC.
Ratio Data
Properties:
• Is measured in a form of number. Example is you can measure
temperature using thermometer.
• Has rank and order. Example 1ºC is always lower than 5ºC.
• Is equidistant, meaning, it has equally spaced intervals.
• Has a meaningful zero (0). Example, you traveled 0 km or have
not traveled at all. Zero votes.
• Can never be negative. Example, there is no -5km or negative
votes.
Exercise: Identify the ff if what type of variable and
level of measurement
Dependent and Independent Variable

Dependent Variable – the variable whose value is being


predicted/ the outcome.
Independent Variable – the predictor

Example 1: to predict the amount of sunlight on the growth of a


certain plant.

Example 2: To evaluate the effect of using computer to the


performance of the students.
Data
The primary element of data management is the data.
• A collection of observations on one or more variables.
• Factual information such as measurements or statistics used as a
basis for reasoning, discussion, or calculations.
• Information in numerical form that can be digitally transmitted
or processed(Merriam-Webster Dictionary).
• The raw material which the statistician works. It can be found
through surveys, experiments, numerical records, and other
modes of research.
Primary and Secondary Data
• Primary Data refer to information which is gathered directly
from the original source or which are based on direct or first-
hand experience (e.g. – surveys, interviews, observations,
registrations, autobiographies, diaries, etc.).

• Secondary Data refer to information which are taken from


published or unpublished data which are previously gathered by
other individuals or agencies (e.g. – books, magazines,
newspapers, internet, etc.).
Data Collection and Presentation
Types of Data
1. Primary – first hand or original sources
2. Secondary – data used by researchers lifted from other
sources.
Data Presentation
Statistical data should be presented systematically

Note: The mere gathering of information or data is not a


small task. A greater task is to make the data comprehensible
and meaningful.
Ways of Presenting Data
• The data gathered are summarized and presented in
different forms,

1. Textual Form – the data are incorporated in the text of


the report.
2. Tabular Form – the data are presented in rows and
columns.
3. Graphical Form – the data are presented in graphical
form for an “easy to digest” information.
Methods of Organizing Raw Data
Raw Data – data collected in an investigation and they are not
organized systematically.

Array – ordering of the observations from smallest to the


largest or vice versa. It has advantages because the low and high
values can be readily perceived. The process is tedious
especially if the raw data are numerous.
Methods of Organizing Raw Data
Example: A nationwide travel agency offers special rates for package tours
during summer. To economize spending for the advertisement only certain age
group of people will be sent brochures for attraction. The agency gets to
previous passenger customers from its files and groups them according to
ages. Only those age groups with at least people are sent brochures.

• The following are the ages of the previous customers:


True Limits
• A point that represents the halfway point between successive
classes is called a true limit or class boundary. It is obtained by
adding the upper limit of one class and the lower limit of the next
class and then dividing by 2.

Example Class Mark or Mid values: 10-19 class interval


Example Class Boundary:
Seatwork
Graphical Representation of Frequency
Distribution
• Graphical Forms – often more helpful in making a stronger
visual impact. Three charts that will help portray a frequency
distribution graphically are the histogram, the frequency polygon
and the cumulative frequency polygon,
Histogram
Is a graphic representation of frequency distribution where
adjoined vertical rectangles are drawn on the horizontal axis with
the centers of the bases located at the class marks. The class
boundaries are plotted against the frequencies. A histogram is called
a frequency histogram when frequencies are plotted along the
vertical axis against the class boundaries.
Histogram
Histogram
Uses:
• Histograms are useful for the visualization of the distribution of
data.
• They can tell us about the skewness of data plotted.
• These charts also help in predicting the future performance of
the process.
• Histograms are helpful in calculating the standard deviation of
data.
Frequency Polygon
Is a closed figure of n sides constructed by plotting the class marks
against the frequencies. In constructing a frequency polygon, class
marks class marks are drawn in order to close the polygon.

Commonly known as line


Graph.

Shows class midpoint on


The x-axis and frequency
On the y-axis.
Frequency Polygon
Uses:
• Frequency polygons are a statistical tool equivalent to a
Histogram that is used to represent and compare data when they
are given in the form of cumulative frequency.
• They are a good choice for displaying cumulative frequency
distributions.
Assignment: Frequency Distribution Table and Presentation

Objective: To understand the concept of a frequency distribution


table and how to present data using different types of graphs.

Instructions:
1. Data Collection: Collect data on a topic of your choice. This could be
anything from the number of hours you spend on different activities in a day,
to the heights of your classmates, or the number of cars passing by your house
in an hour. Make sure you have at least 30 data points.
Assignment: Frequency Distribution Table and Presentation

Instructions:
2. Frequency Distribution Table: Create a frequency distribution table for
your data.
3. Graphical Presentation: Present your data graphically. Create at least
three different types of graphs: a histogram, a frequency polygon, and an
ogive. Make sure to label your axes and provide a title for each graph.
4. Analysis: Write a brief analysis of your data. What does your frequency
distribution table tell you about your data? What can you infer from your
graphs? Do you see any patterns or trends?
Assignment: Frequency Distribution Table and Presentation

Instructions:
5. Presentation: Prepare a presentation of your findings. This could be a
PowerPoint presentation, a poster, or a report. Make sure to include your
frequency distribution table, your graphs, and your analysis.
Measures of Central Tendency
• Well known application of statistics is the measure of central
tendency

• There are many ways of describing of a given set of data. A


good number of descriptive measures exist in statistics whose use
depends largely on the nature of data and the intended purpose of
the description. This measure is the measures of central
tendency; it is used to see how large set a raw materials can be
summarized so that the meaningful essential can be extracted
from it.
Measures of Central Tendency
• Central tendency is a statistical measure that helps you find the
middle or the average of a dataset.
• The most commonly measures of central tendency are the mean,
median, and mode.

• Mode – the most frequent value in the dataset.


• Median – the middle number in an ordered dataset.
• Mean – the sum of all values divided by the total number of
values.
Measures of Central Tendency
• The choice of central tendency measure depends on the level of
measurement of your data. For nominal level data, you can only
use the mode to find the most frequent value. For ordinal level or
ranked data, you can also use the median to find the value in the
middle of your dataset

• It’s important to note that in addition to central tendency, the


variability and distribution of your dataset is important to
understand when performing descriptive statistics
Median
The median is the midpoint of the data array. Before finding the
value, the data must be arranged in order, from least to greatest or
vice versa. The median will be either be a specific value or will fall
between two values.
Mode
It is the value that occurs, most often in the data set. The
number/value/observation in a data set which appears the most
number of times.
Seatwork. Solve the following problems:

1. The grades of 20 high school students in mathematics are 75, 78,


79, 80, 83, 84, 85, 86, 88, 89, 90, 90, 90, 91, 92, 93, 94, 94, 95.
Find the mean, median, and mode.

2. The mean of a set of 40 numbers is 25. Find the sum of 40


numbers.

3. the average IQ of 10 students in a Mathematics subject is 114. If 9


of the students have IQ’s of 101, 125, 118, 128, 106, 115, 100, 118
and 108.
Seatwork. Solve the following problems:

4. Find the mean, median, and mode of the scores of grade 7 students
in a 30 item summative assessment in math.

Class Limits Frequency (f) Lower Class mark (x) fx Cumulative


boundary frequency (cf)

8-11 6

12-15 5

16-19 7

20-23 10

24-27 1

28-31 1

You might also like