1.4 - About Data
1.4 - About Data
Welcome everybody
- Data
- Types of data
- Characteristics of Data
Data is everywhere
Your age, your height, your weight, your hobby, the name of your
Data
sensible to add, subtract, take averages, etc. with the numerical values.
Example:
- height of a person,
- price of a commodity,
- weight of students,
- CGPA of a student,
- Time
- Temperature
Data
Numerical Categorica
l
Continuous Discrete
Discrete Data:
Discrete numerical data are counted, and can take on only whole non-
negative numbers can only take certain values (like whole numbers).
Example:
- number of students,
- number of chairs in a room,
- number of children in a family,
- number of days in a month,
- number of courses
- runs
- wicket number
- goals score
- points in a game.
Continuous Data:
Continuous numerical data can be measured, and can take any numerical
value (within a range). Thus, the numerical data can be whole number or
be fraction
Example:
- height of a person,
- price of a commodity,
- weight of students,
- CGPA of a student,
- Time
- temperature
- Sales of a shop,
- NID number
- Telephone number
- PIN/TIN number
Although all these are discrete number but are in the class of
Categorical data is just sort of descriptive. They are simply names. Categorical
individuals. This type of data is called qualitative or enumeration data and the
an attribute.
Example:
- Name
- Your department,
- Hobby
- Gender,
- Passed or failed,
- Religion,
- rich or poor.
Categorical data can be divided into two parts Nominal or Ordinal
Nominal Data:
Nominal data is classified by quality (attribute) rather than numerical scale. The
levels of the data do not have ordering. A good way to remember all of this is
that “nominal” sounds a lot like “name” and nominal data are kind of like
element.
Note that we can only summarize the nominal data by frequency table and
Categorical variables that have the order or rank or have a rating scale of values
are meaningful are called ordinal. For the ordinal data have relative differences
and consist of ordering or ranking the differences. Thus the ordinal data can be
compared. One can count and order, but not measure, ordinal data. Ordinal
discomfort, etc.
Examples:
Grades
Rich or poor,
Social Class,
Level of satisfaction
Professional level
Discret Continuous
Ordinal
(page, (age,weight, Nominal
income,marks, (taste,grade,social
goals,runs,wicket,no. (name,
%,average) class, profession
of people) city,dept.,hobby,school) level)
Time Series Data
Time Series Data are collected over time. A time series is a collection of
each month of the year, number of students are admitted into each year, export
Year GDP(million$)
2007 300
2008 320
2009 350
2010 455
2011 530
Data Sources
There are many sources to collect dat. Data can be collected
The data which are collected directly way are known primary data.
Primary data means original data, which were collected specially for a
specific study.
Someone collected the data from the original source first hand or directly.
Surveys
Census
Observational studies
Experiments
Clinical trials
The data which are collected indirect way are known secondary data. These
data were not originally collected for the purpose of the study.
Secondary data is data that is being reused.
- Registry,
- Website (google, yahoo),
- Magazine,
- books,
- TV,
- newspaper,
- Radio,
- Journals
You should have proper data for analysis; otherwise, how well you are
making your analysis will go in vain or will be meaning less