Lecture_2_Basics of Data Science (1)
Lecture_2_Basics of Data Science (1)
Sc
Course Code :DS101
Course Name :Basics of Data Science
Introduction to Data
Science
2. ALPHABETIC DATA
It consists of all alphabetic letters A to Z, a to z
and blank space etc.
Different Types Of
Data
3. ALPHANUMERIC DATA
It consists of alphabet letters, digits and special
characters like #, $, % etc.
For example, House Number 10-A, 14-August-1947,
F-16 etc.
4. GRAPHIC DATA
Graphic data or image data consists of charts, graphs
and images etc. For example a collection of maps of
countries, a collection of family pictures etc.
Different Types Of
5. AUDIO DATA
Data
Audio data consists of sounds and voices. For
example radio program, radio news, audio songs etc.
6. VIDEO DATA
Video data consists of moving pictures. For example
movie, TV drama, TV news etc.
Different Types Of
7. MIXED DATA Data
Mixed data consists of combined data of two or more
types.
For example, TV drama consists of audio as well as
video data. Another example of mixed data is the
admission form of students. Because students
provide different types of data on admission form like
numeric data, alphabetic data, alpha numeric data
and graphic/image data etc. as explained below:
Numeric data: marks obtained by student
Alphabetic data: Name, father name etc.
Alpha numeric data: Address
Graphic data: Picture of student
Quantitative data are measures of values or
counts and are expressed as numbers.
Qualitative:
He is brown and black
He has long hair
He has lots of energy
Qualitative vs
Example: What Quantitative
do we know about Arrow the Dog?
Quantitative:
Discrete:
He has 4 legs
He has 2 brothers
Continuous:
He weighs 25.5 kg
He is 565 mm tall
Qualitative vs
More Example: Quantitative
Qualitative:
Your friends' favourite holiday destination
The most common given names in your town
How people describe the smell of a new perfume
Quantitative:
Height (Continuous)
Weight (Continuous)
Petals on a flower (Discrete)
Customers in a shop (Discrete)
How can you use quantitative and qualitative data?
Frequency counts:
The graphs below arrange the quantitative and qualitative data to show
the frequency distribution of the data.
Quantitative Data
Qualitative Data
As absolute frequencies can be calculated on quantitative and
qualitative data, relative frequencies can also be produced, such as
percentages, proportions, rates and ratios. For example, the graphs
above show 4 people (20%) worked less than 30 hours per week, and
6 people (30%) are teachers.
Descriptive (summary) statistics:
As shown in the graph below, data collected over time indicates a 5% increase
every five years. Therefore, if the rate of increase continues to follow the same
pattern, it can be projected that the annual income for that employee in 2015 will
be $46,305; which is the 2010 wage of $44,100 increased by an additional 5%.
Unstructured data is
the data that lacks any
predefined model or
format.
It resides in various different formats like text, images, audio and video
files, etc.
In HTML a text and other data is organized with tags. These tags
somewhat organize this file and help the browser rendering it and
making sense of it. However, on a different webpage the number and
type of tags used might be completely different.
JSON
An employee data JSON file is shown in the diagram., JSON files
naturally contain a tree-like structure that provides some organisation,
though it is weaker than a table's. As a result, it is partially possible to
analyse the data using simple filter choices, although doing so is more
difficult than doing so with structured data.
What is Data Collection?
Data collection is the process of collecting, measuring
and analyzing different types of information using a
set of standard validated techniques.
It refers to information that has never before been used. The best
type of data for study is typically thought to be that which is
obtained using primary data collection techniques.
II. Observations
Researchers use this technique to observe their surroundings
and document their results. It can be used to assess how
various people behave in scenarios that are
controlled (everyone is aware that they are being watched)
and
uncontrolled (no one is aware that they are being watched).
Focus groups might take a lot of time and be difficult, but they can
help disclose some of the best information for difficult
circumstances.
V. Oral Histories
Despite the fact that this method is quick and simple, you should only
use reliable websites for gathering data.
II. Government Archives
The problem, though, is that data isn't always easily accessible for
a variety of reasons.
Example :
Minutes
0 1 2 3 4 5 6 7 8 9 10 11 12
:
People: 6 2 3 5 2 5 0 0 2 3 7 4 1
How to Show Data
Line
Graphs
Line Graph: a graph that shows information that is
connected in some way (such as change over time)
Example: You are learning facts about dogs, and
each day you do a short test to see how good you
are. These are the results:
Table: Facts I got Correct
Day 1 Day 2 Day 3 Day 4
3 4 12 15
https://www.excel-easy.com/functions/statistical-
functions.html
https://www.excel-easy.com/examples/box-
whisker-plot.html