FDS Module I-I
FDS Module I-I
FDS Module I-I
FOUNDATIONS OF
DATA SCIENCE
Course Outcomes
CO1: Recall the basic concepts of big data and data science. PO1, PO2, PO12
CO2: Utilize statistical concepts of big data collection, data analysis, PO1, PO2, PO4, PO5,
PO12
modelling, and inference.
CO3: Identify appropriate data mining algorithms to solve real world PO1, PO2, PO3, PO4,
PO5, PO12
problems.
Information is processed, organized and structured data. It provides context for data
and enables decision making process.
Data Vs Information
Data Vs Information Examples Chart
Data Information
each individual homework and test grade the student’s average grade for each class
of a student in one class
typing the words “cat videos” in your the list of search results that includes a
computer search engine (input) variety of cat videos on the internet
(output)
Knowledge
Information can be converted into knowledge about historical patterns and future
trends
The bit
The Byte
Kilobyte (1024 Bytes)
Megabyte (1024 Kilobytes)
Gigabyte (1,024 Megabytes, or 1,048,576 Kilobytes)
Terabyte (1,024 Gigabytes)
Petabyte (1,024 Terabytes, or 1,048,576 Gigabytes)
Exabyte (1,024 Petabytes)
Zettabyte (1,024 Exabytes)
Yottabyte (1,204 Zettabytes, or 1,208,925,819,614,629,174,706,176 bytes )
Sources of Big Data
Social networking sites: Facebook, Google, LinkedIn all these sites generates huge amount of data on a day
to day basis as they have billions of users worldwide.
E-commerce site: Sites like Amazon, Flipkart, Alibaba generates huge amount of logs from which users
buying trends can be traced.
Weather Station: All the weather station and satellite gives very huge data which are stored and
manipulated to forecast weather.
Telecom company: Telecom giants like Airtel, Vodafone study the user trends and accordingly publish their
plans and for this they store the data of its million users.
Share Market: Stock exchange across the world generates huge amount of data through its daily
transaction.
Types Of Big Data
Structured
Unstructured
Semi-structured
Contd…
Characteristics of Big Data
Applications of Big data
Big data
https://www.youtube.com/watch?v=TzxmjbL-i4Y
Definition – Data Warehouse
A Data Warehousing (DW) is process for collecting and managing data from varied
sources to provide meaningful business insights.
Data Warehouse
Definition-Data Mining
In simple words, data mining is defined as a process used to extract usable data from a larger
set of any raw data.
Data Mining
Contd…
https://www.youtube.com/watch?v=CCnCABJhAdU
https://www.youtube.com/watch?v=lSwIe0TMUhc