0% found this document useful (0 votes)

71 views6 pages

Introduction To Data Science and Big Data

This document provides an introduction to data science and big data. It discusses key concepts like artificial intelligence, machine learning, and the five V's of big data. Data science is used to extract insights from huge amounts of structured, semi-structured, and unstructured data to help with decision making and problem solving. Machine learning enables systems to learn from past data without being explicitly programmed. The document also outlines some applications of data science like driverless cars, e-commerce, and banking. It discusses the data science lifecycle and relationship between data science and information systems.

Uploaded by

Sushant Shinde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

71 views6 pages

Introduction To Data Science and Big Data

Uploaded by

Sushant Shinde

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Unit - 1 Introduction to Data Science and Big Data

Data science is used in to how the data can be made use in a manner
that helps in a better decision-making process and solve complex
problems more simply.

It processes a huge amount of structured, semi-structured,

unstructured data to extract insight meaning, from which one
pattern can be designed that will be useful to take a decision for
grabbing the new business opportunity, the betterment of
product/service, ultimately business growth.

AI
 Artificial intelligence is a technology using which we can create
intelligent systems that can simulate human intelligence

 Artificial intelligence system does not require to be pre-

programmed

Machine Learning

It is about extracting knowledge from the data

Machine learning is a subfield of artificial intelligence, which enables
machines to learn from past data or experiences without being
explicitly programmed
Big Data –
It is a collection of data sets which is so large and complex that it
become difficult to process using DBMs tools

"Big Data" consists of very large volumes of heterogeneous data that

is being generated, often, at high speeds.

Application of Data Science

Transport
Data Science also entered into the Transport field like Driverless Cars.
With the help of Driverless Cars, it is easy to reduce the number of
Accidents.
In Driverless Cars the training data is fed into the algorithm and with
the help of Data Science techniques, the Data is analyzed

E commerce -
E-Commerce Websites like Amazon, Flipkart, etc. uses data Science to
make a better user experience with personalized recommendations.

Banking
It is one of the biggest applications of Data Science , banks can
manage their resources efficiently, furthermore, banks can make
smarter decisions through fraud detection, management of customer
data, risk modeling, real-time predictive analytics, customer
segmentation, etc.

Manufacturing
– Optimizing production
– Reducing costs
– Boosting the profits
Data Explosion

The rapid increase in the amount of data that is generated and stored
in the computing systems, that reaches level where data management
becomes difficult, is called “Data Explosion”.

The key drivers of data growth are following :

– Increase in storage capacities.
– Cheaper storage.
– Increase in data processing capabilities by modern computing
devices.
– Data generated and made available by different sectors

Five V’s of big data

we can identify Big Data by a few characteristics which are

specific to Big Data. Which is know as Five V’s of big data

• Volume –
it refers to the amount of data that exists.
If the volume of data is large enough, it can be considered big data

• Velocity -
It refers to how quickly data is generated and how quickly that data
moves.
• Variety
refers to the diversity of data types.
An organization might obtain data from a number of different data
sources, which may vary in value. Data can come from sources in and
outside an enterprise as well

• Veracity
It refers to the quality and accuracy of data. Gathered data could have
missing pieces, may be inaccurate or may not be able to provide real,
valuable insight

• Value
This refers to the value that big data can provide, and it relates directly
to what organizations can do with that collected data.

Relation between DS and IS

DS is about discovery of knowledge from a data
A data science is field in which information and knowledge extracted
from data by using diff algorithm and processes
Data science is used in business function such as strategy information ,
decision making

IS is about design practices for storing and retrieving information

IS is used in areas such as knowledge management , data
management
Data science Lifecycle –

Data Science Life Cycle is a definite procedure that has five

important steps .
Gathering/Collecting Data

Before creating any new product, organizations need to

collect data to research the demand, customer preferences,
competitors, etc.
If these data are not collected in advance, the rate of failure
for the new product is 80 percent or even higher.

There are two main methods of data collection

1. Primary Data Collection

• Interviews
• Observations
• Surveys and Questionnaires
• Focus Groups
• Oral Histories

2. Secondary Data Collection

Secondary data refers to data that has already been collected

by someone else.
• It is much more inexpensive and easier to collect than
primary data.
• While primary data collection provides more authentic and
original data, there are numerous instances where secondary
data collection provides great value to organizations.
Example - Internet
Cleaning Data -
Scrubbing and filtering of data.
Here we Remove duplicate or irrelevant observations
Exploring Data
Modeling Data
Interpreting Data

(DSBDA) Unit 1 Introduction To Data Science
No ratings yet
(DSBDA) Unit 1 Introduction To Data Science
14 pages
Chapter 1 Data Science Fundamentals
No ratings yet
Chapter 1 Data Science Fundamentals
34 pages
Unit 1
No ratings yet
Unit 1
60 pages
DS-BDS (Unit 1) Technical
No ratings yet
DS-BDS (Unit 1) Technical
22 pages
R Programming UNIT-1
No ratings yet
R Programming UNIT-1
48 pages
1 Unit 1 Introduction To Data Science
No ratings yet
1 Unit 1 Introduction To Data Science
48 pages
Lec 1 Data Science and Big Data
No ratings yet
Lec 1 Data Science and Big Data
3 pages
Data Science Unit 1
No ratings yet
Data Science Unit 1
70 pages
Dsbda U1 New
No ratings yet
Dsbda U1 New
6 pages
Ids (R22) U1 PPT 03092024
No ratings yet
Ids (R22) U1 PPT 03092024
87 pages
Fds Module 1
No ratings yet
Fds Module 1
65 pages
Module-1: Introduction To Data Science
No ratings yet
Module-1: Introduction To Data Science
98 pages
Introduction To Data Science and Big Data
No ratings yet
Introduction To Data Science and Big Data
124 pages
Dsbda Unit1
No ratings yet
Dsbda Unit1
232 pages
DS R Unit-1
No ratings yet
DS R Unit-1
41 pages
Unit I Introduction To Data Science and Big Data
No ratings yet
Unit I Introduction To Data Science and Big Data
121 pages
Unit 1
No ratings yet
Unit 1
28 pages
Mod 3
No ratings yet
Mod 3
96 pages
DSBDA Unit 1
No ratings yet
DSBDA Unit 1
16 pages
BCA Lecture I
No ratings yet
BCA Lecture I
20 pages
Unit I TYCS DS
No ratings yet
Unit I TYCS DS
73 pages
SAS 101 - Introduction To Data Science
No ratings yet
SAS 101 - Introduction To Data Science
10 pages
Big Data in Data Science
No ratings yet
Big Data in Data Science
3 pages
Unit I-Introduction of Data Science & R Programming: What Is Data Science? What Is Data Science?
No ratings yet
Unit I-Introduction of Data Science & R Programming: What Is Data Science? What Is Data Science?
30 pages
20IT501 BDA Unit1
No ratings yet
20IT501 BDA Unit1
18 pages
Ids Unit 1 Final
No ratings yet
Ids Unit 1 Final
30 pages
IDS Unit 1
No ratings yet
IDS Unit 1
67 pages
Data Science
No ratings yet
Data Science
6 pages
Dsbda Unit 1
No ratings yet
Dsbda Unit 1
18 pages
Research On Data Science, Data Analytics and Big Data Rahul Reddy Nadikattu
No ratings yet
Research On Data Science, Data Analytics and Big Data Rahul Reddy Nadikattu
7 pages
Introduction To Datasciecne
No ratings yet
Introduction To Datasciecne
50 pages
GPT (CH 6)
No ratings yet
GPT (CH 6)
22 pages
Fundamentals of Data Science
No ratings yet
Fundamentals of Data Science
53 pages
Deta Science
No ratings yet
Deta Science
8 pages
INTRODUCTION and M1-CH-1
No ratings yet
INTRODUCTION and M1-CH-1
63 pages
AD3491 UNIT 1 NOTES EduEngg
100% (1)
AD3491 UNIT 1 NOTES EduEngg
35 pages
EDS Unit 1?
No ratings yet
EDS Unit 1?
15 pages
M 1 FDS Notes
No ratings yet
M 1 FDS Notes
19 pages
Data
No ratings yet
Data
43 pages
Data Science Unit I
No ratings yet
Data Science Unit I
13 pages
Data Science Unit-I
No ratings yet
Data Science Unit-I
13 pages
IDS Unit-1
No ratings yet
IDS Unit-1
18 pages
PSD02 - Data Science Overview
No ratings yet
PSD02 - Data Science Overview
64 pages
Fintech J Chap 6
No ratings yet
Fintech J Chap 6
18 pages
Paper
No ratings yet
Paper
4 pages
Unit 1 Data Science and Big Data
No ratings yet
Unit 1 Data Science and Big Data
23 pages
Question Bank Syllbuswise
No ratings yet
Question Bank Syllbuswise
16 pages
Inroduction To Data Science
No ratings yet
Inroduction To Data Science
62 pages
Data Science
No ratings yet
Data Science
85 pages
ADET - Lesson 2
No ratings yet
ADET - Lesson 2
21 pages
Unit-1 Final Sgs
No ratings yet
Unit-1 Final Sgs
24 pages
Data Science Lecture 1 Introduction
No ratings yet
Data Science Lecture 1 Introduction
27 pages
Applied - Data - Science MODULE 1 SEM8
No ratings yet
Applied - Data - Science MODULE 1 SEM8
16 pages
Kadir
No ratings yet
Kadir
84 pages
Data Science and Big Data Analytics Unit 1 Notes
No ratings yet
Data Science and Big Data Analytics Unit 1 Notes
13 pages
Unit I
No ratings yet
Unit I
61 pages
Session 1819
No ratings yet
Session 1819
47 pages
FortiNet Log Reference PDF
No ratings yet
FortiNet Log Reference PDF
143 pages
Intel S1200BTL
No ratings yet
Intel S1200BTL
158 pages
Guidelines For Master Thesis SS 2013 Allgemein
No ratings yet
Guidelines For Master Thesis SS 2013 Allgemein
20 pages
CV Ishmam
No ratings yet
CV Ishmam
2 pages
Brochure AuthPoint
No ratings yet
Brochure AuthPoint
4 pages
Comparison of Crisp and Fuzzy Sets
No ratings yet
Comparison of Crisp and Fuzzy Sets
10 pages
A Neural Network Approach To Ordinal Regression
No ratings yet
A Neural Network Approach To Ordinal Regression
6 pages
iCEcube2 Userguide Dec2020
No ratings yet
iCEcube2 Userguide Dec2020
187 pages
PastPapers Harony P4 2024
No ratings yet
PastPapers Harony P4 2024
484 pages
Power System Security
88% (40)
Power System Security
32 pages
Diamond 3 13 User Guide
No ratings yet
Diamond 3 13 User Guide
152 pages
(Day - 1 - 7) - Prep For Mock Conference - Info Kit (Netmission)
No ratings yet
(Day - 1 - 7) - Prep For Mock Conference - Info Kit (Netmission)
34 pages
Fluuter Android Presentation
No ratings yet
Fluuter Android Presentation
17 pages
SlideEgg - 025-Free Editable Infographic PowerPoint Templates
No ratings yet
SlideEgg - 025-Free Editable Infographic PowerPoint Templates
14 pages
(4 Usd) (76561199183231530)
No ratings yet
(4 Usd) (76561199183231530)
1 page
Business Intelligence & Data Visualization Tableau - Tables: Cyrus Lentin
No ratings yet
Business Intelligence & Data Visualization Tableau - Tables: Cyrus Lentin
28 pages
Engineering Practices For Building Quality Software
No ratings yet
Engineering Practices For Building Quality Software
127 pages
2014 Springer Varian Beyond Big Data H.r.varian - Beyondbigdata
No ratings yet
2014 Springer Varian Beyond Big Data H.r.varian - Beyondbigdata
6 pages
Coa CH 11
No ratings yet
Coa CH 11
21 pages
Dice Resume CV Abhishek Goyal
No ratings yet
Dice Resume CV Abhishek Goyal
5 pages
BA Resume
No ratings yet
BA Resume
6 pages
Schneider Sebastian
No ratings yet
Schneider Sebastian
42 pages
Yixing Sea Fountain Equipment Co.,Ltd: Always Believe Something Beautiful Is Going To Happen
No ratings yet
Yixing Sea Fountain Equipment Co.,Ltd: Always Believe Something Beautiful Is Going To Happen
31 pages
DsPIC33 EP64 GS502 Datasheet
No ratings yet
DsPIC33 EP64 GS502 Datasheet
390 pages
Constructivism in Theory and in Practice: Miriam Schcolnik, Sara Kol, and Joan Abarbanel
No ratings yet
Constructivism in Theory and in Practice: Miriam Schcolnik, Sara Kol, and Joan Abarbanel
9 pages
Aoop-A CH
No ratings yet
Aoop-A CH
34 pages
10 Steps To Having Amazing One On Ones With Your Team
No ratings yet
10 Steps To Having Amazing One On Ones With Your Team
12 pages
Developer For Energy and Climate Systems Plugin For Grasshopper, Rhino (80-) - New
No ratings yet
Developer For Energy and Climate Systems Plugin For Grasshopper, Rhino (80-) - New
1 page
Grade XI: Computer Science Project Work: Submitted By: Rashihang Rai
No ratings yet
Grade XI: Computer Science Project Work: Submitted By: Rashihang Rai
21 pages
Jobvacancyresult Com
No ratings yet
Jobvacancyresult Com
4 pages