0% found this document useful (0 votes)
19 views27 pages

Data Science Presentation

Data science is the study of data to extract insights for business, heavily relying on programming languages like Python and R, as well as statistics and machine learning. It encompasses processes such as data mining, cleansing, analysis, and visualization, with applications in areas like fraud detection and recommender systems. Learning data science requires a solid foundation in mathematics, programming, and statistics, and is considered a challenging field to master.

Uploaded by

Sinoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views27 pages

Data Science Presentation

Data science is the study of data to extract insights for business, heavily relying on programming languages like Python and R, as well as statistics and machine learning. It encompasses processes such as data mining, cleansing, analysis, and visualization, with applications in areas like fraud detection and recommender systems. Learning data science requires a solid foundation in mathematics, programming, and statistics, and is considered a challenging field to master.

Uploaded by

Sinoy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

DATA SCIENCE VOCABULARY

4
WHAT IS DATA SCIENCE?

5
WHAT IS DATA SCIENCE?

Data science is the study of data to extract meaningful


insights for business. It is most widely used technique
amongst Artificial Intelligence and Machine Learning
Engineers. For example, when you logged on any e-
commerce website and browsed some categories and
products before purchase, you are generating data,
which will be helpful for Analysts to know your
behavior about purchase.

5
Dealings In DATA SCIENCE?

“Data Science deals with the processes of data mining,


cleansing, analysis, visualization, and actionable insight
generation. Data Scientist must have the basic knowledge of
mathematics, computer programming and statistics to solve
the complex data problems in an efficient way to boost the
business revenue.

5
Programming Languages

Programming languages specially Python and R play vital role in data


organization, visualization and data investigation. Python is high level
programming language which provides free libraries for data analysis. It is popular
amongst the data scientists.

R is another popular language. The best feature of R is data visualization. This


language is mostly used for social media post analysis.

There are another languages that provide support for data science like J ava 8
with Lambdas and Scala. S Q L is used for structured data and NoSQL for
unstructured data.
Statistics and Probability

Statistics and Probability are assumed essential elements in data


science as they make the numerical foundation of data science
and likelihood. It is difficult to do data science without the basic
knowledge of statistics and probability.
Machine Learning

Machine Learning is a part of Data Science that enables the system to


process data sets without any human interference (autonomously). It utilizes
different algorithms to work on massive volume of data generated from
various sources and makes prediction, analysis patterns and gives
recommendations. The real life example of Machine learning is its use in
fraud detection and client retention.
DATA SCIENCE AND ARTIFICIAL INTELLIGENCE

Data ML/DL/
Analytic
Artificial
Scienc s

e Intelligen
ce
DATA SCIENCE AND ARTIFICIAL INTELLIGENCE

Data ML/DL/
Analytic
Artificial
Scienc s

e Intelligen
ce
“Data science produces
insights. learning produces
Machine
DATA SCIENCE APPLICATION EXAMPLES
• Recommender systems
• The ability to offer
unique personalized
service
• Increase sales, click-through
rates, conversions, …
• Netflix recommender system
valued at
$1B per year
• Amazon recommender system
drives a 20-35% lift in sales
annually
• Collaborative filtering at scale
1
DATA SCIENCE APPLICATION EXAMPLES
• Fraud detection
• Investigate fraud patterns in past
data
• Early detection is important
• Before damage propagates
• Harder than late detection
• Precision is important
• False positive and false negative are
both bad
• Real-time analytics
DATA SCIENCE APPLICATION EXAMPLES
• Predicting why patients are
being readmitted in
hospitals
• Reduce costs
• Improve population health
• Find the “why” behind
specific populations
being readmitted d
• Data lakes of multiple
data sources
• Investigate ties between
readmission socioeconomic data
Simplified life cycle in DATA SCIENCE

Core Tasks
Data
Data Security &
Acquisition
Privacy
Making Management Modeling Dissemination
Data
of Big & &
Trustable & Data Analysis Visualization
Usable
Data
Ethics, Policy & Social Preservation
Impact

Applicatio Applicatio
CANADIAN DATA SCIENCE WORKSHOP Applicatio Applicatio
n no. 1 n no.4
n no.2 nno. 3
STEPS IN DATA LIFE CYCLE
Making
Data
Trustable &
Usable

Big Data Modelling


Manageme &
nt Analysis

Data
Visualization
&
Disseminatio
STEPS IN DATA LIFE CYCLE
• Data cleaning
Making • Sampling
Data • Data
Trustable & provenance
Usable

Big Data Modelling


Manageme &
nt Analysis

Data
Visualization
&
Disseminatio
STEPS IN DATA LIFE CYCLE
• Data cleaning
Making • Sampling
• Data lakes Data • Data
• Batch & online Trustable & provenance
access Usable
• Platforms

Big Data Modelling


Manageme &
nt Analysis

Data
Visualization
&
Disseminatio
STEPS IN DATA LIFE CYCLE
• Data cleaning
Making • Sampling
• Data lakes Data • Data
• Batch & online Trustable & provenance
access Usable
• Platforms

Big Data Modelling


Manageme &
nt Analysis
• Models & methods for
data lakes
• Unsupervised
Dat
Visualization classification &
a
& AI
Disseminatio
STEPS IN DATA LIFE CYCLE
• Data cleaning
Making • Sampling
• Data lakes Data • Data
• Batch & online Trustable & provenance
access Usable
• Platforms

Big Data Modelling


Manageme &
nt Analysis
• Visualization for • Models & methods for
wider audience
• Visualization for data lakes
• Unsupervised
data exploration Dat
• Visualization classification &
a
& AI
Disseminatio
STEPS IN DATA LIFE CYCLE

• Cleaning for data


• Support for
provenance
• Data preparation
management
• DL
• ML
• Visual analytics


Frequently Asked Questions about Data Science

Who can learn data science?

Any person who is passionate about data science can learn it.
There are some prerequisites like mathematics, statistics,
programming knowledge, use of data science tools SQL/NoSQL
knowledge.

If you are passionate to dive into the vast sea of data science
and ready learn all the necessary elements and tools, then
data science is right choice for you.
Is Data Science hard or easy to learn?

To become a data scientist is not an easy task. You have invest your
energies and time to become data scientist. You may have learned online
that the data science and easy to learn but it is not the actual fact.

To learn data science you have to understand the statistics and


mathematical concepts, programming languages (like python and R) to
organize and visualize the data. To understand the concepts of machine
learning, S Q L for structured data and NoSQL for unstructured data.
If you analyze all the above mentioned topics with cool mind, you will
come to the point that it is not an uphill and straight forward task.
You have to build real-world models to prove yourself a data scientist.
To understand the all processes from building models to testing and
deployment requires a lot of work. We can say that learning data science
requires time and mastering it requires a lot of time too.
THE END

You might also like