0% found this document useful (0 votes)
22 views

Data Science and Data Scientist

Datascience jaume

Uploaded by

Percy Samsung
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Data Science and Data Scientist

Datascience jaume

Uploaded by

Percy Samsung
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Data Science and Data Scientist

Dr. Alex Liu, Principal Data Scientist

© 2015 IBM Corporation


Data Science Example

Google Flu Trend Analytics

Detecting outbreaks
two weeks ahead
of CDC

Estimating which cities are


most at risk.

2 © 2015 IBM Corporation


Data Science Example

3 © 2015 IBM Corporation


More data science examples …

Capabilities Outcomes

Know Everything about your Customer Creates customized offers up


Analyze all sources of data to know your customers to 125x faster with better results
as individuals

Innovate New Products at Speed and Scale Reduced processing time in half
Capture all sources of feedback and analyze vast
data to drive innovation

Instant Awareness of Fraud and Risk Identified fraud which previously


Analyze all available data, detect fraud and went undetected
manage risk in real-time

Exploit Instrumented Assets Loads hurricane data in seconds


Predict and prevent maintenance, develop new and performs risk analysis in
near real-time for greater
products & services reliability

4 © 2015 IBM Corporation


Data Science – One Definition by Drew Conway

5 © 2015 IBM Corporation


Data Science Definition

 Data Science is an interdisciplinary field about processes and systems to extract


knowledge or insights from large volumes of data in various forms either structured or
unstructured, which is a continuation of some of the data analysis fields such as data
mining and predictive analytics, as well as knowledge discovery and data mining (KDD).
 Data Science is about turning data into insights.

6 © 2015 IBM Corporation


Data Science is a process

4Es – Equation – Estimation – Evaluation - Explanation

7 © 2015 IBM Corporation


Data Science – a new science paradigm

 Data Science is a new science paradigm, under which the knowledge discovery processes
and systems are dramatically different from that in the past, and even how scientists work
and get organized is dramatically different from the past.
 Data Science is a new research paradigm, under which researchers must obtain intelligent
assistance to deal with huge amount of data, large selection of equations and models, large
selection of estimation algorithms, and complicated results evaluation and explanation.

8 © 2015 IBM Corporation


Data Scientist

9 © 2015 IBM Corporation


Data Scientist – A Definition

 A data scientist is a scientific professional who process large amount of data to


discover insights.
 A data scientist represents an evolution from a business or data analyst role. The formal
training is similar, with a solid foundation typically in computer science and applications,
modeling, statistics, analytics, math or even applied social science. What sets the data
scientist apart is strong business acumen, coupled with the ability to communicate findings to
both business and IT leaders in a way that can influence how an organization approaches a
business challenge. Good data scientists will not just address business problems, they will
pick the right problems that have the most value to the organization.
 Whereas a traditional data analyst may look only at data from a single source – a CRM
system, for example – a data scientist will most likely explore and examine data from multiple
disparate sources. The data scientist will sift through all incoming data with the goal of
discovering a previously hidden insight, which in turn can provide a competitive advantage or
address a pressing business problem. A data scientist does not simply collect and report on
data, but also looks at it from many angles, determines what it means, then recommends
ways to apply the data.

Source: http://www-01.ibm.com/software/data/infosphere/data-scientist/

10 © 2015 IBM Corporation


Data Scientist Skills

ALGORITHMS
Data & STATISTICS
Sources MODELS COMPUTING &
Visualization Business
Data Regression Acumen
MLE
Storage RMSE
Decision Subject
ITERATIVE
Data Tree Confusion Knowledge
(MapReduce
RMS
Cleaning & Spark) Matrix
Bayesian & Communica
Causality tion
Feature R ROC Curve
Extraction Time Series
SPSS

Data Equation Estimation Evaluation Explanation

11 © 2015 IBM Corporation

You might also like