2 DS # 1 Introduction To DS
2 DS # 1 Introduction To DS
Data Science
23
Data Science
24
1
9/14/2019
Data Science
25
Media Use-case
Data Science
26
2
9/14/2019
Data Science
27
Data Science
28
3
9/14/2019
Data Science
29
Data Science
30
4
9/14/2019
Data Science
31
Data Science
32
5
9/14/2019
Data Science
33
Data Scientists
Data Scientist is
A practitioner who has sufficient knowledge of the
overlapping regimes of expertise in;
Business needs,
Domain knowledge,
Analytical skills and
Programming expertise
To manage the end-to-end scientific method in the big data
lifecycle to bring
Structure to it,
Find compelling patterns in it, and
Advise executives on the implications for products,
processes, and decisions
Data Science
34
6
9/14/2019
Introduction to https://hadoop.apache.org/
Work both on
Large and Small Data sets
Data Science
36
7
9/14/2019
Introduction to R
R
Is an open source programming language
Freely available
Has GUI support and easy to learn
Is a Software environment for statistical computing and
graphics
Has advanced graphics for information representation
Widely used among statisticians and data miners
Has a lot of packages
Allow multiple ways to do same thing
Customization need command line
Can be connected to many database engines
Data Science
37
Introduction to R (Cont…)
8
9/14/2019
Introduction to https://mahout.apache.org/
Mahout is
Used to create
Scalable
Performant (efficient)
Machine learning applications
9
9/14/2019
Data Science
41
Data Science
42
10
9/14/2019
11
9/14/2019
References
Some references for this chapter are;
www.edureka.in/data-science
Data Science
45
12