INTRO TO
DATA SCIENCE
BitBootCamp 2014
Data Types
DATA
STRUCTURED
ED
STRUCTUR
UN
Data Generation:
Machine
Human
Structured
Human
Structured
or
or
Unstructured
Unstructured
101101010101001010
DATA
Data Generation:
Machine
Machine
Structured
DATA
Working with Data
UNDERSTAND
PREDICT
INFLUENCE
Reporting
Business- Influence
Pivot Tables
Machine Learning:
Supervised
Un-supervised
Update Business
Process
WEEKS 1 & 2
WEEKS 2 & 3
Data Tools
EXCEL
UNIX
SQL
JAVA
HADOOP
Data Tools vs. Volume
Complexity
1M
100 M
Data Volume
> 100 M
Exploring Data Tools
6. Summarize
7. Best Practices
Java for Data Science
5. Manipulate Data
Hive for Data Science
4. Merge Datasets
SQL for Data Science
3. Sort / Filter
Excel for Data Science
2. View Data / Search
Unix for Data Science
1. Store Data