Intorduction of DA
Intorduction of DA
Data Overview
1. Data Engineer – Builds systems to collect and clean data (e.g., Hadoop
pipelines).
2. Data Scientist – Uses stats/ML to find insights (e.g., predictive models).
3. Business Analyst – Translates data insights into business decisions.
Phases (P-D-E-M-D-O)
1. Problem Identification – Define the business question (e.g., "Why are sales
dropping?").
2. Data Collection – Gather data from databases, surveys, sensors, etc.
3. Exploration & Cleaning – Handle missing data, remove duplicates.
4. Modeling – Apply algorithms (e.g., regression, clustering).
5. Deployment – Implement the solution (e.g., a recommendation engine).
6. Outcome – Review results and improve.
Exam Tip: Remember "People Don’t Eat Moldy Donuts, Okay?" (P-D-E-
M-D-O).
Definition
Extracting hidden patterns from large datasets (e.g., finding customer buying
habits).
Models
Steps (D-C-P-M-E-D)
Applications
Retail: Market basket analysis (e.g., "Customers who buy X also buy Y").
Banking: Fraud detection (e.g., unusual transactions).
Challenges
Memory Trick: "Dirty Cats Prefer Milk Every Day" (Define, Collect,
Preprocess, Model, Evaluate, Deploy).
Descriptive Analytics
"What happened?"
Tools: Dashboards, reports (e.g., monthly sales summary).
Diagnostic Analytics
Predictive Analytics
Prescriptive Analytics
"What should we do?"
Example: Recommending the best marketing strategy.
Datasets too large/complex for traditional tools (e.g., social media data).
Future Trends