Data Analyst
Data Analyst
Key Topics:
o Descriptive Statistics (mean, median, mode, variance, standard deviation)
o Probability theory basics (independent events, conditional probability)
o Distributions (Normal, Binomial, Poisson)
o Hypothesis Testing (p-values, t-tests, chi-square tests)
Resources:
o Khan Academy: Probability & Statistics
o Coursera: Statistics for Data Science
Practice:
o Solve basic statistical problems using real datasets.
Key Topics:
o Excel Formulas (SUM, AVERAGE, IF, VLOOKUP, HLOOKUP)
o Data cleaning techniques (removing duplicates, handling missing data)
o Pivot Tables & Charts
o Excel Add-ins for data analysis
Resources:
o Excel Jet: Formulas and Functions
o Microsoft Excel Documentation
Practice:
o Analyze datasets using Excel, create pivot tables and charts.
Tasks:
o Perform a complete data analysis on a sample dataset using Excel.
o Create detailed reports and visualizations.
o Explore advanced Excel tools like Power Query and Power Pivot.
Key Topics:
o Database Fundamentals (Tables, Relationships)
o SQL Syntax (SELECT, WHERE, ORDER BY)
o Joins (INNER, LEFT, RIGHT, FULL)
o Aggregation (GROUP BY, HAVING)
Resources:
o W3Schools: SQL Tutorial
o Mode Analytics: SQL Tutorial
Practice:
o Perform basic queries on sample databases like Chinook or Northwind.
Key Topics:
o Subqueries and Nested Queries
o Window Functions
o Common Table Expressions (CTEs)
o Indexing and Optimization Techniques
Resources:
o LeetCode: SQL Problems
o SQLZoo: Advanced SQL Tutorials
Practice:
o Solve complex SQL problems, optimize queries for large datasets.
Tasks:
o Build a complete relational database.
o Perform end-to-end data extraction, transformation, and loading (ETL).
o Write complex queries to generate business insights from a real-world dataset.
Key Topics:
o Python Syntax & Data Structures (Lists, Dictionaries, Sets, Tuples)
o Loops, Conditions, and Functions
o Basic Libraries (NumPy, Pandas)
Resources:
o Codecademy: Python 3 Course
o Real Python: Python Basics
Practice:
o Write Python scripts to manipulate simple datasets.
Key Topics:
o DataFrames: Creation, Indexing, and Slicing
o Data Cleaning (handling missing values, duplicates)
o Merging, Joining, and Concatenating DataFrames
Resources:
o Pandas Documentation
o Kaggle: Pandas Exercises
Practice:
o Perform data wrangling on complex datasets, cleaning and preparing data for
analysis.
Tasks:
o Perform a complete data analysis project using Python.
o Load data, clean, manipulate, and extract insights.
o Present your findings in a well-documented Jupyter Notebook.
Key Topics:
o Matplotlib & Seaborn Basics (Line Plots, Bar Plots, Histograms)
o Advanced Visualization Techniques (Heatmaps, Pair Plots)
o Interactive Visualizations with Plotly
Resources:
o Matplotlib Documentation
o Seaborn Documentation
Practice:
o Create visualizations for various datasets, focusing on storytelling.
Key Topics:
o Introduction to Power BI/Tableau
o Connecting to Data Sources
o Creating Dashboards and Reports
o Data Transformation in BI Tools
Resources:
o Microsoft Power BI Learning
o Tableau Public Resources
Practice:
o Build simple dashboards using Power BI or Tableau.
Day 26-27: Advanced BI Tools
Key Topics:
o Custom Visualizations
o DAX Functions (Power BI)
o Advanced Calculations and Filters
o Publishing and Sharing Reports
Resources:
o Power BI: DAX Guide
o Tableau: Calculations & Functions
Practice:
o Create advanced dashboards, incorporating interactivity and complex
calculations.
Tasks:
o Complete a capstone project that includes:
Data extraction (SQL)
Data analysis (Python)
Visualization and reporting (Excel/Power BI/Tableau)
o Create a professional portfolio showcasing your projects.
o Review all topics, fill any knowledge gaps, and prepare for job applications.