0% found this document useful (0 votes)
5 views6 pages

Python Data Analytics GenAI Course Plan

The document outlines a comprehensive course plan covering Python, Data Analytics, and Generative AI over 20 sessions. Each session includes key topics, practical focuses, and hands-on activities designed to build skills in data analysis, machine learning, and the integration of SQL with generative AI tools. The course culminates in building a CSV analytics tool and a RAG-based chatbot, emphasizing real-world applications and advanced features.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views6 pages

Python Data Analytics GenAI Course Plan

The document outlines a comprehensive course plan covering Python, Data Analytics, and Generative AI over 20 sessions. Each session includes key topics, practical focuses, and hands-on activities designed to build skills in data analysis, machine learning, and the integration of SQL with generative AI tools. The course culminates in building a CSV analytics tool and a RAG-based chatbot, emphasizing real-world applications and advanced features.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 6

Course Plan: Python, Data Analytics, and Generative AI

Session 1: Python Refresher


Topics

- Python essentials (data types, loops, conditionals)

- Functions and modules

- File handling (CSV/Excel)

- Pythonic coding practices

Practical Focus

- Write a function to load a CSV file and summarize basic statistics.

Session 2: Data Analysis with pandas and NumPy


Topics

- pandas DataFrame basics: loading, slicing, merging

- NumPy arrays: indexing, slicing, reshaping

- Descriptive statistics (mean, median, variance)

Practical Focus

- Analyze a CSV dataset (e.g., sales data) to extract summary statistics.

Session 3: Data Wrangling and Cleaning


Topics

- Handling missing data: dropna, fillna

- String manipulations and date conversions

- Combining and reshaping datasets (merge, concat, pivot)

Practical Focus

- Clean a messy dataset by handling missing values, converting data types, and merging files.

Session 4: Exploratory Data Analysis (EDA)


Topics

- Visualizing distributions (histograms, box plots)


- Correlation analysis and heatmaps

- Identifying patterns and outliers

Practical Focus

- Perform EDA on a dataset (e.g., customer data) to identify trends and relationships.

Session 5: Introduction to Machine Learning


Topics

- Machine Learning basics: supervised vs. unsupervised

- Overview of ML workflow: data preprocessing → model training → evaluation

- Common ML use cases in business

Practical Focus

- Discuss business-relevant ML use cases and map them to available datasets.

Session 6: Supervised Learning – Regression


Topics

- Linear Regression: simple and multiple

- Scikit-learn ML pipeline

- Model evaluation: MSE, RMSE, MAE

Practical Focus

- Build a linear regression model to predict sales/revenue from a dataset.

Session 7: Supervised Learning – Classification


Topics

- Logistic Regression, Decision Trees

- Evaluation metrics: accuracy, precision, recall, F1-score

- Confusion matrix interpretation

Practical Focus

- Train a logistic regression model to classify customers as likely churners or not.

Session 8: Model Evaluation and Validation


Topics
- Cross-validation (K-Fold)

- Hyperparameter tuning (GridSearchCV, RandomizedSearchCV)

- Bias-variance tradeoff

Practical Focus

- Perform cross-validation and hyperparameter tuning on a classification or regression


model.

Session 9: Feature Engineering


Topics

- Encoding categorical variables

- Feature scaling (standardization, normalization)

- Creating new features from existing data

Practical Focus

- Engineer features (e.g., date-based, interactions) to improve a machine learning model.

Session 10: Unsupervised Learning – Clustering


Topics

- K-means clustering

- Applications: customer segmentation, anomaly detection

- Cluster evaluation: silhouette score

Practical Focus

- Perform K-means clustering to segment customers and analyze cluster profiles.

Session 11: Ensemble Methods


Topics

- Random Forest and Gradient Boosting (XGBoost/LightGBM)

- Bagging vs. Boosting

- Practical tips for tuning ensembles

Practical Focus

- Train a Gradient Boosting model to improve classification accuracy.


Session 12: SQL for Business Analytics
Topics

- Writing advanced SQL queries (joins, subqueries, window functions)

- Query optimization and indexing

- Integrating SQL queries into Python (using sqlite3 or SQLAlchemy)

Practical Focus

- Query and analyze data from an SQL database integrated with a Python script.

Session 13: Introduction to Generative AI (GenAI)


Topics

- Overview of Generative AI (text generation, summarization)

- Working with pre-trained LLMs (e.g., Hugging Face transformers)

- Introduction to prompt engineering

Practical Focus

- Generate text summaries or insights from a dataset using an LLM.

Session 14: Retrieval-Augmented Generation (RAG) – Concepts


Topics

- How RAG combines retrieval systems with generative models

- Use cases for RAG in business (Q&A, report generation, decision support)

- Overview of vector databases (e.g., FAISS, Pinecone)

Practical Focus

- Sketch a workflow where queries fetch relevant data to feed into a generative model.

Session 15: Building the CSV Analytics Tool – Design


Topics

- Requirements for a CSV analytics tool (querying, summarizing, filtering)

- Efficient file handling for large datasets (chunking)

- Designing user-friendly outputs (charts, tables)

Practical Focus
- Draft the logic for a CSV analytics module that summarizes key metrics interactively.

Session 16: Implementing the CSV Analytics Tool


Topics

- Building core functionalities: query execution, metric calculations, visualizations

- Error handling and logging

- Exporting insights (e.g., saving summaries to Excel/CSV)

Practical Focus

- Build the CSV analytics tool and test it on real-world datasets.

Session 17: SQL Integration for RAG


Topics

- Querying SQL databases for context retrieval

- Converting SQL results into context for LLMs

- Handling large datasets and dynamic query results

Practical Focus

- Write Python code to retrieve data from SQL, format it, and prepare it for a generative
model.

Session 18: Building the RAG-Based Chatbot


Topics

- Connecting the chatbot to SQL and CSV modules

- Structuring prompts dynamically based on user queries

- Handling missing data or ambiguous queries

Practical Focus

- Build an initial RAG-based chatbot pipeline that retrieves context and generates responses.

Session 19: Testing and Refining the GenAI Project


Topics

- Testing edge cases for the CSV tool and RAG chatbot

- Handling incomplete user inputs or noisy data

- Improving performance and response accuracy


Practical Focus

- Test the combined system, focusing on query accuracy, response quality, and speed.

Session 20: Advanced Features and Final Review


Topics

- Adding advanced features: embedding-based similarity search, interactive filtering

- Business scalability considerations (security, multi-user support)

- Future enhancements: extending RAG or adding predictive analytics

Practical Focus

- Explore extensions, such as adding ML-driven recommendations or summarization


features to the chatbot.

You might also like