Python for Data science
Course objective:
● Advanced understanding of Python programming, including OOP, functional
programming, and memory optimization.
● Experience with multithreading, multiprocessing, and asynchronous programming for
high-performance applications.
● Hands-on work with Python libraries (Numpy, Pandas, Matplotlib) for data
manipulation and analysis.
● Fundamental knowledge of machine learning techniques using Python with real-
world capstone projects.
● This training course balances deep Python programming concepts with practical
applications, preparing students for both technical coding challenges and data
science projects.
Session 1: Review of Python Basics
● What is Python? (History, usage)
● Installing Python and IDEs (VS Code, PyCharm, Jupyter)
● Running Python scripts and using the interpreter
● Variables, Data Types, Type Conversion
● Input/Output
● Basic Operators (Arithmetic, Comparison, Logical)
● Conditional Statements (if, else, elif)
● Loops: for and while
● Break, continue, and pass statements
● Defining Functions, Arguments, Return Values
● Variable Scope (local/global)
● Creating and using modules, importing libraries
● Capstone project 1
Session 2: Data Structures
● Lists, tuples, sets, and dictionaries - create/update/delete
● List comprehensions
● Advanced iteration techniques (enumerate, zip)
● Capstone project - 2
Session 3: File Handling , lambda functions and Asynchronous
programming
● Reading and writing files (text, CSV, JSON)
● Exception handling and custom error messages
● Lambda functions, map(), filter(), and reduce()
● Writing custom decorators, function introspection
Asynchronous Programming:
● Introduction to Asynchronous Programming
● asyncio and await
● Event Loops, Tasks, and Futures
● Multithreading vs Multiprocessing (use of threading and multiprocessing
modules)
●
Session 4: OOP Essentials
● Classes, objects, and constructors
● Class vs instance variables
● Methods: instance methods, class methods, static methods
● Inheritance, multi-level inheritance, and method overriding
● Abstract base classes and interfaces (ABC module)
● super() function and cooperative multiple inheritance
● Dunder methods (__str__, __repr__, __len__, __call__, etc.)
● Operator overloading (e.g., +, -, * for custom objects)
● Context managers and the with statement (__enter__ and __exit__)
● Capstone project - 3: Bank Account Simulation
Session 5: Python Module and Package Management
● Creating and Using Custom Modules
● Structuring Python Packages
● Using __init__.py and Package Imports
● Managing Dependencies with requirements.txt and pyproject.toml
● Virtual Environments: venv, pipenv, virtualenv
SOLID Principles:
● Single Responsibility Principle (SRP)
● Open/Closed Principle (OCP)
● Liskov Substitution Principle (LSP)
● Interface Segregation Principle (ISP)
● Dependency Inversion Principle (DIP)
Session 6: Working with APIs
● Introduction to REST APIs
● Flask and FastAPI Frameworks Overview
● Building a Simple REST API with Flask
● Parsing JSON and XML Responses
● Authentication and Token Management in APIs (e.g., JWT)
Session 7: Database Integration with Python
● Introduction to SQL and NoSQL Databases
● Connecting to Databases using sqlite3/MySQL-connector
● Performing CRUD Operations
● SQLAlchemy for Object-Relational Mapping (ORM)
● Capstone project - 3
Session 8: Data Science with Python - Focus on Libraries
● Numpy:
○ Numpy array creation, indexing, and slicing
○ Vectorized operations for speed improvements
○ Broadcasting and reshaping arrays
● Pandas:
○ Efficient data cleaning and transformation with Pandas
○ Handling large datasets with optimised operations
○ Aggregation and pivot tables
● Matplotlib and seaborn:
○ Customising plots for data visualisation
○ Plotting statistical data and distribution
○ Advanced Seaborn visualisations (pairplots, heatmaps)
● Capstone project - 4
Session 9: Machine Learning with Python (scikit-learn)
● Introduction to Machine Learning:
○ Supervised vs Unsupervised Learning
○ Overview of the scikit-learn Library
○ Loading and Preparing Datasets
● Supervised Learning:
○ Linear and Logistic Regression
○ Decision Trees and Random Forests
● Unsupervised Learning:
○ Clustering (K-Means)
○ Dimensionality Reduction (PCA)
● Model Evaluation:
○ Train-Test Split
○ Cross-Validation
○ Model Accuracy and Performance Metrics (Precision, Recall, F1-Score)
● Capstone project -5
—----------------------------------END OF PROGRAM—------------------------------------------------------