Advanced Python for Data Science – Syllabus
Course Objective:
To equip learners with advanced Python skills, including efficient data processing, working with libraries, advanced programming
techniques, and practical data manipulation needed in data science.
❖ Unit-wise Python
❖ Projects Ideas
Unit-wise Syllabus
Unit 4: Error Handling & Debugging
Unit 1: Pythonic Programming & Best Practices
• Custom exception classes
• Writing idiomatic Python code (PEP8, Zen of
• Exception chaining
Python)
• Logging: logging module
• Comprehensions:
▪ Levels, handlers, log files
▪ List, Set, Dictionary Comprehensions
• Debugging techniques using pdb
• Generators and Iterators
▪ yield, generator functions, memory • Using assertions and warnings
efficiency
• Lambda, map(), filter(), reduce()
• Enumerate, zip, any, all, sorted with key functions
• Using *args, **kwargs Unit 5: Working with Dates and Times
• with context managers and custom context
managers • datetime, dateutil, calendar, and time
• Parsing and formatting dates
• Time arithmetic and differences
• Time zones and localization
Unit 2: Object-Oriented Programming in Depth • Working with timestamps
• Advanced class features:
▪ Class and static methods
▪ Properties and descriptors Unit 6: Functional Programming & Decorators
▪ Inheritance, method overriding
• Special methods (__str__, __repr__, __eq__, etc.) • Higher-order functions
• Operator overloading • Closures
• Abstract Base Classes • Decorator basics and custom decorators
• Encapsulation, polymorphism, composition • Chaining decorators
• Memorization and caching with func tools
Unit 3: File and Directory Handling
Unit 7: Data Handling with NumPy
• Working with file paths using os, pathlib
• Reading/Writing CSV, JSON, and TSV files • NumPy arrays vs Python lists
• Handling large files • Array creation, indexing, slicing
• Directory traversal and file search • Array operations and broadcasting
• shutil for file operations • Boolean indexing
• Aggregations, sorting, reshaping
• Reading/writing .npy, .npz, .txt files
Unit 12: Data Serialization & Storage
Unit 8: Data Manipulation with Pandas • JSON and Pickle formats
• pickle module (saving Python objects)
• Series and DataFrame creation • Working with SQLite (sqlite3)
• Indexing and slicing • Connecting Python with relational DBs (basic
• Handling missing values queries)
• Filtering, sorting, and selection
• GroupBy operations and aggregation
• Merging and joining datasets
• Reading/writing CSV, Excel, JSON Unit 13: Multiprocessing and Performance
• Pivot tables and cross-tabulation
• Difference between threading vs multiprocessing
• Using concurrent.futures
• ThreadPoolExecutor and ProcessPoolExecutor
Unit 9: Regular Expressions • Parallel data processing basics
• Profiling Python code: cProfile, timeit
• Introduction to re module
• Pattern matching with match(), search(), findall(),
sub()
• Greedy vs non-greedy matching Unit 14: Intro to Data Visualization
• Named groups and compiling patterns
• Use cases in data cleaning (e.g., extracting emails, • Using matplotlib for basic plotting
phone numbers)
• Line, bar, scatter, histogram, boxplot
• Customizing plots (titles, labels, legends)
• Introduction to seaborn for statistical visualization
• Saving figures
Unit 10: Working with APIs & Web Data
• Using requests to fetch web data
• Parsing JSON responses
Unit 15: Mini Projects
• Error handling in APIs
• Introduction to REST APIs
1. Calculator App (CLI)
• Using APIs for data collection (e.g., GitHub,
2. Build a CLI or Jupyter-based Data Cleaning Tool
OpenWeather)
3. Automate Web Scraping and Store Data in CSV
4. Data Summarization Dashboard using Pandas
5. Explore APIs (e.g., COVID or GitHub) and store
insights.
Unit 11: Web Scraping Basics 6. Parse log files and generate reports.
• HTML structure overview
• Using BeautifulSoup for parsing
• Navigating and extracting HTML elements
• Handling pagination
• Writing scraped data to files