Syllabus Programming For Data Science - AIML
Syllabus Programming For Data Science - AIML
Syllabus Programming For Data Science - AIML
Note: Examiner will set nine questions in total. Question one will be compulsory. Question one will have 6
parts of 2.5 marks each from all units and remaining eight questions of 15 marks each to be set by taking two
questions from each unit. The students have to attempt five questions in total, first being compulsory and
selecting one from each unit.
Unit 1
Overview of Python Programming Concepts: The concept of data types; variables, assignments; numerical
types; operators and expressions; Control Structures; String manipulations; File Handling – creating,
reading/writing text/number files; Dictionaries; Functions; OOPs Concepts
Unit 2
Introduction to Numpy - Creation on Array ,Array generation from Uniform distribution, Random array
generation, reshaping, maximum and minimum, reshaping, Arithmetic operations, Mathematical functions,
Bracket Indexing and Selection, Broadcasting, Indexing a 2D array (matrices);
Data Manipulation with Pandas -Creating a Series - from lists, arrays and dictionaries, Storing data in series
from intrinsic sources, Creating Data Frames, Imputation, Grouping and aggregation, Merging, Joining,
Concatenation, Find Null Values or Check for Null Values, Reading data from csv, txt, excel, web.
Unit 3
Introduction to Visualization - Installing and setting up visualization libraries, Canvas and Axes, Subplots,
Common plots – scatter, histogram, boxplot, Logarithmic scale, Placement of ticks and custom tick labels,
Pandas Viz, Style Sheets, Plot type, Area, Barplots, Histograms, Line Plots, Scatter Plots, BoxPlots,
Hexagonal Bin Plot, Kernel Density Estimation plot (KDE), Distribution Plots, Categorical Data Plots,
Combining Categorical Plots, Matrix Plots, Regression Plots, Grids; Python Visualizations toolkits/libraries.
Unit4
Introduction to Machine Learning with SciKit-Learn & PyTorch– Data Representation and basic functions-
Estimator, parameters & model validation, Model Selection, Curve, Grid search, Feature engineering, Naive
Bayes Classification, Linear regression, SVM etc; Overview of other Python ML/Deep Learning
toolkits/Libraries.
Introduction to NLP with NLTK and its functions, modules like speech tagging, tokenization, parsing,
segmentation, recognition , cleaning & normalization of text etc; Overview of other Python NLP
toolkits/Libraries.
Course outcomes
Understand and implement the basics of programming in Python.
Apply the Numpy package for numerical calculations in Python.
Apply Pandas package for loading and preprocessing data in Python.
Implement various data visualization tools of Python on real world data.
Understand and implement the Machine Learning Concepts in Python.
Textbooks:
1. Charles Dierbach., Introduction to Python using Computer Science, Wiley Publications, Second
Edition, 2015
2. Mark Lutz , Learning Python, O’Reilly publications , Fifth Edition, 2015
3. Jake Vandar Plas, Python Data Science Handbook, O’Reilly , 2016
Reference Books:
Paul Barry, Head First Python, Orielly Publications, Second Edition, 2010
Reference Websites: (nptel, swayam, coursera, edx, udemy, official documentation weblink)
https://swayam.gov.in/nd1_noc19_cs59/preview
https://www.python.org/
https://www.datacamp.com/