Syllabus Programming For Data Science - AIML

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

Programming for Data Science & AIML

Course code PCC-CSE-250G


Category Professional Core Course
Course title Programming for Data Science & AIML
L T P Credits
Scheme and Credits
3 0 0 3
Class work 25 Marks
Exam 75 Marks
Total 100 Marks
Duration of Exam 03 Hours

Note: Examiner will set nine questions in total. Question one will be compulsory. Question one will have 6
parts of 2.5 marks each from all units and remaining eight questions of 15 marks each to be set by taking two
questions from each unit. The students have to attempt five questions in total, first being compulsory and
selecting one from each unit.

Objectives of the course:


 To impart the basic concepts of Python programming.
 To understand concepts and usage of NumPy and Pandas package for numerical data calculations in
Python.
 To understand concepts and applications of various data visualization tools of Python on real world
data.
 To understand and implement the Machine Learning Concepts in Python.

Unit 1
Overview of Python Programming Concepts: The concept of data types; variables, assignments; numerical
types; operators and expressions; Control Structures; String manipulations; File Handling – creating,
reading/writing text/number files; Dictionaries; Functions; OOPs Concepts

Unit 2
Introduction to Numpy - Creation on Array ,Array generation from Uniform distribution, Random array
generation, reshaping, maximum and minimum, reshaping, Arithmetic operations, Mathematical functions,
Bracket Indexing and Selection, Broadcasting, Indexing a 2D array (matrices);

Data Manipulation with Pandas -Creating a Series - from lists, arrays and dictionaries, Storing data in series
from intrinsic sources, Creating Data Frames, Imputation, Grouping and aggregation, Merging, Joining,
Concatenation, Find Null Values or Check for Null Values, Reading data from csv, txt, excel, web.

Unit 3
Introduction to Visualization - Installing and setting up visualization libraries, Canvas and Axes, Subplots,
Common plots – scatter, histogram, boxplot, Logarithmic scale, Placement of ticks and custom tick labels,
Pandas Viz, Style Sheets, Plot type, Area, Barplots, Histograms, Line Plots, Scatter Plots, BoxPlots,
Hexagonal Bin Plot, Kernel Density Estimation plot (KDE), Distribution Plots, Categorical Data Plots,
Combining Categorical Plots, Matrix Plots, Regression Plots, Grids; Python Visualizations toolkits/libraries.
Unit4
Introduction to Machine Learning with SciKit-Learn & PyTorch– Data Representation and basic functions-
Estimator, parameters & model validation, Model Selection, Curve, Grid search, Feature engineering, Naive
Bayes Classification, Linear regression, SVM etc; Overview of other Python ML/Deep Learning
toolkits/Libraries.

Introduction to NLP with NLTK and its functions, modules like speech tagging, tokenization, parsing,
segmentation, recognition , cleaning & normalization of text etc; Overview of other Python NLP
toolkits/Libraries.

Course outcomes
 Understand and implement the basics of programming in Python.
 Apply the Numpy package for numerical calculations in Python.
 Apply Pandas package for loading and preprocessing data in Python.
 Implement various data visualization tools of Python on real world data.
 Understand and implement the Machine Learning Concepts in Python.

Textbooks:
1. Charles Dierbach., Introduction to Python using Computer Science, Wiley Publications, Second
Edition, 2015
2. Mark Lutz , Learning Python, O’Reilly publications , Fifth Edition, 2015
3. Jake Vandar Plas, Python Data Science Handbook, O’Reilly , 2016

Reference Books:
Paul Barry, Head First Python, Orielly Publications, Second Edition, 2010

Reference Websites: (nptel, swayam, coursera, edx, udemy, official documentation weblink)

https://swayam.gov.in/nd1_noc19_cs59/preview

https://www.python.org/

https://www.datacamp.com/

You might also like