Data Science Hons Syllabus and Structure V8
Data Science Hons Syllabus and Structure V8
Society’s
TEXTILE & ENGINEERING
INSTITUTE
(An Autonomous Institute)
Rajwada, Ichalkaranji – 416115.
Syllabus
of
Data Science (Honors)
(With effect from June 2020)
D.K.T.E. Society’s
TEXTILE & ENGINEERING INSTITUTE
(An Autonomous Institute)
Rajwada, Ichalkaranji – 416115.
Syllabus Structure
Sr. Course Course Title Course Teaching scheme Course Evaluation scheme
No. Code Category Credits Theory Practical
L T P Contact CIE SEE CIE SEE TOTAL
Hrs./wk. SE-I SE-II
1 CSL701 Basic Statistics BSC 2 1 - 3 3 25 25 50 - - 100
Total 2 1 - 3 3 25 25 50 - - 100
L- Lecture
T-Tutorial SE-I : Semester Examination-I CIE – Continuous In Semester Evaluation
P-Practical SE-II : Semester Examination-II SEE- Semester End Examination
Course HSMC BSC (Basic ESC PCC PEC OEC (Open MC PST ( Project /
Category (Humanities, Science (Engineeri (Professional (Professional Elective. (Mandatory Seminar / Ind.
Social Science & Course) ng Core Courses) Elective Courses) Courses) Training)
Management Science Courses)
Course) Course.)
Credits -- 3 -- -- -- -- -- --
Cumulative Sum -- -- -- -- -- -- -- --
Text Books:
--
References Books:
--
Useful Links:
1. https://www.coursera.org/learn/basic-statistics
DKTES Textile and Engineering Institute, Ichalkaranji
(An Autonomous Institute)
Teaching and evaluation Scheme for year 2020-21
Third Year B. Tech. (Semester – V) In Data Science Honors for Computer Science and
Engineering, Electronics, Electronics and Telecommunication
Sr. Course Course Title Course Teaching scheme Course Evaluation scheme
No. Code Category Credits Theory Practical
L T P Contact CIE SEE CIE SEE TOTAL
Hrs./wk. SE-I SE-II
1 CSL702 Exploratory Data PCC 3 - - 3 3 25 25 50 - - 100
Analysis and Feature
Engineering
2 CSP703 Introduction to Data PCC 2 - 2 4 3 - - - 50 50 100
Science in Python
Total 5 - 2 7 6 25 25 50 50 50 200
L- Lecture
T-Tutorial SE-I: Semester Examination-I CIE – Continuous In Semester Evaluation
P-Practical SE-II: Semester Examination-II SEE- Semester End Examination
Course HSMC BSC (Basic ESC PCC PEC OEC (Open MC PST ( Project /
Category (Humanities, Science (Engineeri (Professional (Professional Elective. (Mandatory Seminar / Ind.
Social Science & Course) ng Core Courses) Elective Courses) Courses) Training)
Management Science Courses)
Course) Course.)
Credits -- 03 -- -- -- -- -- --
Cumulative Sum -- -- -- 06 -- -- -- --
Textbooks:
1. Suresh Kumar Mukhiya, Usman Ahmed, “Hands-On Exploratory Data Analysis with Python”, Packt
Publishing, ISBN 978-1-78953-725-3
2. Sinan Ozdemir, Divya Susarla, “Feature Engineering Made Easy”, Packt Publishing, ISBN 978-1-
78728-760-0
3. Howard J .Seltman, “Experimental Design and Analysis”,
http://www.stat.cmu.edu/∼hseltman/309/Book/Book.pdf
4. Max Kuhn , Kjell Johnson, “Feature Engineering and Selection: A Practical Approach for Predictive
Models” 1st Edition, Chapman & Hall/CRC Data Science Series, ISBN 13-978-1-138-07922-9
References Books:
1. John W. Tukey, “Exploratory Data Analysis1st Edition”, Pearson Education, ISBN 0134995457,
9780134995458
Useful Links:
1. https://www.coursera.org/learn/exploratory-data-analysis
2. https://www.kaggle.com/pavansanagapati/a-simple-tutorial-on-exploratory-data-analysis
3. https://www.kaggle.com/learn/feature-engineering
4. https://machinelearningmastery.com/discover-feature-engineering-how-to-engineer-features-and-
how-to-get-good-at-it/
DKTES Textile and Engineering Institute, Ichalkaranji
Third Year B. Tech. (Semester – V)
CSP703: Introduction to Data Science in Python
Lab Scheme: Credits Evaluation Scheme:
Practical: 02 Hrs./Week CIE: 50 Marks
01
SEE: 50 Marks
Course Outcomes:
On completion of the course, student will be able to–
Understand techniques such as lambdas and manipulating Comma Separated Files (CSV) files
Describe common Python functionality and features used for Data Science
Query Data Frame structures for cleaning and processing
Explain distributions, sampling, and t-tests
UNIT-I Fundamentals of Data Manipulation with Python 06 Hours
Python Functions, Python Types and Sequences, Python More on Strings, Python Demonstration: Reading
and Writing CSV files, Python Dates and Times, Advanced Python Objects, map(),Advanced Python Lambda
and List Comprehensions, Numerical Python Library (NumPy),Manipulating Text with Regular Expression
List of Experiments
(It should consist of 10-12 experiments based on the following topics.)
1 Write a Python program to demonstrate array creation techniques
2 Write a Python program to demonstrate indexing in Numpy array.
3 Write a Python function to find the Max in Numpy array, sum all the numbers in Numpy array,
find average of numbers in Numpy array.
4 Write a Python program to demonstrate basic operations on single array and multiple arrays.
5 Write a Python program to demonstrate unary and binary operators in Numpy.
6 Write a Python program to demonstrate lambda technique.
7 Write a Python program to import data from Comma Separated Files (CSV) file, manipulate
data, and export data in CSV file.
8 Write a Python program to demonstrate string manipulation and regular expressions.
9 Write a Python program to demonstrate Viewing/Inspecting Data, Selection, Data Cleaning,
Filter, Sort, Groupby, Join/Combine, and Statistics in Dataframe.
10 Write a Python program to demonstrate filtering data stored in Dataframe (Single condition
filtering, Multiple condition filtering).
11 Write a Python to get a list of the column headers from a Pandas DataFrame, delete
DataFrame columns by name or index, add new column to existing DataFrame.
12 Write a Python program to demonstrate cleaning and processing data in Dataframe.
13 Write a Python program to visualize data using data visualization library Matplotlib or
Seaborn.
14 Write a Python program to demonstrate One sample t-test, two sampled t-test, Paired sampled
t-test.
15 Write a Python program to demonstrate Analysis of Variance (ANOVA).
16 Write a program to generate a normally distributed random variable, Binomial Distribution
distributed random variable, and Bernoulli Distribution random variable.
DKTES Textile and Engineering Institute, Ichalkaranji
(An Autonomous Institute)
Teaching and evaluation Scheme for year 2020-21
Third Year B. Tech. (Semester – VI) In Data Science Honors for Computer Science and
Engineering, Electronics, Electronics and Telecommunication
Sr. Course Course Title Course Teaching scheme Course Evaluation scheme
No. Code Category Credits Theory Practical
L T P Contact CIE SEE CIE SEE TOTAL
Hrs./wk. SE-I SE-II
1 CSL704 Big Data Analytics PCC 3 - - 3 3 25 25 50 - - 100
2 CSL705 Applied Text Mining PCC 2 - 2 4 3 - - - 50 50 100
in Python
Total 5 - 2 7 6 25 25 50 50 50 200
L- Lecture
T-Tutorial SE-I : Semester Examination-I CIE – Continuous In Semester Evaluation
P-Practical SE-II : Semester Examination-II SEE- Semester End Examination
Course HSMC BSC (Basic ESC PCC PEC OEC (Open MC PST ( Project /
Category (Humanities, Science (Engineeri (Professional (Professional Elective. (Mandatory Seminar / Ind.
Social Science & Course) ng Core Courses) Elective Courses) Courses) Training)
Management Science Courses)
Course) Course.)
Credits -- -- -- 06 -- -- -- --
Cumulative Sum -- 03 -- 06 -- -- -- --
References Books:
1. Seema Acharya, Subhasini Chellappan, “Big Data Analytics”, Wiley.
2. Chris Eaton,Dirk derooset al., “Understanding Big data”, McGraw Hill.
3. G James, D. Witten, T Hastie, R. Tibshirani, “An Introduction to Statistical Learning: with
Applications in R”, Springer.
4. Douglas Eadline, “Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the
Apache Hadoop 2 Ecosystem”, Pearson Education.
5. E. Capriolo, D. Wampler, J. Rutherglen, “Programming Hive”, O’ Reilly.
6. Lars George, “HBase: The Definitive Guide”, O’ Reilly.
7. Alan Gates, "Programming Pig”, O’ Reilly
Useful Links:
1. Analytics Vidhya (http://www.analyticsvidhya.com/) ...
2. Dataversity (http://www.dataversity.net/) ...
3. R Bloggers (http://www.r-bloggers.com/) ...
4. SmartData Collective (http://www.smartdatacollective.com/) ...
5. Data Science Central (http://www.datasciencecentral.com/) ...
6. Planet Big Data (http://planetbigdata.com/)
DKTES Textile and Engineering Institute, Ichalkaranji
Third Year B. Tech. (Semester – VI)
CSP705: Applied Text Mining in Python
Lab Scheme: Credits 03 Evaluation Scheme:
Lecture: 01 Hrs./Week CIE: 50 Marks
Practical: 02 Hrs./Week SEE: 50 Marks
Course Outcomes:
On completion of the course, student will be able to–
Understand how text is handled in Python
Apply basic natural language processing methods
Write code that groups documents by topic
Describe the NLTK framework for manipulating text
Unit I Working with Text in Python 06 Hours
Introduction to Text Mining, Handling Text in Python, Regular Expressions
Demonstration: Regex with Pandas and Named Groups, Internationalization and Issues with Non-ASCII
Characters
List of Experiments
(It should consist of 10-12 experiments based on the following topics.)
Sr. Course Course Title Course Teaching scheme Course Evaluation scheme
No. Code Category Credits Theory Practical
L T P Contact CIE SEE CIE SEE TOTAL
Hrs./wk. SE-I SE-II
1 CSP706 Time Series Analysis PCC 2 - 2 4 3 - - - 50 50 100
L- Lecture
T-Tutorial SE-I: Semester Examination-I CIE – Continuous In Semester Evaluation
P-Practical SE-II: Semester Examination-II SEE- Semester End Examination
Course HSMC BSC (Basic ESC PCC PEC OEC (Open MC PST ( Project /
Category (Humanities, Science (Engineeri (Professional (Professional Elective. (Mandatory Seminar / Ind.
Social Science & Course) ng Core Courses) Elective Courses) Courses) Training)
Management Science Courses)
Course) Course.)
Credits -- -- -- 05 -- -- -- --
Cumulative Sum -- 03 -- 12 -- -- -- --
Text Books:
--
References Books:
--
Useful Links:
1. https://www.coursera.org/learn/practical-time-series-
analysis?ranMID=40328&ranEAID=vedj0cWlu2Y&ranSiteID=vedj0cWlu2Y-
EHCYZFT7gt_kCfSbJHQ6DA&siteID=vedj0cWlu2Y-
EHCYZFT7gt_kCfSbJHQ6DA&utm_content=10&utm_medium=partners&utm_source=linkshare&
utm_campaign=vedj0cWlu2Y
2. https://onlinecourses.nptel.ac.in/noc21_ch28/preview
DKTES Textile and Engineering Institute, Ichalkaranji
Final Year B. Tech. (Semester – VII)
CSP707: Capstone Project
Lab Scheme: Credits Evaluation Scheme:
Practical: 02 Hrs./Week CIE: 50 Marks
02
SEE: 50 Marks
Course Outcomes:
On completion of the course, student will be able to–
A team of student will analyze the problem statement
A team of student will build the SRS and design document
A team of student will develop the code according to the design
A team of student will test the developed software
A team of student will write the report.
Student will form the group for the capstone project. The group will submit the completed project work to the
department at the end of semester VII as mentioned below.
1. The workable project.
2. The project report in all respect with the following : -
i. Problem specifications
ii. System definition – requirement analysis.
iii. System design – dataflow diagrams, database design
iv. System implementation – algorithm, code documentation
v. Test results and test report.
vi. In case of object oriented approach – appropriate process be followed.
CIE will be jointly assessed by a panel of teachers appointed by head of the institution. SEE examination will
be conducted by internal and external examiners as appointed by the CoE.