Cse Ds 3 1 Sem Cs Syllabus Ug r20

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

R-20 Syllabus for CSE-DS, JNTUK w. e. f.

2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE

COURSE STRUCTURE AND SYLLABUS


For UG – R20

B. Tech - COMPUTER SCIENCE AND ENGINEERING with Specialization


DATA SCIENCE
(Applicable for batches admitted from 2020-2021)

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA - 533 003, Andhra Pradesh, India
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


III B. Tech – I Semester
S.No Course Code Courses Hours per week Credits
L T P C
1 PC Compiler Design 3 0 0 3
2 PC Operating Systems 3 0 0 3
3 PC Machine Learning 3 0 0 3
Open Elective-I
Open Open Electives offered by other
4 Elective/Job departments/ 3 0 0 3
Oriented Optimization in Operations Research (Job
oriented course)
Professional Elective-I
1. Software Engineering
5 PE 2. Object Oriented Analysis and Design 3 0 0 3
3. DevOps
4. Internet of Things
6 PC Operating Systems & Compiler Design Lab 0 0 3 1.5
7 PC Machine Learning Lab 0 0 3 1.5
Skill Oriented Course - III
1. Continuous Integration and
8 SO 0 0 4 2
Continuous Delivery using DevOps OR
2.Helical Insight
9 MC Employability Skills-I 2 0 0 0
Summer Internship 2 Months
10 PR (Mandatory) after second year (to be 0 0 0 1.5
evaluated during V semester
Total credits 21.5
11 Minor Data Warehousing and Data Mining $ 3 0 2 4
$- Integrated Course
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
COMPILER DESIGN

Course Objectives:
Understand the basic concept of compiler design, and its different phases which will be helpful to
construct new tools like LEX, YACC, etc.

Course Outcomes: At the end of the course, student will be able to


x Demonstrate phases in the design of compiler
x Organize Syntax Analysis, Top Down and LL(1) grammars
x Design Bottom Up Parsing and Construction of LR parsers
x Analyze synthesized, inherited attributes and syntax directed translation schemes
x Determine algorithms to generate code for a target machine

UNIT I:
Lexical Analysis: Language Processors, Structure of a Compiler, Lexical Analysis, The Role of
the Lexical Analyzer, Bootstrapping, Input Buffering, Specification of Tokens, Recognition of
Tokens, Lexical Analyzer Generator-LEX, Finite Automata, Regular Expressions and Finite
Automata, Design of a Lexical Analyzer Generator.

UNIT II:
Syntax Analysis: The Role of the Parser, Context-Free Grammars, Derivations, Parse Trees,
Ambiguity, Left Recursion, Left Factoring, Top Down Parsing: Pre Processing Steps of Top
Down Parsing, Backtracking, Recursive Descent Parsing, LL (1) Grammars, Non-recursive
Predictive Parsing, Error Recovery in Predictive Parsing.

UNIT III:
Bottom Up Parsing: Introduction, Difference between LR and LL Parsers, Types of LR Parsers,
Shift Reduce Parsing, SLR Parsers, Construction of SLR Parsing Tables, More Powerful LR
Parses, Construction of CLR (1) and LALR Parsing Tables, Dangling Else Ambiguity, Error
Recovery in LR Parsing, Handling Ambiguity Grammar with LR Parsers.

UNIT III:
Syntax Directed Translation: Syntax-Directed Definitions, Evaluation Orders for
SDD’s, Applications of Syntax Directed Translation, Syntax-Directed Translation Schemes,
Implementing L-Attributed SDD’s. Intermediate Code Generation: Variants of Syntax Trees,
Three Address Code, Types and Declarations, Translation of Expressions, Type Checking, Control
Flow, Backpatching, Intermediate Code for Procedures.

UNIT IV:
Run Time Environments: Storage Organization, Run Time Storage Allocation, Activation
Records, Procedure Calls, Displays, Code Optimization: The Principle Sources of Optimization,
Basic Blocks, Optimization of Basic Blocks, Structure Preserving Transformations, Flow Graphs,
Loop Optimization, Data-Flow Analysis, Peephole Optimization
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


UNIT V:
Code Generation: Issues in the Design of a Code Generator, Object Code Forms, Code
Generation Algorithm, Register Allocation and Assignment.

Text Books:
1. Compilers: Principles, Techniques and Tools, Second Edition, Alfred V. Aho, Monica S.
Lam, Ravi Sethi, Jeffry D. Ullman, Pearson Publishers, 2007.

Reference Books:
1. Compiler Construction, Principles and Practice, Kenneth C Louden, Cengage Learning,
2006
2. Modern compiler implementation in C, Andrew W Appel, Revised edition, Cambridge
University Press.
3. Optimizing Compilers for Modern Architectures, Randy Allen, Ken Kennedy, Morgan
Kauffmann, 2001.
4. Levine, J.R., T. Mason and D. Brown, Lex and Yacc, edition, O'Reilly & Associates, 1990
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
OPERATING SYSTEMS

Course Objectives:
The objectives of this course is to
x Introduce to the internal operation of modern operating systems
x Define, explain, processes and threads, mutual exclusion, CPU scheduling, deadlock,
memory management, and file systems
x Understand File Systems in Operating System like UNIX/Linux and Windows
x Understand Input Output Management and use of Device Driver and Secondary Storage
(Disk) Mechanism
x Analyze Security and Protection Mechanism in Operating System

Course Outcomes:
After learning, the course the students should be able to:
x Describe various generations of Operating System and functions of Operating System
x Describe the concept of program, process and thread and analyze various CPU Scheduling
Algorithms and compare their performance
x Solve Inter Process Communication problems using Mathematical Equations by various
methods
x Compare various Memory Management Schemes especially paging and Segmentation in
Operating System and apply various Page Replacement Techniques
x Outline File Systems in Operating System like UNIX/Linux and Windows

UNIT I:
Operating Systems Overview: Operating system functions, Operating system structure, Operating
systems operations, Computing environments, Open-Source Operating Systems.
System Structures: Operating System Services, User and Operating-System Interface, systems
calls, Types of System Calls, system programs, operating system structure, operating system
debugging, System Boot.

UNIT II:
Process Concept: Process scheduling, Operations on processes, Inter-process communication,
Communication in client server systems.
Multithreaded Programming: Multithreading models, Thread libraries, Threading issues. Process
Scheduling: Basic concepts, Scheduling criteria, Scheduling algorithms, Multiple processor
scheduling, Thread scheduling.
Inter-process Communication: Race conditions, Critical Regions, Mutual exclusion with busy
waiting, Sleep and wakeup, Semaphores, Mutexes, Monitors, Message passing, Barriers, Classical
IPC Problems - Dining philosophers problem, Readers and writers problem.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


UNIT III:
Memory-Management Strategies: Introduction, Swapping, Contiguous memory allocation, Paging,
Segmentation.
Virtual Memory Management: Introduction, Demand paging, Copy on-write, Page replacement,
Frame allocation, Thrashing, Memory-mapped files, Kernel memory allocation.

UNIT IV:
Deadlocks: Resources, Conditions for resource deadlocks, Ostrich algorithm, Deadlock detection and
recovery, Deadlock avoidance, Deadlock prevention.
File Systems: Files, Directories, File system implementation, management and optimization. Secondary-
Storage Structure: Overview of disk structure, and attachment, Disk scheduling, RAID structure, Stable
storage implementation.

UNIT V:
System Protection: Goals of protection, Principles and domain of protection, Access matrix, Access control,
Revocation of access rights.
System Security: Introduction, Program threats, System and network threats, Cryptography for security,
User authentication, implementing security defenses, Firewalling to protect systems and networks,
Computer security classification.
Case Studies: Linux, Microsoft Windows.

Text Books:
1. Silberschatz A, Galvin P B, and Gagne G, Operating System Concepts, 9th edition, Wiley, 2013.
2. TanenbaumA S, Modern Operating Systems, 3rd edition, Pearson Education, 2008. (forInterprocess
Communication and File systems.)

Reference Books:
1. Dhamdhere D M, Operating Systems A Concept Based Approach, 3rd edition, Tata McGraw-Hill,
2012.
2. Stallings W, Operating Systems -Internals and Design Principles, 6th edition, Pearson Education, 2009
3. Nutt G, Operating Systems, 3rd edition, Pearson Education, 2004.

e-Resources:
1) https://nptel.ac.in/courses/106/105/106105214/
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
MACHINE LEARNING

Course Objectives:
x Identify problems that are amenable to solution by ANN methods, and which ML methods may be
suited to solving a given problem.
x Formalize a given problem in the language/framework of different ANN methods (e.g., as a search
problem, as a constraint satisfaction problem, as a planning problem, as a Markov decision process, etc).

Course Outcomes: After the completion of the course, student will be able to
x Explain the fundamental usage of the concept Machine Learning system
x Demonstrate on various regression Technique
x Analyze the Ensemble Learning Methods
x Illustrate the Clustering Techniques and Dimensionality Reduction Models in Machine Learning.
x Discuss the Neural Network Models and Fundamentals concepts of Deep Learning

Unit I:
Introduction- Artificial Intelligence, Machine Learning, Deep learning, Types of Machine Learning
Systems, Main Challenges of Machine Learning.
Statistical Learning: Introduction, Supervised and Unsupervised Learning, Training and Test Loss,
Tradeoffs in Statistical Learning, Estimating Risk Statistics, Sampling distribution of an estimator,
Empirical Risk Minimization.

Unit II:
Supervised Learning(Regression/Classification):Basic Methods: Distance based Methods, Nearest
Neighbours, Decision Trees, Naive Bayes, Linear Models: Linear Regression, Logistic Regression,
Generalized Linear Models, Support Vector Machines, Binary Classification: Multiclass/Structured
outputs, MNIST, Ranking.

Unit III:
Ensemble Learning and Random Forests: Introduction, Voting Classifiers, Bagging and Pasting,Random
Forests, Boosting, Stacking.
Support Vector Machine: Linear SVM Classification, Nonlinear SVM ClassificationSVM Regression,
Naïve Bayes Classifiers.

Unit IV:
Unsupervised Learning Techniques:Clustering, K-Means, Limits of K-Means, Using Clustering for Image
Segmentation, Using Clustering for Preprocessing, Using Clustering for Semi-Supervised Learning,
DBSCAN, Gaussian Mixtures.
Dimensionality Reduction: The Curse of Dimensionality, Main Approaches for Dimensionality Reduction,
PCA, Using Scikit-Learn, Randomized PCA, Kernel PCA.

Unit V:
Neural Networks and Deep Learning:Introduction to Artificial Neural Networks with Keras,
Implementing MLPs with Keras, Installing TensorFlow 2, Loading and Preprocessing Data with
TensorFlow.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


Text Books:
1. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition, O’Reilly
Publications, 2019
2. Data Science and Machine Learning Mathematical and Statistical Methods,Dirk P. Kroese, Zdravko
I. Botev, Thomas Taimre, Radislav Vaisman,25th November 2020

Reference Books:
1. Machine Learning Probabilistic Approach, Kevin P. Murphy, MIT Press, 2012.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
OPTIMIZATION IN OPERATIONS RESEARCH
(Job oriented course)

Course Objectives:
x To define an objective function and constraint functions in terms of design variables, and then state
the optimization problem.
x To state single variable and multi variable optimization problems, without and with constraints.
x To explain linear programming technique to an optimization problem, define slack and surplus
variables, by using Simplex method.
x To state transportation and assignment problem as a linear programming problem to determine
Simplex method.
x To study and explain nonlinear programming techniques, unconstrained or constrained, and define
exterior and interior penalty functions for optimization problems.

Course Outcomes: At the end of the course, student will be able to


x State and formulate the optimization problem, without and with constraints, by using design
variables from an engineering design problem.
x Apply classical optimization techniques to minimize or maximize a multi-variable objective
function, without or with constraints, and arrive at an optimal solution.
x Apply and Solve transportation and assignment problem by using Linear programming Simplex
method.
x Apply gradient and non-gradient methods to nonlinear optimization problems and use interior or
exterior penalty functions for the constraints to derive the optimal solutions
x Formulate and apply Dynamic programming technique to inventory control, production planning,
engineering design problems etc. to reach a final optimal solution from the current optimal solution.

UNIT I:
Introduction and Classical Optimization Techniques: Statement of an Optimization problem, design
vector, design constraints, constraint surface, objective function, objective function surfaces, classification
of Optimization problems.
Classical Optimization Techniques: Single variable Optimization, multi variable Optimization without
constraints, necessary and sufficient conditions for minimum/maximum, multivariable Optimization with
equality constraints. Solution by method of Lagrange multipliers, multivariable Optimization with inequality
constraints, Kuhn – Tucker conditions

UNIT II:
Linear Programming : Standard form of a linear programming problem, geometry of linear programming
problems, definitions and theorems, solution of a system of linear simultaneous equations, pivotal reduction
of a general system of equations, motivation to the simplex method, simplex algorithm, Duality in Linear
Programming, Dual Simplex method.

UNIT III:
Transportation Problem: Finding initial basic feasible solution by north – west corner rule, least cost
method and Vogel’s approximation method, testing for optimality of balanced transportation problems,
Special cases in transportation problem.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


UNIT IV:
Nonlinear Programming: Unconstrained cases, One – dimensional minimization methods: Classification,
Fibonacci method and Quadratic interpolation method, Univariate method, Powell’s method and steepest
descent method.
Constrained cases– Characteristics of a constrained problem, Classification, Basic approach of Penalty
Function method; Basic approaches of Interior and Exterior penalty function methods, Introduction to
convex Programming Problem.

UNIT V:
Dynamic Programming: Dynamic programming multistage decision processes, types, concept of sub
optimization and the principle of optimality, computational procedure in dynamic programming, examples
illustrating the calculus method of solution, examples illustrating the tabular method of solution.

Text Books:
1. “Engineering optimization: Theory and practice”, S. S.Rao, New Age International (P) Limited, 3 rd
edition, 1998.
2. “Introductory Operations Research”, H.S. Kasene& K.D. Kumar, Springer (India), Pvt. LTd.

Reference Books:
1. “Optimization Methods in Operations Research and systems Analysis”, by K.V. Mital and C.
Mohan, New Age International (P) Limited, Publishers, 3rd edition, 1996.
2. Operations Research, Dr. S.D.Sharma, Kedarnath, Ramnath& Co
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
SOFTWARE ENGINEERING
(Professional Elective-I)

Course Objectives:
This course is designed to:
x Give exposure to phases of Software Development, common process models including Waterfall, and
the Unified Process, and hands-on experience with elements of the agile process
x Give exposure to a variety of Software Engineering practices such as requirements analysis and
specification, code analysis, code debugging, testing, traceability, and version control
x Give exposure to Software Design techniques

Course Outcomes:
Students taking this subject will gain software engineering skills in the following areas:
x Ability to transform an Object-Oriented Design into high quality, executable code
x Skills to design, implement, and execute test cases at the Unit and Integration level
x Compare conventional and agile software methods

UNIT I:
The Nature of Software, The Unique Nature of WebApps, Software Engineering, The Software Process,
Software Engineering Practice, Software Myths. A Generic Process Model, Process Assessment and
Improvement, Prescriptive Process Models, Specialized Process Models, The Unified Process, Personal and
Team Process Models, Process Technology.

UNIT II:
Agility, Agility and the Cost of Change, Agile Process, Extreme Programming (XP), Other Agile Process
Models, A Tool Set for the Agile Process, Software Engineering Knowledge, Core Principles, Principles
That Guide Each Framework Activity, Requirements Engineering, Establishing the Groundwork, Eliciting
Requirements, Developing Use Cases, Building the Requirements Model, Negotiating Requirements,
Validating Requirements.

UNIT III:
Requirements Analysis, Scenario-Based Modeling, UML Models That Supplement the Use Case, Data
Modeling Concepts, Class-Based Modeling, Requirements Modeling Strategies, Flow-Oriented Modeling,
Creating a Behavioral Model, Patterns for Requirements Modelling, Requirements Modeling for WebApps.

UNIT IV:
Design within the Context of Software Engineering, The Design Process, Design Concepts, The Design
Model, Software Architecture, Architectural Genres, Architectural Styles
Assessing Alternative Architectural Designs, Architectural Mapping Using Data Flow, Components,
Designing Class-Based Components, Conducting Component-Level Design, Component-Level Design for
WebApps, Designing Traditional Components, Component- Based Development.

UNIT V
The Golden Rules, User Interface Analysis and Design, Interface Analysis, Interface Design Steps, WebApp
Interface Design, Design Evaluation, Elements of Software Qualtiy Assurance, SQA Tasks, Goals &
Metrics, Statistical SQA, Software Reliability, A Strategic Approach to Software Testing, Strategic Issues,
Test Strategies for Conventional Software, Test Strategies for Object-Oriented Software, Test Strategies for
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


WebApps, Validation Testing, System Testing, The Art of Debugging, Software Testing Fundamentals,
Internal and External Views of Testing, White-Box Testing, Basis Path Testing.

Text Books:
1. Software Engineering a practitioner’s approach, Roger S. Pressman, Seventh Edition, McGraw Hill
Higher Education.
2. Software Engineering, Ian Sommerville, Ninth Edition, Pearson.

Reference Books:
1. Software Engineering, A Precise Approach, PankajJalote, Wiley India, 2010.
2. Software Engineering, UgrasenSuman, Cengage.

e-Resources:
1) https://nptel.ac.in/courses/106/105/106105182/
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
OBJECT ORIENTED ANALYSIS AND DESIGN
(Professional Elective-I)

Course Objectives: The main objective is the students to


x Become familiar with all phases of OOAD.
x Master the main features of the UML.
x Master the main concepts of Object Technologies and how to apply them at work and develop the
ability to analyze and solve challenging problem in various domains.
x Learn the Object design Principles and understand how to apply them towards Implementation.

Course Outcomes: After finishing this course student will be able to:
x Analyze the nature of complex system and its solutions.
x Illustrate & relate the conceptual model of the UML, identify & design the classes and relationships
x Analyze &Design Class and Object Diagrams that represent Static Aspects of a Software System and
apply basic and Advanced Structural Modeling Concepts for designing real time applications.
x Analyze & Design behavioral aspects of a Software System using Use Case, Interaction and Activity
Diagrams.
x Analyze & Apply techniques of State Chart Diagrams and Implementation Diagrams to model
behavioral aspects and Runtime environment of Software Systems.

UNIT I:
Introduction: The Structure of Complex systems, The Inherent Complexity of Software, Attributes of
Complex System, Organized and Disorganized Complexity, Bringing Order to Chaos, Designing Complex
Systems. Case Study: System Architecture: Satellite-Based Navigation

UNIT II:
Introduction to UML: Importance of modeling, principles of modeling, object oriented modeling,
conceptual model of the UML, Architecture, and Software Development Life Cycle. Basic Structural
Modeling: Classes, Relationships, common Mechanisms, and diagrams. Case Study: Control System:
Traffic Management.

UNIT III:
Class & Object Diagrams: Terms, concepts, modeling techniques for Class & Object Diagrams.
Advanced Structural Modeling: Advanced classes, advanced relationships, Interfaces, Types and Roles,
Packages. Case Study: AI: Cryptanalysis.

UNIT IV:
Basic Behavioral Modeling-I: Interactions, Interaction diagrams Use cases, Use case Diagrams, Activity
Diagrams. Case Study: Web Application: Vacation Tracking System

UNIT V:
Advanced Behavioral Modeling: Events and signals, state machines, processes and Threads, time and
space, state chart diagrams. Architectural Modeling: Component, Deployment, Component diagrams and
Deployment diagrams
Case Study: Weather Forecasting
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


Text Books:
1. Grady BOOCH, Robert A. Maksimchuk, Michael W. ENGLE, Bobbi J. Young, Jim Conallen,
Kellia Houston , “Object- Oriented Analysis and Design with Applications”, 3rd edition, 2013,
PEARSON.
2. Grady Booch, James Rumbaugh, Ivar Jacobson: The Unified Modeling Language User Guide,
Pearson Education.

Reference Books:
1. Meilir Page-Jones: Fundamentals of Object Oriented Design in UML, Pearson Education.
2. Pascal Roques: Modeling Software Systems Using UML2, WILEY- Dreamtech India Pvt. Ltd.
3. AtulKahate: Object Oriented Analysis & Design, The McGraw-Hill Companies.
4. Appling UML and Patterns: An introduction to Object – Oriented Analysis and Design and Unified
Process, Craig Larman, Pearson Education.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
DevOps
(Professional Elective-I)

Course Objectives:
x Introduces the basic concepts of Information System.
x To understand The Management Control Framework and The Application Control
Framework.

Course Outcomes: At the end of the course, student will be able to


x Enumerate the principles of continuous development and deployment, automation of configuration
management, inter-team collaboration, and IT service agility.
x Describe DevOps&DevSecOps methodologies and their key concepts
x Illustrate the types of version control systems, continuous integration tools, continuous monitoring
tools, and cloud models
x Set up complete private infrastructure using version control systems and CI/CD tools
x Acquire the knowledge of maturity model, Maturity Assessment

UNIT I:
Phases of Software Development Life Cycle, Values and principles of agile software development.

UNIT II:
Fundamentals of DevOps: Architecture, Deployments, Orchestration, Need, Instance of applications,
DevOps delivery pipeline, DevOps eco system.

UNIT III:
DevOps adoption in projects: Technology aspects, Agiling capabilities, Tool stack implementation, People
aspect, processes

UNIT IV:
CI/CD: Introduction to Continuous Integration, Continuous Delivery and Deployment, Benefits of CI/CD,
Metrics to track CICD practices

UNIT V:
Devops Maturity Model: Key factors of DevOps maturity model, stages of Devops maturity model, DevOps
maturity Assessment

Text Books:
1. The DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in
Technology Organizations, Gene Kim , John Willis , Patrick Debois , Jez Humb,1st Edition,
O’Reilly publications, 2016.
2. What is Devops? Infrastructure as code, 1st Edition, Mike Loukides ,O’Reilly publications, 2012.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
3 0 0 3
INTERNET OF THINGS
(Professional Elective-I)

Course Objectives:
From the course the student will learn
x the application areas of IOT
x the revolution of Internet in Mobile Devices, Cloud & Sensor Networks
x building blocks of Internet of Things and characteristics

Course Outcomes:
By the end of the course, student will be able to
x Review Internet of Things (IoT).
x Demonstrate various business models relevant to IoT.
x Construct designs for web connectivity
x Organize sources of data acquisition related to IoT, integrate to enterprise systems.
x Describe IoT with Cloud technologies.

UNITI:
The Internet of Things- An Overview of Internet of things, Internet of Things Technology, behind IoTs
Sources of the IoTs, Examples of IoTs, Design Principles For Connected Devices, Internet connectivity,
Application Layer Protocols- HTTP, HTTPS, FTP

UNIT II: Business Models for Business Processes in the Internet of Things, IoT/M2M systems LAYERS
AND designs standardizations, Modified OSI Stack for the IoT/M2M Systems ,ETSI M2M domains and
High-level capabilities, Communication Technologies, Data Enrichment and Consolidation and Device
Management Gateway Ease of designing and affordability.

UNIT III: Design Principles for the Web Connectivity for connected-Devices, WebCommunication
protocols for Connected Devices, Message Communication protocols for Connected Devices, Web
Connectivity for connected-Devices.

UNIT IV: Data Acquiring, Organizing and Analytics in IoT/M2M, Applications/Services/Business


Processes, IOT/M2M Data Acquiring and Storage, Business Models for Business Processes in the Internet
Of Things, Organizing Data, Transactions, Business Processes, Integration and Enterprise Systems.

UNITV: Data Collection, Storage and Computing Using a Cloud Platform for IoT/M2M
Applications/Services, Data Collection, Storage and Computing Using cloud platform Everything as a
service and Cloud Service Models, IOT cloud-based services using the Xively (Pachube/COSM), Nimbits
and other platforms Sensor, Participatory Sensing, Actuator, Radio Frequency Identification, and Wireless,
Sensor Network Technology, Sensors Technology, Sensing the World.

Text Books:
1. Internet of Things: Architecture, Design Principles And Applications, Rajkamal, McGraw Hill Higher
Education
2. Internet of Things, A.Bahgya and V.Madisetti, Univesity Press, 2015
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


Reference Books:
1. Designing the Internet of Things, Adrian McEwen and Hakim Cassimally, Wiley
2. Getting Started with the Internet of Things, CunoPfister , Oreilly
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
0 0 3 1.5
OPERATING SYSTEMS & COMPILER DESIGN LAB

Course Objectives:
The man objective of this course is to implement operating systems and compiler design concept

Course Outcomes:
By the end of the course, student will be able to
x Implement various scheduling, page replacement algorithms and algorithms related to deadlocks
x Design programs for shared memory management and semaphores
x Determine predictive parsing table for a CFG
x Apply Lex and Yacc tools
x Examine LR parser and generating SLR Parsing table

List of Experiments:

1. Simulate the following CPU scheduling algorithms:


a. Round Robin (b) SJF (c) FCFS (d) Priority
2. Simulate the following:
a) Multiprogramming with a fixed number of tasks (MFT)
b) Multiprogramming with a variable number of tasks (MVT)
3. Simulate the following page replacement algorithms:
a) FIFO b) LRU c) LFU
4. Write a C program that illustrates two processes communicating using shared memory
5. Write a C program to simulate producer and consumer problem using semaphores
6. Simulate Bankers Algorithm for Dead Lock Avoidance
7. Simulate Bankers Algorithm for Dead Lock Prevention.
8. Write a C program to identify different types of Tokens in a given Program.
9. Write a Lex Program to implement a Lexical Analyzer using Lex tool.
10. Write a C program to Simulate Lexical Analyzer to validating a given input String.
11. Write a C program to implement the Brute force technique of Top down Parsing.
12. Write a C program to implement a Recursive Descent Parser.
13. Write C program to compute the First and Follow Sets for the given Grammar.
14. Write a C program for eliminating the left recursion and left factoring of a given grammar
15. Write a C program to check the validity of input string using Predictive Parser.
16. Write a C program for implementation of LR parsing algorithm to accept a given input string.
17. Write a C program for implementation of a Shift Reduce Parser using Stack Data Structure to accept
a given input string of a given grammar
18. Simulate the calculator using LEX and YACC tool.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
0 0 3 1.5
MACHINE LEARNING LAB

Course Objectives:
This course will enable students to learn and understand different Data sets in implementing the machine
learning algorithms.

Course Outcomes (Cos): At the end of the course, student will be able to
x Implement procedures for the machine learning algorithms
x Design and Develop Python programs for various Learning algorithms
x Apply appropriate data sets to the Machine Learning algorithms
x Develop Machine Learning algorithms to solve real world problems

Requirements: Develop the following program using Anaconda/ Jupiter/ Spider and evaluate ML
models.

Experiment-1:
Implement and demonstrate the FIND-S algorithm for finding the most specific hypothesis based on a given
set of training data samples. Read the training data from a .CSV file.

Experiment-2:
For a given set of training data examples stored in a .CSV file, implement and demonstrate the Candidate-
Elimination algorithm to output a description of the set of all hypotheses consistent with the training
examples.

Experiment-3:
Write a program to demonstrate the working of the decision tree based ID3 algorithm. Use an appropriate
data set for building the decision tree and apply this knowledge to classify a new sample.

Experiment-4:
Exercises to solve the real-world problems using the following machine learning methods: a) Linear
Regression b) Logistic Regression c) Binary Classifier

Experiment-5: Develop a program for Bias, Variance, Remove duplicates , Cross Validation
Experiment-6: Write a program to implement Categorical Encoding, One-hot Encoding

Experiment-7:
Build an Artificial Neural Network by implementing the Back propagation algorithm and test the same
using appropriate data sets.

Experiment-8:
Write a program to implement k-Nearest Neighbor algorithm to classify the iris data set. Print both correct
and wrong predictions.

Experiment-9: Implement the non-parametric Locally Weighted Regression algorithm in order to fit data
points. Select appropriate data set for your experiment and draw graphs.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE

Experiment-10:
Assuming a set of documents that need to be classified, use the naïve Bayesian Classifier model to perform
this task. Built-in Java classes/API can be used to write the program. Calculate the accuracy, precision, and
recall for your data set.

Experiment-11: Apply EM algorithm to cluster a Heart Disease Data Set. Use the same data set for
clustering using k-Means algorithm. Compare the results of these two algorithms and comment on the
quality of clustering. You can add Java/Python ML library classes/API in the program.

Experiment-12: Exploratory Data Analysis for Classification using Pandas or Matplotlib.

Experiment-13:
Write a Python program to construct a Bayesian network considering medical data. Use this model to
demonstrate the diagnosis of heart patients using standard Heart Disease Data Set

Experiment-14:
Write a program to Implement Support Vector Machines and Principle Component Analysis

Experiment-15:
Write a program to Implement Principle Component Analysis
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
0 0 4 2
CONTINUOUS INTEGRATION AND CONTINUOUS DELIVERY USING DevOps
(Skill Oriented Course III)

Course Outcomes:
At the end of the Course, Student will be able to:
¾ Understand the why, what and how of DevOps adoption
¾ Attain literacy on Devops
¾ Align capabilities required in the team
¾ Create an automated CICD pipeline using a stack of tools

List of Exercises:

Note: There are online courses indicated in the reference links section. Learners need to go through the
contents in order to perform the given exercises

Exercise 1:
Reference course name :Software engineering and Agile software development
Get an understanding of the stages in software development lifecycle, the process models, values and
principles of agility and the need for agile software development. This will enable you to work in projects
following an agile approach to software development.
Solve the questions given in the reference course name to gauge your understanding of the topic

Exercise 2:
Reference course name: Development & Testing with Agile: Extreme Programming
Get a working knowledge of using extreme automation through XP programming practices of test first
development, refactoring and automating test case writing.
Solve the questions in the “Take test” module given in the reference course name to gauge your
understanding of the topic

Exercise 3:
Module name :DevOps adoption in projects
It is important to comprehend the need to automate the software development lifecycle stages through
DevOps. Gain an understanding of the capabilities required to implement DevOps, continuous integration
and continuous delivery practices.
Solve the questions given in Quiz1, Quiz2, Quiz 3

Exercise 4:
Module name :Implementation of CICD with Java and open source stack
Configure the web application and Version control using Git using Git commands and version control
operations.

Exercise 5:
Module Name: Implementation of CICD with Java and open source stack
Configure a static code analyzer which will perform static analysis of the web application code and identify
the coding practices that are not appropriate. Configure the profiles and dashboard of the static code analysis
tool.
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE

Exercise 6:
Module Name: Implementation of CICD with Java and open source stack
Write a build script to build the application using a build automation tool like Maven. Create a folder
structure that will run the build script and invoke the various software development build stages. This script
should invoke the static analysis tool and unit test cases and deploy the application to a web application
server like Tomcat.

Exercise 7:
Module Name: Implementation of CICD with Java and open source stack
Configure the Jenkins tool with the required paths, path variables, users and pipeline views.

Exercise 8:
Module name: Implementation of CICD with Java and open source stack
Configure the Jenkins pipeline to call the build script jobs and configure to run it whenever there is a change
made to an application in the version control system. Make a change to the background color of the landing
page of the web application and check if the configured pipeline runs.

Exercise 9:
Module name: Implementation of CICD with Java and open source stack
Create a pipeline view of the Jenkins pipeline used in Exercise 8. Configure it with user defined messages.

Exercise 10 :
Module name: Implementation of CICD with Java and open source stack
In the configured Jenkins pipeline created in Exercise 8 and 9, implement quality gates for static analysis of
code.

Exercise 11:
Module name :Implementation of CICD with Java and open source stack
In the configured Jenkins pipeline created in Exercise 8 and 9, implement quality gates for static unit testing.

Exercise 12:
Module name :Course end assessment
In the configured Jenkins pipeline created in Exercise 8 and 9, implement quality gates for code coverage.

Reference Books:
1. Learning Continuous Integration with Jenkins: A beginner's guide to implementing Continuous
Integration and Continuous Delivery using Jenkins - Nikhil Pathania ,Packt publication
[https://www.amazon.in/Learning-Continuous-Integration-Jenkins-Pathania/dp/1785284835]
2. Jenkins 2 – Up and Running: Evolve Your Deployment Pipeline for Next Generation Automation -
Brent Laster, O’Reilly publication
[https://www.amazon.in/Jenkins-2-Running-Brent-Laster/dp/
1491979593]
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


Hardware and software configuration:
1. Git [GitHub or Gitlab]
2. Sonarqube
3. Jenkins
4. JUnit
5. Eclipse
6. Tomcat server
7. Maven
8. Cobertura or JaCoCo
9. Java SDK
10. All necessary drivers and jar files for connecting the software
11. Windows machine with 16GB RAM

Web Links:( Courses mapped to Infosys Springboard platform)

1. https://infyspringboard.onwingspan.com/en/app/toc/lex_auth_013382690411003904735_shared/overvie
w [Software Engineering and Agile software development]

2. https://infyspringboard.onwingspan.com/en/viewer/html/lex_auth_01350157819497676810467
[Development & Testing with Agile: Extreme Programming]
3. https://infyspringboard.onwingspan.com/en/viewer/html/lex_auth_01353898917192499226_shared
[DevOps CICD]
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE

L T P C
III B Tech I Sem
0 0 4 2
HELICAL INSIGHT
(Skill Oriented Course III)

Course Objectives:
The main objective of the course is to understand a business intelligence tool Helical Insight- the a new
frame work for data analysis

Course Outcomes:
By the end of the course, the student will be able to
x develop data analysis on top of your data and embed it
x support better business decision-making.
x implement their own custom business processes very easily

List of Experiments:

1. Installation of Helical Insight


2. Develop a Helical Insight application with various Filters
3. Develop a Helical Insight application to export Reports and Dashboards
4. Develop a Helical Insight application to Export Reports and Dash Boards
5. Develop a Helical Insight application to Integrate using iFrame
6. Develop a Helical Insight application to customize Tables with Drill Down Function
7. Develop a Helical Insight application to customize Adhoc Charts with Drill Down Functions, Axis
Chart, Non-Axis Chart and Gauge Chart.
8. Develop a Helical Insight application with various operations in Dashboard Designer
9. Develop a Helical Insight application with Geographical Maps
10. Develop a Helical Insight application on Tomcat with MySQL
11. Develop a Helical Insight application with SQL Server database
12. Develop a Helical Insight application with SQlite database
13. Develop a Helical Insight application with HBase
14. Develop a Helical Insight application with MongoDB using Apache Drill
15. Develop a Helical Insight application with Hive

Web Resources:
1. https://www.helicalinsight.com/helical-insight-installation-guide/
2. https://www.helicalinsight.com/deploy-helical-insight-application-tomcat-mysql/
3. https://www.helicalinsight.com/installation-of-sql-server-on-windows-creating-data-source-
connection-in-helical-insight-to-sql-server/
4. https://www.helicalinsight.com/open-source-bi-to-sqlite/
5. https://www.helicalinsight.com/open-source-business-intelligence-bi-tool-for-hbase/
6. https://www.helicalinsight.com/connecting-mongodb-using-apache-drill/
7. https://www.helicalinsight.com/open-source-bi-tool-for-hive/
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem
2 0 0 0
EMPLOYABILITY SKILLS-I

Course Objectives:
The main objective of this course is to assist students in developing employability skills and personal
qualities related to gaining and sustaining employment.

Course Outcomes: The end of the course student will be able to


x Understand the corporate etiquette.
x Make presentations effectively with appropriate body language
x Be composed with positive attitude
x Understand the core competencies to succeed in professional and personal life

UNIT I:
Analytical Thinking & Listening Skills: Self-Introduction, Shaping Young Minds - A Talk by AzimPremji
(Listening Activity), Self – Analysis, Developing Positive Attitude, Perception.
Communication Skills: Verbal Communication; Non Verbal Communication (Body Language)

UNIT II:
Self-Management Skills: Anger Management, Stress Management, Time Management, Six Thinking Hats,
Team Building, Leadership Qualities
Etiquette: Social Etiquette, Business Etiquette, Telephone Etiquette, Dining Etiquette

UNIT III:
Standard Operation Methods: Note Making, Note Taking, Minutes Preparation, Email & Letter Writing
Verbal Ability: Synonyms, Antonyms, One Word Substitutes-Correction of Sentences-Analogies, Spotting
Errors, Sentence Completion, Course of Action -Sentences Assumptions, Sentence Arguments, Reading
Comprehension, Practice work

UNIT IV:
Job-Oriented Skills –I: Group Discussion, Mock Group Discussions

UNIT V:
Job-Oriented Skills –II: Resume Preparation, Interview Skills, Mock Interviews

Text Books and Reference Books:


1. Barun K. Mitra, Personality Development and Soft Skills, Oxford University Press, 2011.
2. S.P. Dhanavel, English and Soft Skills, Orient Blackswan, 2010.
3. R.S.Aggarwal, A Modern Approach to Verbal & Non-Verbal Reasoning, S.Chand& Company Ltd.,
2018.
4. Raman, Meenakshi& Sharma, Sangeeta, Technical Communication Principles and Practice, Oxford
University Press, 2011.

e-resources:
1. www. Indiabix.com
2. www.freshersworld.com
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


L T P C
III B Tech I Sem Minor
3 0 2 4
DATA WAREHOUSING AND DATA MINING

Course Objectives:
The main objective of the course is to
x Inculcate Conceptual, Logical, and Physical design of Data Warehouses OLAP applications and
OLAP deployment
x Design a data warehouse or data mart to present information needed by management in a form that is
usable
x Emphasize hands-on experience working with all real data sets.
x Test real data sets using popular data mining tools such as WEKA, Python Libraries

Course Outcomes:
By the end of the course student will be able to
x Design a data mart or data warehouse for any organization
x Extract knowledge using data mining techniques and enlist various algorithms used in information
analysis of Data Mining Techniques
x Demonstrate the working of algorithms for data mining tasks such as association rule mining,
classification for realistic data
x Implement and Analyze on knowledge flow application on data sets and Apply the suitable
visualization techniques to output analytical results

UNITI:
Data Warehousing and Online Analytical Processing: Data Warehouse: Basic concepts, Data Warehouse
Modelling: Data Cube and OLAP, Data Warehouse Design and Usage, Data Warehouse Implementation,
Introduction: Why and What is data mining, What kinds of data need to be mined and patterns can be
mined, Which technologies are used, Which kinds of applications are targeted.

UNIT II:
Data Pre-processing: An Overview, Data Cleaning, Data Integration, Data Reduction, Data Transformation
and Data Discretization.

UNITIII:
Classification: Basic Concepts, General Approach to solving a classification problem, Decision Tree
Induction: Attribute Selection Measures, Tree Pruning, Scalability and Decision Tree Induction

UNIT IV:
Association Analysis: Problem Definition, Frequent Item set Generation, Rule Generation: Confident Based
Pruning, Rule Generation in Apriori Algorithm, Compact Representation of frequent item sets, FP-Growth
Algorithm.

UNIT V:
Cluster Analysis: Overview, Basics and Importance of Cluster Analysis, Clustering techniques, Different
Types of Clusters; K-means: The Basic K-means Algorithm, K-means Additional Issues, Bi-secting K
Means,

Software Requirements: WEKA Tool/Python/R-Tool/Rapid Tool/Oracle Data mining


R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


List of Experiments:
1. Creation of a Data Warehouse.
¾ Build Data Warehouse/Data Mart (using open source tools like Pentaho Data Integration Tool,
Pentaho Business Analytics; or other data warehouse tools like Microsoft-SSIS, Informatica,
Business Objects,etc.,)
¾ Design multi-dimensional data models namely Star, Snowflake and Fact Constellation schemas for
any one enterprise (ex. Banking, Insurance, Finance, Healthcare, manufacturing, Automobiles, sales
etc).
¾ Write ETL scripts and implement using data warehouse tools.
¾ Perform Various OLAP operations such slice, dice, roll up, drill up and pivot

2. Explore machine learning tool “WEKA”


¾ Explore WEKA Data Mining/Machine Learning Toolkit.
¾ Downloading and/or installation of WEKA data mining toolkit.
¾ Understand the features of WEKA toolkit such as Explorer, Knowledge Flow interface,
Experimenter, command-line interface.
¾ Navigate the options available in the WEKA (ex. Select attributes panel, Preprocess panel, Classify
panel, Cluster panel, Associate panel and Visualize panel)
¾ Study the arff file format Explore the available data sets in WEKA. Load a data set (ex. Weather
dataset, Iris dataset, etc.)
¾ Load each dataset and observe the following:
1. List the attribute names and they types
2. Number of records in each dataset
3. Identify the class attribute (if any)
4. Plot Histogram
5. Determine the number of records for each class.
6. Visualize the data in various dimensions

3. Perform data preprocessing tasks and Demonstrate performing association rule mining on data sets
¾ Explore various options available in Weka for preprocessing data and apply Unsupervised filters like
Discretization, Resample filter, etc. on each dataset
¾ Load weather. nominal, Iris, Glass datasets into Weka and run Apriori
Algorithm with different support and confidence values.
¾ Study the rules generated. Apply different discretization filters on numerical attributes and run the
Apriori association rule algorithm. Study the rules generated.
¾ Derive interesting insights and observe the effect of discretization in the rule generation process.

4. Demonstrate performing classification on data sets


¾ Load each dataset into Weka and run 1d3, J48 classification algorithm. Study the classifier output.
Compute entropy values, Kappa statistic.
¾ Extract if-then rules from the decision tree generated by the classifier, Observe the confusion matrix.
¾ Load each dataset into Weka and perform Naïve-bayes classification and k-Nearest Neighbour
classification. Interpret the results obtained.
¾ Plot RoC Curves
¾ Compare classification results of ID3, J48, Naïve-Bayes and k-NN classifiers for each dataset, and
deduce which classifier is performing best and poor for each dataset and justify.

5. Demonstrate performing clustering of data sets


¾ Load each dataset into Weka and run simple k-means clustering algorithm with different values of k
(number of desired clusters).
R-20 Syllabus for CSE-DS, JNTUK w. e. f. 2020 – 21

JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY KAKINADA


KAKINADA – 533 003, Andhra Pradesh, India

DEPARTMENT OF CSE - DATA SCIENCE


¾ Study the clusters formed. Observe the sum of squared errors and centroids, and derive insights.
¾ Explore other clustering techniques available in Weka.
¾ Explore visualization features of Weka to visualize the clusters. Derive interesting insights and
explain.

6. Demonstrate knowledge flow application on data sets


¾ Develop a knowledge flow layout for finding strong association rules by using Apriori, FP Growth
algorithms
¾ Set up the knowledge flow to load an ARFF (batch mode) and perform a cross validation using J48
algorithm
¾ Demonstrate plotting multiple ROC curves in the same plot window by using j48 and Random forest
tree

7. Write a Python program to generate frequent item sets / association rules using Apriori algorithm
8. Write a program of cluster analysis using simple k-means algorithm Python programming language.

Text Books:
1. Data Mining concepts and Techniques, 3/e, Jiawei Han, Michel Kamber, Elsevier, 2011.
2. Introduction to Data Mining: Pang-Ning Tan & Michael Steinbach, Vipin Kumar, Pearson, 2012.

Reference Books:
1. Data Mining Techniques and Applications: An Introduction, Hongbo Du, Cengage Learning.
2. Data Mining: VikramPudi and P. Radha Krishna, Oxford Publisher.
3. Data Mining and Analysis - Fundamental Concepts and Algorithms; Mohammed J. Zaki, Wagner
Meira, Jr, Oxford
4. Data Warehousing Data Mining & OLAP, Alex Berson, Stephen Smith, TMH.
http://onlinecourses.nptel.ac.in/noc18_cs14/preview
5. (NPTEL course by Prof.PabitraMitra)
http://onlinecourses.nptel.ac.in/noc17_mg24/preview
6. (NPTEL course by Dr. Nandan Sudarshanam & Dr. Balaraman Ravindran)
http://www.saedsayad.com/data_mining_map.htm

You might also like