Data Science VI Sem Syllabus - 1
Data Science VI Sem Syllabus - 1
COURSE OBJECTIVES: Introduce deep learning fundamentals and major algorithms, the
problem settings, and their applications to solve real world problems.
COURSE OUTCOMES: After completing the course student should be able to:
Unit II: Deep Feedforward Neural Networks, Gradient Descent (GD), Momentum Based
GD,Nesterov Accelerated GD, Stochastic GD, AdaGrad, Adam, RMSProp, Auto-
encoder,Regularization in auto-encoders, Denoising auto-encoders, Sparse auto-encoders,
Contractiveauto-encoders,Variational auto-encoder, Auto-encoders relationship with PCA
and SVD,Dataset augmentation.Denoising auto encoders,
Unit III: Introduction to Convolutional neural Networks (CNN) and its architectures,
CCNterminologies: ReLu activation function, Stride, padding, pooling, convolutions
operations,Convolutional kernels, types of layers: Convolutional, pooling, fully connected,
VisualizingCNN, CNN examples: LeNet, AlexNet, ZF-Net, VGGNet, GoogLeNet, ResNet,
RCNNetc.Deep Dream, Deep Art. Regularization: Dropout, drop Connect, unit pruning,
stochasticpooling, artificial data, injecting noise in input, early stopping, Limit Number of
parameters,Weight decay etc.
TEXTBOOKS RECOMMENDED:
1. Ian Goodfellow, YoshuaBengio and Aaron Courville; Deep Learning, MIT Press.
2. Charu C. Aggarwal "Neural Networks and Deep Learning: A Textbook", Springer.
3. Francois Chollet, "Deep Learning with Python", Manning Publications.
REFERENCE BOOKS:
LIST OF EXPERIMENTS:
Unit –I
Computer Network: Definitions, goals, components, Architecture, Classifications &
Types.Layered Architecture: Protocol hierarchy, Design Issues, Interfaces and Services,
ConnectionOriented & Connectionless Services, Service primitives, Design issues & its
functionality. ISOOSI Reference Model: Principle, Model, Descriptions of various layers and
its comparison withTCP/IP. Principals of physical layer: Media, Bandwidth, Data rate and
Modulations
Unit-II
Data Link Layer: Need, Services Provided, Framing, Flow Control, Error control. Data
LinkLayer Protocol: Elementary &Sliding Window protocol: 1-bit, Go-Back-N, Selective
Repeat,Hybrid ARQ. Protocol verification: Finite State Machine Models & Petri net
models.ARP/RARP/GARP
Unit-III
MAC Sub layer: MAC Addressing, Binary Exponential Back-off (BEB) Algorithm,
DistributedRandom Access Schemes/Contention Schemes: for Data Services (ALOHA and
SlottedALOHA), for Local-Area Networks (CSMA, CSMA/CD, CSMA/CA), Collision Free
Protocols:Basic Bit Map, BRAP, Binary Count Down, MLMA Limited Contention Protocols:
AdaptiveTree Walk, Performance Measuring Metrics. IEEE Standards 802 series & their
variant.
Unit-IV
Network Layer: Need, Services Provided, Design issues, Routing algorithms: Least
CostRouting algorithm, Dijkstra's algorithm, Bellman-ford algorithm, Hierarchical
Routing,Broadcast Routing, Multicast Routing. IP Addresses, Header format, Packet
forwarding,Fragmentation, and reassembly, ICMP, Comparative study of IPv4 & IPv6
Unit-V
Transport Layer: Design Issues, UDP: Header Format, Per-Segment Checksum,
CarryingUnicast/Multicast Real-Time Traffic, TCP: Connection Management, Reliability of
DataTransfers, TCP Flow Control, TCP Congestion Control, TCP Header Format, TCP
TimerManagement.Application Layer: WWW and HTTP, FTP, SSH, Email (SMTP, MIME,
IMAP),DNS, Network Management (SNMP).
References:
List of Experiments:
Course Outcomes:
1. Students should be able to understand the concept and challenges of Big data.
2. Students should be able to demonstrate knowledge of big data analytics.
3. Students should be able to develop Big Data Solutions using Hadoop Eco System
4. Students should be able to gain hands-on experience on large-scale analytics tools.
5. Students should be able to analyse the social network graphs.
Course Content
Unit1: Introduction to Big data, Big data characteristics, Types of big data, Traditional
versusBig data, Evolution of Big data, challenges with Big Data, Technologies available for
BigData, Infrastructure for Big data, Use of Data Analytics, Desired properties of Big
Datasystem.
Unit3: Introduction to Hive Hive Architecture, Hive Data types, Hive Query
Language,Introduction to Pig, Anatomy of Pig, Pig on Hadoop, Use Case for Pig, ETL
Processing, Datatypes in Pig running Pig, Execution model of Pig, Operators, functions,Data
types of Pig.
Text Books:
TEXT BOOKS:
1. Parasram, S. V. N. (2020), “Digital Forensics with Kali Linux” - Second Edition:
Perform Data Acquisition, Data Recovery, Network Forensics, and Malware Analysis
with Kali Linux. United Kingdom: Packt Publishing, Limited.
2. Carrier, B. (2005). “File System Forensic Analysis”, United Kingdom: Pearson
Education.
REFERENCE BOOKS:
1. 1.Kruse, W. G., Heiser, J. G. (2001). Computer Forensics: Incident Response
Essentials. (n.p.): Pearson Education.
2. Parasram, S. V. N., Joseph, D., Samm, A. (2017). Digital Forensics with Kali Linux:
Perform Data Acquisition, Digital Investigation, and Threat Analysis Using Kali
Linux Tools. United Kingdom: Packt Publishing.
RAJIV GANDHI PROUDYOGIKI VISHWAVIDYALAYA, BHOPAL
UNIT-I
Objected Oriented and Object Relational Databases Modeling Complex Data Semantics,
Specialization, Generalization, Aggregation and Association, Objects, Object Identity and its
implementation, Clustering, Equality and Object Reference, Architecture of Object Oriented
and Object Relational databases, Persistent Programming Languages, Cache Coherence. Case
Studies: Gemstone, O2, Object Store, SQL3, Oracle xxi, DB2.
UNIT-II
Deductive Databases Data log and Recursion, Evaluation of Data log program, Recursive
queries with negation. Parallel and Distributed Databases Parallel architectures, shared
nothing/shared disk/shared memory based architectures, Data partitioning, Intra-operator
parallelism, pipelining. Distributed Data Storage – Fragmentation & Replication, Location
and Fragment Transparency Distributed Query Processing and Optimization, Distributed
Transaction Modeling and concurrency Control, Distributed Deadlock, Commit Protocols,
Design of Parallel Databases, and Parallel Query Evaluation.
UNIT-III
Advanced Transaction Processing Advanced transaction models: Savepoints, Nested and
Multilevel Transactions, Compensating Transactions and Saga, Long Duration Transactions,
Weak Levels of Consistency, Transaction Work Flows, Transaction Processing Monitors,
Shared disk systems.
UNIT-IV
Active Database and Real Time Databases Triggers in SQL, Event Constraint and Action:
ECA Rules, Query Processing and Concurrency control, Recursive query processing,
Compensation and Databases Recovery, multi-level recovery.
UNIT-V
Image and Multimedia Databases Modeling and Storage of Image and Multimedia Data, Data
Structures – R-tree, k-d tree, Quad trees, Content Based Retrieval: Color Histograms,
Textures, etc., Image Features, Spatial and Topological Relationships, Multimedia Data
Formats, Video Data Model, Audio & Handwritten Data, Geographic Information Systems
(GIS).
WEB Database
Accessing Databases through WEB, WEB Servers, XML Databases, Commercial Systems –
Oracle xxi, DB2.
BOOKS
1. Elmarsi, “Fundamentals of Database Systems”, 4 th Edition, Pearson Education
2. R. Ramakrishnan, “Database Management Systems”, 1998, McGraw Hill
International Editions
3. Elmagarmid.A.K. “Database transaction models for advanced applications”, Morgan
Kaufman.
4. Transaction Processing, Concepts and Techniques, J. Gray and A. Reuter, Morgan
Kauffman..
5. S. Abiteboul, R. hull and V. Vianu, “Foundations of Databases”, 1995, Addison –
Wesley Publishing Co., Reading Massachusetts.
6. W. Kim, “Modern Database Systems”, 1995, ACM Press, Addison – Wesley.
D. Maier, “The Theory of Relational Databases”, 1993, Computer Science Press,
Rockville, Maryland
RAJIV GANDHI PROUDYOGIKI VISHWAVIDYALAYA, BHOPAL
UNIT-I
Introduction - History of IR- Components of IR - Issues -Open source Search engine
Frameworks - The Impact of the web on IR - The role of artificial intelligence (AI) in IR – IR
Versus Web Search - Components of a search engine, Characterizing the web.
UNIT -II
Boolean and Vector space retrieval models- Term weighting - TF-IDF weighting- cosine
similarity - Preprocessing - Inverted indices - efficient processing with sparse vectors
Language Model based IR - Probabilistic IR -Latent Semantic indexing - Relevance feedback
and query expansion.
UNIT- III
Web search overview, web structure the user paid placement search engine optimization,
Web Search Architectures - crawling - meta-crawlers, Focused Crawling - web indexes -
Nearduplicate detection - Index Compression - XML retrieval.
UNIT -IV
Link Analysis -hubs and authorities - Page Rank and HITS algorithms -Searching and
Ranking - Relevance Scoring and ranking for Web - Similarity - Hadoop & Map Reduce -
Evaluation - Personalized search - Collaborative filtering and content-based recommendation
of documents And products - handling invisible Web - Snippet generation, Summarization.
Question Answering, Cross- Lingual Retrieval.
UNIT -V
Information filtering: organization and relevance feedback - Text Mining- Text classification
and clustering - Categorization algorithms, naive Bayes, decision trees and nearest neighbor -
Clustering algorithms: agglomerative clustering, k-means, expectation maximization (EM).
References:
1. C. Manning, P. Raghvan and H Schutze: Introduction to Information Retrieval,
Cambridge University Press, 2008.
2. Ricardo Baeza -Yates and Berthier Ribeiro –Neto, Modern Information Retrieval. The
Concepts and Technology behind Search 2nd Edition, ACM Press Books 2011.
3. Bruce Croft, Donald Metzler and Trevor Strohman Search Engines Information
Retrieval in Practice 1st Edition Addison Wesley, 2009
4. 4.Mark Levene, An Introduction to Search Engines and Web Navigation, 2nd Edition
Wiley 2010.
RAJIV GANDHI PROUDYOGIKI VISHWAVIDYALAYA, BHOPAL
Course Contents:
Unit-I: Fundamentals of Agile Process: Introduction and background, Agile Manifesto and
Principles, Stakeholders and Challenges, Overview of Agile Development Models: Scrum,
Extreme Programming, Feature Driven Development, Crystal, Kanban, and Lean Software
Development.
Unit-II: Agile Projects: Planning for Agile Teams: Scrum Teams, XP Teams, General Agile
Teams, Team Distribution; Agile Project Lifecycles: Typical Agile Project Lifecycles, Phase
Activities, Product Vision, Release Planning: Creating the Product Backlog, User Stories,
Prioritizing and Estimating, Creating the Release Plan; Monitoring and Adapting: Managing
Risks and Issues, Retrospectives.
Unit-V: Agile Software Design and Development: Agile design practices, Role of design
Principles, Need and significance of Refactoring, Refactoring Techniques, Continuous
Integration, Automated build tools, Version control; Agility and Quality Assurance: Agile
Interaction Design, Agile approach to Quality Assurance, Test Driven Development, Pair
programming: Issues and Challenges.
Recommended Books:
1. Robert C. Martin, Agile Software Development- Principles, Patterns and Practices,
Prentice Hall, 2013.
2. Kenneth S. Rubin, Essential Scrum: A Practical Guide to the Most Popular Agile
Process, Addison Wesley, 2012.
3. James Shore and Shane Warden, The Art of Agile Development, O’Reilly Media,
2007.
4. Craig Larman, ―Agile and Iterative Development: A manager’s Guide, Addison-
Wesley, 2004.
5. Ken Schawber, Mike Beedle, Agile Software Development with Scrum, Pearson,
2001.
6. Cohn, Mike, Agile Estimating and Planning, Pearson Education, 2006.
7. Cohn, Mike, User Stories Applied: For Agile Software Development Addison Wisley,
2004.
Online Resources:
1. IEEE Transactions on Software Engineering
2. IEEE Transactions on Dependable and Secure Computing
3. IET Software
4. ACM Transactions on Software Engineering and Methodology (TOSEM)
5. ACM SIGSOFT Software Engineering Notes
RAJIV GANDHI PROUDYOGIKI VISHWAVIDYALAYA, BHOPAL
Detailed Contents:
UNIT I: Introduction: Origins and challenges of NLP, Human languages, Application and
Goals of NLP, Main approach of NLP, Knowledge in speech and language processing,
Ambiguity, Models and algorithms, Formal language and Natural Language, Regular
Expression, and automata.
UNIT II: Text Pre-processing, Tokenization, Feature Extraction from text Morphology:
Inflectional morphology, Derivational morphology, Finite state morphological parsing,
Morphology, and Indian languages. Part of Speech Tagging: Rule based, Stochastic POS,
HMM, Transformation based tagging (TBL), N-Grams: Simple N-grams, Smoothing,
Backoff, Entropy. Handling of unknown words, Named entities, Multi word expressions.
UNIT III: Parsing: Syntactic and statistical parsing, parsing algorithms, hybrid of rule based
and probabilistic parsing, scope ambiguity and attachment ambiguity resolution, Tree banks.
Discourse and dialogue: discourse and dialogue analysis, anaphora resolution, named entity
resolution, event anaphora, Information extraction and retrieval. Hidden Markov and
Maximum Entropy models, Viterbi algorithms and EM training.
UNIT IV: Semantic analysis, Semantic attachments – Word Senses, Relations between
Senses, Word Sense Disambiguation, WSD using Supervised, Dictionary & Thesaurus,
Bootstrapping methods –Word Similarity using Thesaurus and Distributional methods.
Compositional semantics.
Speech Processing: Speech and phonetics, Vocal organ, Phonological rules and Transducer,
Probabilistic models: Spelling error, Bayesian method to spelling, Minimum edit distance,
Bayesian method of pronunciation variation.
Reference Books:
1. Breck Baldwin, Language Processing with Java and LingPipe Cookbook, Atlantic
Publisher.
2. Richard M Reese, Natural Language Processing with Java, OReilly Media.
3. Nitin Indurkhya and Fred J. Damerau, Handbook of Natural Language Processing,
Chapman and Hall/CRC Press.
4. Tanveer Siddiqui, U.S. Tiwary, Natural Language Processing and Information
Retrieval, Oxford University Press.