1691488222216-MTech Data Science Syllabus
1691488222216-MTech Data Science Syllabus
Vision
"To build a rich intellectual potential embedded with interdisciplinary knowledge, human values and
professional ethics among the youth, aspirant of becoming engineers and technologists, so that they
contribute to society and create a niche for a successful career".
Mission
"To become a leading and unique institution of higher learning, offering state-of-the-art education,
research and training in engineering and technology to students who are able and eager to become
change agents for the industrial and economic progress of the nation. To nurture and sustain an
academic ambience conducive to the development and growth of committed professionals for
sustainable development of the nation and to accomplish its integration into the global economy. To
strive to be a Swachch NIT providing a guarantee of clean hostel, clean mess, clean water and
hygienic & wholesome food to its students".
Vision
To be recognized globally for imparting computer science education and research of high distinction,
both of value and relevance to society.
Mission
M1: To impart contemporary knowledge and skill relevant to the field of Computer Science and
Engineering to maximize employability and potential.
M2: To strengthen multifaceted competence in the different core and allied areas of Computer
Science in order to nurture creativity, innovations and out-of-the-box thinking.
M3: To promote research and expertise in Computer Science and Engineering in order to serve the
needs of Industry, Government and Society and motivate the students for lifelong learning.
M4: To inculcate professional ethics and social values through co-curricular and extra–curricular
activities for holistic nation building.
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
PEO1: Analyze, design and formulate complex research problems by applying in-depth domain
knowledge and research principles.
PEO4: Engage in lifelong learning to always be an active learner towards the technical professional
growth along with betterment of the society and country.
PO1: An ability to independently carry out research /investigation and development work to solve
practical problems.
PO3: Students should be able to demonstrate a degree of mastery over the area as per the
specialization of the program. The mastery should be at a level higher than the requirements in the
appropriate bachelor program
PSO1: Apply mathematical, statistical concepts and programming skills to analyse and develop
innovative solutions for the problems in the domain of data science.
PSO2: Demonstrate inferential analysis and enrich skills required to qualify for Employment,
Higher studies and Research in Data science with ethical values.
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Eligibility criteria for admission to M.Tech (Data Science & Engineering)– Full
Time (CCMT & Self-Sponsored) for the Academic batch 2023 onwards
Following are relevant branches for M. Tech, Data Science & Engineering (CCMT)Programme:
B.E./B.Tech. in Computer Engineering, Computer Science, Computer Science and
Engineering, Computer Science and Information Technology, Computer Technology,
Information Technology,Computer and Communication Engineering, Computer Engineering
and Application, Computer Networking, Computer Science and Systems Engineering,
Computer Science and Technology, Computing in Computing, Computing in Multimedia,
Computing in Software, Information and Communication Technology, Information
Engineering, Information Science, Information Science and Engineering Software
Engineering, Information Technology and Engineering, and other relevant branch in Computer
Science and Information Technology
i. GATE Qualified candidates shall be admitted without any entrance test, based upon the GATE
score.
ii. Non-GATE candidates shall be admitted based upon the admission test to be carried out by the
department.
The reservation rules will be followed as per Government of India guidelines prevailing at the
timeof admission, at the Institute level.
Number of seats:
Programme No. of seats No. of seats
(CCMT) (Self-sponsored)
M. Tech. in Data Science & 20 10
Engineering
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
SEMESTER II
S. No Course Course Title Teaching Load Credit
Code
L T P
1. CS-542 Big Data Analytics 3 0 0 3
2. CS-546 Deep Learning 3 0 0 3
3. CS-602 Data Engineering and Visualization 3 0 0 3
4. CS-606 Optimization Techniques for Data 3 0 0 3
Science
5. CS-6XX Elective III 3 0 0 3
6. CS-6XX Elective IV 3 0 0 3
7. CS-622 Data Engineering and Visualization 0 0 3 2
Laboratory
8. CS-624 Big Data Analytics Laboratory 0 0 3 2
TOTAL 15 0 6 22
SEMESTER III
S. No Course Course Title Teaching Load Credit
Code L T P
1. CS-601 Project Seminar / Independent study 0 0 6 3
2. CS-600 Dissertation Phase-I 0 0 12 6
TOTAL 0 0 18 9
SEMESTER IV
S. No Course Course Title Teaching Load Credit
Code L T P
1. CS-600 Dissertation Phase – II 0 0 24 12
TOTAL 0 0 24 12
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
ELECTIVE I and II
Elementary Data Structures and Complexity Analysis: Overview of Basic Data Structures: Arrays,
Linked List, Stack, Queues. Implementation of Sparse Matrices, Algorithm Complexity: Average,
Best- and worst-case analysis, asymptotic notations, Simple Recurrence Relations and use in
algorithm analysis.
Search Structures: Binary search trees, AVL trees, 2-3 trees, 2-3-4 trees, Red-black trees, Btrees,
weak AVL tree.
Graph Algorithms: Representation of Graphs, Traversals, Single-source shortest path Algorithms,
All-pairs shortest path algorithms, Sub graphs, Disjoint Graphs, Connected Components,
Articulation Points, Spanning tree, Minimum Spanning Trees Algorithms, Topological sort
String Matching Algorithms: Introduction, The Brute-Force- Algorithm, Rabin-Karp
Algorithm, String Matching with Finite automata, Knuth-Marries-Pratt Algorithm,
Aho-Corasick algorithm.
Heap Structures: Min-max heaps, Deaps, Leftist heaps, Binomial heaps, Fibonacci heaps, Skew
heaps
Multimedia Structures: Segment trees, k-d trees, Point Quad trees, MX-Quad trees, R-trees,
Splay tree; Bags and Sets.
1. E. Horowitz, S.Sahni and Dinesh Mehta, Fundamentals of Data structures in C++,
Galgotia, 1999.
Text Books 2. Adam Drozdex, Data Structures and algorithms in C++, Second Edition, Thomson
learning
vikas publishing house, 2001.
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Reference 1. G. Brassard and P. Bratley, Algorithmics: Theory and Practice, Printice –Hall,
Books: 1988.
2. Thomas H.Corman, Charles E.Leiserson, Ronald L. Rivest, ”Introduction to
Algorithms”, PHI.
Department: Computer Science and Engineering Programme: M. Tech. in DSE
Course Title: Statistical foundations of Data Course Code: CS-603
Science
Course Designation: Core
L T P C
Credit Scheme
3 0 0 3
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 Basic understanding of different types of statistical and probability
models for data science.
Course CO2 Understand the concept of random variables and applications of random
Outcomes variables in probability distributions.
CO3 Gain knowledge on sampling theory, distributions and hypothesis testing.
CO4 Apply hypothesis testing, interval estimation and point estimation in
advance applications and anlysis.
Topics to be covered:
Introduction: Big Data and Data Science; Current landscape of perspectives; Impact of Big Data;
Impact of Dimensionality; Aims of High-dimensional statistical learning; Statistics: Descriptive and
Inferential principles Real world problems.
Probability: The concept of probability, The axioms of probability, Some important theorems on
Probability, Assignment of Probabilities, Conditional Probability, Theorems on conditional
probability, Independent Event’s, Bayes’ Theorem.
Random Variables: Random variables, Discrete probability distributions, Distribution functions for
discrete random variables, Continuous probability distribution, Distributions for continuous random
variables, joint distributions, Independent random variables, Mathematical Expectation.
Probability Distributions: The Binomial Distribution, The Normal Distribution, The Poisson
Distribution, Relations between different distributions, Central limit theorem, Uniform distribution,
Chi-square Distribution, Exponential distribution; Maximum Likelihood Estimator.
Analysis of variance: One Way Classification: ANOVA for fixed effect model, ANOVA for
Random Effect Model, Two-way Classification (one observation per cell): ANOVA for fixed effect
model, ANOVA for Random Effect Model.
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
1. Fan, J., Li, R., Zhang, C.-H., and Zou, H. (2020). Statistical Foundations of Data
Text Books
Science. CRC Press.
Reference 1. Carlos Fernandez-Granda, Probability and Statistics for Data Science,
Books: Center for Data Science in NYU, 2017, available online.
2. Avrim Blum, John Hopcroft, and Ravi Kannan, Foundations of Data
Science, Cambridge University Press, March 2020, available online.
3. Dirk P. Kroese, Zdravko Botev, Thomas Taimre, Radislav Vaisman, Data
Science and Machine Learning: Mathematical and Statistical Methods, CRC
Press.
4. James D. Miller, Statistics for Data science, Packt Publishing, 2019.
5. Davy Cielen, Arno D. B. Meysman, and Mohamed Ali, Introducing Data
Science, Manning Publication, 2016, available online.
Learning
Forms of Learning, Supervised Learning and Unsupervised Learning, Regression and Classification,
Support vector machine, Artificial Neural Networks, Ensemble Learning, and Reinforcement
Learning.
Decision Tree Learning: Decision tree representation, appropriate problems for decision tree
learning, Univariate Trees (Classification and Regression), Multivariate Trees, Basic Decision Tree
Learning algorithms, Hypothesis space search in decision tree learning, Inductive bias in decision
tree learning, Issues in decision tree learning.
Support Vector Machine: Maximum margin linear separators, Quadratic Programming Solution to
finding maximum margin separators, Kernels for learning non-linear functions.
Artificial Neural Network: Neural network representation, Gradient Descent, Back propagation
Algorithm.
Fairness & Consent: Algorithmic Fairness, Correct but Misleading results, The Ethical Cycle,
Principle of Informed Consent, Data Ownership, Limit on Usage, Five Cs, Ethics and Technology
Privacy: Ethics and Security Training, Regulation, Ethics in Data-Driven Culture, History of
Privacy, Degree of Privacy, Modern Privacy Risks, Privacy in Targeted Ads
Data Validity: Validity of data, Errors in Data Processing, Errors in Design, Choice of Measures,
Managing Change
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Ethics in Action: Societal Implications and Consequences, Ossification and Surveillance, Medical
Applications and Implications, Law enforcing Applications
Basics of Intellectual Property Law: Introduction to Intellectual Property Law; Evolutionary past;
Intellectual Property Law Basics; Types of Intellectual Property; Innovations and Inventions of Trade
related Intellectual Property Rights; Introduction to Copyrights; Principles of Copyright;
Fundamentals of Patent and Cyber Law: Introduction to Patent Law; Rights and Limitations;
Introduction to Cyber Law; Information Technology Act; and Cyber Crime and E-commerce
1. Ethics and Data Science, by Mike Loukides, Hilary Mason, and DJ Patil
Text Books (2018)
nodes by changing link, Segregating odd & even nodes together, binary Search, number of
elements in a loop, print nth element from the last, finding the middle element.
4. Implementation of binary trees and various operations based on them such as preorder/ inorder/
postorder traversal using stack, level order traversal, level order traversal in spiral form, left/
right/ top/ bottom view of the tree, vertical order traversal, printing sum of inorder predecessor
& successor for each node.
5. Implementation of binary search tree and various problem solving based on them such as finding
minimum/ maximum element, traversal in ascending/ descending order, kth largest & kth
minimum element, converting binary tree to binary search tree, finding a pair with a given sum.
6. Implementation of AVL trees and various operations based on them such as insertion, deletion
and traversal.
7. Implementation of Red/Black trees and various operations based on them such as insertion,
deletion and traversal.
8. Implementation of B trees and various operations based on them such as insertion, deletion and
traversal.
9. Implementation of heaps & deaps along with various operations based on them such as insertion,
deletion, extracting minimum/ maximum element, heap sort, priority queues.
10. Implementation of fibonacci & binomial heaps with various operations based on them such as
insertion, union, deletion, extracting minimum/ maximum element.
11. Implementation of various graph based algorithms Dijkstra’s shortest path, Warshall’s all pair
shortest path, breadth & depth first search.
12. Implementation of greedy algorithms like kruskal & prims to find the minimum spanning tree
from a given set of nodes and edges.
13. Implementation of various string matching algorithms like brute force, Rabin-karp,
Knuth_marries-Pratt and using finite automata.
9. Programs for loading and crating data sets and performing various operations on the data.
10. Programs for Data visualization.
Course Assessment Methods: Assignments for each topic to be evaluated in the lab, and final
evaluation at the end which includes Viva Voce, Conduct of experiment.
The course will enable the student to:
CO1 Basic to build machine learning models in Python using popular machine
learning libraries NumPy and scikit-learn.
Course CO2 Hands on knowledge of methods and theories in the field of machine
Outcomes learning for learning principles, techniques, and applications of machine
learning.
CO3 Apply best practices for machine learning development for model
generalization to data and tasks in the real world.
List of Practicals (This is only the suggested list of Practicals. Instructor may frame additional
Practicals relevant to the course contents.):
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 Understand Big Data and its Applications
Course CO2 Understand the fundamentals of Hadoop and Yarn
Outcomes CO3 Apply the various fundamentals of Big Data analytics using Hadoop
ecosystem tools.
CO4 Analyze the different types of data which can be handled by NoSQL
Topics to be covered:
Introduction: Big Data Overview, Big data analytics in industry verticals, Data Storage and
Analysis
Pig & Hive: Introduction to Pig, Execution modes, Data-Processing, Grouping and Joining in Pig,
Hive Shell, Hive Query Language, Hive Configuration, Tables, Partitioning, Serialization and
Deserialization, Bucketing.
Spark: Resilient Distributed Datasets, Transformations and Actions, Dataframes and SparkSQL
NoSQL: Different types of databases, CAP theorem, HDFS as a foundation for NoSQL, Data
Control, definition and manipulation, Types of NoSQL database, Advantages, New SQL,
Comparison of SQL, NoSQL and NewSQL.
1. Tom White “ Hadoop: The Definitive Guide” Third Edit on, O’reilly
Media, 2012.
Text Books 2. Big Data Analytics: From Strategic Planning to Enterprise Integration with
Tools, Techniques, NoSQL, and Graph by David Loshin
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Deep Neural Networks: Gradient Descent, Backpropagation, Vanishing & exploding gradient
problem, Empirical Risk Minimization, regularization.
Recurrent Neural Networks: RNN vs CNN, Back propagation through time, Long Short Term
Memory, Gated Recurrent Units, Bidirectional LSTMs, Bidirectional RNNs.
Applications of Deep Learning: Large Scale Deep Learning, Computer Vision, Speech
Recognition, Natural Language Processing, Other Applications.
1. Deep Learning, Ian Goodfellow and Yoshua Bengio and Aaron Courville,
Text Books MIT Press, 2016.
2. Deep Learning with Python, Francois Chollet, 2017.
Reference 1. Hands-On Machine Learning with Scikit-Learn, Keras, and Tensorflow,
Books Aurélien Géron, O’Reilly, Second Edition, 2019.
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
Course
CO1 Comprehend the practice of iterative, methodical exploration of an
Outcomes
organization’s data with emphasis on statistical analysis to automate and
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
CO2 To organize and critically apply the concepts and methods of business
analytics that support the decision process in business operations.
Prescriptive analytics, creating data for analytics through designed experiments, creating data for
analytics through Active learning, creating data for analytics through Reinforcement learning
Data Visualization: Use of Data Visualization, Data Examples, Basic Charts, Multidimensional
Visualization, and Specialized visualization: Network Data, Hierarchical Data, Geographical Data.
1. Shmueli, G., Bruce, P. C., Yahav, I., Patel, N. R., & Lichtendahl Jr, K. C.
(2017). Data mining for business analytics: concepts, techniques, and
Text Books applications in R. John Wiley & Sons.
2. Han, J., Pei, J., & Tong, H. (2022). Data mining: concepts and techniques.
Morgan kaufmann.
1. Hastie, T., Tibshirani, R. and Friedman, J., The Elements of Statistical
Learning, Springer (2009) 2nd Edition.
Reference 2. Simon, P., The Visual Organization: Data Visualization, Big Data, and the
Books Quest for Better Decisions, John Wiley & Sons (2014).
3. Making sense of Data : A practical Guide to Exploratory Data Analysis and
Data Mining, by Glenn J. Myatt
techniques
CO2 To build the foundation for unconstrained and convex optimization
techniques
CO3 Implement and analyse the Optimization Techniques
Topics to be covered:
Introduction to Linear Programming: Connections with Geometry. Simplex Method: Duality
Theorem. Complementary Slackness. Farkes' Lemma. Revised Simplex Method.
Convex Optimization (CO): Convex sets, functions, and programs and their properties, Basics of
convex analysis, cones, Optimality conditions, introduction to duality theory, theorems of
alternatives, Algorithms: Unconstrained minimization, Descent methods, Newton’s method, and
Interior-point methods.
Fuzzy Systems: Fuzzy set Theory, Optimization of Fuzzy systems. Non-traditional optimization
algorithm
Application of the CO machinery in Data Science: Maximum likelihood and elements of Robust
Statistics
Regularization: ridge regression, LASSO, and their analysis, Elements of Bayesian Statistics, SVD,
Matrix norms, Robust Principal Component Analysis, Matrix completion, Examples from
engineering and machine learning.
1. C. S. Beightler , D. T. Phillips and D. J. Wilde,”Foundations of Optimization",
Text Books Prentice Hall, 1979
Reference 2. E. K. P. Chong and S. H. Zak, “An Introduction to Optimization", John Wiley and
Books: Sons, 2013
3. A. Beck, First-Order Methods in Optimization, MOS-SIAM Series on
Optimization, 2017.
4. K. Lange, “Optimization", Springer-Verlag, 2004
5. David G. Luenberger (1969). Optimization by Vector Space Methods, John Wiley
& Sons (NY)
L T P C
Credit Scheme
0 0 3 2
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 Preparing for data summarization, query, and analysis
Course CO2 Applying data modelling techniques to large data sets
Outcomes
CO3 Building a complete business data analytic solution
List of Experiments (This is only the suggested list of Practical. The instructor may frame
additional Practical relevant to the course contents.)
Data Analysis:
Apply Data Cleaning, Concepts of Exploratory Data Analysis on publicly available datasets like:
Charles Book Club, German Credit, Tayko Software Cataloger, Voter Persuasion, Taxi cancellation,
Bath Soap , Future Fund Raising, Bankruptcy, Titanic, etc.
Data Visualization:
Hands-on Visualization Tools like Knime, Tableau
2-D: bar charts, Clustered bar charts, dot plots, connected dot plots, pictograms, proportional shape
charts, bubble charts, radar charts, polar charts, Range chart, Box-and-whisker plots,
univariate scatter plots, histograms word cloud, pie chart, waffle chart, stacked bar chart, back-to-
back bar chart, treemap and all relevant 2-D charts.
3-D: surfaces, contours, hidden surfaces, pm3d coloring, 3D mapping; multi-dimensional data
visualization; manifold visualization; graph data visualization; Annotation.
ELECTIVE I AND II
Department: Computer Science and Engineering Programme: M. Tech. in DSE
Course Title: Advanced Databases and Data Course Code: CS-503
Mining
Course Designation: Elective
L T P C
Credit Scheme
3 0 0 3
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 Understand the advanced concepts of Databases, security and privacy
issues and the techniques to handle the same.
Course CO2 To design and develop data warehouses and data marts models and their
Outcomes applications.
CO3 Learn and apply the data mining techniques to find out the hidden
patterns and to turn into useful knowledge.
CO4 To identify Business applications and Trends of Data mining
Topics to be covered:
Basics: Formal review of relational database and FDs Implication, Closure, its correctness
Normalization and Joins: 3NF and BCNF, Decomposition and synthesis approaches,
Review of SQL99, Basics of query processing, external sorting, file scan. Processing of
joins, materialized vs. pipelined processing, query transformation rules, DB transactions,
ACID properties, interleaved executions, schedules, serializability, Query optimization,
object-oriented databases, and multimedia databases.
Advance Databases: Big Data Revolution- CAP Theorem- Birth of NoSQL- Document
Database—XML Databases- JSON Document Databases- Graph Databases.
Column Databases— Data Warehousing Schemes- Columnar Alternative- Sybase IQ- CStore and
Vertica- Column Database Architectures- SSD and In-Memory Databases— InMemory Databases-
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 Basic understanding of the fundamental concepts of a digital image
processing systems.
CO2 Evaluate the techniques for image enhancement and image restoration in
Course
spatial and frequency domain.
Outcomes
CO3 Categorize various compression techniques.
CO4 Interpret image segmentation and representation techniques.
CO5 Acquire an appreciation for the image processing issues and techniques
and apply these techniques to real world problems
Topics to be covered:
Digital Image Fundamentals: Image understanding, Different stages of image processing and
analysis, Components of image processing system, Sampling and Quantization, Some basic
relationships like neighbour’s connectivity, distance measure between pixels.
Image Compression: Fundamentals of image compression, error criterion, Coding Inter-pixel and
Psycho visual redundancy, Image Compression models, Error free compression: Huffman,
Arithmetic, Run length Coding, Lossy Compression: Block Transform Coding based on DCT and
DWT, Image Compression standard: JPEG.
Morphological image processing: Basic Morphology concepts, Binary dilation and erosion,
Opening and Closing operations, Basic Morphological Algorithms: Boundary extraction, Hole
Filling, Extraction of Connected Components.
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Image Segmentation and Edge Detection: Fundamentals, Point, Line and Edge Detection:
Detection of isolated points, lines, Basic Edge Detection, Advanced Edge detection using Canny
edge detector, Laplacian edge detector and Laplacian of Gaussian edge detector. Edge Linking and
Boundary Detection, Thresholding: Basic Global Thresholding and Optimum Global Thresholding
using Otsu’s Method, Region Based Segmentation: Region Growing, Region Splitting and Merging.
Recognition and Interpretation: Pattern and pattern classes, Decision Theoretic methods:
minimum distance classifier, matching by correlation, Structural Methods: Matching Shape
Numbers.
1. Rafael C. Gonzales and Richard E. Woods, “Digital Image Processing”,
Pearson Education, 3/e, 2007.
Text Books
2. Milan Sonka, Vaclav Hlavac and Roger Boyle, “Digital Image Processing
and Computer Vision”, Cengage Learning, 2007.
3. Anil K. Jain, “Fundamentals of Digital Image Processing”, Pearson
Education, 1988.
1. B. Chanda, “Digital Image Processing and Analysis”, PHI Learning Pvt.
Reference Ltd., 2011.
Books 2. William K. Pratt, “Digital Image Processing”, Wiley-Interscience, 4/e,
2007.
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 Explain the core concepts of the cloud computing paradigm.
CO2 Discuss system, network and storage virtualization and outline their role in
Course enabling the cloud computing system model.
Outcomes CO3 Illustrate the fundamental concepts of cloud storage and demonstrate their
use in storage systems
CO4 Analyze various cloud programming models and apply them to solve
problems on the cloud.
Topics to be covered:
Introduction: Cloud-definition, benefits, usage scenarios, History of Cloud Computing, Cloud
Architecture, Types of Clouds, Business models around Clouds, Major Players in Cloud Computing,
Research issues and challenges in Cloud Computing, Examples: Eucalyptus, Nimbus, Open Nebula,
Cloud simulator: CloudSim.
Cloud Services: Types of Cloud services: SaaS, IaaS, PaaS, CaaS,
Service providers – Google, Amazon, Microsoft Azure, IBM, Salesforce.
Resource Management: Server consolidation, Dynamic resource provisioning, Resource
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 To articulate the protocols and standards designed for IoT.
CO2 To understand the various data analytics and machine learning techniques
Course
used in IoT
Outcomes
CO3 To critically analyze and compare the various IoT and WSN protocols.
CO4 To develop practical skills in implementing IoT solutions using a variety
of sensors.
Topics to be covered:
Introduction to IoT: Overview of IoT, IoT applications, enabling technologies and services, IoT
architecture, stack and protocols, IoT characteristics, levels and challenges, Cyber physical systems
vs IoT, Wireless sensor network vs IoT.
IoT Hardware and Sensors: Introduction to sensor interfacing, types of sensors (gas sensor,
obstacle sensor, heartbeat sensor, gyro sensor, GPS, pH sensor, etc.), controlling sensors through
webpages, IoT microcontrollers-8051, Advanced RISC Machine-ARM7.
IoT Protocols: Messaging and Transport protocol - Message Queuing Telemetry Transport (MQIT),
Constrained Application Protocol, XMPP and DDS Protocols, Extensible Messaging and Presence
Protocol (XMPP), Data Distribution Service (DDS). Transport Protocols - Bluetooth Low Energy,
Light Fidelity (Li-Fi). Addressing and Identification protocols - Internet Protocol Version 4 (IPv4),
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Internet Protocol Version 6 (IPv6), IPv6 versus IPv4, Legacy of IPv4 Devices , Switching Over to
IPv6, Uniform Resource Identifier ( URI).
IoT Cloud Computing and Edge Computing: Challenges, Selection of Cloud Service Provider:
An Overview, Introduction to Fog Computing, Working of Fog Computing, Difference Between
Edge and Fog Computing, Cloud Computing: Security Aspects
Data Analytics and Machine Learning: Data Analysis, Machine Learning — Visualising,
Supervised Learning, Unsupervised Learning, Types of Machine Learning Models - Classification,
Regression, and Clustering, Model Building process, Modelling algorithms, Model performance, Big
data platform and pipeline.
IoT Applications: IFTTT, IFTTT versus Other Cloud Services, Smart Perishable Tracking with IOT
and Sensors , Smart Healthcare —Elderly Fall Detection with 1oT and Sensors, Smart Inflight
Lavatory Maintenance with IoT, IoT—Based Application to Monitor Water Quality, Smart
Warehouse Monitoring, Smart Retail —IoT Possibilities in the Retail Sector, Prevention of
Drowsiness of Drivers by IoT-Based Smart Driver Assistance, System to Measure Collision Impact
in an Accident with IoT, Integrated Vehicle Health Management.
1. Shriram K Vasudevan,Abhishek S Nagarajan, RMD Sundaram, "Internet
of Things", Wiley 2020
2. David Hanes, Gonzalo Salgueiro, Patrick Grossetete, Robert Barton,
Text Books
Jerome Henry, IoT Fundamentals: Networking Technologies, Protocols,
and Use Cases for the Internet of Things, Cisco Press, 2017, ISBN:
1587144565.
1. Perry Lea, IoT and Edge Computing for Architects: Implementing edge
and IoT systems from sensors to clouds with communication systems,
analytics, and security, 2nd Edition, Packt Publishing, 2020,
ISBN:1839214805
Reference
Books 2. S. Misra, A. Mukherjee, and A. Roy, Introduction to IoT. Cambridge
University Press, 2021, ISBN: 1108959741
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
Introduction: NLP introduction, origins of NLP, Language and Knowledge, The challenges of NLP,
Language and Grammar, Processing Indian Languages, NLP applications, Some successful Early
NLP systems, Information Retrieval
Word Level Analysis: Introduction, Regular Expressions, Finite State Automata, Morphological
Parsing, Spelling Error Detection and Correction, Words and Word Classes, Part-of-Speech Tagging.
Natural Language Generation: Introduction, Architecture of NLG Systems, Generation Tasks and
Representations, Application of NLG.
Lexical Resources: Introduction, Word Net, Frame Net, Stemmers, Part-of-Speech Tagger.
1. D. Jurafsky and J. H. Martin, “Speech and Language Processing”, Prentice
Hall, 2/e, 2008.
Text Books
2. Tanveer Siddiqui and U. S. Tiwary, “Natural Language Processing and
Information Retrieval”, Oxford Higher Education, 2008.
3. L.M. Ivansca and S. C. Shapiro, “Natural Language Processing and
Language Representation”, AAAI Press, 2000.
Reference 1. Akshar Bharti, Vineet Chaitanya and Rajeev Sangal, “NLP: A Paninian
Books Perspective”, Prentice Hall, New Delhi, 2004.
CO4 Evaluate and optimize search engines and develop optimized web sites.
Topics to be covered:
Introduction: Information retrieval process, Indexing, Information retrieval model, Boolean retrieval
model, Dictionary and Postings: Tokenization, Stop words, Stemming, Inverted index, Skip pointers,
Phrase queries,
Tolerant Retrieval: Wild card queries, Permuterm index, Bigram index, Spelling correction, Edit
distance, Jaccard coefficient, Soundex, Evaluation Precision, Recall, F-measure, Normalized recall,
Latent Semantic Indexing, Eigen vectors, Singular value decomposition, Lowrank, approximation,
Problems with Lexical Semantics, Query Expansion Relevance feedback, Rocchio algorithm,
Probabilistic relevance feedback, Query Expansion and its types, Query drift,
Content Based Image retrieval: Challenges in Image retrieval, Image representation, Indexing and
retrieving images, Relevance feedback, Open source Search engine Frameworks: Components of a
Search engine, Web search: The structure of the web, Queries and users, Static ranking, Dynamic
ranking, Evaluating web search, The user, paid placement, Search engine optimization/spam, Web
Search Architectures: Crawling, Meta-crawlers and Focused Crawling, Web Crawlers: Crawling the
web, Document feeds, Storing documents and detecting duplicates,
Index Compression: Statistical properties of terms in IR, Dictionary compression, Postings File
compression, XML retrieval: XML Indexing and Search, Data vs. Text-centric XML, Text-Centric XML
retrieval, A vector space model for XML retrieval, Link Analysis: hubs and authorities, Page Rank and
HITS algorithms.
1. Introduction to Information Retrieval by C. Manning, P. Raghavan, and H.
Schutze, Cambridge University Press.
Text Books
2. Modern Information Retrieval: The Concepts and Technology behind Search
by Ricardo Baeza Yates and Berthier Ribeiro Neto, ACM Press.
1. Search Engines: Information Retrieval in Practice by Bruce Croft, Donald
Metzler and Trevor Strohman, Addison Wesley.
Reference
Books 2. Kowalski, Gerald, Mark T May bury: INFORMATION RETRIEVAL
SYSTEMS: Theory and Implementation, Kluwer Academic Press, 1997.
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Fuzzy logic: Introduction to Fuzzy logic, Fuzzy sets and membership functions, Operations on Fuzzy
sets, Fuzzy relations, rules, propositions, implications and inferences, Defuzzification techniques,
Fuzzy logic controller design, Some applications of Fuzzy logic.
Genetic Algorithms: Concept of "Genetics" and "Evolution" and its application to probabilistic
search techniques, Basic GA framework and different GA architectures, GA operators: Encoding,
Crossover, Selection, Mutation, etc., Solving single-objective optimization problems using GAs.
Artificial Neural Networks: Biological neurons and their working, Simulation of biological neurons
to problem-solving, Different ANNs architectures, Training techniques for ANNs, Applications of
ANNs to solve some real life problems.
1. Networks, Fuzzy Logic and Genetic Algorithms: Synthesis & Applications,
Text Books
S.Rajasekaran, G. A. Vijayalakshami, PHI.
Reference 1. Genetic Algorithms: Search and Optimization, E. Goldberg.
Books: 2. Neuro-Fuzzy Systems, Chin Teng Lin, C. S. George Lee, PHI.
requirements.
CO4 Development of case studies related to environmentally responsible
strategies.
Topics to be covered:
Fundamentals: Business, IT, and the Environment, carbon foot print, regulations and industry
initiatives, green maturity model for virtualization.
Operating system support: Power supply, Storage, video card, Display, web, temporal and spatial
data mining materials recycling, Tele-computing.
Middleware support for green computing: Tools for monitoring, HPC computing, Green Mobile,
Embedded computing and networking, Management frameworks, Standards and metrics for
computing green.
Green Networking: Where to save energy in wired networking, Taxonomy of green networking
research, Adaptive link rate, Interface proxying, Energy ware infrastructure, Energy ware
application.
Green Cellular Networking: Survey, measuring greenness metrics, Energy saving in base stations,
Research issues, Challenges, Future generation wireless systems, Wireless sensor network for green
networking.
Green Compliance: Socio-cultural aspects of Green IT, Green Enterprise Transformation Roadmap,
Protocols, Standards, and Audits, Emergent Carbon Issues.
Case Studies: The Environmentally Responsible Business Strategies (ERBS), Case Study Scenarios
for Trial Runs, Applying Green IT Strategies and Applications to a Home, Hospital, Packaging
Industry and Telecom Sector.
1. Bud E. Smith, “Green Computing: Tools and Techniques for Saving
Energy, Money, and Resources”, Auerbach Publications, 2018.
Text Books 2. Jason Harris, “Green Computing and Green IT Best Practices on
Regulations and Industry Initiatives, Virtualization, Power Management,
Materials Recycling and Telecommuting”, Emereo Publishing, 2008.
3. Bhuvan Unhelkar, “Green IT Strategies and Applications-Using
Environmental Intelligence”, CRC Press, June 2011
1. Toby Velte, Anthony Velte, Robert Elsenpeter, “Green IT: Reduce Your
Reference
Information System's Environmental Impact While Adding to the Bottom
Books
Line”, MC-Graw Hill, 2008.
Course CO2 Illustrate the concepts of Bitcoin and their usage; the importance of
Outcomes consensus.
CO3 To deploy blockchain and Implement Ethereum blockchain smart
contracts.
CO4 To implement Blockchain for different use cases.
Topics to be covered:
Introduction: Need for Distributed Record Keeping, Modeling faults and adversaries, Byzantine
Generals problem, Multichain and its features, Consensus algorithms, Requirements for the
consensus protocols and their scalability problems, Nakamoto Concept with Blockchain based
cryptocurrency, Technologies Borrowed in Blockchain – hash pointers, consensus, byzantine fault-
tolerant distributed computing, digital cash etc.
Basic Distributed Computing and & basic Crypto primitives: Atomic Broadcast, Consensus,
Byzantine Models of fault tolerance, Hash functions, Puzzle friendly Hash, Collison resistant hash,
digital signatures, public key crypto, verifiable random functions, Zero-knowledge systems
Blockchain 1.0: Bitcoin blockchain, the challenges, and solutions, proof of work (PoW), Proof of
stake (PoS), alternatives to Bitcoin consensus, Bitcoin scripting language and their use.
Blockchain 2.0 and Blockchain 3.0: Ethereum and Smart Contracts, The Turing Completeness of
Smart Contract Languages and verification challenges, using smart contracts to enforce legal
contracts, comparing Bitcoin scripting vs. Ethereum Smart Contracts, Hyperledger fabric, the plug
and play platform and mechanisms in permissioned blockchain
Privacy, Security issues in Blockchain: Pseudo-anonymity vs. anonymity, Zcash and Zk-SNARKS
for anonymity preservation, attacks on Blockchains – such as Sybil attacks, selfish mining, 51%
attacks - -advent of algorand, and Sharding based consensus algorithms to prevent these attacks.
Case Studies: Blockchain in Financial Service, Supply Chain Management and Government
Services.
1. Melanie Swa “Blockchain”, First Edition, O’Reilly Jan 2015
2. Narayanan, Bonneau, Felten, Miller and Goldfeder, “Bitcoin and
Cryptocurrency Technologies – A Comprehensive Introduction”, Princeton
University Press.
3. Andreas M. Antonopoulos, “Mastering Bitcoin: Unlocking Digital
Text Books Cryptocurrencies”, O’Reilly Media Inc, 2015
4. Josh Thompson, ‘Blockchain: The Blockchain for Beginnings, Guild to
Blockchain Technology and Blockchain Programming’, Create Space
Independent Publishing Platform, 2017.
5. Imran Bashir, “Mastering Blockchain: Distributed ledger technology,
decentralization, and smart contracts explained”, Packt Publishing.
1. Zero to Blockchain - An IBM Redbooks course, by Bob Dill, David Smits -
https://www.redbooks.ibm.com/Redbooks.nsf/RedbookAbstracts/crse0401.
Reference html
Books Hyperledger Fabric - https://www.hyperledger.org/projects/fabric
Binocular Stereopsis: Camera and Epipolar Geometry, Planar Scenes and Homography, Depth
estimation and multi-camera views, 3-D reconstruction, Auto-calibration, DLT and RANSAC,
Structure from Motion, Hough Transform, Interest Point Detection, Edge Detection, Local Binary
Pattern and its variants, Convolution and Filtering, Gaussian derivative filters, Gabor Filters, DWT,
Data Visualization and Dashboard Design: Job responsibilities of BI analysts like creating data
visualization and dashboards, importance of data visualization, types of visualization methods,
determining effective visualization method, common characteristics and types of dashboards,
guidelines for designing dashboard.
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
White noise model, Random walks, moving average, Invertibility, ARIMA Models, Autoregressive
processes, Fitting an AR process, Yule–Walker equations, General linear process, Wold
decomposition theorem, Time series Forecasting, Exponential smoothing, Holt- Winters, Box-
Jenkins forecasting, Optimality models for exponential smoothing, Model selection for time series
forecasting.
Gaussian Process, Gaussian Regression, Vector autoregression models VAR, Structural Form,
Reduced Form, Parameter Estimation, Kernel Methods for forecasting, Adaptive filtering mechanism
for forecasting, Statistical Testing for stationarity, Augmented Dickey-Fuller, Kwiatkowski–
Phillips–Schmidt–Shin Test, Goodness of estimation.
1. Aileen Nielsen, Practical Time Series Analysis: Prediction with Statistics
and Machine Learning, 1st ed 2019, ISBN: 1492041653
Text Books 2. Terence C. Mills, Applied Time Series Analysis: A Practical Guide to
Modelling and Forecasting (1st ed) 2019, Academic Press, ISBN: 978-0-12-
813117-6
Reference 1. George E. P. Box, Gwilym M. Jenkins, Gregory C. Reinsel, Greta M.
Books: Ljung, Time Series Analysis : Forecasting And Control, (5th ed) 2016,
ISBN: 9781118675021
Topics to be covered:
Introduction to game theory, Games and solutions. Game theory and mechanism
design, Games to play, Strategic form games. Matrix and continuous games. Dominant and
dominated strategies, Nash Equilibrium; existence and uniqueness. Pareto efficiency
Nash bargaining solution, Evolutionary games. Evolutionarily stable strategies, Games with
incomplete information: Mixed and behavioral strategies. Bayesian Nash equilibrium,
Applications in auctions.
Different auction formats. Revenue and efficiency properties of different auctions, Repeated and
evolutionary games, Cooperative games, Shapley value and applications, Matching algorithms,
TTC and Gale-Shapley algorithm.
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 Basic understanding of cyber-physical systems (CPS), characteristics and
its applications.
CO2 Understanding about the basic principles, techniques and applications of
Course
machine learning
Outcomes
CO3 Broad understanding of machine learning algorithms and their use in
data-driven knowledge discovery and program synthesis.
CO4 The ability to adapt or combine some of the key elements of existing
machine learning algorithms to design new algorithms for CPS.
Topics to be covered:
Cyber-Physical Systems (CPS) Overview: Introduction – CPS analysis by example, Application
domains, Significance; Hybrid system Vs CPS; Emergence of CPS; CPS Characterization; Analysis
of Representative Cps Domains.
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Learning Tools and Techniques for CPS: Introduction to learning models; data pre-processing and
tools used for pre-processing; Programming for learning models.
Supervised and Unsupervised learning models for CPS: Bayesian Networks; KNN classifier;
Decision Trees; Random Forest; Support Vector Machines; Introduction to Neural Networks;
Feedforward Neural Networks; Sequence modelling using RNN; Echo State Networks; Gated RNNs;
K-Means Clustering and its applications towards CPS, Graph theoretic clustering for CSP.
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
Matrices & Operators: Observables, The Pauli Operators, Outer Products & matrix representation,
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Operators Representation of using matrices, Eigen values & Eigen Vectors, Pauli Matrix, Hermitian
unitary and normal operator, Projection Operator, Positive Operators, Trace of an operator, important
properties of Trace, Expectation Value of Operator.
Classical data in quantum computers: Representing data in quantum computers, Access models,
Implementations, Block encodings, Importance of quantum memory models, QRAM architectures
and noise resilience, Working with classical probability distributions, Retrieving data.
SVE-based quantum algorithms: Spectral norm and the condition number estimation, estimating
quality of representations, Extracting the SVD representations, Singular value estimation of a product
of two matrices, Slow algorithms for log-determinant.
Quantum algorithms for Monte Carlo: Monte Carlo with quantum computing, Bounded output,
Bounded ℓ2 norm, Bounded variance and its applications, Dimensionality reduction, Unsupervised
and Supervised algorithms .
q-means: The k-means algorithm, The q-means algorithm and its analysis.
Introduction of Finance in Data Analytics: Understanding the scope of Data Science and Its
solutions in Finance, how data science fit into existing infrastructure, assessing new options such
Cloud and Artificial Intelligence technology, describe finance management its components &
architecture,
Role and Responsibilities of Data Science in Finance: The key role and responsibility of Data
science in Accounts and Finance. The Functional Area of Data analytics Tools, Query Tools and
Reporting, OLAP and its applications, the key process and Matrices of accounts and finance with the
help of data science.
Data Science Workflow in Finance: Analysis the role of data science in Financial Management like
creating data visualization of Finance management, Descriptive Analytics to derive insights from
data, Workflow Optimization and Process Improvements in finance and Accounts Record, analysis
the predictive learning process and core concept of data analytics and data science in finance
Management.
Predictive Analytics application and Challenges of data science in Finance: Predictive Analytics
core concepts Machine Learning application in Finance. Limitations of Data Science adaption in
Finance Challenges in transition, Data Science Industry use case in Finance domain.
1. Ruey S. Tsay (2012), “An Introduction to Analysis of Financial Data with
R”, Wiley, ISBN: 978-0-470-89081-3
Text Books 2. Irene Aldridge and Marco Avellaneda, “Big Data Science in Finance”
ISBN: 978-1119602989
Course Assessment Methods: Two sessional exams and one end-semester exam, along with
assignments, presentations and class tests, which may be conducted by the course coordinator in
lieu of internal assessment.
The course will enable the student to:
CO1 To get acquainted with fundamentals of Bioinformatics.
CO2 Understanding about the basic principles, techniques, and
Course
Outcomes applications of bioinformatics.
CO3 Broad understanding of big data analytics, machine learning algorithms
and their use in data-driven knowledge discovery and program synthesis.
CO4 To use different big data analytics tools for bioinformatics analysis.
Topics to be covered:
The Beginning of Bioinformatics: Margaret Dayhoff, Richard Eck, Robert Ledley, and the
Beginning of Bioinfomatics; Definition of Bioinformatics; Bioinformatics Versus Computational
Biology; Goals of Bioinformatic Analysis; Bioinformatics Technical Toolbox.
Dr. B R AMBEDKAR NATIONAL INSTITUTE OF TECHNOLOGY, JALANDHAR
G T Road Bye Pass, Jalandhar-144011, Punjab (India)
Department of Computer Science and Engineering
Scheme of M. Tech. in Data Science & Engineering
Data, Databases, Data Format, Database Search, Data Retrieval Systems, and Genome
Browsers: Genomic Data; Sequence Data Formats - FASTA Format, PHYLIP Format; Conversion
of Sequence Formats Using Readseq; Primary Sequence Databases—GenBank, EMBL-Bank,
DDBJ, Sequence Submission to the Databases, Availability of the Submitted Sequence to the Public,
Sequence Flatfile Format, Sequence Accession Numbers and Redundancy in Primary Databases,
Divisions of the NCBI Primary Sequence Database; Secondary Databases; Some Examples of
Publicly Available Secondary and Specialized Databases; Data Retrieval; An Example of Retrieval
of mRNA/Gene Information; Data Visualization in Genome Browsers; Using Map Viewer to Search
the Genome.
Sequence Alignment and Similarity Searching in Genomic Databases - BLAST and FASTA:
Evolutionary Basis of Sequence Alignment; Sequence Identity and Sequence Similarity; Global
Versus Local Alignment; Pairwise and Multiple Alignment; Alignment Algorithms, Gaps, and Gap
Penalties; Scoring Matrix, Alignment Score, and Statistical Significance of Sequence Alignment;
Database Searching with the Heuristic Versions of the Smith-Waterman Algorithm—BLAST and
FASTA; Sequence Comparison, Synteny, and Molecular Evolution.
Big Data Analytics in Bioinformatics: Introduction - Big data in bioinformatics, Types of big data
in bioinformatics, Big data problems in bioinformatics, Techniques for big data Analytics;
Architectures for big data analytics - MapReduce architecture, Fault tolerant graph architecture,
Streaming graph architecture.
Machine learning for big data analytics: Feature Selection, Supervised Learning, Unsupervised
learning, Deep Learning, Inference of Large Scale GRN with Association Rule Mining; Challenges
and issues in big data analytics - Challenges in big data analytics, Issues in big data Analytics;
Tools for big data analytics in bioinformatics: Tools for microarray data analysis; Tools for gene-
gene network analysis; Tools for PPI data analysis; Tools for sequence analysis; Tools for pathway
analysis;