M.Tech - Computer Vision and Image Processing
M.Tech - Computer Vision and Image Processing
In recent times, there has been a dramatic increase of image and video data in every conceivable
field due to the proliferation of digital capture devices and also due to the internet increasingly
becoming a multimedia phenomenon. Consequently, the field of Computer Vision and Image
Processing has emerged as a promising field of study and research due to its wide spread
applications in managing the huge influx of image and video data.
Computer Vision started with building machines that can visualize data like human and give
inputs for robots; and now has wider objectives to serve applications such as search engines,
computational photography, medical imaging, vision for computer graphics and many more.
Areas like document and medical image analysis are also developing rapidly. The field of
robotics has abundant potential to serve in medical surgery, defense, home security and the
community at large. With the advancements in supportive technologies such as digital cameras
and video equipments, Computer Vision and Image Processing will become increasingly more
capable and affordable as well.
The issues and scope for research in this area of specialization are so vast that it is vital to offer a
specialized programme in this area. With this as the goal, the University is offering a two year
M.Tech programme in Computer Vision and Image Processing. The objective is to create
professionals and researchers with the necessary expertise to handle the various real-world
problems where image processing techniques might provide robust solutions.
The programme includes core courses in Digital Image Processing, Signal Processing, Video
Processing, and Computer Vision with the necessary background covered in mathematical
courses. The programme has an intensive course work for three semesters with suitable elective
courses followed by a dissertation where the students would conduct research in this field of
study. The department has a well established research facility, “Amrita – Cognizant Innovation
lab” which would help the students to build applications on real-time image and video data.
Students have abundant opportunities to pursue internships in major companies and R&D labs
like ISRO, NPOL etc. Bright career opportunities are available to students in top companies and
research labs.
CURRICULUM
First Semester
Credits 20
* Non Credit Course
Second Semester
Credits 18
* Non Credit Course
Third Semester
E Elective I 300 3
E Elective II 300 3
16CV798 P Dissertation 10
Credits 16
Fourth Semester
Credits 12
Total Credits 66
List of Courses
Foundation Core
Electives
Project Work
16CV798 Dissertation 10
16CV799 Dissertation 12
16MA605 LINEAR ALGEBRA AND PARTIAL DIFFERENTIAL EQUATIONS 4-0-0-4
Vector Spaces: Vector spaces - Sub spaces - Linear independence - Basis - Dimension - Inner
products - Orthogonality - Orthogonal basis - Gram Schmidt Process - Change of basis -
Orthogonal complements - Projection on subspace - Least Square Principle.
Linear Transformations: Positive definite matrices - Matrix norm and condition number - QR-
Decomposition - Linear transformation - Relation between matrices and linear transformations -
Kernel and range of a linear transformation - Change of basis - Nilpotent transformations -
Similarity of linear transformations - Diagonalisation and its applications - Jordan form and
rational canonical form. Introduction to Normed Linear space, Banach spaces and Hilbert space.
Partial Differential Equations: Basic definitions, Model Equations: Elliptic, Parabolic and
Hyperbolic PDEs, Solving PDEs Numerically- Elliptic, Parabolic and Hyperbolic Equations.
Finite Element Method.
1. Howard Anton and Chris Rorres, “Elementary Linear Algebra”, Tenth Edition, John
Wiley and Sons, 2010.
2. Gilbert Strang, “Linear Algebra and Its Applications”, Fourth Edition, Cengage, 2006.
3. Justin Solomon, “Mathematical Methods for Computer Vision, Robotics, and Graphics”,
Stanford University, 2013.
4. Lawrence C. Evans, “Partial Differential Equations”, American Mathematical Society,
2010.
5. Gilles Aubert, Pierre Kornprobst, “Mathematical Problems in Image Processing: Partial
Differential Equations”, Springer, 2006.
Introduction to Stacks and Queues, Linked list, B-Tree, B+ Tree. Multidimensional Data:
Introduction – Need for Multidimensional Data Structures. Multidimensional Point Data
Structures: Point Quadtrees – Trie-based Quadtrees – KD Trees (Point and Tries). Object based
and Image based Representations: Interior bases representations: Cells and tiling, Ordering Space
– Blocks - Non Orthogonal Blocks - Region Quadtrees - Region Octrees. Boundary Based
Representations: Boundary Model - Voronoi Diagrams - Properties - Delaunay Graphs -
Delaunay Triangulations.
TEXT BOOKS/ REFERENCES:
One Dimensional and Two Dimensional Signals and Systems- Separable Signals- Periodic
Signals - General Periodicity - 1D & 2-D Discrete Space Systems- 1 D & 2D Convolution.
Continuous Space Fourier Transform - Sampling in One and Two Dimensions - Ideal
Rectangular Sampling – Sampling Theorem - General Case – Change of Sampling Rate -
Sampling Lattice – Reconstruction – Down Sampling and Up sampling by integers.
Discrete Space Transforms – 1D & 2D Discrete Fourier Series - 1D & 2-D Discrete Fourier
transform- Properties – Discrete Time Fourier Transform- Short Time Fourier Transform – Fast
Fourier Transform - Wavelet Transform - Gabor Transform.
Filter Design Fundamentals - Ideal and Finite order Filters - Two Dimensional Filter Design -
FIR and IIR filter design - Window Functions - Rectangular and Rotated windows.
1. Rafael C. Gonzalez and Richard E. Woods, “Digital Image Processing”, Third Edition,
Pearson Education, 2009.
2. Milan Sonka, Vaclav Hlavac and Roger Boyle, “Image Processing, Analysis and Machine
Vision”, Third Edition, Cengage Learning, 2007.
3. William K. Pratt, “Digital Image Processing”, Fourth Edition, Wiley Interscience, 2007.
4. Anil K Jain, “Fundamentals of Digital Image Processing”, Prentice Hall, 1989.
1. Yao Wang, Jorn Ostermann and Ya-Qin Zhang, “Video Processing and
Communications”, Prentice Hall, 2001.
2. A. Murat Tekalp, “Digital Video Processing”, Pearson, 1995
1. Douglas C. Montgomery and George C. Runger, “Applied Statistics and Probability for
Engineers”, Third Edition, John-Wiley & Sons Inc., 2003.
2. A. Papoulis and Unnikrishna Pillai, “Probability, Random Variables and Stochastic
Processes”, Fourth Edition, McGraw Hill, 2002.
3. J Ravichandran, “Probability and Statistics for Engineers”, First Edition, Wiley, 2012.
4. Scott L. Miller, Donald G. Childers, “Probability and Random Processes”, Academic
Press, 2012.
5. Kalyanmoy Deb, “Optimization for Engineering Design: Algorithms and Examples”,
Prentice Hall, 2002.
6. Singiresu S. Rao, “Engineering Optimization: Theory and Practice”, Third Edition, New
Age Publishers, 2003.
7. Justin Solomon, “Mathematical Methods for Computer Vision, Robotics, and Graphics”,
Stanford University, 2013.
Image Morphology: Binary and gray scale Morphological analysis - Dilation and Erosion -
Skeletons and Object Marking – Granulometry – Morphological Segmentation. Feature
extraction: Global image measurement, feature specific measurement, characterizing shapes,
Hough Transform. Representation and Description: Region Identification – Contour Based and
Region Based Shape Representation and Description – Shape Classes. Flexible shape extraction:
active contours, Flexible shape models: active shape and active appearance. Texture
representation and analysis: Statistical Texture Description – Syntactic Texture Description
Methods – Hybrid Texture description Methods – Texture Recognition Method Applications.
Image Understanding: Control Strategies –RANSAC – Point Distribution Models – Scene
Labeling and Constraint Propagation. Image Data Compression: Predictive Compression
Methods – Vector Quantization, DCT, Wavelet, JPEG.
1. Milan Sonka, Vaclav Hlavac and Roger Boyle, “Image Processing, Analysis and Machine
Vision”, Third Edition, Cengage Learning, 2007.
2. Tinku Acharya, Ajoy K Ray,“Image Processing- Principles and Applications”, Wiley,
2005.
3. John C. Russ, “The Image Processing Handbook”, Sixth Edition, CRC Press, 2007.
4. Mark S. Nixon, Alberto S. Aguado, “Feature Extraction and Image Processing”, Second
Edition, Academic Press, 2008.
Introduction - Pattern recognition systems - The design cycle - Learning and adaptation - Linear
models for classification - Discriminant functions (Two and multiple classes) - Least squares
classification functions - Fisher’s discriminant analysis for two and multiple classes -
Probabilistic generative models - Maximum likelihood solution. Kernel methods: Constructing
kernels - Kernel density estimators - Nearest neighbor methods - Gaussian processes and
classification - Sparse kernel machines - Support vector machines - Maximum margin classifiers
- Multi-class support vector machine. Graphical models: Bayesian networks - Generative models
- Linear Gaussian models - Conditional independence. Mixture models and Expectation
maximization: K-means clustering - Mixtures of Gaussian - Expectation maximum for Gaussian
mixtures. Continuous latent variables: Principal component analysis - Applications of principal
component analysis - PCA for higher dimensional data - Factor analysis. Sequential data:
Markov models - Hidden Markov models - Maximum likelihood for HMM - Forward-backward
algorithm. Combining models - Tree based models - Decision trees - Classification and
regression trees (CART).
Deep learning for High-level Vision: Introduction to Deep Learning, main types of Deep
Architectures, Application of Deep Learning Architecture to Computer Vision.
Image Formation: Geometric image formation, Photometric image formation - Camera Models
and Calibration: Camera Projection Models – Orthographic, Affine, Perspective, Projective
models. Projective Geometry, Transformation of 2D and 3D, Internal Parameters, Lens
Distortion Models- Local Feature Detectors and Descriptors: Hessian corner detector, Harris
Corner Detector, LOG detector, DOG detector, SIFT, PCA-SIFT, GLOH, SURF, HOG,
Pyramidal HOG, PHOW-Calibration Methods: Linear, Direct, Indirect and Multiplane methods -
Pose Estimation. Stereo and Multi-view Geometry: Epipolar Geometry, Rectification and Issues
related to Stereo, General Stereo with E Matrix Estimation, Stratification for 2 Cameras,
Extensions to Multiple Cameras, Self-Calibration with Multiple Cameras, 3D reconstruction of
cameras and structures, Three View Geometry.
1. Forsyth and Ponce, “Computer Vision – A Modern Approach”, Second Edition, Prentice
Hall, 2011.
2. Emanuele Trucco and Alessandro Verri, “Introductory Techniques for 3-D Computer
Vision”, Prentice Hall, 1998.
3. Olivier Faugeras, “Three Dimensional Computer Vision”, MIT Press, 1993.
4. Richard Szeliski, “Computer Vision: Algorithms and Applications”, Springer, 2011.
5. Milan Sonka, Vaclav Hlavac and Roger Boyle, “Image Processing, Analysis and Machine
Vision”, Third Edition, CL Engineering, 2013.
This course is intended to be a self study course. Each student can select an area of self study in
consultation with the Faculty, collect and study basic and recent research articles (project
reports, review articles, published articles in journals and book chapters.) on the topic. Students
will be required to make two in-class presentations. The Seminars will be evaluated for grading
purpose. The evaluation will be done by a panel of (at least) two Faculty members.
Topics to be covered:
Selection of project domain; Publication ethics, Tools and evaluation. Selection of tentative
project area and process of literature survey – Literature survey components and procedures
Basic components of a research paper – procedures and processes, Journal types, Scopus, web
of science, Thomson Reuters, Science Citation Index, H-index, Google citations, Presentation of
selected project proposal – Oral presentation. Preparation of a report on the selected project
proposal in LaTeX format
Attending special invited lectures, practical orientation in searching and collecting literature
through library, online tools, presenting a seminar on selected project proposal and submitting
project report prepared using LaTeX.
TEXTBOOKS/REFERENCES:
1. H.L. Hirsch, Essential Communication Strategies for Scientists, Engineers and Technology
Professionals, Second Edition, New York: IEEE Press, 2002.
2. P.V. Anderson, Technical Communication: A Reader-Centered Approach, Sixth Edition,
Cengage Learning India Pvt. Ltd., New Delhi, 2008, (Reprint 2010).
3. W.Jr. Strunk and E.B.White, The Elements of Style, New York. Alliyan and Bacon, 1999.
1. Tomas Akenine Moller, Eric Haines and Naty Hoffman,“ Real-Time Rendering”, Third
Edition, A K Peters Ltd, 2008.
2. Matt Pharr and Greg Humphreys,“Physically Based Rendering: From Theory to
Implementation”, Second Edition, Morgan Kaufmann, 2010.
3. Lars Linsen, Hans Hagen and Bernd Hamann,“Visualization in Medicine and Life
Sciences”, Springer-Verlag Berlin Heidelberg, 2008.
4. Donald Hearn and Pauline Baker, “Computer Graphics”, Second Edition, Prentice Hall
of India, 1994.
Case Study: Face Detection and Recognition, Natural Scene Videos, Crowd Analysis, Video
Surveillance, Traffic Monitoring, Intelligent Transport System.
Shape from X – Shape from Stereo, Shape from Shading, Shape from Silhouette, Shape from
Texture and Shape from Focus. Shape Representation: Statistical Shape Models, Active Shape
Models, Combined Appearance Models, Active Appearance Models, View-based Appearance
Models, Tracking with View-based Appearance Models. Object Recognition: Shape
Correspondence and Shape Matching, PCA, Shape Priors for Recognition, Finding Templates
and Recognition, Recognition by Relations between Templates, Robotic vision, Computer
Vision on the GPU. Tracking & Video Analysis: Tracking and Motion Understanding - Kalman
filters, condensation, particle, Bayesian filters, Hidden Markov models, Change detection and
Model-based tracking.
1. Forsyth and Ponce, “Computer Vision – A Modern Approach”, Second Edition, Prentice
Hall, 2011.
2. Emanuele Trucco and Alessandro Verri, “Introductory Techniques for 3-D Computer
Vision”, Prentice Hall, 1998.
3. Olivier Faugeras, “Three Dimensional Computer Vision”, MIT Press, 1993.
4. Richard Szeliski, “Computer Vision: Algorithms and Applications”, Springer, 2011.
5. Richard Hartley and Andrew Zisserman, “Multiple View Geometry in Computer Vision”,
Second Edition, Cambridge University Press, 2004.
Design and Fabrication of Soft Zoom Lens Applied in Robot Vision - Methods for Reliable
Robot Vision with a Dioptric System - An Approach for Optimal Design of Robot Vision
Systems -Visual Motion Analysis for 3D Robot Navigation in Dynamic Environments - A Visual
Navigation Strategy based on Inverse Perspective Transformation - Vision-based Navigation
Using an Associative Memory. Vision Based Robotic Navigation: Application to Orthopedic
Surgery - Navigation and Control of Mobile Robot Using Sensor Fusion - Visual Navigation for
Mobile Robots - Interactive Object Learning and Recognition with Multiclass Support Vector
Machines - Recognizing Human Gait Types - Environment Recognition System for Biped Robot
Walking Using Vision Based Sensor Fusion - Non Contact 2D and 3D Shape Recognition by
Vision System for Robotic Prehension - Image Stabilization in Active Robot Vision-Real - Time
Stereo Vision Applications - Robot vision using 3D TOF systems - Calibration of Non-SVP
Hyperbolic Catadioptric Robotic Vision Systems - Computational Modeling, Visualization, and
Control of 2-D and 3-D Grasping under Rolling Contacts - Towards Real Time Data Reduction
and Feature Abstraction for Robotics Vision - LSCIC Pre Coder for Image and Video
Compression - The Robotic Visual Information Processing System Based on Wavelet
Transformation and Photoelectric Hybrid-Direct Visual Servoing of Planar Manipulators Using
Moments of Planar Targets - Industrial Robot Manipulator Guarding Using Artificial Vision -
Remote Robot Vision Control of a Flexible Manufacturing Cell - Robot Vision in Industrial
Assembly and Quality Control Processes - Multi-Task Active - Vision in Robotics
Medical imaging modalities: Planar X-Ray imaging - X-Ray Computed Tomography – Magnetic
Resonance Imaging – Nuclear Imaging – Ultrasonography – Other modalities. Image file
formats: DICOM and other medical image file formats. Image Enhancement: Fundamental
enhancement techniques- Adaptive Image Filtering – Enhancements by multiscale non linear
operators. Image segmentation: Overview and fundamentals of medical image segmentation –
Segmentation using Graph cuts – Segmentation using fuzzy clustering – Neural networks –
Deformable models. Medical image registration: Rigid body transformation – Non rigid body
transformation – Pixel based registration – Surface based registration – Intensity based
registration. Medical image fusion: Linear and non linear methods – Wavelet based fusion –
Pyramidal fusion schemes – Edge preserving fusion algorithms. Validation of medical image
analysis techniques. Case Study: Brain image analysis and atlas construction – Tumour image
analysis and treatment planning – Lung Nodule Analysis - Retinal Image Processing and analysis
- X-Ray image processing and Analysis. Tools: VTK / ITK, Fiji, Mevislab.
Architecture and Design: Introduction - Architecture of content-based image and video retrieval -
Designing an image retrieval system - Designing a video retrieval system. Feature extraction and
similarity measure: Color - Texture - Shape - Spatial relationships - MPEG 7 features. Modeling
and analysis of images: Classification and clustering - Annotation and semantic based retrieval
of visual data - Probabilistic models - Relevance feedback. Standards for image data
management: Standards relevant to Content based image retrieval - Image compression - Query
Specification - Metadata description. Analysis of video: Feature extraction - Semantics
understanding - Summarization - Indexing and retrieval of video - Mining large databases.
Applications: Architectural and engineering design - Fashion and interior design - Journalism
and advertising -Medical Diagnosis - Geographical Information Systems and Remote Sensing -
Education and Training - Web Searching.
1. Oge Marques and Borko Furht, “Content Based Image and Video Retrieval”, Multimedia
Systems and Applications, Springer, 2002.
2. Oge Marques, “Practical Image and Video Processing”, Wiley IEEE Press, 2011.
3. Borko Furht and Oge Marques, “Hand Book of Video Databases Design and
Applications”, CRC Press, 2003.
4. Yogita Mistry and Dr. D.T. Ingole, “Survey on Content Based Image Retrieval Systems”,
International Journal of Innovative Research in Computer and Communication
Engineering, 2013.
Introduction: Data capture - Document image understanding - Concepts and components. Pixel
level processing: Thresholding – Basic Document Image Binarization – Edge Preserving
Binarization Techniques – Combining Different Binarization Techniques - Case Study:
Binarization of Historical Documents.
Noise reduction and Enhancement: Noise in Conventional and Camera Captured Document
Images – Border Noise and Frame Noise Removal – Circular Noise and Stroke like Pattern
Noise Removal – Clutter Noise Removal – Bleed Through Removal - Case Study: Removing
noise from Historical Documents.
Feature level processing: Introduction - Polygonalization - Critical point detection - Line and
curve fitting - Shape description and recognition - Case Study: Detection of lines, curves and
angles from engineering drawings/maps.
Methodologies: Projection Profile Analysis – Run Length Smearing – Recursive X-Y Cut –
White Space Analysis - Connected component computation - Top Down and Bottom Up
Strategies.
Skew detection and correction: Projection Profile based Skew Detection- Nearest Neighbour
Clustering based Skew Detection – Hough Transform based Skew Detection - Slant Estimation.
Layout analysis - Geometric and logic layout analysis - Page decomposition – Region
segmentation - Labelling and classification.
Text analysis and recognition: Text segmentation – Script, Language and Font identification -
Character segmentation – Machine printed character recognition - Hand written character
recognition – Case Study: Optical Character Recognition (OCR) - Tesseract OCR- Form
Processing - Recognition of Braille characters. Commercial state and future trends.
Introduction to Image Fusion – Spatial and transform domain fusion schemes – fusion rules -
Current trends in super-resolution image reconstruction- Introduction – Geometric
transformation models – Image degradation models – State-of-the-art SR methods.
Multiresolution analysis – Fundamental principles – wavelet based fusion scheme – pyramid –
based fusion scheme – ‘A truos’ wavelet fusion scheme. Image fusion using ICA. Image fusion
using optimization of statistical measurements – Introduction – Mathematical preliminaries –
Dispersion Minimisation fusion based methods – Kurtosis Maximisation Fusion based methods.
Fusion of edge maps using statistical approaches – Introduction – Automatic edge detection –
ROC analysis. Region-based multi focus image fusion – spatial domain fusion – fusion using
segmented regions. Pixel level image fusion metrics – Signal level performance evolution – Edge
based metrics – performance of fusion metrics. Objectively adaptive image fusion - forward
adaptive – feed-back adaptive schemes – evaluation parameters – Optimal video fusion.
Performance evaluation of image fusion techniques – Signal-to-Noise Ration (SNR) – Peak
Signal-to-Noise Ration (PSNR) – Mean Square Error (MSE) – Mutual Information – Fusion
Factor – Fusion Symmetry.
Case study : Out-of-focus image fusion – Multi-modal image fusion – Image fusion in remote
sensing applications – Image fusion in medical applications.
1. Tania Stathaki, “Image Fusion- Algorithms and Applications”, First Edition, Academia Press,
2008.
2. Rick S. Blum and Zheng Liu, “Multi-Sensor Image Fusion and Its Applications”, CRC Press,
2005.
1. Ingemar Cox, Matthew Miller, Jeffrey Bloom and Mathew Miller, “Digital
Watermarking: Principles and Practice”, Morgan Kaufmann Series in Multimedia
Information and Systems, 2008.
2. Stefan Katzenbeisser and Fabien A. P. Petitcolas, “Information Hiding Techniques for
Steganography and Digital Watermarking”, Artech House, 1999.
3. Juergen Seitz, “Digital Watermarking for Digital Media”, Information Science
Publishing, 2005
4. B. Schneier, “Applied Cryptography”, Second Edition, Wiley, 1996.
5. Menezes, P. van Oorschot and S. Vanstone, “Handbook of Applied Cryptography”, CRC
Press, 1997.
Pattern Analysis and Statistical Learning: Statistical classification, Visual pattern representation,
statistical learning. Unsupervised Learning for Visual Pattern Analysis: Cluster analysis,
Clustering algorithms, Representational Models Component Analysis: Overview, Generative and
Discriminative models. Manifold Learning: global, local and hybrid methods
Functional Approximations: Modeling and Approximating visual data, Lifting schemes,
Temporal filtering in video coding.
Supervised Learning for Visual Classification: Support vector machine, Boosting algorithm.
Statistical Motion Analysis, Tracking of Visual Objects, Robust Visual Tracking, Multi Target
Tracking in Video. Cognition Process: cognitive model, brain research and visual science, visual
cognition, cognitive mechanisms.
1. Nanning Zheng and Jianru Xue, “Statistical Learning and Pattern Analysis for Image and
Video Processing (Advances in Pattern Recognition Series)”, Springer-Verlag London
Limited, 2009.
2. Christopher M Bishop, “Pattern Recognition and Machine Learning” Springer, 2006.
3. Ian T. Babney, “NETLAB: Algorithms for Pattern Recognition (Advances in Pattern
Recognition series)”, Springer, 2002.
Thinking in Parallel: Parallelism Vs. Concurrency, Types and levels of parallelism, Different
grains of parallelism, Flynn’s classification of multi-processors, Introduction to parallelization
and vectorization: Data dependencies, Bernstein conditions for Detection of Parallelism,
Motivation for Heterogeneous Computing, Definition of thread and process, Parallel
programming models, Parallel Programming constructs: Synchronization, Deadlocks, Critical
sections and Data sharing .
Optimizations and Tools: Memory coalescing; thread and warp divergence, avoiding bank
conflicts, Reduction operation using prefix sum example. Usage of shared memory optimally,
Performance issues in algorithms, Need of profilers and analyzers, Introduction to CUDA Tools:
GDB, MemCheck, Command line & Visual Profilers, Parallel NSight: Debugger, Analyzer&
Graphics Inspector.