0% found this document useful (0 votes)
9 views

Projects

Uploaded by

Văn Minh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views

Projects

Uploaded by

Văn Minh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Outline

1. Introduction P3
2. The Research Pipeline
3. Scene Text Detection and Recognition
4. License Plate Detection and Recognition
5. Anomaly Detection
6. Face Recognition
7. Facial Age Estimation
8. Face Swapping and Reenactment

Digital Surveillance Systems and Application 2


Introduction (1/2) – Detection and Recognition
Scene Text Detection and Recognition License Plate Detection and Recognition

Anomaly Detection

3
Introduction (2/2) – Facial Analysis
Face Recognition Facial Age Estimation

Face Swapping and Reenactment

4
The Research Pipeline

Evaluation and
Define the Problem Collect the Data Train the Model
Analysis

Topics will be As most of the dataset Follow the GitHub we Follow the instruction to
announced on Moodle is “unbalanced”. Each suggested, please evaluation the per-
by 10/8. group needs to collect train the model to formance and conduct
and annotate the solve the specified the error analysis.
augmented data. problem.

Note: After the project presentation, please upload 1) the slides; 2) the trained model; and 3) the collected
data with annotation to Google Drive.
5
Scene Text Detection and Recognition
• Reference-1:
➢ Wang et al. "AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text
Spotting." ECCV, 2020.
• Reference-2:
➢ Wang et al. "Shape Robust Text Detection with Progressive Scale Expansion Network." CVPR ,
2019.
➢ Yue et al. " RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition."
ECCV , 2020.
• Reference-3:
➢ Long et al. " TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes." ECCV,
2018.
➢ Sheng et al. " NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition."
ICDAR , 2019.

Each group needs to collect 1000 training images with scene texts in “Chinese characters” as the
augmented data.
1. Each image should be captured from in-the-wild environments.
2. Each image should be labeled with the detection box and scene texts as the ground-truth (GT).
3. The test set will be provided by TAs.
6
License Plate Detection and Recognition
• Reference-1:
➢ Zhu et al. " Fourier Contour Embedding for Arbitrary-Shaped Text Detection." CVPR. 2021.
➢ Li et al. " Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition."
AAAI, 2019.
• Reference-2:
➢ Liao et al. " Real-Time Scene Text Detection with Differentiable Binarization." AAAI, 2020.
➢ Lee et al. " On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention." CVPRW, 2020.
• Reference-3:
➢ Zhang et al. " Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection."
CVPR, 2020
➢ Yue et al. " RobustScanner: Dynamically Enhancing Positional Clues for Robust Text
Recognition.“ ECCV, 2020

Each group needs to collect 1000 training images with license plates as the augmented data.
1. Each image should be captured from in-the-wild environments.
2. Each image should be labeled with the detection box and license plate number as the
ground-truth (GT).
3. The test set will be provided by TAs.

7
Anomaly Detection
• Yi et al. "Patch svdd: Patch-level svdd for anomaly detection and segmentation." ACCV, 2020.
• Gudovskiy, Denis, Shun Ishizaka, and Kazuki Kozuka. "CFLOW-AD: Real-Time Unsupervised Anomaly
Detection with Localization via Conditional Normalizing Flows." WACV, 2022.

For anomaly detection,


1. Each group need to implement the codes on the MVTec-AD dataset.
2. The evaluation should also on the test data of the MVTec-AD dataset.
3. TAs will provide the original and the cropped version of MVTec-AD dataset.

8
Facial Analysis (1/2)
• Facial Age Estimation
➢ Cao et al. "Rank consistent ordinal regression for neural networks with application to age
estimation." Pattern Recognition Letters, 2020.
➢ Zhang et al. "C3AE: Exploring the limits of compact model for age estimation." CVPR, 2019.

Each group needs to collect 100 subjects (identity) as the augmented data.
1. Each subject should includes more than 7 images across age 0 – 70.
2. Each image should be labeled with the subject’s name with 5 point landmarks, face
detection boxes, and age information.

Note: The 5 point landmarks and face detection boxes can be manually annotated or any other method.
E.g., MTCNN https://github.com/ipazc/mtcnn
9
Facial Analysis (2/2)
• Face Swapping and Reenactment
➢ Siarohin, Aliaksandr, et al. "First order motion model for image animation." ANIES, 2019.
➢ Zhang et al. "Real-Time Audio-Guided Multi-Face Reenactment." ICASSP , 2021.
• Face Recognition
➢ Wang, Hao, et al. "Cosface: Large margin cosine loss for deep face recognition." CVPR. 2018.
➢ Deng et al. " ArcFace: Additive Angular Margin Loss for Deep Face Recognition." CVPR. 2019.

Each group needs to collect 100 subjects (identity) as the augmented data.
1. Each subject should includes more than 20 images across head yaw angle 0o – 90o,
and half of the images should have the poses greater than 45o
2. Each image should be labeled with the subject’s name with 5 point landmarks, face
detection boxes and pose information.

Note: The 5 point landmarks and face detection boxes can be manually annotated or any other method.
E.g., MTCNN https://github.com/ipazc/mtcnn 10
Scene Text Detection and Recognition (1/3)
Paper Title: AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting
Conference: ECCV 2020
Wang, Wenhai and Liu, Xuebo and Ji, Xiaozhong and Xie, Enze and Liang, Ding and Yang, ZhiBo
Authors:
and Lu, Tong and Shen, Chunhua and Luo, Ping
Paper: https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123590443.pdf
Github: https://github.com/whai362/AE_TextSpotter

11
Scene Text Detection and Recognition (2/3)
Detection Module
Paper Title: Shape Robust Text Detection with Progressive Scale Expansion Network
Conference: CVPR 2019
Authors: Xiang Li, Wenhai Wang, Wenbo Hou, Ruo-Ze Liu, Tong Lu, Jian Yang
Paper: https://openaccess.thecvf.com/content_CVPR_2019_paper.pdf
Github: https://github.com/open-mmlab/mmocr

Recognition Module
Paper Title: RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition
Conference: ECCV 2020
Authors: Xiaoyu Yue, Zhanghui Kuang, Chenhao Lin Hongbin and Wayne Zhang
Paper: https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123640137.pdf
Github: https://github.com/open-mmlab/mmocr

12
Scene Text Detection and Recognition (2/3)
Detection Module

Recognition Module

13
Scene Text Detection and Recognition (3/3)
Detection Module

Paper Title: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes
Conference: ECCV 2018
Authors: Shangbang Long, Jiaqiang Ruan, Wenjie Zhang, Xin He, Wenhao Wu, Cong Yao
Paper: https://openaccess.thecvf.com/content_ECCV_2018_paper.pdf
Github: https://github.com/open-mmlab/mmocr

Recognition Module

Paper Title: NRTR: A No-Recurrence Sequence-to-Sequence Model For Scene Text Recognition
Conference: ICDAR 2019
Authors: Fenfen Sheng, Zhineng Chen, Bo Xu
Paper: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=8978180&tag=1
Github: https://github.com/open-mmlab/mmocr

14
Scene Text Detection and Recognition (2/3)
Detection Module

Recognition Module

15
License Plate Detection and Recognition (1/3)
Detection Module

Paper Title: Fourier Contour Embedding for Arbitrary-Shaped Text Detection


Conference: CVPR 2021
Authors: Yiqin Zhu , Jianyong Chen, Lingyu Liang , Zhanghui Kuang , Lianwen Jin , Wayne Zhang
https://openaccess.thecvf.com/content/CVPR2021/papers/Zhu_Fourier_Contour_Embedding_for
Paper:
_Arbitrary-Shaped_Text_Detection_CVPR_2021_paper.pdf
Github: https://github.com/open-mmlab/mmocr

Recognition Module

Paper Title: Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition
Conference: AAAI 2019
Authors: Hui Li, Peng Wang ,Chunhua Shen ,Guyu Zhang
Paper: https://ojs.aaai.org/index.php/AAAI/article/view/4881/4754
Github: https://github.com/open-mmlab/mmocr
16
License Plate Detection and Recognition (1/3)
Detection Module

Recognition Module

17
License Plate Detection and Recognition (2/3)
Detection Module

Paper Title: Real-time Scene Text Detection with Differentiable Binarization


Conference: AAAI 2020
Authors: Minghui Liao, Zhaoyi Wan, Cong Yao, Kai Chen, Xiang Bai
Paper: https://ojs.aaai.org/index.php/AAAI/article/view/6812/6666
Github: https://github.com/open-mmlab/mmocr

Recognition Module

Paper Title: On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention


Conference: CVPRW 2020
Junyeop,Lee and Seong,Joon Oh and Sungrae,Park and Seonghyeon,Kim and Jeonghun,Baek and
Authors:
Hwalsuk,Lee
https://openaccess.thecvf.com/content_CVPRW_2020/papers/w34/Lee_On_Recognizing_Texts_of
Paper:
_Arbitrary_Shapes_With_2D_Self-Attention_CVPRW_2020_paper.pdf
Github: https://github.com/open-mmlab/mmocr 18
License Plate Detection and Recognition (2/3)
Detection Module

Recognition Module

19
License Plate Detection and Recognition (3/3)
Detection Module

Paper Title: Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection
Conference: CVPR 2020
Authors: Shi-Xue Zhang, Xiaobin Zhu, Jie-Bo Hou , Chang Liu,Chun Yang , Hongfa Wang , Xu-Cheng Yin
https://openaccess.thecvf.com/content_CVPR_2020/papers/Zhang_Deep_Relational_Reasoning_
Paper:
Graph_Network_for_Arbitrary_Shape_Text_Detection_CVPR_2020_paper.pdf
Github: https://github.com/open-mmlab/mmocr

Recognition Module

Paper Title: RobustScanner: Dynamically Enhancing Positional Clues for Robust Text Recognition
Conference: ECCV 2020
Authors: Xiaoyu Yue, Zhanghui Kuang, Chenhao Lin, Hongbin Sun, Wayne Zhang
Paper: https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123640137.pdf
Github: https://github.com/open-mmlab/mmocr
20
License Plate Detection and Recognition (3/3)
Detection Module

Recognition Module

21
Anomaly Detection (1/2)
Paper Title: Patch svdd: Patch-level svdd for anomaly detection and segmentation
Conference: ACCV
Authors: Yi, Jihun and Yoon, Sungroh
https://openaccess.thecvf.com/content/ACCV2020/papers/Yi_Patch_SVDD_Patch-
Paper:
level_SVDD_for_Anomaly_Detection_and_Segmentation_ACCV_2020_paper.pdf
Github: https://github.com/nuclearboy95/Anomaly-Detection-PatchSVDD-PyTorch

22
Anomaly Detection (2/2)
CFLOW-AD: Real-Time Unsupervised Anomaly Detection with Localization via Conditional
Paper Title:
Normalizing Flows
Conference: WACV 2022
Authors: Gudovskiy, Denis and Ishizaka, Shun and Kozuka, Kazuki
Paper: https://arxiv.org/pdf/2107.12571.pdf
Github: https://github.com/gudovskiy/cflow-ad

23
Face Recognition (1/2)
Paper Title: Cosface: Large margin cosine loss for deep face recognition.
Conference: CVPR 2018
Wang, Hao and Wang, Yitong and Zhou, Zheng and Ji, Xing and Gong, Dihong and Zhou,
Authors:
Jingchao and Li, Zhifeng and Liu, Wei
https://openaccess.thecvf.com/content_cvpr_2018/html/Wang_CosFace_Large_Margin_CVPR_20
Paper:
18_paper.html
Github: https://github.com/ZhaoJ9014/face.evoLVe

24
Face Recognition (2/2)
Paper Title: ArcFace: Additive Angular Margin Loss for Deep Face Recognition
Conference: CVPR 2019
Authors: Jiankang Deng and Jia Guo and Niannan Xue1 Stefanos
https://openaccess.thecvf.com/contentCVPR2019/html/DengArcFaceAdditiveAngularMarginLos
Paper:
sforDeepFaceRecognitionCVPR2019paper.html
Github: https://github.com/ZhaoJ9014/face.evoLVe

25
Age Estimation (1/2)
Paper Title: Rank consistent ordinal regression for neural networks with application to age estimation
Conference: PRL 2020
Authors: WenzhiCao, VahidMirjalili, SebastianRaschka
Paper: https://www.sciencedirect.com/science/article/pii/S016786552030413X
Github: https://github.com/Raschka-research-group/coral-cnn

26
Age Estimation (2/2)
Paper Title: C3AE: Exploring the Limits of Compact Model for Age Estimation
Conference: CVPR 2019
Authors: Chao Zhang, Shuaicheng Liu, Xun Xu , Ce Zhu
https://openaccess.thecvf.com/content_CVPR_2019/html/Zhang_C3AE_Exploring_the_Limits_of
Paper:
_Compact_Model_for_Age_Estimation_CVPR_2019_paper.html
Github: https://github.com/whai362/AE_TextSpotter

27
Face Swapping and Reenactment (1/2)
Paper Title: First Order Motion Model for Image Animation
Conference: NeurlPS 2019
Authors: Aliaksandr Siarohin, Stéphane Lathuilière, Sergey Tulyakov, Elisa Ricci, Nicu Sebe
Paper: https://papers.nips.cc/paper/2019/file/31c0b36aef265d9221af80872ceb62f9-Paper.pdf
Github: https://github.com/AliaksandrSiarohin/first-order-model

28
Face Swapping and Reenactment (2/2)
Paper Title: Real-Time Audio-Guided Multi-Face Reenactment
Conference: ICASSP 2021
Authors: Jiangning Zhang, Xianfang Zeng, Chao Xu, and Yong Liu
Paper: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9552566
Github: https://github.com/zhangzjn/APB2FaceV2

29

You might also like