0% found this document useful (0 votes)
13 views36 pages

Htran DAV C0 Introduction

The document outlines a course on Data Analysis and Visualization (AC4110E) led by Assoc. Prof. Thanh-Hai Tran, detailing its objectives, grading structure, and course modules. It covers the significance of data analysis and visualization, challenges faced in the field, and practical applications in various domains. Students will learn techniques for data representation and engage in projects to extract insights from multimedia data.

Uploaded by

vuminhduck67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views36 pages

Htran DAV C0 Introduction

The document outlines a course on Data Analysis and Visualization (AC4110E) led by Assoc. Prof. Thanh-Hai Tran, detailing its objectives, grading structure, and course modules. It covers the significance of data analysis and visualization, challenges faced in the field, and practical applications in various domains. Students will learn techniques for data representation and engage in projects to extract insights from multimedia data.

Uploaded by

vuminhduck67
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Course code: AC4110E

DATA ANALYSIS AND VISUALIZATION


Lecturer: Assoc. Prof. Thanh-Hai Tran

1
• Introduction to data analysis and visualization
• Syllabus of the course
• Position of the course in the ET-E16 program
Outline • Course grading
• Tools
• References
2
Introduction to Data Analysis and Visualization (DAV)

Source: https://www.ontotext.com/knowledgehub/fundamentals/dikw-pyramid/ 3
Introduction to Data Analysis and Visualization (DAV)

Source: https://www.i-scoop.eu/big-data-action-value-context/dikw-model/ 4
Introduction to data analysis and visualization

• Data actually refers to discrete, objective facts in a digital form.


• Data are the basic building blocks that, when organized and arranged in different
ways, lead to information that is useful in answering some questions about the
business
• Some forms of data
▪ CSV files
▪ Database tables
▪ Document formats (Excel, PDF, Word, and so on)
▪ HTML files
▪ JSON files
▪ Text files
▪ XML files

5
Introduction to data analysis and visualization

Source: https://lawtomated.com/structured-data-vs-unstructured-data-what-are-they-and-why-care/ 6
What is data analysis and visualization

• Information is processed data presented as an answer to a business


question. Data becomes information when we add a relationship or an
association
• Knowledge emerges when humans interpret and organize information
and use that to drive decision-making
• Data analysis (aka Data analytics) hinges upon mathematical
algorithms to determine the relationships between the data that can
yield insight
Data visualization is to gain insight (hidden truth) into the data or
information

7
Why do we need data analysis and visualization?

8
Why do we need data analysis and visualization?

Phổ điểm khối A1 ở Hà Giang 2018

9
Why do we need data analysis and visualization?

Source: https://fmarket.vn 10
Why do we need data analysis and visualization?

Source: https://iboard.ssi.com.vn/

11
What is data analysis and visualization

12
What is data analysis and visualization

Data visualization

13
What is data analysis and visualization

• Example of data visualization

14
Example of data visualization

Grad-CAM visualization. a) DU. b) GC. c) EC. d) Gastritis. e) Esophagitis


Nguyen, Phuong-Thao, et al. "Automatic classification of upper gastrointestinal tract diseases from endoscopic images." 2022 11th 15
International Conference on Control, Automation and Information Sciences (ICCAIS). IEEE, 2022.
Example of data visualization

T-SNE visualization of deep features space in the cross-validation fold leaving the last subject out of MICAGes
dataset. First column: original featurespace; second column: MvDA common space; third column: MvDA-vc
common space; fourth column: pc-MvDA common space. Best viewed in color.
Tran, Hoang-Nhat, et al. "Pairwise-covariance multi-view discriminant analysis for robust cross-view human action recognition." IEEE Access 9 (2021): 16
76097-76111.
What is data analysis and visualization

• Example
1 joint 20 joints

10 joints

17
What is data analysis and visualization

18
Introduction to data analysis and visualization

• What: (concept)
o Data analysis is the process of
▪ examining, cleaning, transforming, and interpreting data
▪ to uncover meaningful insights, trends, and patterns that can support decision-making
o Data visualization
▪ graphical representation of information and data.
▪ By using visual elements such as charts, graphs, and maps, it allows stakeholders to
understand complex data more effectively and identify patterns, trends, and outliers.

19
Introduction to data analysis and visualization

• Why: (significance)
o Data analysis
▪ Informed decision making
▪ Identify trends and patterns
▪ Problem-solving and root cause analysis
▪ Risk management
▪ Performance evaluation
▪ Predition
o Data visualization
▪ Simplify complex data for better understanding
▪ Highlight key trends and insights
▪ Facilitate faster decision-making
▪ Communicate information to both technical and non-technical audiences

20
Introduction to data analysis and visualization

• How:

21
Main challenges of DAV

• Data quality issues: • Data Bias and Ethical Implications


o Incomplete or Missing Data o Algorithmic bias
o Inconsistent Data Formats o Cherry-picking data
o Noisy data • Integration of multisource data
• Data volume and scalability: o Heterogeneous Data Sources
o Large dataset o Data latency
o Realtime analysis • Performance and scalability in
• Data privacy and security visualization
o sensitive information exposure o rendering complex visual
o Ethical Concerns o mobile platform and cross-platform
compatibility
• Understanding and interpreting
visualizations • Technology trending
o Misleading Graphs and Charts o Rapid revolution of tools
o Overplotting in Charts

22
Objective of the course

• In the course, students will gain the background knowledge of data


analysis process.
• Then, they study on techniques for representing data in compact form.
• At the end of this course, students will
• Practice through projects associated with content analysis,
• Visually represent content from multimedia data
• Extract useful information from a variety of data sources to specific fields such
as economics, education, healthcare

23
• Introduction to data analysis and visualization
• Syllabus of the course
• Position of the course in the ET-E16 program
Outline • Course grading
• Tools
• References
24
Course modules

1. Preprocessing techniques
(missing data, noise, normalization)

2. Data visualization
(low dimensional data and high dimensional data
visualization techniques, e.g. SOM, Manifold, NMF)
Final
projects
3. Correlation và Feature Selection
(ANOVA, Chi Square Test, PEARSON)

4. Multimodality analysis and visualization (late


fusion, early fusion, CCA)

25
• Introduction to data analysis and visualization
• Syllabus of the course
• Position of the course in the ET-E16 program
Outline • Course grading
• Tools
• References
26
Data analysis and visualization in multimedia ecosystem
Multimedia elements Multimedia ecosystem Used technology

Applications Software/API/UI/UX

Synthesized Content Production

Multimedia communications
Video Rendering and Production Artificial Intelligence
Machine learning/Deep learning

Audio Analysis and Understanding


Multimedia Signal Management
and Processing
Animation Techniques/Algorithms
Processing and Representation

Multimedia Communication
Multimedia Management Systems
Graphics
Sensors, Devices and
Multimedia Data Collection Hardware platform
Text

Source: ET E16 programme 27


Data analysis and visualization in multimedia ecosystem
Multimedia elements Multimedia ecosystem Used technology

Applications Software/API/UI/UX

Synthesized Content Production

Multimedia communications
Video Rendering and Production Artificial Intelligence
Machine learning/Deep learning

Audio Analysis and Understanding


Multimedia Signal Management
and Processing
Animation Techniques/Algorithms
Processing and Representation

Multimedia Communication
Multimedia Management Systems
Graphics
Sensors, Devices and
Multimedia Data Collection Hardware platform
Text

Source: ET E16 programme 28


Position of Data analysis and visualization

Applications in: Education, Operations,


Software platforms: Multimedia authoring and programming: HCI - UI/UX: user models, task analysis,
Public, Business office, Home, Information
Web, mobile, desktop, language programming and tools for games, interface design, interactivity, usability test
systems, Art, entertainment, biomedical,
Edge, etc. effects, etc. (Unity, C/C#, Java) and evaluation
social network, robotics, etc.
Multimedia Applications

Content design: Media Arts, Computer Vision: Object Speech Recognition


Synthesized content production: computer graphic (2/3D modeling Machine learning:
Design thinking, Scenario and and animation), computational photography, image effect (wrapping/ detection, Tracking, Video and Synthesis:
classification and
Scripting, Visual Presentation, blending/ morphing), AR-VR, Games, e-music (delay, echo, chorus, segmentation and Acoustic/Phonetic
regression, Deep
Culture and Communication equalization, 3D audio effects), multimedia element combination, text summarization, model, Language
Learning, etc.
summarization, chat bot, etc. multimodality, etc. model, Decoding
Rendering & Production Analysis and Understanding

Image/Video representation and Speech and Text Processing: linguistic Big data and Analytics: crowd sourcing, large scale
Music and Sound Processing: and big multimedia data retrieval; Inference
processing: color space, geometric and non-linguistic speech features,
filtering, de-noising, enhancement, methods, hidden semantic analysis, big
transform, filtering, de-noising, speech signal representation, word
source separation, voice /music heterogeneous data
enhancement, manipulation, segmentation, POS tagging,
detection,
segmentation, feature extraction syntactic/semantic parsing
Multimedia Indexing and Retrieval: Data models,
Image and Video/Audio Coding and Compression: Still image compensation, motion estimation, video compression, database management system (indexing,
compression standards similarity, search, ranked retrieval, user relevance
feedback)
Multidimensional Signal Processing: random process, information theory,
Multimedia Processing and Representation 2-D and 3-D signal processing theory, super-resolution methods, etc. Multimedia Management

System Software: Computer Multimedia Sensors/IoT Devices: Transmission (Edge/cloud, 4-5G,…): Mobile Services, Security and Privacy:
Architecture, Operating Systems, audio/video sensors, display and multimedia, transmission mechanisms, next- Transmission reliability, QoS, QoE;
Embedded systems, etc. interactive systems generation wired/wireless networks, transport Security and privacy for multimedia
protocols, modeling of multimedia networks data.
Multimedia data collection Multimedia Communication 29
• Introduction to data analysis and visualization
• Syllabus of the course
• Position of the course in the ET-E16 program
Outline • Course grading
• Tools
• References
30
Course Grading

• Process grading: (30%)


• Short exercise, quizzes
• Weekly exercise (1-2 students): each group has a week to prepare and present
• Class attendance
• Assignments
• Final grading based on projects: (70%)
• Groups of 2 students
• Evaluation based on the quality, size of the project
• Individual-based (quality of report, code, presentation and answers)

31
Capstone project

• The result will be presented in the ending period of this subject


• Every member is required to contribute to his/her project and
presentation
• Project report:
• Source code: save your code into one zip file
• Readme.txt: describes clearly how to setup, compile, and run your program
• Written report:
• Introduce the problem to be solved, the data sets were used
• Details about the methods for analyzing data
• Results of different evaluations, new conclusions/findings, …
• The main components of your code
• The difficulties in this project, and your proposed solution

32
Tools

33
References

• Lectures:
• Reference books:
• Grus, Joel. Data science
from scratch: first
principles with python.
O'Reilly Media, Inc.,
2015
• Phuong Vo et al.
Python: Data Analytics
and Visualization, Packt
Publishing, 2017

34
Conclusions

• Concept of data analysis and visualization


• Significance and main steps of DAV
• Tools for DAV
• References

35
THANK YOU !

36

You might also like