0% found this document useful (0 votes)
31 views

Moditroduction Multimedia Database

This document provides an introduction to multimedia databases. It discusses the explosion of digitized text, images, audio, and video data and how this data is stored and processed more easily in digital form. Examples are given of applications that involve storing, retrieving, processing and sharing multimedia data, such as journalism, searching movies and the web. The characteristics of different media types like text, audio, images and video are described. Digital representation and compression techniques for these formats are also outlined. The document concludes with an overview of multimedia information retrieval systems and their typical components like archiving, feature extraction, searching and querying.

Uploaded by

reza
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
31 views

Moditroduction Multimedia Database

This document provides an introduction to multimedia databases. It discusses the explosion of digitized text, images, audio, and video data and how this data is stored and processed more easily in digital form. Examples are given of applications that involve storing, retrieving, processing and sharing multimedia data, such as journalism, searching movies and the web. The characteristics of different media types like text, audio, images and video are described. Digital representation and compression techniques for these formats are also outlined. The document concludes with an overview of multimedia information retrieval systems and their typical components like archiving, feature extraction, searching and querying.

Uploaded by

reza
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

Module 1

INTRODUCTION TO
MULTIMEDIA DATABASES
Prof. Dr. Naomie Salim
Faculty of Computer Science & Information
Systems
Universiti Teknologi Malaysia

The Explosion of Digital


Multimedia Information
We interact with multimedia
everyday
Large amount of text, images,
speech & video converted to digital
form
Advantages of digitized data over
analog
Easy storage
Easy processing
Easy sharing

Give examples of
multimedia applications
that deals with storing,
retrieving, processing and
sharing of multimedia
data

Eg 1. Journalism
Journalist to write article about
influence of alcohol on driving
Investigation involved:
Collect news articles about accidents,
scientific reports, television
commercials, police interviews, medical
experts interviews

Illustration:
Search photo archives, stock footage
companies for good photos
shocking, funny, etc.

Other examples
Searching movies
Based on taste of movies already seen
Based on movies a friend favor

Searching on web
Eg. searching Australian Open website (
http://www.ausopen.org)
Integrate conceptual terms + interesting
events
give info about video segments
showing female American tennis players
going to the net

Retrieval problems
EMPLOYEE (Name: char(20), City:
Char(20), Photo: Image)
How do you select employees in
Skudai?
How do you select employees that wear
tudung, wear glasses, fair and have a
mole under the lips?

Characteristics of Media
Data
Medium a type of Information
representation
Alphanumeric
Audio, video and image traditionally in
analog representation;

Static vs dynamic
Static: do not have time dimensions
(alphanumeric data, images, graphics)
Dynamic: have time dimensions (video,
animation, audio)

Multimedia
Collection of media types used together

Digital representation of
text
OCR (Optical character recognition) techniques
convert analog text to digital text
Eg. of digital representation: ASCII
Use 8 bits
Chinese char requires more space
Storage requirements depend on number of characters

Structured documents becoming more popular


Docs consist of titles, chapters, sections, paragraphs,
etc.
Standards like HTML and XML used to encode structured
information

Compression of text
Huffman, arithmetic coding
Since storage requirements not too high, less important

Digital representation of
audio
Audio

air pressure waves with frequency,


amplitude
Human hears 20-20,000 Hertz
Low amplitude soft sound

Digitizing pressure
waveforms
Transform into

electrical signal (by


microphone)
Convert into discrete
values
Sampling: continuous
time axis divided into
small, fixed intervals
Quantization:
determination of
amplitude of video
signals at beginning of
each time interval
Human cannot notice
difference between
analog & digital with

Audio storage
requirements
Example of a CD audio

16 bits per sample


44,000 samples per second
Two (stereo) channels
Requirements = 16 * 44,000 * 2 bits = 1.4 Mbit
per second

Compression (examples)
Masking: Discard soft sound because not
audible by louder sound
Speech: coding of lower frequency sounds only
MPEG: audio compression standards

Digital representation of
image
Scan analog photos & pictures using
scanner
Analog image approximated by rectangle of
small dots
In digital camera, ADC is built-in

Image consists of many small dots or


picture elements (pixels)
Gray scale: 1 byte (8 bits) per pixel
Color: 3 color (RGB) of one byte each
Data required for 1 rectangular screen
A = xyb
A:number of bytes needed, x: # pixels per horizontal

Image compression
Exploit redundancy in image &
properties of human perception
Spatial redundancy: pixels in certain
area often appear similar (golden sand,
blue sky)
Human tolerance: error still allows
effective communication

Eg. of image compression


Transform coding
Fractal image coding

Digital representation of
video

Sequence of frames or images


presented at fixed rate

Digital video obtained by digitizing


analog videos or digital cameras
Playing 25 frames per second gives
illusion of continuous view

Amount of data to represent video


1 second, image: 512 lines, 512 pixels
per line, 24 bits per pixel, 25 frames per
second
512 * 512 * 3 * 25 = 19 Mbytes

Compression of video

Compressing frames of videos: similar to image


Reduce redundancy & exploit human perception properties

Temporal redundancy: neighboring frames normally similar,


remove by applying motion estimation & compression
Each image divided into fixed-sized blocks
For each block in image, the most similar block in previous
image is determined & pixel difference computed
Together with displacement between the two blocks, this
difference stored or transmitted

MPEG-1 (VHS, pixel based coding): coding of video data up


to speed of 1.5 Mbits per second
MPEG-2 (pixel based coding): coding of video data up to
speed of 10 Mbits per second
MPEG-4 (multimedia data, object based coding) : coding of
video data up to speed of 40 Mbits per second, tools for
decoding & representing video objects, support contentbased indexing & retrieval

How to search for


images or multimedia
data?
Analyze one by one?
No! Takes too long!
Have to use metadata instead of
searching directly, search for metadata
that have been added to it
Metadata requirements to be valuable for
searching:
Description of multimedia object should be as
complete as possible
Storage of metadata must not take too much
overhead
Comparison of two metadata values must be

Metadata of Multimedia
Objects
Descriptive data
Give format or factual info about
multimedia object
Eg.: author name, creation date, length
of multimedia object, representation
technique
Eg. standard for descriptive data: Dublin
core
Can use SQL (metadata condition in
WHERE clause)

Metadata of Multimedia
Objects (cont.)
Annotations

Textual description of contents of objects


Eg.: photo description in Facebook
Either free format or sequence of keywords
Manual text annotations allow Information
Retrieval techniques to be used but
Time consuming, expensive
Subjective, incomplete

Structured concepts (eg semantic web, ER-like


schema) can be used to describe content
through concepts, their relationships to each
other & MM object but
Also slow and expensive

Metadata of Multimedia
Objects (cont.)
Features
Derive characteristics from MM object
itself
Need language to describe features, eg.
MPEG-7
Process to capture features from MM
object is called feature extraction
Performed automatically, sometimes with
human support

Two feature classes


Low-level features

Low-level Features

Grasp data patterns & statistics of MM object


Depend strongly on medium
Extraction performed automatically
Eg. for text
List of keywords with frequency indicators

Eg. for audio


Representation
Amplitude-time sequence: quantification of air pressure at
each sample
Silence:0, > silence:+ve amplitude, < silence:-ve amplitude

Eg. Low-level features derived


Energy (loudness of signal), ZCR(zero crossing ratefrequency of sign change)-high indicate speech, silence
ratio(low indicates music)

Low-level features
(cont.)
Eg. for images
Color histograms: # pixels having color of
certain range
Spatial relationships: eg. blue patterns appears
above yellow (beach photo),
Contrast: # dark spots neighboring light spots

Eg. for video


Use low-level features for image
Eg. of temporal dimension: shot change-when
pixel difference between two images is higher
than certain threshold
Shot- sequence of images taken with same camera position

High-level features
Features which are meaningful to end
user, such as golf course, forest
How can we bridge semantic gap between
low level and high level features
High level feature extraction from low level
features
Eg. text containing words football, referee
football match text
Eg. Speech to text translators (low level audio
features to text)
Eg. Video-Domain specific: loud sound from
crowd, round object passing white line,

Multimedia Information
Retrieval System (MIRS)

Component of MIRS Archiving


MM data stored separately from its
metadata
Voluminous
Visible or audible delays in playback
unacceptable

MM data managed separately in MM


content server
Objects get identification to be used by
other parts of MIRS at storage time
Have to deal with compression and

Component of MIRS
Feature Extraction
(Indexing)
Extraction of metadata (annotations, descriptions,
features) from incoming multimedia object
Algorithms have to consider extraction
dependencies. Eg.:
Video object segmented, choose key frame for each
segment
Extract low-level features from key frame
Based on low-level features, classify into shots of
audience, fields, close-ups
For field shots, detect positions of players
Extract body related features of players
Determine where net playing begins and ends

Have to consider incremental maintenance


(modification of MM objects, extractors,

Incremental Maintenance
in ACOI Feature
Extraction Architecture

Component of MIRS Searching


Multimedia queries are diverse, can
be specified in many different ways
No exact match, many ways to
describe MM objects
Specifying information need
Direct user specifies info. need herself
Indirect user relies on other users

Possible Querying
Scenarios

Possible Querying
Scenarios (cont.)
Queries based on Profile
Users expose preferences in one way or
another
Preferences stored in user profile in
MIRS
Can use profile of a friend if not sure &
trusted

Queries based on Descriptive Data


Based on format and fact about MM
object
Eg. all movies with Director = Steven

Possible Querying
Scenarios (cont.)
Queries based on Annotations
Text-based: keywords or natural language
Eg. Show me video in which Barack Obama shakes
hand with Mahathir Mohamad
Set of keywords derived from query & compared with
keywords in annotations of movies

Queries based on Features


content-based queries
features derived (semi) automatically from content of
MM object
Low & high level features used
Eg. Find all photos with color distribution like this
photo
Eg. Give me all football videos which a goal is scored
within last ten minutes

Possible Querying
Scenarios (cont.)
Query by example
Give example MM object
MIRS extract all kinds of features from the MM
object
Resulting query based on these features

Similarity
Degree to which query & MM object of MIRS
are similar
Similarity calculated by MIRS based on
metadata of MM object & query
Try to estimate value of relevance of MM object
to user

General Retrieval Model

Relevance Feedback

Helps when user doesnt know exactly what


he is looking for, causing problem in query
formulation
Interactive approach
User issue starting query, MIRS compose result
set, user judge output (relevant/not), MIRS

Component of MIRS Browsing


User sometimes cannot precisely specify what
they want, but can recognize what they want
when they see it
Browsing let user scans through objects
Exploits hyperlinks which lead user from one object to
other
When object shown, user judge its relevance & proceed
accordingly
If objects are huge, icons are used

Starting point
query that describe info need or system provide starting
point
User can ask for another starting point if not satisfied
Can classify object based on topics & subtopics

Component of MIRS
Output Presentation
(Play)
When MIRS returns list of objects, system
has to decide whether user has right to
see them
User interface should be able to show all
kinds of MM data
What if objects are huge and result set
large?
Give user perception of content of object
Extract & present essential info for user to
browse & select objects
Text: title, summary, places where keywords occur
Audio: tune, start of song

Component of MIRS
Output Presentation
(cont.)
Streaming
Content sent to client at specific rate and
except for buffering, played directly
Audio & video is delivered as continuous
stream of packets
When resource become scarce

Use switched Ethernet instead of shared Ethernet


Use disk stripping
Skip frames during play-back
Fragment content over several content servers (need
logical component between client & servers to direct
client request to corresponding server)

Quality of MIRS
Recall
r/R

Precision

r: # of relevant objects returned by system,


n: # objects retrieved,
R: # relevant objects in collection

r/n

Relevance judged by humans, refer


to TREC (Text Retrieval Conference)

Exercise
Discuss the role of DBMS in storing
MM objects
Discuss the role of Information
Retrieval systems in storing MM
objects

End of Module 1

You might also like