0% found this document useful (0 votes)
270 views19 pages

Multimedia Data Mining

1. Multimedia data mining involves extracting patterns and knowledge from multimedia data such as images, video, and audio. It differs from traditional data mining due to the complex, unstructured nature of multimedia data. 2. There are three main approaches to automatic annotation in multimedia data mining: assigning keywords, clustering documents and then assigning keywords, and mining concepts without manual annotation by using contextual information. 3. Key aspects of multimedia data mining include media compression/storage, streaming, editing, indexing/retrieval, and creating interactive multimedia systems. Pattern discovery is a core part of the mining process.

Uploaded by

Ankita Singh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
270 views19 pages

Multimedia Data Mining

1. Multimedia data mining involves extracting patterns and knowledge from multimedia data such as images, video, and audio. It differs from traditional data mining due to the complex, unstructured nature of multimedia data. 2. There are three main approaches to automatic annotation in multimedia data mining: assigning keywords, clustering documents and then assigning keywords, and mining concepts without manual annotation by using contextual information. 3. Key aspects of multimedia data mining include media compression/storage, streaming, editing, indexing/retrieval, and creating interactive multimedia systems. Pattern discovery is a core part of the mining process.

Uploaded by

Ankita Singh
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 19

MULTIMEDIA DATA MINING

Introduction
Data Management lies a the heart of multimedia information system.The spatial,
temporal, storage, retrieval, integration and presentation requirements of multimedia data
differ significantly from the traditional data. Hence the goal of Multimedia data management
system is to allow efficient storage, manipulation using of multimedia data in all its varied
form.

There are four types of multimedia data: audio data, which includes sound , speech,
and music; image data (black-and-white and colour images); video data, which include time
aligned sequences of images; and electronic or digital, which is sequences of time aligned 2D
or 3D coordinates of a stylus, a light per, data glove sensors, or a similar device. All this data
is generated by specific kind of sensors.

The concept of mining in multimedia is also referred to as automatic annotation or annotation


mining. There appears to be three main pattern discovery approaches that have been used for
automatic annotation in multimedia data mining. These approaches primarily differ in terms
of how external knowledge is provided to mine concepts.

The first approach includes assigning key words or classifying the data. The second
approach for automatic annotation is through clustering and here multimedia documents are
clustered first and then the resulting clusters are assigned keywords by annotator. The third
approach does not rely on manual annotator and it tries to mine concepts by knowing the
contextual information.

The Multimedia Data Mining (MDM) is a part of multimedia technology, which covers the
following areas.

 Media compression and storage.


 Delivering streaming media over networks with required quality of service.
 Media restoration, transformation, and editing.
 Media indexing, summarization, search, and retrieval.
 Creating interactive multimedia systems for learning/training and creative art
production.
 Creating multimodal user interfaces.
MULTIMEDIA DATA MINING ARCHITECTURE

The data mining process consists of several processes and stages, which are related to

each other and interactive. The main stages of the data mining process are

(1) domain understanding;

(2) data selection;

(3) cleaning and preprocessing;

(4) discovering patters;

(5)interpretation;

(6) reporting and using discovered knowledge.

Figure: Multimedia Data Mining Architecture


1. The domain understanding stage requires learning how the results of data-mining will
be used so as to gather all relevant prior knowledge before mining.

2. The data selection stage requires the user to target a database or select a subset of
fields or data records to be used for data mining. A proper domain understands at this
stage helps in the identification of useful data. This is the most time consuming stage
of the entire data mining process for business applications; data are never clean and in
the form suitable for data mining. For multimedia data mining, this stage is generally
not an issue, because the data are not in relational form and there are no subsets of
fields to choose from.
3. The next stage in a typical data mining process is the preprocessing step that involves
integrating data from different sources and making choices about representing or
coding certain data fields that serve as inputs to the pattern discovery stage. Such
representation choices are needed because certain fields may contain data at levels of
details not considered suitable for the pattern discovery stage. The preprocessing stage
is of considerable importance in multimedia data mining, given the unstructured
nature of multimedia data. The pattern discovery stage is the heart of the entire data
mining process. It is the stage where the hidden patterns and trends in the data are
actually uncovered. There are several approaches to the pattern discovery stage. These
include association, classification, clustering, regression, time-series analysis and
visualization. Each of these approaches can be implemented through one of several
competing methodologies, such as statistical data analysis, machine learning, neural
networks and pattern recognition. It is because of the use of methodologies from
several disciplines that data mining is often viewed as a multidisciplinary field.

4. The interpretation stage of the data mining process is used to evaluate the quality of
discovery and its value to determine whether previous stage should be revisited or not.
Proper domain understanding is crucial at this stage to put a value on discovered
patterns.
5. The final stage of the data mining process consists of reporting and putting to use the
discovered knowledge to generate new actions or products and services or marketing
strategies as the case may be.
Advantages of Multimedia Data Mining

1. Generation of indexing schemes, based on the related terms to regularities discovered


in other media types (semantic extraction) and Structural patterns discovered in
multimedia (graph indexing) and One case library and its dynamic nature
2. Retrieval – flexibility in formulating queries
3. Adaptation of the new case description based on the user’s feedback
4. Case-based mechanism provides incorporation and management of the discovered
knowledge
5. Multimedia data mining can improve the case-based system and Discover of unknown
patterns
6. Modular approach to the case-base reasoning multimedia data mining model

Multimedia Miner

1. Multimedia Data Cube


 Image Excavator (Extraction of images)
 Preprocessor - Feature extractor
 User interface
 Search engine
2. Multimedia Miner
 characterizer, comparator
 classifier, associator

Multi-Dimensional Search in Multimedia Databases

Multi-Dimensional Analysis in
Multimedia Databases
Features and Standards of Multimedia Data Mining

It is noted that different image attributes such as Colour, edges, shape, and texture are
used to extract features for mining. Feature extraction based on these attributes may be
performed at the global or local level. For example, colour histograms may be used as
features to characterize the spatial distribution of colour in an image. Similarly, the shape of a
segmented region may be represented as a feature vector of Fourier descriptors to capture
global shape property of the segmented region or a shape could be described in terms of
salient points or segments to provide localized descriptions. Global descriptors are generally
easy to compute, provide a compact representation, and are less prone to segmentation errors.
However such descriptors may fail to uncover subtle patterns or changes in shape because
global descriptors tend to integrate the underlying information. Local descriptors, on the other
hand, tend to do generate more elaborate representation and can yield useful results even
when part of the underlying attribute, for example, the shape of a region is occluded, is
missing. In the case of video, additional attributes resulting from object and camera motion
are used.
In case of audio, both the temporal and the spectral domain features have been
employed. Examples of some of the features used include short-time energy, pause rate, zero
crossing rate, normalized harmonicity, fundamental frequency, frequency spectrum,
bandwidth, spectral centroid, spectral roll-off frequency and band energy ratio. Many
researchers have found the cepstral based features, Mel-Frequency Cepstral Coefficients
(MFCC) and Linear Predictive Coefficients (LPC), very useful, especially in mining tasks
involving speech recognition. The MPEG-7 standard provides a good representative set of
features for multimedia data. The features are referred as descriptors in MPEG-7. The
MPEG-7 Visual description tools describe visual data such as images and videos
while the Audio description tools account for audio data. The MPEG-7 visual description
defines the following main features for color attributes: Color Layout Descriptor, Color
Structure Descriptor, Dominant Color Descriptor and Scalable Color Descriptor. The Color
Layout Descriptor is a compact and resolution invariant descriptor that is defined as YCbCr
Color space to capture the spatial distribution of color over major image regions. The Color
Structure Descriptor captures both color content and information about its spatial
arrangement using a structuring element that is moved over the image. The Dominant Color
Descriptor characterizes an image or an arbitrarily shaped region by a small number of
representative colors. The Scalable Color Descriptor is a color histogram in the HSV Color
Space encoded by Haar transform to yield a scalable representation. While the above features
are defined with respect to an image or its part, the feature Group of Frames-Group of
Pictures Color (GoFGoPColor) describes the color histogram aggregated over multiple
frames of a video9.
MPEG-7 provides for two main shape descriptors; others are based on these and
additional semantic information. The Region shape Descriptor describers the shape of a
region using Angular Radial Transform (ART). The description is provided in terms of 40
coefficients and is suitable for complex objects consisting of multiple disconnected regions
and for simple objects with or without holes. The Contour Shape Descriptor describes the
shape of an object based on its outlines. The descriptor used the curvature scale space
representation of the contour.
The motion descriptors in MPEG-7 are defined to cover a broad range of applications.
The motion activity descriptor captures the intuitive notion of intensity or pace of action in a
video clip. The descriptor provides information for intensity, direction, and spatial and
temporal distribution of activity in a video segment. The spatial distribution of activity
indicates whether the activity is spatially limited or not. Similarly, the temporal distribution
of activity indicates how the level of activity varies over the entire segment. The
CameraMotion Descriptor specifies the camera motion types and their quantitative
characterization over the entire video segment. The Motion Trajectory Descriptor describes
motion trajectory of moving object basic on spatiotemporal localization of trajectory points.
The description provided is at a fairly high level as each moving object is indicated by one
representative point at any time instant. The parametric Motion Descriptors describes motion,
global and object motion, in a video segment by describing the evolution of arbitrarily shaped
regions over time using a two-dimensional geometric transform.
The MPEG-7 Audio standard defines two sets of audio descriptors. The first set is of
low-level features, which are meant for a wide range of applications. The descriptors in this
set include silence, power, Spectrum, and Harmonicity. The silence Descriptor simply
indicates that there is no significant sound in the audio segment. The power Descriptor
measures temporally smoothed instantaneous signal power. The Spectrum Descriptor
captures properties such as the audio spectrum envelope, spectrum centroid spectrum spread
spectrum flatness, and fundamental frequency. The second set of audio descriptors is of high-
level feature, which are meant for specific applications. The features in this set include Audio
Signature, Timbre, and Melody. The Signature Descriptor is designed to generate a unique
identifier for identifying audio content. The Timbre Descriptor captures perceptual features of
instrument sound. The Melody Descriptor captures monophonic melodic information and is
useful for matching of melodies. In addition, the high-level descriptors in MPEG-7 Audio
include descriptors for automatic speech recognition, sound classification and indexing.

Implementation using WEKA Tool


Weka (Waikato Environment for Knowledge Analysis) is a popular suite of machine
learning software written in Java, developed at the University of Waikato, New Zealand.
WEKA is free software available under the GNU General Public License.

Weka supports several standard data mining tasks, more specifically,


data preprocessing, clustering, classification, regression, visualization, and feature selection.
All of Weka's techniques are predicated on the assumption that the data is available as a
single flat file or relation, where each data point is described by a fixed number of attributes.

Problem Statement:- We have considered a single multimedia schema describing video-


library. The application should allow multiple movie makers working simultaneously to
store, remove and manipulate different kinds of multimedia data we assume that some
material is gathered from database.

This application helps the manager of a a video library to group the customers according to
the purchase language, rating, cast etc.

Working
1. Loading The Data

After weka 3.6 has been installed we launch the explorer application of weka. Now we
need to load the dataset we have created as a .csv extension. Click on choose file and then
change the file type to .csv and browse to the desired location and select the file. It is as
shown below in the figure.
Our dataset contain the following attributes:

1. F_ID : this gives a unique id to the film with which it is recognised.


2. Genre: this is a nominal attribute with the following categories
 Drama
 Fiction
 Action
 Horror
 Comedy
3. Name: this shows the Name of the film
4. Duration: this is a numerical attribute containing continuous values and it
describes the duration of the film
5. Lead: describes the lead actor of the movie
6. Director: director of the film
7. Release date
8. Certification: this is a categorical attribute with following values
 U
 A
 U/A
9. Rating: this is a numerical attribute
10. Language: this is a nominal attribute with the values
 English
 Hindi
 Tamil
 Kannada

2. Basic Statistics
Once the data set has been loaded Weka will recognize the attributes and during the
scan of the data will compute some basic statistics on each attribute. The left panel in Figure
below shows the list of recognized attributes, while the top panels indicate the names of the
base relation (or table) and the current working relation.

Clicking on any attribute in the left panel will show the basic statistics on that
attribute. For categorical attributes, the frequency for each attribute value is shown, while for
continuous attributes we can obtain min, max, mean, standard deviation, etc. The figure
below illustrates the same. It shows the type of attribute be it numeric, nominal etc. for
nominal attribute “Lead” shown below it tells us the number of distinct values and also lists
the number of occurrences along with the values for each attribute.
The visualization graphs shown below is a cross tabulation between two attributes.
The figure below shows the cross tabulation between certification and the language of the
films.

3. Selecting and Filtering Data

In our sample data file, each record is uniquely identified by a F_id (the "id" attribute).
We need to remove this attribute before the data mining step. We can do this by using the
Attribute filters in WEKA. In the "Filter" panel, click on the "Choose" button. This will show
a popup window with list available filters. Expand the filters, then expand unsupervised, then
expand attributes and select “Remove” filter from that. It is as shown below in the figure.
After this click on text box immediately to the right of the "Choose" button. In the
resulting dialog box enter the index of the attribute to be filtered out. In this case, we enter 1
which is the index of the "F_id" attribute. Make sure that the "invertSelection" option is set to
false (otherwise everything except attribute 1 will be filtered). Then click "OK". It is as
illustrated below in the figure.
Now, in the filter box you will see "Remove -R 1". Now click on apply button. The
resulting filtration has removed the f_id attribute as shown in the figure below. We now save
the resulting intermediary dataset as “media2.arff”

4. Discretization

Some techniques, such as association rule mining, can only be performed on


categorical data. This requires performing discretization on numeric or continuous attributes.
There are 2 such attributes in this data set: "duration" and “rating”. In the case of the "rating"
attribute the range of possible values are only 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10. In this case, we
have opted for keeping all of these values in the data. This means we can simply discretize by
removing the keyword "numeric" as the type for the "rating" attribute in the ARFF file, and
replacing it with the set of discrete values. We do this directly by opening the media2.arff file
in word pad as shown below in the figure.
We save the above changes in word pad as a new intermediary data set as media3.arff.
Now open the file with weka and we notice that the “rating” attribute is no longer numeric it
is now discretized. This is show in the figure below.

Now the “duration” attribute has to be discretized. This is a continuous attribute so we


need to apply another filter as shown below. We open the filter dialog box and the expand
filters, under this we select unsupervised and expand the same again we select attributes and
expand the same. Under this we select the Discretize filter
(weka.filters.unsupervised.attribute.Discretize). After this again click on the textbox next to
the choose button in filters. Another dialog opens in this change the attribute indices to 3
which is the index for the duration attribute. Set the number of bins to 4 and click ok. This is
illustrated as shown in the figure below.
Now click on apply to apply the filter. The “duration” attribute is now discretized and is
having 4 discrete values like (-inf-93], (93-117] etc. It is as shown in the figure below. Now
we again save the intermediate dataset as media4.arff.

Now we shall open this media4.arff file in word pad and change the discrete values
assigned for duration to a more meaningfull value. We replace all occurances of (-inf-93]
with the value 0_93 this can be done by clicking replace all option in the figure below.

The same procedure is followed for all the values of “Duration” attribute. Then we
save the file as media-final.arff. The values that the duration attribute can have are shown in
the following two figures.
5. Association Mining
Now that all the attributes have been discretized we can perform association mining
on the dataset. The most commonly used algorithm is the apriori which we will also be using.
Go to the associate tab in weka. In that tab click on choose and select apriori as the
associator. Then click on the textbox next to the choose button. A dialog box appears. Here
we change the default value of rules to 20, this indicates that the program will report no more
than the top 20 rules. The upper bound for minimum support is set to 1.0 (100%) and the
lower bound to 0.1 (10%). Apriori in WEKA starts with the upper bound support and
incrementally decreases support (by delta increments which by default is set to 0.05 or 5%).
The algorithm halts when either the specified number of rules are generated, or the lower
bound for min. support is reached. The significance testing option is only applicable in the
case of confidence and is by default not used (-1.0). The figure below shows the final dialog
box for apriori.
Now we click ok and then click on start. The results are displayed as shown below.
6. Visualization
The relationship between attributes can be shown in terms of graphs by plotting tne X and Y
coordinates of araph with the attributes between which relation should be visualized using
“visualize” button in the top panel of the weka explorer

By pressing the visualize button following screen is obtained

Consider we need to visualize relation between Language and Lead attributes . select the
corresponding by clicking on the red square as shown above. The following screen appears
showing the relation between Lead and Language
Conclusion
Multimedia data mining techniques are active and growing area of research now. In
case of digital library projects, there is need for multimedia data mining for conversion and
preservation of multimedia information. There is needed to make data mining strategy for
conversion of multimedia files in the libraries. The digital libraries, to a large extent
accessible through the web, must present multimedia information effectively. Then the
purpose of these libraries is served properly. To serve this purpose, there is needed to form
data mining strategy, considering standards, features and available techniques.

You might also like