0% found this document useful (0 votes)
61 views11 pages

Image Mining Method and Frameworks: Shaikh Nikhat Fatma

The document summarizes image mining methods and frameworks. It defines image mining as extracting patterns from large collections of images, going beyond content-based retrieval or analyzing single images. The summary describes: 1) The image mining process involves preprocessing images, extracting features, mining for patterns, evaluating results, and obtaining knowledge. 2) Image mining differs from data mining due to images having absolute vs relative values, spatial information, and multiple interpretations. 3) A proposed image mining method extracts objects from images, indexes images and objects, then applies data mining algorithms to find association rules between objects.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
61 views11 pages

Image Mining Method and Frameworks: Shaikh Nikhat Fatma

The document summarizes image mining methods and frameworks. It defines image mining as extracting patterns from large collections of images, going beyond content-based retrieval or analyzing single images. The summary describes: 1) The image mining process involves preprocessing images, extracting features, mining for patterns, evaluating results, and obtaining knowledge. 2) Image mining differs from data mining due to images having absolute vs relative values, spatial information, and multiple interpretations. 3) A proposed image mining method extracts objects from images, indexes images and objects, then applies data mining algorithms to find association rules between objects.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue.

Image Mining Method and Frameworks


1

Shaikh Nikhat Fatma

Department Of Computer, Mumbai University, Pillais Institute Of Information Technology, New Panvel

Abstract:
Image mining deals with the extraction of image patterns from a large collection of images. Clearly, image mining is different from low-level computer vision and image processing techniques because the focus of image mining is in extraction of patterns from large collection of images, whereas the focus of computer vision and image processing techniques is in understanding and / or extracting specific features from a single image. While there seems to be some overlaps between image mining and content-based retrieval (both are dealing with large collection of images), image mining goes beyond the problem of retrieving relevant images. In image mining, the goal is the discovery of image patterns that are significant in a given collection of images.

Keywords Image mining (IM); function-driven; knowledge driven; information driven; knowledge remounting
I. INTRODUCTION Image mining deals with extraction of implicit knowledge, image data relationship or other patterns not explicitly stored in images and uses ideas from computer vision, image processing, image retrieval, data mining, machine learning, databases and AI. The fundamental challenge in image mining is to determine how low-level, pixel representation contained in an image or an image sequence can be effectively and efficiently processed to identify high-level spatial objects and relationships. Typical image mining process involves pre-processing, transformations and feature extraction, mining (to discover significant patterns out of extracted features), evaluation and interpretation and obtaining the final knowledge. Various techniques from existing domains are also applied to image mining and include object recognition, learning, clustering and classification, just to name a few. Association rule mining is a well-known data mining technique that aims to find interesting patterns in very large databases. Some preliminary work has been done to apply association rule mining on sets of images to find interesting patterns[4,5,7]. The fundamental challenge in image mining is to determine how low-level, pixel representation contained in a raw image or image sequence can be efficiently and effectively processed to identify high-level spatial objects and relationships. In other words, image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the image databases. Research in image mining can be broadly classified into two main directions. The first direction involves domain-specific applications where the focus is in the process of extracting the most relevant image features into a form suitable for data mining. The second direction involves general applications where the focus is on the process of generating image patterns that maybe helpful in the understanding of the interaction between high-level human perceptions of images and low level image features. The latter may lead to improvements in the accuracy of images retrieved from image databases. In the remaining paper in section II there is an explanation for the image mining process, in section III we take the review of how Image Mining is actually done actually work. In section IV we take an example. In section V we take an over view of Image Mining Frameworks. In section VI we make a conclusion for Image Mining and the last section includes all references for this paper. II. THE IMAGE MINING PROCESS Figure 1 shows the image mining process. The images from an image database are first preprocessed to improve their quality. These images then undergo various transformations and feature extraction to generate the important features from the images. With the generated features, mining can be carried out using data mining techniques to discover significant patterns. The resulting patterns are evaluated and interpreted to obtain the final knowledge, which can be applied to applications.

||Issn 2250-3005(online)||

||December||2012||

Page 135

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

Figure 1: The image mining process It should be noted that image mining is not simply an application of existing data mining techniques to the image domain. This is because there are important differences between relational databases versus image databases: (a) Absolute versus relative values In relational databases, the data values are semantically meaningful. For example, age is 35 is well understood. However, in image databases, the data values themselves may not be significant unless the context supports them. For example, a grey scale value of 46 could appear darker than a grey scale value of 87 if the surrounding context pixels values are all very bright. (b) Spatial information (Independent versus dependent position) Another important difference between relational databases and image databases is that the implicit spatial information is critical for interpretation of image contents but there is no such requirement in relational databases. As a result, image miners try to overcome this problem by extracting position-independent features from images first before attempting to mine useful patterns from the images. (c) Unique versus multiple interpretation A third important difference deals with image characteristics of having multiple interpretations for the same visual patterns. The traditional data mining algorithm of associating a pattern to a class (interpretation) will not work well here. A new class of discovery algorithms is needed to cater to the special needs in mining useful patterns from images. III. METHOD FOR IMAGE MINING In this section, we present the algorithms needed to perform the mining of associations within the context of images. The four major image mining steps are as follows: 1. 2. 3. 4. Feature extraction. Segment images into regions identifyable one blob represents one object. This step is also called segmentation. by region descriptors (blobs). Ideally

Object identification and record creation. Compare objects in one image to objects in every other image. Label each object with an id. We call this step the preprocessing algorithm. Create auxiliary images. Generate images with identified objects to interpret the association rules obtained from the following step. Apply data mining algorithm to produce object association rules.

The idea of this method is selecting a collection of images that belong to a specific field (e.g. weather), after the selection stage we will extract the objects from each image and indexing all the images with its objects in transaction database, the data base contain image identification and the objects that belong to each images with its features. After creating the transaction data base that contains all images and its feature we will use the proposed data mining methods to associate rules between the objects. This will help us for prediction (e.g. if image sky contain black clouds then it will rain (65%))[5]. The following block diagram presents the proposed IM method:

||Issn 2250-3005(online)||

||December||2012||

Page 136

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

Figure 2: Block diagram of image mining method (1) Select a collection of images that belong to the same field (E.g. medical images, geographical images, persons images, etc.) (2) Image Retrieval. Image mining requires that images can be retrieved according to some requirement specifications. In the proposed work we comprise image retrieval by derived or logical features like objects of a given type or individual objects or persons using edge detection techniques[8]. After we extract object we will encoded it as follows: O1: circle. O2: triangle. O3: square.

Figure 3: Example of an image

Figure 4: Object extraction using edge detection


||Issn 2250-3005(online)|| ||December||2012|| Page 137

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

(3) Image Indexing. Image mining systems require a fast and efficient mechanism for the retrieval of image data. Conventional database systems such as relational databases facilitate indexing on primary or secondary key(s). We will create two databases: The first one contains all the objects that have been extracting from the images and its features[5]. Table 1: First database contains the objects and its features

Therefore the association rule with spatial relationships could be: V-Next-to ([red, circle, small], [blue, square, *]) ^ H-Next-to ([red, circle, *], [yellow, *, large]) Overlap([red, circle, *], [green, *, *]) (34%). In this example, only three dimensions were needed and we made use of the wildcard * to replace absent values. The second Database contains all the images and the objects that belong to each image. Table 2: Second Database Contains Each Image and its Objects

(4) Finally, the last step is applying the proposed mining techniques using the data of the the database. (5) After that we will use the first a proposed algorithm to find the frequent item sets will be the following:

images that has been index to

from the specific table and the result

(6) The final step we will use the second proposed algorithm to find association rules between the objects and we will have the following results: (1) {O4,O4} {O2,O2} [100%] (2) {O2,O4,O4} {O2} [100%] (3) {O3,O4} {O2} [100%] (4) {O3} {O2,O4} [100%] (5) {O2,O2} {O4} [100%] (6) {O4,O4} {O2} [100%] (7) {O3} {O2} [100%] (8) {O3} {O4} [100%] IV. EXAMPLE A simple example illustrating how image mining algorithms work with n=10. The original images and their corresponding blobs are shown on Figure 4 and Figure 5. Association rules corresponding to the identified objects are also shown . 10 representative images are chosen from the image set created for the experiments.
||Issn 2250-3005(online)|| ||December||2012|| Page 138

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

Figures 5 and 6 shows the original image at the left with several geoemetric shapes and white background. These images are labeled with an image id. These images are the only input data for the program; no domain knowledge is used. Then a series of blob images are shown, each containing one blob. These images are labeled with the id obtained by preprocessing . Each blob has a close (most times equal) position to its corresponding geometric shape. There are some cases in which one blob corresponds to several geometric shapes. For instance in image 013 object 2 corresponds to the triangle and object 3 corresponds to the circle. In image 131 object 4 corresponds to all the shapes in the image. The data mining was done with a 20% support and 70% confidence. The output is a set of association rules whose support and confidence are above these thresholds. The 66 rules obtained by the program are shown . Let us analyse some of these rules. The first rule { 3}{ 2} means that if there is circle in the image then there is also a triangle. In fact with these simple images, there was never a circle without a triangle. In this case the rule actually has a higher support and a higher confidence than that obtained by the program (50% and 83 % respectively). This happened because the circle had two different object identifiers: 3 and 7. The rule {2,3,5} 8 says if there is a circle, a triangle and an hexagon then there is also a square. Once again images containing the first three shapes always contained the square. Another interesting rule is {3,11} 2 .In this case the rule says that if there are a circle and an ellipse then there is also a triangle; once again the rule is valid; note that this rule has a low support. It important to note that several incorrect or useless blob matches such as 9, 10, 13, 14, 16 are altered out by the 30% support. That is the case for images 029, 108, 119, 131 and 144. There are no rules that involve these identified objects (matched blobs).

Figure 5: First part of images and blobs

FIGURE 6: SECOND PART OF IMAGES AND BLOBS ||Issn 2250-3005(online)|| ||December||2012|| Page 139

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

RULES GENERATED

V. IMAGE MINING FRAMEWORKS Early work in image mining has focused on developing a suitable framework to perform the task of image mining. The image database containing raw image data cannot be directly used for mining purposes. Raw image data need to be processed to generate the information that is usable for high level mining modules. An image mining system is often complicated because it employs various approaches and techniques ranging from image retrieval and indexing schemes to data mining and pattern recognition. A good image mining system is expected to provide users with an effective access into the image repository and generation of knowledge and patterns underneath the images. Such a system typically encompasses the following functions: image storage, image processing, feature extraction, image indexing and retrieval, patterns and knowledge discovery. 1. Function-Driven Image Mining Framework Model Function-driven Image Mining Framework Model is usually organized by modules with different functions. It divides the function model into two modules. 1) Data obtaining, pre-treatment and saving module: Which is mainly used for image pick-up, original image storage and searching[8]. 2) Image mining module: Which is used for mining image model and meanings. There are 4 function modules included in this system.

||Issn 2250-3005(online)||

||December||2012||

Page 140

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

Figure 6: Function-driven image mining framework model 1) Image data acquisition: Get image from multimedia database. 2) Pre processor : Get image character[10]. 3) Searches engine: use the image characters for matching inquire. 4) Knowledge discovery module: Mining image. The Diamond Eye [7] is an image mining system that enables scientists to locate and catalog objects of interest in large image collections. This system employs data mining and machine learning techniques to enable both scientists and remote systems to find, analyze, and catalog spatial objects, such as volcanos and craters, and dynamic events such as eruptions and satellite motion, in large scientific datasets and real-time image streams under varying degrees of a priori knowledge. The architecture of the Diamond Eye system is also based on module functionality. 2. Information Driven Image Mining Framework Model While the function-driven framework serves the purpose of organizing and clarifying the different roles and tasks to be performed in image mining, it fails to emphasize the different levels of information representation necessary for image data before meaningful mining can take place. Zhang et. al. proposes an information-driven framework that aims to highlight the role of information at various levels of representation (see Figure 7). The framework distinguishes four levels of information given below.This model emphasize different roles of different image arrangement, that incarnate description mechanism of vary arrangement of image data, mark of 4 layers [4]. The Four Information Levels We will describe the four information levels in our proposed framework. 1. Pixel Level The Pixel Level is the lowest layer in an image mining system. It consists of raw image information such as image pixels and primitive image features such as color, texture, and edge information[10]. 2. Object Level The focus of the Object level is to identify domain-specific features such as objects and homogeneous regions in the images. An object recognition module consists of four components: model database, feature detector, hypothesizer and hypothesis verifier. The model database contains all the models known to the system. The models contain important features that describe the objects. The detected image primitive features in the Pixel Level are used to help the hypothesizer to assign likelihood to the objects in the image. The verifier uses the models to verify the hypothesis and refine the object likelihood. The system finally selects the object with the highest likelihood as the correct object.

||Issn 2250-3005(online)||

||December||2012||

Page 141

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

To improve the accuracy of object recognition, image segmentation is performed on partially recognized image objects rather than randomly segmenting the image. The techniques include: characteristic maps to locate a particular known object in images, machine learning techniques to generate recognizers automatically , and use a set of examples already labelled by the domain expert to find common objects in images . Once the objects within an image can be accurately identified, the Object Level is able to deal with queries such as Retrieve images of round table and Retrieve images of birds flying in the blue sky. However, it is unable to answer queries such as Retrieve all images concerning Graduation ceremony or Retrieve all images that depicts a sorrowful mood.

Figure 7: Information driven image mining framework model 3. Semantic Concept Level While objects are the fundamental building blocks in an image, there is semantic gap between the Object level and Semantic Concept level. Abstract concepts such as happy, sad, and the scene information are not captured at the Object level. Such information requires domain knowledge as well as state-of-the-art pattern discovery techniques to uncover useful patterns that are able to describe the scenes or the abstract concepts. Common pattern discovery techniques include: image classification, image clustering, and association rule mining. With the Semantic Concept Level, queries involving high-level reasoning about the meaning and purpose of the objects and scene depicted can be answered. Thus, we will able to answer queries such as: Retrieve the images of a football match and Retrieve the images depicting happiness. It would be tempting to stop at this level. However, careful analysis reveals that there is still one vital piece of missing information that of the domain knowledge external to images. Queries like: Retrieve all medical images with high chances of blindness within one month, requires linking the medical images with the medical knowledge of chance of blindness within one month. Neither the Pixel level, the Object level, nor the Semantic Concept level is able to support such queries. 4. Pattern and Knowledge Level At this level, we are concerned with not just the information derivable from images, but also all the domain-related alphanumeric data. The key issue here is the integration of knowledge discovered from the image databases and the alphanumeric databases. A comprehensive image mining system would not only mine useful patterns from large collections of images but also integrate the results with alphanumeric data to mine for further patterns. For example, it is useful to combine heart perfusion images and the associated clinical data to discover rules in high dimensional medical records that may suggest early diagnosis of heart disease. IRIS, an Integrated Retinal Information System, is designed to integrate both patient data and their corresponding retinal images to discover interesting patterns and trends on diabetic retinopathy. BRAin-Image Database is another image mining system developed to discover associations between structures and functions of human brain . The brain modalities were studied by the image mining process and the brain functions
||Issn 2250-3005(online)|| ||December||2012|| Page 142

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

(deficits/disorders) are obtainable from the patients relational records. Two kinds of information are used together to perform the functional brain mapping. Discovering knowledge from data stored in alphanumeric databases, such as relational databases, has been the focal point of much work in data mining. However, with advances in secondary storage capacity, coupled with a relatively low storage cost, more and more nonstandard data is being accumulated. One category of non-standard data is image data (others include free text, video, sound, etc). There is currently a very substantial collection of image data that can be mined to discover new and valuable knowledge. The central research issue in image mining is how to pre-process image sets so that they can be represented in a form that supports the application of data mining algorithms. A common representation is that of feature vectors were each image is represented as vector. Typically each vector represents some subset of feature values taken from some global set of features. A trivial example is where images are represented as primitive shape and colour pairs[9]. Thus the global set of tuples might be: {{blue square}, {red square}, {yellow square}, {blue circle}, {red circle}, {yellow circle}} which may be used to describe a set of images: {{blue square}, {red square}, {red circle}} {{red square}, {yellow square} {blue circle}, {yellow circle}} {(red box}, {red circle}, {yellow circle}} However, before this can be done it is first necessary to identify the image objects of interest (i.e. the squares and circles in the above example). A common approach to achieving this is known as segmentation. Segmentation is the process of finding regions in an image (usually referred to as objects) that share some common attributes (i.e. they are homogenous in some sense)[9]. The process of image segmentation can be helped /enhanced for many applications if there is some application dependent domain knowledge that can be used in the process. In the context of the work described here the authors are interested in MRI brain scans, and in particularly a specific feature within these scans called the Corpus Callosum. An example image is given in Figure 8. The Corpus Callosum is of interest to researchers for a number of reasons: 1. The size and shape of the Corpus Callosum are shown to be correlated to sex, age, neuro degenerative diseases (such as epilepsy) and various lateralized behaviour in people. 2. It is conjectured that the size and shape of the Corpus Callosum reflects certain human characteristics (such as a mathematical or musical ability). 3. It is a very distinctive feature in MRI brain scans. Several studies indicate that the size and shape of the Corpus Callosum in human brains are correlated to sex , age , brain growth and degeneration, handedness and various types of brain dysfunction.

Figure 8: Corpus callosum in a midsagittal brain MRI image. In order to find such correlations in living brains, Magnetic Resonance Imaging (MRI) is regarded as the best method to obtain cross-sectional area and shape information about the Corpus Callosum. In addition, MRI is fast and safe, without any radiation exposure to the subject such as with x-ray CT. Since manual tracing of Corpus Callosum in MRI data is time consuming, operator dependent, and does not directly give quantitative measures of cross-sectional areas or shape, there is a need for automated and robust methods for localization, delineation and shape description of the Corpus Callosum [9].
||Issn 2250-3005(online)|| ||December||2012|| Page 143

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

The four information levels can be further generalized to two layers: the Pixel Level and the Object Level form the lower layer, while the Semantic Concept Level and the Pattern and Knowledge Level form the higher layer. The lower layer contains raw and extracted image information and mainly deals with images analysis, processing, and recognition. The higher layer deals with high-level image operations such as semantic concept generation and knowledge discovery from image collection. The information in the higher layer is normally more semantically meaningful in contrast to that in the lower layer. It is clear that by proposing a framework based on the information flow, we are able to focus on the critical areas to ensure all the levels can work together seamlessly. In addition, with this framework, it highlights to us that we are still very far from being able to fully discovering useful domain information from images. 3. Knowledge Driven Image Mining Framework Model Function-driven model is formed from image mining application, and information-driven model is considered from different layer. The essential of image mining is to find knowledge, the above two model doesnt consider the using of mining knowledge, besides, in the whole course, user is on a passive position to receive the mining module and knowledge. Due to the image data itself is an unstructured or semi structure data, so the remounting may happen in image mining, how to mine the maximum knowledge from the mining course. We should know the knowledge user wanted is the knowledge significant[6]. 1) Image choosing: The aim of image choosing is to confirm the object of image mining which is an original image data in image database as the users requirement. Image disposal: It refers to digital image management and image identification. For example, to remove noises from the image or to proof read the anamorphic image, to recover the low information image[10]. Character pickup: The character information, such as color , shape, Position are picked up, and stored. Character base is very important because it should support the inquire of image data. Character choosing Optimize: The storage of the image character may be overabundant and this factor may affect the operation of the key mining approach so the character choose should be taken before the image mining. If we mark a character with eigenvector, this approach we call dimension decrease. Besides character choosing, sometimes, we should optimize the choosing, including data noise decrease, sequence data dispersion and dispersing data continuum. Image mining: Use image mining to mine the data in the image to find related modules. At present, commonly used ways are all from traditional data mining area, such as stat. analyses, associate rule analyses, machine learning. etc. Explain and comment combining: In module/knowledge base, it stores the knowledge units which represent image logic concept; we need integrate data to find more potential modules or knowledge. When mining the module, all redundant or useless modules should be removed, the useful modules converse to the knowledge which can be understand by the user. Image sample training: Through the image sample training, the validity and veracity can be highly improved[6]. Alternating learning: Users can learn domain knowledge by system mining, and also can input the domain knowledge to the system, which includes how to split the non figurative knowledge into knowledge unit[6]. Domain knowledge: In the course of mining, all the former approaches, models or the episteme can be used in discovery of the new system[6].

2)

3)

4)

5)

6)

7) 8)

9)

||Issn 2250-3005(online)||

||December||2012||

Page 144

International Journal Of Computational Engineering Research (ijceronline.com) Vol. 2 Issue. 8

Figure 9: Knowledge-driven image mining framework VI. CONCLUSIONS Image mining is the advanced field of Data mining technique and it has a great challenge to solve the problems of various systems. The main objective of the image mining is to remove the data loss and extracting the meaningful information to the human expected needs. It retrieves the most matching images from the collection of the images, with respect to the query image. The framework models represents the first step towards capturing the different levels of information present in image data and addressing the question of what are the issues and challenges of discovering useful patterns/knowledge from each level.

References
[1] [2] [3] [4] J. Han and M. Kamber. Data Mining: Concepts and Techniques, Morgan Kaufmann, USA, 2001. Margaret H. Dunham, "Data mining: Introductionary and Advanced Topics", Southern Methodist University. R. Gonzalez and R. Woods. Digital Image Processing, Addison-Wesley Publications Co, March 1992. Ji Zhang, Wynne Hsu, Mong Li Lee. An Information-driven Framework for Image Mining, in Proceedings of 12th International Conference on Database and Expert Systems Applications (DEXA), Munich, Germany, 2001. [5] Hilal M. Yousif, Abdul- Rahman Al-Hussaini, Mohammad A. Al-Hamami. Using Image Mining to Discover Association Rules between Image Objects. [6] Yu Changjin, Xia Hongxia. The Investigation of Image Mining Framework, WUHAN University of Technology Wuhan, China. [7] Joseph Roden, Michael Burl and Charless Fowlkes. The Diamond Eye Image Mining System, Jet Propulsion Laboratory. [8] J.Zhang, W.Hsu and M.L.Lee, Image Mining: Issues, Frameworks and Techniques, Proc. of Second International Workshop on Multimedia Data Mining (MDM/KDD'2001), San Francisco, CA, USA, August, 2001. [9] Ashraf Elsayed, Frans Coenen, Marta Garca-Fiana and Vanessa Sluming, Segmentation for Medical Image Mining: A Technical Report, The University of Liverpool, Liverpool L69 3BX, UK. [10] Dr.V.Mohan, A.Kannan, Color Image Classification and Retrieval using Image mining Techniques, International Journal of Engineering Science and Technology Vol. 2(5), 2010, 1014-1020. [11] M. Antonie, O.R. Zaiane, A. Coman. Application of Data Mining Techniques for Medical Image Classification. In Proceedings of the Second International Workshop on Multimedia Data Mining (MDM/KDD'2001), San Francisco, CA, USA, August, 2001.
||Issn 2250-3005(online)|| ||December||2012|| Page 145

You might also like