Using The Apriori Algorithm For Medical Image Classification
The document discusses using the Apriori algorithm for medical image classification. It describes acquiring medical images, enhancing them, extracting features, and then classifying the images using association rule mining with Apriori. The method involves discovering association rules between extracted image features and image categories (normal vs. abnormal) to build a classifier. This classifier could then help analyze large collections of medical images and reduce radiologist workload.
Using The Apriori Algorithm For Medical Image Classification
The document discusses using the Apriori algorithm for medical image classification. It describes acquiring medical images, enhancing them, extracting features, and then classifying the images using association rule mining with Apriori. The method involves discovering association rules between extracted image features and image categories (normal vs. abnormal) to build a classifier. This classifier could then help analyze large collections of medical images and reduce radiologist workload.
Image Classification SORINA GHITA Agenda Introduction The method used Image acquisition Image enhancement Feature extraction Classification Conclusion Introduction The number of images is growing especially in radiology. A small percentage of the cases there is actually a malformation in an image to be detected. It is very important to detect a malformation in an early stage where it can be treated well. It is much easier to detect this in a late stage but treatment is then much more difficult and more costly. The analysis of image is done by radiologist which is time consuming. The amount of images is growing faster then the number of radiologist can analyze. Because of the need to analyze more images with a very high accuracy and reliability there is a need for software which can help to reduce the workload of the radiologist. New machine learning based algorithms might be used to learn on a small set of training images to classify a large collection of images. The method used The image cclassification process that will be used is the following: 1. Image acquisition 2. Image enhancement 3. Feature extraction 4. Classification Image acquisition A lot of privacy issues regarding real medical images . Alliance with Institute for Mother and Childcare Alfred D. Rusescu (IOMC) that will collaborate with me in this research. Research focused on various congenital malformations foud in ultrasound screening of infants / children or for detection of ultrasound fetal Down syndrome. (For example, the fetal ultrasound may detect signs of Down Syndrome in the first half of pregnancy. A fetal ultrasound image can show a more prominent position (than normal) in the back of the neck of the unborn child. This prominent position is detected by measuring the distance between skin surface and neck bones. ) Image acquisition (2) Creation of a database for storing the medical images. Technology used: Oracle Database 11g Enterprise Edition (with Oracle Multimedia feature Oracle Application Express (formerly Oracle HTML DB to load the images in the Oracle Database Application created: DICOM image archive. Image acquisition (3) Image acquisition preferred situation: loading images into the database by capturing the images directly from the medical devices. Image enhancement Preprocessing phase (data cleaning phase) of the images is necessary to improve the quality of the images and make the feature extraction phase more reliable. Techniques used: cropping operation and image enhancement. The cropping operation (eliminating the unwanted parts of the image ) will be done automatically by sweeping through the image and cutting horizontally and vertically the image those parts that had the mean less than a certain threshold. Image enhancement (to diminish the effect of over brightness or over darkness in the images and accentuate the image features) helps in qualitative improvement of the image. Histogram equalization increases the contrast range in an image by increasing the dynamic range of grey levels (or colors). Feature extraction Features relevant to the classification will be extracted from the cleaned images. The extracted features will be organized in a database in the form of transactions, which in turn constitute the input for the classification algorithms used. The transactions are of the form {ImageID, Class Label, F1; F2; :::; Fn} where F1:::Fn are n features extracted for a given image. The database will be constructed by merging some already existing features in the original database with some new visual content features that we extracted from the medical images using image processing techniques. Classification Method used: Association rule mining using the Apriori algorithm. Association rule mining typically aims at discovering associations between items in a transactional database. Given a set of transactions D = {T1; ::; Tn} and a set of items I = {i1; ::; im} such that any transaction T in D is a set of items in I, an association rule is an implication A=> B where the antecedent A and the consequent B are subsets of a transaction T in D, and A and B have no common items. For the association rule to be acceptable, the conditional probability of B given A has to be higher than a threshold called minimum confidence. Association rules mining is normally a two-step process, wherein the first step frequent item-sets are discovered (i.e. item-sets whose support is no less than a minimum support) and in the second step association rules are derived from the frequent item-sets. Information specific to image analysis: the number of occurrences of a particular feature on the image with uniform feature characteristics. Classification (2) I will use the apriori algorithm in order to discover association rules among the features extracted from the image database and the category to which each image belongs. The antecedent of the rules is composed of a conjunction of features from the image (color, texture) while the consequent of the rule is always the category to which the image belongs. A rule would describe frequent sets of features per category normal and abnormal based on the apriori association rule discovery algorithm. After all the features will be merged and put in the transactional database, the next step is applying the apriori algorithm for finding the association rules in the database constrained as described above with the antecedent being the features and the consequent being the category. The association rules will be used to construct a classification system that categorizes the images as normal or abnormal. Classification (3) The most delicate part of the classification with association rule mining will be the construction of the classifier itself. The main question is how to build a powerful classifier from these associations. The first intuition in building the classification system is to categorize the image in the class that has the most rules that apply. This classification would work when the number of rules extracted for each class is balanced. In other cases, a further tuning of the classification system will be required. The tuning of the classifier is mainly represented by finding some optimal intervals of the confidence such as both the overall recognition rate and the recognition rate of abnormal cases are at its maximum value. In dealing with medical images it is very important that the false negative rate be as low as possible. It is better to misclassify a normal image than an abnormal one. In the tuning phase it is important to take into consideration the recognition rate of abnormal images. It is not only important to recognize some images, but to be able to recognize those that are abnormal. Image acquisition (4) By applying the apriori algorithm with additional constraints on the form of the rules to be discovered a relatively small set of association rules will be generated associating sets of features with class labels. These association rules will constitute the classification model. The discovery of association rules in the images feature database will represent the training phase of the classifier. To classify a new image, it enough to extract the features from the image as was done for the training set, and applying the association rules on the extracted features to identify the class the new image falls into. Conclusion The number of images is growing especially in radiology. The big players in CAD (computer aided diagnosis) focus on helping radiologist to extract many features by hand from patients images. This still helps one radiologist with one patient and does not scale to analyzing large numbers of patients and determining if there is a malformation or not. All players can identify the obvious cases of malformations but the biggest problem is the large number of false positive (saying the images contains a malformation but it doesnt). By analyzing a large collection of images and thus reduce workload one can focus on early screening and safe life and safe money. This solution will facilitate digital storage and distribution of patient images across healthcare network, will increase diagnostic accuracy and optimize decision time allowing a detailed analysis of a large number of images in the shortest time. This will increase efficiency and quality of care. Q U E S T I O N S A N S W E R S