Retinal Blood Vessels Classification Into Arteries and Veins
Retinal Blood Vessels Classification Into Arteries and Veins
Fig. 4. (a) Skeleton of vessels (b) Skeleton with identified crossings (c)
Labeled regions colored.
C. Feature Extraction
Global Features
Fig. 1. Pipeline followed in the work for A/V classification. Beginning in global futures, intensity values for red, green
and blue channels of the RGB image and, on HSV image, the
A. Pre-Processing
hue, saturation and value, were obtained. Also, since usually
Due to the lack of local luminosity and contrast variability in veins have a larger diameter than arteries, vessel diameter was
retinal images [7], it becomes very important to implement a extracted, by using the distance transform on the vessels and
method to normalize luminosity and contrast in retinal images, saving the computed result on the skeleton pixel.
both intra- and inter-image. Local Features
As a result, two different operations were performed on the To compute local features, considering the diameter of the
original image sequentially. The main goal of these operations vessel, and using the approach of Zamperini et al [1], as
was to create an image enhancement method so we could, then, mentioned above, we extracted new features based on statistic
normalize the images as illumination is concerned making it measures on the color components and color derivatives, in
possible to extract more meaningful color features. disks with different diameters (large and small, like shown in
This way, first we applied an image denoising operation in Fig. 5).
order to remove the noise that is associated with any acquisition
system as well as some image artifacts that may exist.
Afterwards, we performed the Homomorphic Filtering
Operation for Illumination Correction so that assuming an
image is non-uniformly illuminated (I(x)), we can obtain an Fig. 5. Small area (p1) vs large area (p2), used for feature extraction [1].
undegraded image (r(x)) while removing the illumination field
(i(x)) that introduces the variability mentioned above. The With this approach, rotation invariant (due to disk-like shape)
equation that represents this operation is presented in (1), the and size invariant (due to adaptive window size) features were
method underlying it is described in Fig. 2, and an example of obtained.
its application on Fig. 3. Besides that, as usually there is an alternation of veins and
arteries near the optic disk [11], the distance to the optic disk
3
was computed. Optic disk was detected by applying a time it receives an answer, a follow-up question is asked until a
Difference of Gaussian filter, for blob detection. For false conclusion about the class label of the record is reached.
positive reduction, the region with the highest intensity in red Random Forests are an extension of Decision Trees. As the
channel was chosen. Due to luminosity variations, even after name suggests, this algorithm creates a forest - an ensemble of
applying the homomorphic filtering, the wrong region can be decision trees. In general, the more trees in the forest, the more
detected, as shown on Fig. 6, showing a limitation of this robust the classifier looks like. In these two algorithms, variable
algorithm. parameters include the depth, max number of leaves, maximum
leaf nodes, among others.
The k-nearest neighbor (kNN) algorithm is a nonparametric
method based on the similarity of nearby instances and it is a
classifier which does not require a model, and uses the data
directly for classification. The typically variable
hyperparameter is the number of neighbors, which is usually
chosen to be odd to avoid ties.
Fig. 6. (a) Detection of the Optic Disk (b) Failed detection due to luminosity
Logistic regression is a classifier which finds the best fitting
variations. model to describe the relationship between dependent
Distance to image center, which has similar physical meaning (outcome) and independent (predictor/explanatory) variables,
as the distance to optic disk, was also calculated. generating coefficients that maximize the likelihood of
Line Features observing the sample values. The typically variable
Last, the remaining group of features, the line features, hyperparameter is the cost.
contain several statistical measurements regarding vessel Multilayer Perceptron is a class of a feedforward artificial
profile. This was achieved by getting the orientation of each neural network, with nodes that use nonlinear activation
segment (through the fitting of an ellipse), then getting the functions. The tunable hyper parameters chosen were the
perpendicular line centered in the centroid region, discretizing activation function, alpha, which is a regularization parameter
the line and skeletonizing it. From here we could extract the that penalizes weights with large magnitudes, the hidden layer
kurtosis, skewness, mean and standard deviation [8]. size, the learning rate (which can be, for instance, adaptive,
A total of 61 features were obtained. varying in case the loss is constant from iteration to iteration),
and the solver (Adam/SGD/RMSProp).
Feature Selection E. Post-Processing
In this work, recursive feature elimination (RFE) with cross
validation and ANOVA (with different number of chosen After obtaining the first prediction for the full skeleton, each
features) were used for feature selection. Besides that, all the region is iteratively searched, and checked if the number of
features, with some manual feature preprocessing to remove points classified as an artery (with a fixed threshold of 0.5 on
features with a considerable number of zeros, which have no the classification probability) is bigger than the number of
predictive power, were used. For this task, several classifiers, points classified as veins. According to that, the probability of
along with these feature selection techniques, as shown on Fig. the arteries is swapped by the mean probability of the veins in
7, were used. the same segment, or vice-versa.
As a second post-processing step, points which are in an area
closer to the optic disk than 20% of the maximum distance to it
are removed from the skeleton. This is performed because
vessels closer to the optic disk are usually easier to classify, so
their prediction shouldn’t need to be changed.
Fig. 7. Pipeline of feature selection and classification tasks. Afterwards, for every intersection of the skeleton, we get all the
In RFE, all features are chosen on the first step. For every step neighbors in a 5 by 5 window, and see to which region they
or iteration, the worst n number of features (lowest feature belong to. Then, we compute the mean probability of that
importance in case of tree-based models) are eliminated. The segment being a vein, to see the predominant class in the
number of features left at the step which gives the maximum neighborhood. In the regions that doesn’t belong to the
score on the validation data, is considered to be optimal number predominant class, we swap its mean probability by the
of features. maximum probability of the other regions. An example is
The one-way analysis of variance (ANOVA) is used to shown on Fig. 8.
determine whether there are any statistically significant
differences between the means of two or more independent
groups).
D. Classification
Decision Tree classifier applies a straightforward idea to
solve the classification problem, by posing a series of carefully
crafted questions about the attributes of the test record. Each
Fig. 8. Example of the second post processing step, for one region. 0.3 is
changed into 0.7, the maximum between 0.5 and 0.7.
4
V. CONCLUSION
Artery/Vein classification was shown to be a difficult task,
with several challenges in terms of algorithmic approaches and
code implementation.
As possible improvements, more things could be attempted in
the pre-processing step, such as the Retinex algorithm, or an
Fig. 9. Propagation from the skeleton to the vessel. algorithm based on image dehazing to remove shadows, leading
to less dramatic changes on the color of the image than the
IV. EXPERIMENTAL RESULTS homomorphic filtering which was used.
The intersection detection methodology, using hit-miss
A. Model Choice transform, lead to some points which were not intersections
5-fold cross validation was performed for model selection. being detected. Some post-processing which uses the
There was no significant difference between feature selection neighborhood of an intersection can be further tested to see if it
methods, and using all the features, as shown in Fig. 10, so all is a false positive.
features ended up being used, as the computational time wasn’t Regarding feature extraction, more features, such as texture-
significantly different. No pre-processing was used as initial related features, could be extracted from the vessels, as a way
trials showed it led to higher variability. to describe the center reflex. With a higher number of features,
the feature selection methods should lead to a higher difference
in accuracy than what was seen in this work.
As for label propagation, in post-processing, it should be
performed only on smaller vessels, perhaps using a threshold to
distinguish between large and small vessels. This is because
larger vessels are typically well-classified already.
As always, having more data could help our training algorithm
generalize more, and with higher amount of data and data
augmentation, deep learning algorithms can be applied.
VII. REFERENCES
Fig. 10. 5-fold CV results for all the classifiers and all the pathways tested. 1. Zamperini, A., et al. Effective features for artery-vein classification
The best model was shown to be the multilayer perceptron, with in digital fundus images. in Computer-Based Medical Systems
(CBMS), 2012 25th International Symposium on. 2012. IEEE.
a rectified linear unit as an activation function, an alpha of
2. Ikram, M.K., et al., Are retinal arteriolar or venular diameters
0.0001, 100 nodes in one layer, a constant learning rate and associated with markers for cardiovascular disorders? The
Adam as an optimizer. Rotterdam Study. Invest Ophthalmol Vis Sci, 2004. 45(7): p. 2129-
Using the second post processing step, accuracy in both vessel 34.
and skeleton lowered, and standard deviation increased, in the 3. Karssemeijer, N., et al., Automatic classification of retinal vessels
into arteries and veins. 2009. 7260: p. 72601F.
5 validation images, when compared to using only the first step 4. Dashtbozorg, B., A.M. Mendonca, and A. Campilho, An automatic
of post processing. As such, the second post processing step graph-based approach for artery/vein classification in retinal
was not used in the final model. Propagation from skeleton to images. IEEE Trans Image Process, 2014. 23(3): p. 1073-83.
vessel increased average accuracy by around 7%, indicating its 5. Niemeijer, M., et al., Automated measurement of the arteriolar-to-
venular width ratio in digital color fundus photographs. IEEE Trans
successfulness. Med Imaging, 2011. 30(11): p. 1941-50.
6. Joshi, V.S., et al., Automated method for identification and artery-
B. Test Set venous classification of vessel trees in retinal vessel networks. PLoS
The final model was then tested on 20 test images. An area One, 2014. 9(2): p. e88061.
under the curve of 75% was obtained (as shown on Fig. 10), 7. Nieuwenhuis, C. and M. Yan. Knowledge based image enhancement
using neural networks. in Pattern Recognition, 2006. ICPR 2006.
with a 68% accuracy at the optimal threshold. Moreover, the 18th International Conference on. 2006. IEEE.
sensitivity and specificity were shown to be 67% and 68%, 8. Xu, X., et al., An improved arteriovenous classification method for
respectively, showing that the model can classify veins as well the early diagnostics of various diseases in retinal image. Comput
as arteries. Methods Programs Biomed, 2017. 141: p. 3-9.