Automatic Thresholding Using Modified Valley Empha PDF
Automatic Thresholding Using Modified Valley Empha PDF
Research Article
E-mail: [email protected]
Abstract: Otsu's method is one of the most well-known methods for automatic thresholding, which serves as an important
algorithm category for image segmentation. However, it fails if the histogram is close to unimodal or has large intra-class
variances. To alleviate this limitation, improved Otsu's methods such as the valley emphasis method and weighted object
variances method have been proposed, which still yield non-optimal segmentation performance in some cases. In this study, a
modified valley metric using second-order derivative is proposed to improve the Otsu's algorithm. Experiments are firstly
conducted on five typical test images whose histograms are unimodal, multimodal or have large intra-class variances, and then
expanded to a larger data set consisting of 22 cell images. The proposed algorithm is compared with original Otsu's method and
existing improved algorithms. Four evaluation metrics including misclassification error, foreground recall, Dice similarity
coefficient and Jaccard index are adopted to quantitatively measure the segmentation performance. Results show that the
proposed algorithm achieves best segmentation results on both data sets quantitatively and qualitatively. The proposed
algorithm adapts the Otsu's method to more image subtypes, indicating a wider application in automatic thresholding and image
segmentation field.
f (i)
pi = , i ∈ [0, L) (1)
M*N
L−1
P1(t) = ∑ pi = 1 − P0(t) (3)
i=t+1
Fig. 1 Thresholds of various Otsu methods on image coins
(a) Original image, (b) Ground truth, (c) The desired threshold and threshold values
t t
acquired by Otsu, VE, NVE, WOV, Cao's method and the proposed algorithm ∑i = 0 i ⋅ pi ∑i = 0 i ⋅ pi
μ0(t) = = (4)
t
∑i = 0 pi P0(t)
compared with standard Otsu's method. Besides these improved
one-dimensional Otsu's methods mentioned above, many two- L−1 L−1
dimensional Otsu's methods have also been published [27–30]. ∑i = t + 1 i ⋅ pi ∑i = t + 1 i ⋅ pi
μ1(t) = = (5)
Despite better performance, the two-dimensional Otsu's algorithms L−1
∑i = t + 1 pi P1(t)
are generally more time consuming compared to one-dimensional
Otsu's methods. In this paper, we are focusing on improving the L−1 L−1
∑i = 0 i ⋅ pi
performance of one-dimensional Otsu's algorithm. μ(t) = L−1
= ∑ i ⋅ pi
Fig. 1 shows the threshold values acquired by several one- ∑i = 0 pi i=0 (6)
dimensional Otsu's algorithms on a typical image named coins.
Despite the effectiveness declared in the published literature, these = P0(t) ⋅ μ0(t) + P1(t) ⋅ μ1(t)
algorithms could not find the desired threshold value, resulting in
non-optimal segmentation results. To further improve Otsu's For a threshold t, the between-class variance in Otsu's method is
method to fit more cases, in this paper, we introduce a modified defined as
valley metric and propose an improved Otsu's method based on
that. The main contributions of this paper are as follows: σ(t) = P0(t) ⋅ (μ0(t) − μ(t))2 + P1(t) ⋅ (μ1(t) − μ(t))2 (7)
(i) We have introduced a modified valley metric using second- Then the best threshold t* can be acquired by solving the
order derivative instead of grey-level probability used in original optimization issue below
VE, and proposed an improved Otsu's algorithm based on the
modified valley metric; t* = arg max σ(t)
t
(ii) Four quantitative evaluation metrics including misclassification
error (ME), foreground recall (FRecall), Dice similarity coefficient = arg max P0(t) ⋅ (μ0(t) − μ(t))2 (8)
t
(DSC) and Jaccard index (Jac) are employed for algorithm
validation. Experiments on five typical test images as well as on a +P1(t) ⋅ (μ1(t) − μ(t))2
larger data set consisting of 22 cell images demonstrate that the
proposed method has the overall best segmentation performance 2.2 Improved Otsu's methods
compared with existing Otsu's method, VE, NVE, WOV and Cao's
method. With the observation that the best threshold value should locate at
(iii) The proposed algorithm could achieve much better the valley of two peaks, or at the bottom rim of a single peak, Ng
segmentation results on images having unimodal histograms or proposed an improved VE scheme. The main idea of VE is to
large intra-class variances, without compromising its segmentation select a threshold value with a small probability of occurrance and
performance on images with bimodal or multimodal histograms. It to maximize the between-class variance in Otsu's method [13]. The
adapts the Otsu's method to more image subtypes and makes it best threshold of VE can be determined as follows:
suitable for wider applications in real-world image segmentation
scenarios. t* = arg max σ(t) ⋅ (1 − pt)
t
t* = arg max σ(t) where px is the probability of occurrence of grey level x, and p( ⋅ )
t represents the probability of all grey levels.
= arg max P0(t) ⋅ P0(t) ⋅ (μ0(t) − μ(t))2 (13) In our implementation, the histogram of the image is first
t smoothed using an average filter in order to alleviate the second-
+P1(t) ⋅ (μ1(t) − μ(t))2 order difference anomaly caused by the abnormal fluctuation of the
histogram. The average smooth process can be described as
Cao et al. presented an improved Otsu's method whose objection
function not only makes the between-class variance maximum but ~ ∑ j = − k : 1: k f (i + j)
f (i) = (19)
also maximizes the distance between the mean values of each class. 2k + 1
The new object function in Cao's work is defined as
where 2k + 1 is the filter size, and k is set to five as suggested in
σ(t) = P0(t) ⋅ P1(t) ⋅ ((μ0(t) − μ1(t)) 2 the NVE method [18].
(14)
+(μ0(t) − μ(t))2 + (μ1(t) − μ(t))2) 3.2 Automatic thresholding using modified valley metric
Although the VE, NVE and WOV algorithms as well as Cao's Taking the valley metric in (18) into consideration, the final object
method perform better than standard Otsu's method in most cases, function of our proposed algorithm is defined as follows:
there are still certain types of images they cannot properly process.
As shown in Fig. 1, the thresholds acquired by VE, NVE, WOV δ(t) = σ(t) ⋅ wv(t) (20)
and Cao's method are not located at the real valley and
significantly different from the desired value. In the following In (20), the between term σ(t) and the valley metric wv(t) are
section, a new valley metric is introduced, and automatic multiplied together. There are two advantages using (20) as the
thresholding using the modified VE is proposed. objective function. On one hand, the two terms in (20) are as large
as possible when δ(t) reaches its largest value. On the other hand,
3 Proposed automatic thresholding the objective function is parameter-free which means we do not
have to discuss the value of tradeoff parameters between σ(t) and
3.1 Valley metric using second-order derivative wv(t).
As is known from the discussion above, VE is not enough in some
thresholding cases. In order to further improve the performance of
Otsu's method, we are trying to evaluate the valley using the
second-order derivative. Let s(x) be the envelope function of a
histogram and second-order differentiable, and s′(x), s′′(x) be the
first-order and second-order derivative, respectively, it is obvious
that s′(x) equals 0 at both peaks and valleys, whereas s′′(x) is
positive at valleys and negative at peaks. Fig. 2 shows the envelope
curve of the histogram of image coins and its corresponding to
second-order difference curve. We can clearly find that points
around the valley of the histogram envelope curve correspond to
positive second-order difference values. Especially, each valley
point in histogram envelope curve corresponds to a peak point with
a positive value in the second-order difference curve using original
histogram.
Motivated by the mathematical property mentioned above, we
define the valley metric as
Fig. 2 Histogram of coins and its corresponding second-order difference
s′′(x) − min (s′′( ⋅ )) curves
wv(x) = (15)
max (s′′( ⋅ )) − min (s′′( ⋅ ))
t* = arg max σ(t) ⋅ wv(t) where Fo and Bo are pixel sets of foreground and background
t
2 2
segmented by automatic thresholding method, and FT, BT are
= arg max (P0(t) ⋅ (μ0(t) − μ(t)) + P1(t) ⋅ (μ1(t) − μ(t)) ) manually labelled foreground and background pixel sets which
t (21)
2 2
serve as ground truth. Consequently, ME represents the ratio of
Δ pt − min (Δ p( ⋅ )) misclassified pixels in an image by automatic thresholding method.
⋅
max (Δ2 p( ⋅ )) − min (Δ2 p( ⋅ )) Obviously, better automatic thresholding corresponds to a smaller
ME value. On the other hand, FRecall is used to assess the ability
Consequently, the proposed improved Otsu's method is as follows: of foreground extraction of a method, and it can be expressed as
2TP
4 Experimental results and analysis DSC = (24)
2TP + FP + FN
Experiments are conducted using Matlab R2012b on a PC with
Intel Core 2.30 GHz CPU and 4.0 GB memory. First, five typical In this paper, TP represents the number of correctly detected
test images named rice, coins, handwriting, printing character and foreground pixels, and FP, FN are the number of incorrect
license plate are selected to test algorithm performance, whose foreground pixels and number of miss detected foreground pixels,
histograms include unimodal, bimodal and multimodal respectively.
distributions (Fig. 3). Among them, image coins has significant The last evaluation metric is Jac and it is defined as
large intra-class variances in its histogram. Each of the images is
manually labelled with the binary segmentation ground truth, partly Fo ∩ FT
Jac = (25)
based on which the algorithm evaluation will be conducted. Fo ∪ FT
Subsequently, a larger data set containing 22 cell images is
constructed to further test the proposed algorithm. All the 22 where Fo and FT are the same as those introduced in ME.
images are collected from internet, and we manually label the In this paper, the performance of the proposed method is
segmentation ground truth for each image. More details about the evaluated on both data sets mentioned above and compared with
data set and the validation experiments will be introduced later. original Ostu's method, VE, NVE, WOV and Cao's method. In the
In order to verify the effectiveness of the proposed algorithm, following parts, algorithm evaluation using a data set of five
results are compared with those of Otsu's method [15], VE [13], typical test images is first conducted in four aspects, including the
NVE [19], WOV [23] and Cao's method [26], and four evaluation object function comparison, rationality analysis of the threshold
metrics including ME [13, 19, 23], FRecall, DSC [31, 32] and Jac value, quantitative evaluation and qualitative evaluation. Next a
[33, 34] are adopted to quantitatively measure the algorithm's larger data set of 22 cell images is introduced, and further
performance. The metric ME for effectiveness evaluation is experiments are conducted on the proposed larger data set.
defined as follows:
Fig. 3 Test images and their manually labelled binary segmentation ground truth
(a) Original image, (b) Ground truth, (c) Histograms of each image
4.1 Object function evaluation 4.2 Rationality analysis of the threshold value
The main idea of the proposed method is taking the second-order To validate the rationality of the threshold of the proposed
difference-based valley metric into consideration to construct a algorithm, all five typical test images are used for testing. Through
more effective object function. In this section, the objective the whole experiments, the filter size used in NVE is set to 11 as
function of the proposed method will be evaluated on image rice suggested in the literature work [19]. Table 1 demonstrates the
and coins and compared with Otsu's method. threshold values of each method, and it is obvious that the results
Fig. 4 demonstrates the difference between our proposed object of Otsu's method, VE and NVE are extremely close on all images.
function and that of Otsu's method. The primary impression of the Fig. 5 shows the threshold locations of each method on the
figure is that our modified object functions are not as smooth as histograms of all test images except for image coins, which is
that of the Otsu's, which may be due to the fluctuation of the shown in Fig. 1c. The histogram distributions of image handwriting
histogram. However, the trends are distinct that the objective and printing character are close to unimodal, the case of which
function values are enhanced at valley points and weakened at Otsu's method is known to perform badly on. In particular, the grey
peaks. This effect is more obvious on image coins shown in level of foreground character for image handwriting is lower and
Fig. 4b. closer to background grey level, making the problem more
difficult. From Fig. 5 we can find that threshold values acquired by
the proposed method for these two images both locate at the Table 3 shows the FRecall values of all test methods. The
bottom rim of the peak and only the proposed algorithm and Cao's proposed method obtains the best average result and has the largest
method generate thresholds close to the left bottom rim for image FRecall values for almost all test images except for image
handwriting. For the remaining images with bimodal or multimodal handwriting. Although a larger FRecall value generally
histogram distribution, the proposed algorithm chooses thresholds corresponds to a better foreground detection, it may not be the case
closest to the real valleys compared with all other compared if the foreground pixels are overly labelled compared with ground
methods as well, indicating its general effectiveness in all test truth. According to (23), if we label all pixels in an image as
cases. In particular, as is shown in Fig. 1c, the proposed method foreground pixels, the FRecall can reach 100%, which is not the
could identify a threshold value most similar to the desired value result we expect. Here the result for image handwriting encounters
for image coins, whose histogram has large intra-class variances similar situation and the good segmentation result of our proposed
and challenging for existing Otsu's method and other improved method will be demonstrated in the qualitative evaluation section.
methods. The DSC values of each algorithm on the five test images are
shown in Table 4. As introduced before, larger DSC values indicate
4.3 Quantitative evaluation better algorithm performance. From Table 4 we can come to the
conclusion that the proposed algorithm, whose average value is
In order to evaluate the performance of all test methods, we have 0.92, generally perform better than the compared methods on test
manually labelled the binary segmentation ground truth for the five images. Table 5 shows the Jac values of each algorithm on five test
test images (Fig. 3). The ground truth images are labelled in pixel images. The average Jac value of the proposed method is larger
level and saved as png format files. ME, FRecall, DSC and Jac than those of all compared algorithms, once again confirming the
metric values are then calculated based on the ground truth images outperformance of the proposed algorithm.
to quantitatively evaluate the segmentation performance of each
method. 4.4 Qualitative evaluation of segmentation results
Table 2 lists the ME values of each algorithm on all of the five
images. We can see that the proposed algorithm gains the smallest Fig. 6 shows the thresholding segmentation results of each method
ME values on three of the five test images except for image on the five test images. Images of each column from left to right
handwriting and image license plate, which are both one percent are original images, results of Otsu's method, VE, NVE, WOV,
larger than the smallest ME. In addition, the average ME of the Cao's method and the proposed method. There is no much
proposed method on five test images is far less than most of the difference among all methods for image printing character and
other Otsu-based algorithms including Otsu's method, VE, NVW image license plate. For image rice, although all methods miss
and WOV, and one percent smaller than that of the Cao's method. some foreground object pixels at the bottom area due to uneven
As smaller ME values indicate more effective segmentation ability, illumination, our algorithm misses the least. As for image coins
the ME results have verified the effectiveness of our proposed whose histogram has large intra-class variances, the proposed
algorithm. method is the only one which could correctly segment out all coins
Fig. 6 Segmentation results of each automatic thresholding method. From left to right: original images, Otsu's results, VE's results, NVE's results, WOV's
results, Cao's results, and the proposed results
Fig. 7 Segmentation results of each automatic thresholding method on a subset of the larger data set. From left to right: original images, Otsu's results, VE's
results, NVE's results, WOV's results, Cao's results, and proposed results
without any local cavities. When applying the proposed method on performance, indicating its overall effectiveness for various image
image handwriting, despite a much lower FRecall value acquired subtypes.
compared with most of the other Otsu-based methods shown in
Table 3, the lower ME and higher DSC and Jac values assure that 4.5 Further evaluation on a larger data set
the proposed method could almost segment all words from the
image, generating the best result. When compared with Cao's In order to further evaluate the proposed algorithm, we conduct
method for image handwriting, three of the four evaluation metric experiments on a larger data set consisting of 22 cell images. All
values including ME, DSC and Jac are better for Cao's method, and the original cell images are collected from internet, and the ground
only the value of FRecall shows a better result for our proposed truth of each image is manually labelled before algorithm
method. However, from the segmentation results in Fig. 6 we could validation.
see that despite some over labelled foreground at the left lower The proposed algorithm and the five compared methods are
corner, our proposed method could segment much clearer evaluated both quantitatively and qualitatively on the 22 cell
characters compared with Cao's method. Therefore, based on the images. Fig. 7 shows part of the segmentation results. It is obvious
thresholding segmentation results on all five test images described that most segmentation results of the proposed method are better
above, the proposed method shows the best segmentation compared to other algorithms. On one hand, from the segmentation
Table 7 Average running time of each algorithm on different size of images (ms)
Image size Method
Otsu's method [9] VE [18] NVE [10] WOV [12] Cao's method [26] Proposed method
256 × 256 0.3588 0.3588 0.4524 0.3588 0.3588 0.7644
512 × 512 0.4524 0.4680 0.5928 0.4526 0.4524 0.8736
1024 × 1024 1.1544 1.1700 1.3104 1.1388 1.1388 1.5912
2048 × 2048 4.1496 4.3680 4.2588 4.1184 4.1340 4.6488
4096 × 4096 15.6313 15.8653 16.2709 15.6937 15.6937 16.1461
images in rows 2, 3 and 5, we can see that foreground objects in methods have the same time complexity of O(n). When image size
the segmentation images generated by the proposed method reaches 4096 × 4096, all methods could still achieve real-time
contain less black holes which correspond to miss detection of image processing since the average running time is <50 ms.
foreground pixels. On the other hand, the threshold values obtained
using the proposed method and WOV are more reasonable than 5 Conclusion
those obtained using Otsu's method, VE, NVE and Cao's method
on the image in the 4th row in Fig. 7. While the results of Otsu's, In this paper, we have introduced a modified valley metric and
VE, NVE and Cao's method in the 4th row obviously miss most of presented an improved Otsu's method for automatic image
the foreground objects, the results of the proposed method and thresholding. The proposed valley metric is constructed using
WOV are more consistent with the manually labelled ground truth. second-order derivative and introduced into the objective function
Moreover, compared to the results of WOV, foreground objects of Otsu's method to make the threshold more likely to locate at the
segmented using the proposed method are more complete. Despite valley of peaks of the image histogram. Experiment verification is
the advantages of the proposed method, we should also notice its conducted on five typical images as well as on a larger data set of
drawbacks in some of the segmentation results. Taking the results 22 cell images with manually labelled ground truth. The proposed
in rows 1 and 5 as an example, while the proposed method detects method is compared with existing Otsu-based methods including
more complete foreground objects, more background pixels are standard Otsu's method, VE, NVE and WOV, as well as recently
incorrectly classified. Taking all the segmentation results into published Cao's method. The proposed method could significantly
consideration, the proposed method performs better than all the improve Otsu's method in segmenting images with unimodal
compared algorithms. histograms and images having large intra-class variances. It has
Table 6 shows the average metric values of each method on the similar time complexity as compared methods and is shown to be
22 cell images. The proposed method obtains best values on three the most effective algorithm among all test methods by both
of the four metrics, which are ME, DSC and Jac. For FRecall, the quantitative and qualitative evaluations, exhibiting best flexibility
result of the proposed method (86.04%) is slightly smaller than and performance for segmenting images of various histogram
WOV method (88.26%). However, WOV method has the largest distributions.
average ME value (21.65%) among all test algorithms, which is
7% larger than the proposed method (14.57%). The quantitative 6 Acknowledgments
evaluation results on the larger data set verify the effectiveness of
the proposed algorithm. This work is partly supported by the National Natural Science
Foundation of China (Grant Nos. 61866031, 61862053, 61762074
and 31860030), and the Science Technology Foundation for
4.6 Algorithm complexity Middle-aged and Young Scientist of Qinghai University (Grant
We compare the algorithm complexity for all test methods shown Nos. 2016-QGY-5, 2017-QGY-4 and 2018-QGY-6).
in this paper. Let n represent the pixel number of the image, L be
the grey level, the time complexity of Otsu's method would be 7 References
O(n). The extra computation burden of the proposed algorithm
[1] Cao, J.F., Chen, L.C., Wang, M., et al.: ‘Implementing a parallel image edge
includes (i) the average filtering of the histogram, which needs (2k detection algorithm based on the Otsu-Canny operator on the Hadoop
+ 1)L times of addition and multiplication respectively; (ii) the platform’, Comput. Intell. Neurosci., 2018, 2018, pp. 1–12
calculation of the second order difference, which needs 2L times of [2] Chen, C.T., Tsao, C.K., Lin, W.C.: ‘Medical image segmentation by a
subtraction; (iii) the calculation of valley metric, which includes 2L constraint satisfaction neural network’, IEEE Trans. Nucl. Sci., 1991, 38, (2),
pp. 678–686
times of subtraction and L times of division; (iv) the calculation of [3] Lin, L., Yang, W., Li, C., et al.: ‘Inference with collaborative model for
new object function, which needs L times of multiplication. The interactive tumor segmentation in medical image sequences’, IEEE Trans.
total computation increase is (4k + 8)L, where k and L are constant Cybern., 2017, 46, (12), pp. 2796–2809
whose values are 5 and 256, respectively. Therefore, our proposed [4] Han, B., Wu, Y.: ‘A novel active contour model based on modified symmetric
cross entropy for remote sensing river image segmentation’, Pattern Recogn.,
method has same time complexity as Otsu's method, which is O(n). 2017, 67, pp. 396–409
Similarly, the time complexity of all other compared methods in [5] Zhang, L., Kong, H., Chin, C.T., et al.: ‘Segmentation of cytoplasm and
this paper is also O(n). Table 7 lists the average running time of nuclei of abnormal cells in cervical cytology using global and local graph
100 times implementation for each algorithm on different size of cuts’, Comput. Med. Imaging Graph., 2014, 38, (5), pp. 369–380
[6] Wang, D.W.: ‘Hybrid fitting energy-based fast level set model for image
images. When the image size is small (no more than 1024 × 1024), segmentation solving by algebraic multigrid and sparse field method’, IET
the extra time consumption for our proposed method compared Image Process., 2017, 12, (4), pp. 539–545
with all other methods is about 0.3–0.4 ms. However, as image size [7] Zheng, J., Zhang, D.H., Huang, K.D., et al.: ‘Adaptive image segmentation
increases, the average running time will get much similar for all method based on the fuzzy c-means with spatial information’, IET Image
Process., 2018, 12, (5), pp. 785–792
methods, which is consistent with the previous analysis that all