A Rapid Knowledge-Based Partial Supervision Fuzzy C-Means For
A Rapid Knowledge-Based Partial Supervision Fuzzy C-Means For
DOI: 10.1002/ima.22335
RESEARCH ARTICLE
1
Department of Computer Applications,
Kalasalingam Academy of Research and Abstract
Education (Deemed to be University), The proposed work aims to quicken the magnetic resonance imaging (MRI) brain tis-
Krishnankoil, Tamil Nadu, India
sue segmentation process using knowledge-based partial supervision fuzzy c-means
2
Department of Computer Science and
(KPSFCM) with graphics processing unit (GPU). The proposed KPSFCM contains
Applications, The Gandhigram Rural
Institute (Deemed to be University), three steps: knowledge-based initialization, modification, and optimization. The
Gandhigram, Tamil Nadu, India knowledge-based initialization step extracts initial centers from input MR images for
3
Department of Radiology and Imaging KPSFCM using Gaussian-based histogram smoothing. The modification step
Sciences, Sri Ramachandra University
Medical College, Chennai, Tamil Nadu, changes the membership function of PSFCM, which is guided by the labeled patterns
India of cerebrospinal fluid portion. Finally, the optimization step is achieved through size-
based optimization (SBO), adjacency-based optimization (ABO), and parallelism-
Correspondence
Thiruvenkadam Kalaiselvi, Department of based optimization (PBO). SBO and ABO are algorithmic level optimization tech-
Computer Science and Applications, The niques in central processing unit (CPU), whereas PBO is a hardware level optimiza-
Gandhigram Rural Institute (Deemed to be
tion technique implemented in GPU using compute unified device architecture
University), Gandhigram, Tamil Nadu,
India. (CUDA). Performance of the KPSFCM is tested with online and clinical datasets.
Email: [email protected] The proposed KPSFCM gives better segmentation accuracy than 14 state-of-the-art-
methods, but computationally expensive. When the optimization techniques (SBO
and ABO) were included, the execution time reduces by 13 times in CPU. Finally,
the inclusion of PBO yields 19 times faster than the optimized CPU implementation.
KEYWORDS
brain scans, GPU CUDA, labeled patterns, parallel computing, tissue segmentation
Int J Imaging Syst Technol. 2019;1–14. wileyonlinelibrary.com/journal/ima © 2019 Wiley Periodicals, Inc. 1
2 SRIRAMAKRISHNAN ET AL.
learning. A labeled pattern is a process of assigning the the brain tissue using a hidden Markov random field model
belongingness of a data point to a particular cluster using and the expectation-maximization algorithm.19
application knowledge. Labeled patterns significantly SPM5 and SPM8 are two variants of SPM toolbox. These
improve the results of clustering process. Generally, labeled segmentation methods work with parameter estimation of a
patterns are small in number and unlabeled patterns are large Gaussian mixture model and atlas registration.20 GAM-
in number. Labeled patterns contribute for more accuracy IXTURE is a statistical model approximated by a Gaussian
while unlabeled patterns give less accuracy in the clustering function. The mixture model helps to segment the brain into
process. Partial supervision helps to identify the hidden pure and mixture classes.21 Self-organizing map (SOM) is a
structure of the data and is more suitable for the brain tissue type of Artificial Neural Network (ANN) capable to do effi-
segmentation when bias field is present. cient brain tissue segmentation against noise and bias field.22
Generally, PSFCM computation is an iterative procedure KNN implements self-trained prior probability-based atlases,
and requires manual initialization for the number of clusters which help to obtain maximum probability from the tissues.23
(c) and initial centers (ICs). These drawbacks are overcome FANTASM is proposed to enhance the alternative fuzzy
by the proposed KPSFCM with automatic initialization by c-means algorithm and improve its robustness to noise.24
using a Gaussian-based histogram smoothing technique. The improvement is formulated by directly injecting a new
PSFCM adapts a new membership function for brain tissue term into the objective function. The membership value of
segmentation, computed with the brain portion using labeled each pixel depends not only on the data at that pixel but also
patterns. The modified membership function improves the on the neighboring membership values. PVC is a maximum
accuracy of the segmentation. The performance of the a posteriori classifier and also optimized by the iterative con-
KPSFCM is computed by the Dice coefficient and is com- ditional algorithm.25 Tissue distributions are estimated by
pared with 14 state-of-the-art-methods. combining the tissue measurement model with a spatial prior
The iterative nature of PSFCM makes it a time- model. A new semiautomated brain tissue segmentation
consuming process.12 Furthermore, the MRI volume in the method was developed based on a hybrid hierarchical
form of 2D slices takes much time for the computation. approach that combines with brain atlas and LS-SVM.26
Therefore, the KPSFCM was adapted by several optimiza- This is a three-step process for skull stripping, CSF removal,
tion techniques in the algorithm level (size-based optimiza- and tissue segmentation using FAST added with FMRIB
tion [SBO] and adjacency-based optimization [ABO]) as software library (FSL-FAST) and LS-SVM. FSL-FAST con-
well as hardware level parallelism-based optimization tribute to the first two steps of skull stripping and CSF
(PBO). In hardware level optimization, graphics processing removal.27,28 In the third step, the LS-SVM classifier helps
unit (GPU) is very useful to gear up the KPSFCM perfor- to segment the GM and WM tissues.
mance. General purpose GPU computation gives attractive Brain tissue segmentation using FCM for a large dataset
results, which are observed by the researchers, to accelerate is a time-consuming process.29 The br-FCM is a GPU-based
various applications, including medical image analysis.13-15 parallel FCM for medical image segmentation and
The rest of the paper is organized as follows: Section 2 implemented in various GPU cards.30 FCM and type-2 fuzzy
describes related works relevant to this experiment. Section 3 c-means (T2FCM) algorithms are implemented in GPU on
contains the methodology of KPSFCM. Section 4 contributes medical images.31 This method reduces the execution time
the materials and evaluation metrics used to test the perfor- up to 80% for FCM and 74% for T2FCM. Parallelized FCM
mance of KPSFCM. Results and discussion are detailed in on general images gives 11 times faster than the conventional
Section 4 and concluding remarks are given in Section 5. central processing unit (CPU).32 A segmentation method has
been developed for polyurethane foam with fungus color
2 | RELATED WORKS images and was compared with the sequential FCM
implemented using C++ and MATLAB.33 They achieved
The proposed method is competing with the following 10-fold speedup of their parallel proposal compared with the
14 state-of-the-art methods and given in this section. FCM implemented in C++ for an object area of 310k pixels,
K-means (KM) is an unsupervised iterative method that and a 50- to 100-fold speedup compared with the FCM
gives sharp classification.16 FCM is also an unsupervised implemented in MATLAB for an object area of 260k pixels.
clustering technique, which is classifying the given data by From the literature review, it is observed that numerous
fuzzifying the elements with different degrees.17 FFCM methods have been developed to analyze the brain-related
assigns c membership grades to every pixel.18 In order to diseases using segmentation techniques, but unsupervised
reduce the computational time in FFCM, a hard membership algorithms play a major role among them. Usually,
can be assigned to pixels for updating centers in each itera- unsupervised algorithms require a medical expert's support
tion step. FAST is a fully automatic method for segmenting to set up parameters for initialization process ahead of
SRIRAMAKRISHNAN ET AL. 3
clustering. Few algorithms require to make a preprocessing In the first step, automatic initialization of KPSFCM is
decision when noise or bias field was present in the image done by using the anatomy details, intensity characteristics
sometimes. Generally, machine learning algorithms are com- of MR brain images, and histogram smoothing techniques.
plex in nature, and huge medical volumes expect tremen- Automatic initialization overcomes the random initialization
dous computation power. These factors motivate us to overhead of PSFCM. In the second step, modification is
develop a fully automatic method with self-initialization to available in the membership function using the labeled pat-
perform tissue segmentation against the presence of bias terns of the CSF region. These two novel techniques
field. In addition, complexity of the algorithm has been improve the segmentation accuracy of the KPSFCM algo-
addressed by the computation power of GPU. rithm. The third step is aiming to quicken the proposed
KPSFCM algorithm by including the following three optimi-
3 | KNOWLE D GE -B ASE D P A R T I A L zation processes:
S U P E R V I S I O N FU Z Z Y C - M E A N S
1. SBO: A bounding box of region of interest (ROI), here
The block diagram of KPSFCM is shown in Figure 1. The the brain portion is used throughout the algorithm for
proposed KPSFCM has three major steps to enhance the per- segmentation.
formance of PSFCM for segmenting brain tissues. They are 2. ABO: The adjacency property present in the brain slices
as follows: is used for centers’ initialization process. The final cen-
Step 1: Initialization ters (FCs) are used to initialize the centers of adjacent
Step 2: Modification slices. This helps to reduce the number of iterations and
Step 3: Optimization thus processing time.
3. PBO: A parallel processing for MRI volume is aimed to After extracting the brain portion, the histogram-based ini-
implement the KPSFCM using GPU with compute uni- tialization is targeted. A prior knowledge about brain anatomy
fied device architecture (GPU-CUDA) programming can play an important role in the initialization process. In this
model. work, prior knowledge can be extracted from anatomical struc-
ture and intensity characteristics of MR brain scans and are
used to initialize the centers of the KPSFCM process.
3.1 | Initialization Middle slice of the volume contains all the tissue regions
Before going to initialize the KPSFCM algorithm, there is and the histogram of the middle slice from IBSR20 is shown
a need to prepare the data for processing. The in Figure 3. Since it contains several peaks, it cannot give
preprocessing of proposed work starts with brain portion any information about the number of clusters (c) and ICs.
extraction or skull stripping using brain extraction method Gaussian distribution-based histogram smoothing helps to
(BEM). The original MRI of head scans contains brain produce the peaks of major tissues. In the histogram smooth-
portion along with nonbrain tissues such as skull, scalp, ing, four peaks are found for the GM, WM, CSF, and back-
fat, and so on, as shown in Figure 2A. BEM is a new ground, as shown in Figure 4. Clusters count (c) is equal to
knowledge method used for skull stripping proposed by the number of peaks from histogram smoothing, and the cen-
Kalaiselvi.34 This method gives better brain portion than ters of each cluster set from its corresponding gray values of
brain extraction tool, brain surface extractor, and model- the peaks. The nontissue (background) region is obtained
based level set. A sample slice from IBSR20 dataset and from the void nuclear magnetic resonance signal, which is
its skull-stripped image using BEM are shown in omitted from the number of clusters' calculation. This
Figure 2. knowledge-based initialization is used for the FCM process
FIGURE 3 Histogram middle slice of IBSR20 205_3 dataset [Color figure can be viewed at wileyonlinelibrary.com]
SRIRAMAKRISHNAN ET AL. 5
FIGURE 4 Histogram smoothing using Gaussian distribution on IBSR20 205_3 middle image [Color figure can be viewed at
wileyonlinelibrary.com]
c X
X n
c X
X
instead of the manual selection of c and the random initiali- n
uik − m
zation of ICs. J ps = um 2
ik d ik +δ d2ik , ð3Þ
i=1 k=1 i=1 k=1 f ik bk
The modified membership matrix U= uikis defined as Step 3: Compute the membership values of labeled data fik.
Step 4: Compute the membership matrix (uik) using Equation (6).
2 3
Pc Step 5: Compute Euclidian distance between data points (xk) and
1 61 −bk l = 1 f lk 7 centers (vi).
uik = 4 2 5 + δ f ik bk ð6Þ
1 + δ bk Pc dik Step 6: Update the centers using Equation (7).
l=1 dlk
Step 7: If the stopping criteria given in Equation (8) is met, go
to step 8 else go to step 4.
Membership modification gives a strong association
Step 8: Outputs: Membership matrix (uik), FCs (vi), and itera-
between the Boolean vector and scaling factor. This factor leads
tion count.
to find the suitable clustering group of the data point when a
bias field is present. Therefore, the modified membership func-
tion makes the method robust against noise and bias field.
The centers (vi) are updated as
3.3 | Optimization
Pn
u2 xk The proposed KPSFCM is further optimized using the
vi = Pk n= 1 ik2 , ð7Þ
k = 1 uik knowledge about the ROI, adjacency properties of MR
slices, and parallel computing technique. These three kinds
where uik is the degree of membership of xk in cluster i, δ is of optimization are named SBO, ABO, and PBO.
a scaling factor to maintain a balance between the supervised
and unsupervised data and suggest δ to be proportional to
the rate n/p, where p denotes the number of labeled patterns. 3.3.1 | Size-based optimization
dik = kxk − vik is a Euclidean distance between kth data Under this optimization, KPSFCM aims to reduce the computa-
points and ith center and vi is a center of cluster i. tional time using image size (SBO). In the MRI slices, back-
The stopping criterion of the KPSFCM is given by ground data points occupy more number of pixels than ROI, as
X
c shown in Figure 5A. KPSFCM takes more time to process these
absðvi −~vi Þ < ε , ð8Þ unwanted background data points. Therefore, a bounding box
i=1
algorithm helps to remove the background data points, as shown
where ~
vi is an updated value of vi and ε 2 [0, 1]. The algo- in Figure 5C. This process reduces the image size considerably.
rithm of the proposed KPSFCM is given in Algorithm 1. The size reduction process does not affect the segmentation accu-
racy but reduces the processing time. The ROI size is reduced
Algorithm 1 much near the bottom end and top end slices in a volume.
FIGURE 5 Size-based optimization using bounding box. A, Sample T1 weighted coronal image. B, Image with ROI bounding box. C, ROI.
ROI, region of interest
SRIRAMAKRISHNAN ET AL. 7
FCs of a slice are very close to that in the adjacent slices. Parallel KPSFCM method
Hence, FCs of the current slice are used to assign ICs of the Data parallelization on GPU is a powerful method using
next slice, as shown in the flowchart in Figure 1. In this type thread per voxel scheme. Heterogeneous CPU and GPU
of initialization, the algorithm takes very less iterations to implementation gives much speedup ratio than everything in
converge. Middle slice is a source slice to initiate the pro- GPU alone.36 The parallel KPSFCM design consists of two
posed work and continue the clustering process toward the parts: a sequential data dependency code executed on the
upper and lower end slices. CPU (host) and a parallel data independency code executed
on the GPU (device). The sequential CPU code contains
3.3.3 | Parallelism-based optimization centers' initialization, memory allocation on device, data
transfer from host to device, and start of the kernels' execu-
The proposed implementation in a single-threaded CPU is
tion. GPU code contains three kernels for label creation,
found to be a time-consuming process for large dataset
update the member function, and update the centers. The
(BRATS2012 and BRATS2014). GPU is a specialized hard-
ware introduced by NVIDIA Corporation for addressing the block diagram of the proposed parallel KPSFCM is shown
computational problem. Under PBO, the KPSFCM algo- in Figure 6. The parallel KPSFCM clustering algorithm can
rithm is parallelized using the GPU-CUDA programming be divided into the following three main stages, and the
model based on thread per voxel scheme. step-by-step procedures are given in Algorithm 2.
1. Initialization and data transfer makes parallelizing the algorithm infeasible. One finding
2. Label creation and update the membership function solution for this problem is to handle thread per image
3. Update the centers within the third kernel. Thread per image executes the serial
code inside the third kernel for performing two summation
Algorithm 2 operations. The reason of thread per image within the kernel
is to avoid the huge data transfer between the CPU and GPU
Step 1: Inputs: MR volume data points, number of clusters for performing the two summation operations. One disad-
(c) , and centers (vi). vantage of using a GPU is transferring the data cost between
Step 2: Allocate the required memory on device. the CPU and GPU. This takes place over a PCI-Express bus,
Step 3: Transfer the data from host memory to device memory. which is having a maximum transfer rate of 2 GB/s, a factor
Step 4: Calculate the number of thread blocks and grids from of ×87 less than the memory bandwidth of the onboard
the number of voxels.
QUADRO GPU memory. The host executes the second and
Step 5: Execute kernel 1 for labeled and unlabeled data crea-
third kernels repeatedly until all the images satisfy the termi-
tion with thread per voxel.
nation condition as given in Equation (8).
Step 6: Execute kernel 2 to update the membership matrix with
thread per voxel.
Step 7: Execute kernel 3 to update the clusters' center with 4 | SYSTEMS, MATERIALS, A ND
thread per image. METRICS
Step 8: If the stopping criteria given in Equation (8) are met
for all images, go to step 9 or else go to step 6. The configurations of CPU and GPU systems used in our
Step 9: Transfer the final membership matrix from device to experiment are given in Table 1. This experiment uses both
host memory. clinical and online datasets. The real-time clinical datasets
Step 10: Release the device memory. were collected from KGS Scan Centre and Meenakshi Mis-
Step 11: Output: Segmented volume. sion Hospital & Research Centre (MMHRC), Madurai. Online
datasets are collected from Internet brain segmentation repository
In the heterogeneous implementation, the CPU part con- Features CPU GPU
tains the number of clusters (c) and the center of all clusters Processor name Intel i5-2500 NVIDIA Quadro K5000
assigned from knowledge-based histogram smoothing of each Speed 3.4 GHz 1.4 GHz
slice. The device's global memories will be allocated for the Count 1 8
volume data, membership matrix, and clusters' center. After Number of cores 4 1536 (8 × 192)
allocating the memories on device, the data will be transferred Memory 4 GB 4 GB
from host to device. All the arrays are defined in a 1D pattern, Operating system Windows 8 64 bit Windows 8 64 bit
which helps to calculate the required number of CUDA blocks Programming C CUDA 7.5
and threads. Then the CPU starts the kernel execution with language
enough number of parallel threads. After the computation, the Graphics clock 810 MHz 706 MHz
final membership matrix will be transferred from the device to
Memory bandwidth 21 GB/s 173 GB/s
the host and then release the device memory.
Power consumption 95 W 122 W (Auxiliary
power required)
2. Label creation and update of membership function
Transistor count 1400 million 3540 million
The host calls the three CUDA kernels one after another. Others --- Compute capability
version 3.0
The first kernel executes thread per voxel, which assigns the
Memory clock 5.4GHZ
labeled and unlabeled patterns. The second kernel handles the
Max grid dimension
heavy calculations such as Euclidian distance between data (2 147 483 647,
point and center using parallel threads as given in Equation (6). 65 535, 65 535)
Max thread dimension
3. Update the centers (1024, 1024, 64)
Register per block 49 152
Two summation operations are needed to update the cen- Threads per block 1024
ters (vi), as given in Equation (7). Such a strong dependency Abbreviations: CPU, central processing unit; GPU, graphics processing unit.
SRIRAMAKRISHNAN ET AL. 9
Volume No. Dataset name Gender Age First slice Last slice Total number of slices
1 1_24 Female 35 −1 63 65
2 2_4 Female 34 1 65 65
3 4_8 Female 29 7 67 61
4 5_8 Female 20 1 60 60
5 6_10 Male 22 1 63 63
6 7_8 Male 29 1 60 60
7 8_4 Male 27 1 63 63
8 11_3 Male 28 1 63 63
9 12_3 Male 38 1 63 63
10 13_3 Male 32 1 63 63
11 15_3 Male 31 1 60 60
12 16_3 Female 36 1 60 60
13 17_3 Female 29 1 63 63
14 100_23 Female 23 1 63 63
15 110_3 Male 25 0 63 64
16 111_2 Male 27 0 63 64
17 112_2 Male 32 1 63 63
18 191_3 Female 32 1 63 63
19 202_3 Female 28 1 63 63
20 205_3 Female 24 1 63 63
(IBSR) and the details are given in Table 2. IBSR dataset con- 5 | R E S U L T S A N D DI S C U S S I O N
tains 20 normal T1 coronal volumes with gold standard images,
which were created by the center for morphometric analysis at The experiments carried out by applying KPSFCM and by
Massachusetts General Hospital and used for brain tissue seg- including their three optimization techniques, namely KPSFCM
mentation.37 The brain scans are commonly known in the litera- +SBO, KPSFCM+ABO, and KPSFCM+SBO+ABO on the
ture as IBSR20. Among the 20 subjects, 10 subjects were from material pool. The computational time of parallel implementation
each male and female, respectively. The subject's age lies in the GPU-CUDA programming model named as KPSFCM
between 20 and 38. The volume size is nearly 256 × 256 × 60, +PBO is compared with optimized CPU implementation.
with voxel resolution of 1 mm × 1 mm × 3 mm. Some datasets The qualitative results of clinical datasets are discussed
(2_4, 4_8, 5_8, 6_10, 15_3, 16_3, and 17_3) were affected by before online repository. Qualitative results of KPSFCM on
low contrast scans and intensity nonuniformity artifacts caused the clinical dataset are shown in Figure 7. The first row shows
by magnetic fields, radio frequency coils, and noise factors. the original MR images of clinical datasets and the second
The performance of KPSFCM for segmenting brain tissues is row displays the segmented results in terms of GM and
estimated by the Dice coefficient.38 The Dice value ranges from WM. In the clinical dataset, the gold standard results are not
zero for complete disagreement and one for complete agreement. available to compute the quantitative results.
The Dice coefficient has been widely accepted by the medical Figure 8 shows the segmentation results by various
community used in the medical field to evaluate the accuracy of methods on a sample image. GM is shown in green, WM in
segmented images.39 The Dice coefficient is computed as: blue, CSF in red, and the background in black. KPSFCM
shows close segmentation results with the gold standard,
2 × TP marked with circle as shown in Figure 8 (c). Segmentation
Dice coefficient = ð9Þ
2 × TP + FP + FN results of KPSFCM on IBSR20 are shown in Figure 9. The
where TP is the true positive, the number of pixels overlap dataset names are given in the i-axis and their average Dice
between segmented results and the ground truth, FP is the values are given in the y-axis. GM ranges from 0.64 (6_10)
number of false positives, and FN is the number of false to 0.9 (13_3), giving a mean value of 0.81. WM ranges from
negatives. 0.68 (16_3) to 0.87 (13_3), giving a mean value of 0.80.
10 SRIRAMAKRISHNAN ET AL.
FIGURE 8 Brain tissue segmentation results by various methods on sample image. GM is shown in green, WM in blue, CSF in red and
background in black. A, Original image. B, Gold standard. C, Proposed KPSFCM method. D, K-means. E, FCM. F, FFCM. FCM, fuzzy c-means;
GM, gray matter; KPSFCM, knowledge-based partial supervision fuzzy c-means; WM, white matter [Color figure can be viewed at
wileyonlinelibrary.com]
CSF segmentation results lie in the range from 0.1 (13_3) to GM are having close intensities in the T1-weighted MRI,
0.5 (5_8), giving a mean value of 0.31. CSF exhibited poor which makes the misclassification in CSF segmentation.26
results due to two reasons. The first reason is that CSF and The second reason is that CSF has void black intensity in
SRIRAMAKRISHNAN ET AL. 11
TABLE 3 Average Dice coefficient (μ ± ) values of existing methods and proposed method
Abbreviations: CSF, cerebrospinal fluid; GM, gray matter; WM, white matter.
T1-weighted MRI, which is considered as background in yielded better and more consistent results than the state-of-the-
some dataset. art methods.
Segmentation accuracy of KPSFCM is compared with The overall performance of all the methods measured in
14 state-of-the-art methods; those results were gathered from terms of Dice value; the number of iterations; and the compu-
Kasiri et al,26 and Valverde et al,40 and are given in Table 3. tation time of traditional FCM, KPSFCM, and its three vari-
Dice value is given in the form of mean (μ) followed by the SD ants (KBPFCM+SBO, KBPFCM+ABO, and KBPFCM
(σ). If two methods have the same mean value, then the mini- +SBO+ABO) are given in Table 4. The KPSFCM and its
mum value of SD is considered for the best. KPSFCM yields three variants have the nearest segmentation results. The SBO
high Dice values for GM, WM, and CSF segmentation than the technique makes the small deviation between KPSFCM+SBO
state-of-the-art methods. Some existing methods are not consid- and KPSFCM+SBO+ABO. SBO reduces more background
ered for the CSF segmentation, and CSF will be combined with data points and changes the cluster centers of the CSF region.
GM. A plot of the values given in Table 3 is shown in The reason is because CSF segmentation goes down in both
Figure 10. From Figure 10, all the proposed method variants methods when compared with KPSFCM. The ABO technique
12 SRIRAMAKRISHNAN ET AL.
FIGURE 10 Stock chart of the results produced by various methods for IBSR20 evaluated using the Dice coefficient (μ ± ). Results of A,
GM; B, WM; and C, CSF. CSF, cerebrospinal fluid; GM, gray matter; WM, white matter [Color figure can be viewed at wileyonlinelibrary.com]
Dice coefficient
Abbreviations: CSF, cerebrospinal fluid; GM, gray matter; KPSFCM, knowledge-based partial supervision fuzzy c-means; WM, white matter.
reduces the processing time and does not affect the segmenta- methods took nine iterations per dataset. KPSFCM+ABO
tion accuracy. and the optimized KPSFCM+SBO+ABO methods took four
The average number of iterations taken by the traditional iterations per dataset to converge. ABO reduces the number
FCM and KPSFCM is given in Table 4. Stopping criteria of of iteration to converge and makes the computation faster.
all methods are same as given in Equation (8). The tradi- The average number of iterations taken by the KPSFCM
tional FCM with random initialization took as average of +SBO+ABO is two times lesser than the KPSFCM method
12 iterations per dataset. KPSFCM and KPSFCM+SBO and three times lesser than traditional FCM.
SRIRAMAKRISHNAN ET AL. 13
The average computational time taken by the traditional all the reviewers and associate editor for their fruitful comments
FCM, KPSFCM, KPSFCM+SBO, KPSFCM+ABO, and and suggestions for significant improvement of the manuscript.
KPSFCM+SBO+ABO methods on IBSR20 is given in
Table 4. The traditional FCM took 33 s/dataset and the pro-
posed KPSFCM took 503.7 s/dataset. The KPSFCM+SBO OR CI D
and KPSFCM+ABO have taken an average of 80.2 s/dataset
and 155.7 s/dataset, respectively. The average time taken by Thiruvenkadam Kalaiselvi https://orcid.org/0000-0002-
the optimized KPSFCM+SBO+ABO is 39 s/dataset. The 0197-2077
image size plays a major role in computational time. SBO
reduces the image size to ROI, which helps to execute the algo-
rithm much faster. KPSFCM+SBO+ABO yields 12.9-fold REF ER ENC ES
speedup than the plain KPSFCM method. KPSFCM+SBO
1. Prince JL, Links JM. Medical Imaging Signals and Systems. 2nd
+ABO is slower than traditional FCM because KPSFCM holds
ed. Pearson Prentice Hall; 2014.
knowledge-based computation. 2. Schmitter D, Roche A, Marechal B, et al. An evaluation of
The optimized KPSFCM is taken for the GPU implemen- volume-based morphometry for prediction of mild cognitive
tation under PBO. In the GPU implementation, threads are impairment and Alzheimer's disease. Neuroimage Clin. 2015;7:
created based on voxel per thread scheme. The average com- 7-17.
putation time for IBSR20 dataset with CPU-C and GPU- 3. Cardenas VA, Studholme C, Gazdzinski S, Durazzo TC,
CUDA is shown in Figure 11. The average computation Meyerhoff DJ. Deformation based morphometry of brain changes
in alcohol dependence and abstinence. Neuroimage. 2007;34(3):
time by CPU is 39 s/dataset and by GPU is 2.01 s/dataset.
879-887.
The GPU-CUDA implementation is 19.4 times faster than
4. Savitz JB, Rauch SL, Drevets WC. Clinical application of brain
the optimized CPU implementation. imaging for the diagnosis of mood disorders: the current state of
play. Mol Psychiatry. 2013;18(5):528-539.
5. Shirvany Y, Porras AR, Kowkabzadeh K, Mahmood Q, Lui HS,
6 | CONCLUSI ON S Persson M. Investigation of brain tissue segmentation error and its
effect on EEG source localization. Proc Int Conf IEEE Eng Med
Biol Soc (EMBC). 2012;1522-1525.
In this work, a fully automatic and knowledge-based system
6. Stoffers D, Sheldon S, Kuperman JM, Goldstein J, Corey-
is developed for enhancing the performance of PSFCM to Bloom J, Aron AR. Contrasting gray and white matter changes in
segment the brain tissues. The accuracy of segmented results preclinical Huntington disease an MRI study. Neurology. 2010;74
was calculated by using gold standards and compared with (15):1208-1216.
state-of-the-art methods. The KPSFCM used three optimiza- 7. Ueda K, Fujiwara H, Miyata J, et al. Investigating association of
tion techniques based on algorithm level and hardware level brain volumes with intracranial capacity in schizophrenia.
to reduce the segmentation time considerably. The GPU Neuroimage. 2010;49(3):2503-2508.
8. Diniz PRB, Murta-Junior LO, Brum DG, Araujo DB, Santos AC.
implementation of the present work is 19.4 times faster than
Brain tissue segmentation using q-entropy in multiple sclerosis
the optimized CPU implementation.
magnetic resonance images. Braz J Med Biol Res. 2010;43(1):
77-84.
9. Kalaiselvi T, Sriramakrishnan P, Somasundaram K. Survey of
ACKNOWLEDGMENTS using GPU CUDA programming model in medical image analysis.
Inform Med Unlocked. 2017;9:133-144.
We gratefully acknowledge the support of NVIDIA Corpora- 10. Kalaiselvi T, Sriramakrishnan P. Rapid brain tissue segmentation
tion Private Ltd., USA, with the donation of the QUADRO process by modified FCM algorithm with CUDA enabled GPU
K5000 GPU used for this research. The authors wish to thank machine. Int J Imaging Syst Technol. 2018;28(3):163-174.
14 SRIRAMAKRISHNAN ET AL.
11. Pedrycz W. Fuzzy clustering with partial supervision. IEEE Trans 29. Ali N, Cherradi B, Abbassi A, et al. GPU fuzzy c-means algorithm
Syst Man Cybern. 1997;27(5):787-795. implementations: performance analysis on medical image segmen-
12. Al-Ayyoub M, Abu-Dalo AM, Jararweh Y, Jarrah M. Al Sa'd, M. a tation. Multimed Tools Appl. 2018;77(16):21221-21243.
GPU-based implementations of the fuzzy c-means algorithms for 30. Al-Ayyoub M, Abu-Dalo AM, Jararweh Y, Jarrah M, Sa'd MA. A
medical image segmentation. J Supercomput. 2015;71(8):3149-3162. GPU-based implementations of the fuzzy c-means algorithms for
13. Eklund A, Andersson M, Knutsson H. True 4D image denoising medical image segmentation. J Supercomput. 2015;71(8):3149-
on the GPU. J Biomed Imaging. 2011;(8):1-16. 3162.
14. Olmedo E, De La Calleja J, Benitez A, Medina MA. Por Com- 31. Shehab MA, Al-Ayyoub M, Jararweh Y. Improving FCM and
putadora, L. D. P. Point to point processing of digital images using T2FCM algorithms performance using GPUs for medical images
parallel computing. Int J Comput Sci. 2012;9(3):251-276. segmentation. Paper presented at: Proceedings of the International
15. Massanes F, Cadennes M, Brankov JG. Compute-unified device Conference on Information and Communication Systems (ICICS);
architecture implementation of a block-matching algorithm for 2015: 130-135.
multiple graphical processing unit cards. J Electronic Imaging. 32. Li H, Yang Z, He H. An improved image segmentation algorithm
2011;20(3):033004. based on GPU parallel computing. J Softw. 2014;9(8):1985-
16. Kalaiselvi T, Somasundaram K, Vijayalakshmi S. Segmentation of 1990.
brain portion from MRI of head scans using K-means cluster. Int J 33. Rowiska Z, Gocawski J. CUDA based fuzzy c means acceleration
Comput Intelligence Inform. 2011;1(1):75-79. for the segmentation of n images with fungus grown in foam
17. Pal NR, Bezdek JC. On cluster validity for the fuzzy c-means matrices. Image Process Commun. 2012;17(4):191-200.
model. IEEE Trans Fuzzy Syst. 1995;3(3):370-379. 34. Somasundaram K, Kalaiselvi T. Automatic brain extraction
18. Biniaz A, Abbasi A, Shamsi M. Fast FCM algorithm for brain MR methods for T1 magnetic resonance images using region labeling
image segmentation. Int Conf Fuzzy Inform Eng. 2012;1:1-8. and morphological operations. Comput Biol Med. 2011;41(8):
19. Zhang Y, Brady M, Smith S. Segmentation of brain MR images 716-725.
through a hidden markov random field model and the expectation- 35. Bouchachia A, Pedrycz W. Enhancement of fuzzy clustering by
maximization algorithm. IEEE Trans Med Imaging. 2001;20:45-57. mechanisms of partial supervision. Fuzzy Set Syst. 2006;157(13):
20. http://www.fil.ion.ucl.ac.uk/spm. 1733-1759.
21. Ruan S, Jaggi C, Xue J, Fadili J, Bloyet D. Brain tissue classifica- 36. Kirk DB, Hwu WW. Programming Massively Parallel Processor:
tion of magnetic resonance images using partial volume modeling. A Hands-on Approach. Second ed. London: Elsevier; 2012.
IEEE Trans Med Imaging. 2000;19(12):1179-1187. 37. Internet Brain Segmentation Repository. Center for Morphometric
22. Tian D, Fan L. A brain MR images segmentation method based on Analysis at Massachusetts General Hospital. http://www.cma.mgh.
SOM neural network. Int Conf Bioinform Biomed Eng. 2007;1: harvard.edu/ibsr/index.html. Accessed August 1, 2018.
686-689. 38. Dice LR. Measures of the amount of ecologic association between
23. De Boer R, Vrooman HA, Van Der Lijn F, et al. White matter species. Ecology. 1945;26(3):297-302.
lesion extension to automatic brain tissue segmentation on MRI. 39. Guindon B, Zhang Y. Application of the Dice coefficient to accu-
Neuroimage. 2009;45(4):1151-1161. racy assessment of object-based image classification. Can J
24. Pham DL. Robust fuzzy segmentation of magnetic resonance Remote Sens. 2017;43(1):48-61.
images. Paper presented at: Proceedings of the 14th IEEE Sympo- 40. Valverde S, Oliver A, Cabezas M, Roura E, Llado X. X. Comparison
sium on Computer-Based Medical Systems; 2001; Bethesda, MD, of 10 brain tissue segmentation methods using revisited IBSR annota-
USA:127-131. tions. J Magn Reson Imaging. 2015;41(1):93-101.
25. Shattuck DW, Sandor-Leahy SR, Schaper KA, Rottenberg DA,
Leahy RM. Magnetic resonance image tissue classification using a
partial volume model. Neuroimage. 2001;13:856-876.
26. Kasiri K, Kazemi K, Dehghani MJ, Helfroush MS. A hybrid hier- How to cite this article: Sriramakrishnan P,
archical approach for brain tissue segmentation by combining Kalaiselvi T, Somasundaram K, Rangasami R. A
brain atlas and least square support vector machine. J Med Signals rapid knowledge-based partial supervision fuzzy c-
Sensors. 2013;3(4):232-243.
means for brain tissue segmentation with
27. Smith SM. Fast robust automated brain extraction. Hum Brain
CUDA-enabled GPU machine. Int J Imaging Syst
Mapp. 2002;17(3):143-155.
28. Smith SM, Jenkinson M, Woolrich MW, et al. Advances in func- Technol. 2019;1–14. https://doi.org/10.1002/ima.
tional and structural MR image analysis and implementation as 22335
FSL. Neuroimage. 2004;23:208-219.