Assignment 4
Assignment 4
depicts the literature review of the existing big data classification methods. In
section 3, the proposed method of big data classification is presented and section 4
discusses the results of the proposed method. Finally, section 5 concludes the
paper.
The main aim of the research is to establish a big data classification model using
integrating the EWMA with the BA. The big data that is obtained from the
distributed sources is fed to the mapper phase that performs the feature selection
using the E-Bat algorithm. The effective feature extraction ensures the
classification of the data such that the classification accuracy is enhanced. The
selected features are fed to the reducer for data classification, and the reducer
utilizes the Neural Network (NN), which is trained using the proposed E-Bat
4. Assumptions
5. Hypothesis
6. Research questions
7. Literature review
Eight literary works related to the data classification framework in the big data environment
is presented in table 1.
Furthermore, the
approach requires
only one
hyperparameter,
which avoids the
potential errors
caused by excessive
parameter settings.
Cost effective
solution
8. Research Methodology
a) Sample
b) Research Design
c) Tools for data collection
9. Data Analysis(Methods)
12. Conclusion
The paper deals with the proposed big data classification that aimed at
meeting the raising demands of high volume, high velocity, high value,
high veracity, and huge variety. The big data classification is performed
using the MapReduce framework such that the data from the distributed
sources is handled parallel at the same time. The big data is analyzed by
the MapReduce framework to yield the classified results and the
processing is of two steps. The first step is feature extraction that extracts
the optimal features from the data using the proposed E-Bat algorithm in
the mappers. In contrary, the classification is performed in the reducers
that are provided with the NN. The optimal tuning of the weights of NN is
processed using the proposed EBatNN algorithm. The final output from
the MapReduce framework is the classified big data that forms the clusters
for the whole big data. The experimentation of the proposed big data
classification is performed using four standard databases taken from the
UCI machine learning Repository.
13. References
[1] A. Alexandrov et al., The stratosphere platform for big data analytics, The VLDB
[2] A. Fernandez et al., Fuzzy rule based classification systems for big data with
[3] A.J.C. Slooter et al., Seizure detection in adult ICU patients based on changes in EEG
[4] B. Xue, M. Zhang and W. N. Browne, Particle Swarm Optimization for Feature
"http://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+%28diagnostic%29,"
Support Vector Machine for Big Data, IEEE Transactions on Big Data, 3(1) (2017),
pp. 79-90.
[7] D.Cui et al., Estimation of genuine and random synchronization in multivariate neural
[8] G. Chatzigeorgakidis et al., FML-kNN: scalable machine learning on Big Data using
[9] G. Manogaran and D. Lopez, Spatial cumulative sum algorithm with big data
[10] H. Ke et al., Towards Brain Big Data Classification: Epileptic EEG Identification