0% found this document useful (0 votes)
27 views5 pages

1 Inani2019

Research Paper
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views5 pages

1 Inani2019

Research Paper
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2019 IEEE 4th International Conference on Computer and Communication Systems

A Machine Learning Algorithm TsF K-NN Based on Automated Data


Classification for Securing Mobile Cloud Computing Model

Anunaya Inani1
Chakradhar Verma2
Computer Science and Engineering Department,
Computer Science and Engineering Department,
Gurukul Institute of Engineering & Technology1
University College of Engineering 2
Rajasthan Technical University, Kota, Rajasthan, INDIA
Email:[email protected]
Email:[email protected]

Suvrat Jain3
Computer Science Department,
Neerja Modi School3
Email:[email protected]

Abstract - Mobile cloud computing (MCC) is fastest growing shared remote network, and storage capacity. [3] Mobile
technology era in which the research society has recently applications propel computing towards the cloud system
embarked. Today, Mobile data can include financial because of high usage of processing force and data storage
transactions such as electronic payments, M- wallets and requirements for smartphone supporters. Another significant
sensitive multimedia contents. The explosive volumes of mobile aspect that addresses in MCC is the security and privacy of
devices personal data, bring-up more attention to securely data the mobile data storage. MCC gives data sharing facility in
storage rather than consideration on data privacy and the middle of data service operators and mobile clients and
confidentiality levels. In this scenario Machine Leaning (ML) these data are saving in various geographical locations.[4]
brings an important role in the electronic data management. It
Therefore, such sort of mobile data is extremely fictile to
is always expensive and hard to manage the data manually
without adopting machine learning techniques using metadata.
exposing high hazard by means of confidentiality, integrity,
Many Machine Learning algorithms have been proposed to availability next to the traditional computational model.
comprehend diverse data management issues, yet the forecast Mobile clients hesitates from sharing confidential documents
of the top secret data and public data in a document is as yet a to the mediators storage service providers on cloud because
challenging exploration task. The contribution of this research of obscure nature hands for backup and restore operations.
article is to demonstrate a securing mobile data storage secrecy Attackers will affect consumers trusted data as outcome of
and privacy in cloud communication framework in terms of illegal data accession [5]. Mobile users hesitates from
automatic data classification using mobile training datasets sharing confidential documents to the mediators storage
with help of Training dataset Filtration Key Nearest Neighbor service providers on cloud due to unknown nature hands for
(TsF-KNN) classifier which classifies the data based on the backup and restore operations. In addition, they are concerns
confidentiality level of the record with higher accuracy and about their private data being compromised due to high level
powerful timelines as compared to the traditional K-NN attacks against user specific applications and use
algorithms and securing such confidential data category mechanisms such as IDS (Intrusion Detection System)-based
afterwards by applying various existing cryptographic tools spread of cloud storage systems [6]. Compromising and
solutions to assuring data privacy and confidentiality levels exploiting these touchy data will have serious negative
and simulation results demonstrates that reducing the overall impression on the clients being as individual or an
cost and minimize procedural time, increasing system organization. Therefore we must want to forestall such
performance and sustainability.
valuable mobile data over the cloud environments. The
Keywords-Mobile Cloud Computing; Machine Learning, existing mobile cloud storage system [7] frameworks utilize
Data Security, Asymmetric Key Cryptography, Data security algorithms to encrypted data without having
Classification. consideration its confidentiality level which might be
unfeasible. Addressing public and concealed classified data
I. INTRODUCTION by the similar fashion and at the equal security level which
Mobile Cloud Computing is a rapid innovation era that will hold unessential operating cost and increasing the
involves largest range of expert technologies and processing time. Machine learning is an application of
applications that touches almost every customer through artificial intelligence (AI) that provides systems the ability to
mobile devices.[1] MCC take away the restrictions from automatically learn and improve from experience without
geographically domains and becomes capable clients to get being explicitly programmed. Machine learning focuses on
what they want to do at anywhere and anytime from the development of computer programs that can access data
internet[2]. Because of Mobile devices challenges such as and use it learn for themselves. Data mining is most
low processing force, limited performance, battery life, and significant applications of Machine Learning. Every instance
lack of quality of service factor (QoS), limited vitality, within a dataset is developed by machine learning algorithms
based on few predictive features.[8] Machine learning

978-1-7281-1322-7/19/$31.00 ©2019 IEEE 9


algorithms are often categorized as supervised or compromising, the user's extremely critical data such as
unsupervised. Supervised machine learning algorithms can sensitive personal documents like address location, contact
apply what has been learned in the past to new data using details, official secret files which is stored at high risk assets
labelled examples to predict future events. In contrast, may lost. Mobile browser vulnerabilities are most common
unsupervised machine learning algorithms are used when the cyber attack and phishing scams where hackers acting one as
information used to train is neither classified nor labelled [9]. authentic authorities within cyberspace like social networks
K-Nearest Neighbour algorithm is one of the simplest and acquiring critical information such as login password,
classification algorithms. K-NN provides keen accuracy and secret pin and get full access and exposing mobile user
more flexible nature to acquire the new advancement as highly confidential data. In Mobile ecosystem multiuser
compared to other algorithms. K-NN easiness is based upon login service (MLS) another major issue of mobile data
Euclidean distance and cosine similarity works that are leakage where user’s confidential information utilized with
generally used to classify the data. The objective of this single sign-on (SSO) for multiple social media applications
research is to determine the confidentiality based class of the like as facebook, LinkedIn, twitter etc. This paper outlines
data into a file through machine learning algorithm and advantages of biometric method towards verification of the
reduce the data encryption and decryption process by clients. Therefore it is wise to become more fruitful as of
applying data encryption to only for critical confidential data. biometric strategy implies the machine controlled usage of
This proposed framework will increase mobile device behavioural or physiological characteristics to determining
efficiency and decreases the storage complexity and running or defended individuality. To protecting critical data privacy
time after encryption and decryption of the mobile data. management machine leaning algorithms play an important
After all ML algorithms a more secure and effective model role. In this scenario main improvement in the K-NN method
that contributes mobile data secrecy along with integrity in was proposed by Hart [13] in 1968 to decreases the training
mobile ecosystem. The Research paper is arranged as: the storage estimate and improves productivity of K-NN. This
related work is presented in section 2, our proposed secure method was named as the "Condensed Nearest Neighbours"
mobile cloud Model with an efficient TsF K-NN Machine (CNN). In 1972, another system Reduced Nearest
Learning Algorithm is in section 3; results and discussion are Neighbours rule (RNN), was introduced by Gates in [14]
presented in section 4 and the conclusion is in section 5. which is an expansion of the CNN method. From a
computing point of view RNN is more costly than CNN, yet
II. RELATED WORK AND EXISTING SOLUTIONS improves the classification result. As later on the
The MCC standard allows consumer to approaches and classification stage, RNN is more affordable in terms of
manages their applications data by smart telephones computing and data storage in contrast to CNN. In 2002,
improving the capabilities of smart devices through moving Maleq Khan [15] and his colleagues proposed two new
the storage and compute intensive tasks of mobile to the techniques to better the performance of K-NN with regards
cloud based framework. The major safety worries in the to accuracy and higher speed. They proposed another
MCC are mobile application data security, client similarity matrix called Higher Order Bit Similarity (HOBS)
authentication and privacy. The smartphone may be the hot where they considered the similarity in the most significant
source of location tracking partially as location based resulting bit positions starting from the left bit, the most
services [10]. Because of poor computing capability of noteworthy order bit. Angiulli [16] proposed a fast form of
mobile devices, encryption algorithms with large keys are K¬NN which was titled as Fast Condensed Nearest
not feasible to be kept running at the mobile gadget. MCC Neighbour rule (FCNN). This algorithm does not rely upon
requires the fastest methodology of encryption that requires the order of data and has low complexity.
least storage, processing and communication Data Classification is almost essential procedures of
overhead[11].MCC offers great advantages to their clients classifying data assets based upon a typical values consisting
for better approachability of data from anywhere by any with risks associated on its storage, processing, and
gadget connected to the internet. In add-on Cloud storage is transmission and sensibility. The baseline criteria of data
more affordable because of no compelling reason to classification are filtering of information in accordance with
purchase or maintain costly hardware. Also, it tends to be the specific data sensitivity and security requirements. The
aptitude for hold support at any time because of accidental information must be classified as public or non sensitive,
damage or loss of data for recuperation purposes. highly confidential in respect to the risk level of illegal
Mobile User's Data Privacy Threats and Defense exposure, alteration and accessions. Unfortunately few
Mechanisms: Jalaluddin Khana et.al.[12] presents survey on organizations they don’t categorized their client data and
various defence mechanism in respects of mobile cloud consider complete data as highly classified; thereafter, they
safety, data secrecy issues, threats and vulnerabilities. A may frequently assign the incorrect parameter to defend such
mobile ecosystem is an immense developing field at critical data. Apart from critical data handling several
whatever point it approaches to the internet environments, employees that share such valuable information in public
and almost conceivable safety steps become hindered domain without knowing it adverse effect because of there is
through cyber attackers as an outcome of malicious code no clear guidelines about data classification procedures that
imposition in smart phones applications. Mobile cloud which data what to do. Machine Learning Data classification
physical storage security is the highest consideration for scheme allows us to protecting such crucial data towards any
protecting mobile devices. If cloud storage devices are enterprises or its users. In this research article, we have

10
concentrate following problems that adversely affected method researchers[17] first predicting document properties,
mobile cloud computing services. Firstly, Mobile subscribers such as file name which is based on the file metadata and the
worried about their confidential mobile data such as mobile properties of that test file and determines the nearest sub-
banking threats and penetrating add-ons by attackers over training datasets. There after determining specialised training
the mobile wireless networks. Secondly, whole data dataset and predicts the class of the particular document data
encryption is useless and impractical in nature because of its instead of entire training dataset. The data of a document
takes plenty of time for this process without knowledge may be mixed in nature or we can say that it will be in
awareness of secrecy and privacy level. Thus, we proposed a multiple categories. So they can foresee with the assistance
model in which mobile subscribers classifying those critical of file attributes that whether a data falls into confidential or
mobile data by machine learning algorithm like TsF non-confidential category. The data of each attribute in a file
KNN[17] thereafter using public key cryptographic has an alternate quality and importance with regards to data
algorithms to assuring data secrecy and integrity towards security. After identifying the security level of the document
top-secret data. In continuation, we will introduce our data, a superior security strategy can be utilized for the
proposed model based on machine classified datasets in confidential data in the record. Presently It is important to
detail. Additionally, as considering the literature review that understand that which data in a document are confidential
we have guided to support us in constructing our model. and which are non confidential before outsourcing the record
on Internet-based storage servers. Every document may be
III. PROPOSED WORK separated into three major parts, like as file metadata, list of
Mobile users utilized several applications to fulfil their attributes and data of attributes. In TsF-KNN algorithm
everyday needs and maximum level of satisfactions. Cyber authors [17] proposed new filtration system to reducing the
attacks is an enormous matter resulting their harmful impact training dataset load from the classifier. This system named
more than millions of devices that are connected through as the "bi-gram with dice coefficient" model which apply
cyberspace. As far as concerning large amount of mobile before the k-NN classifier. Throughout this whole procedure,
data security risks incorporates because of gigantic demand the time complexity of K-NN at the testing phase was
increases day to day. Therefore it may want to venture up's reduced and improved the data classification accuracy. In
some standard conventions to preventing client valuable data TsF-KNN algorithm is to integrate K-NN with a bi-gram
in mobile ecosystem from attackers. To achieve these goals model to increase the K¬NN algorithm's classification
Machine learning algorithms like as TsF K-NN[17] play an proficiency and improve the accuracy. The main advantage
important role for implying data security setup for those of this filtration process is reducing the number of repetitions
critical locales where extremely needs. This automatic in the algorithm and the computational cost. The Mobile
classification process is more accurate and more affordable training sample datasets for our experiment was extracted
towards critical data filtration process over the wide area from Open Mobile Data by MobiPerf Google repository [18]
networks in which authorities satisfy their critical data and few of the datasets are created by the researchers.
management needs at whatever point they required. Another B. Proposed Secure Mobile Cloud Computing Model
important aspect in case of classified data handling by
Based on Machine Classified Data
external assets is more productive way by knowing data
categorization approaches towards providing better security There are two types of data security management
point of view in the middle of mobile devices and mobile procedures out of which machine learning data classification
cloud data centers. method is used in the proposed framework as shown in Fig.1.
Thus our proposed model falls into two different parts as 1) Public Data or Non-Confidential Data
follows: first part contains data classification process which Public data is general data which is straightforwardly
was already proposed by researchers [17] and we are available to all for sharing and copying with no privacy
applying such TsF KNN algorithm to our training datasets issues, for example, historical data, surveys, News Media
and after filtration process we are getting such classified data. and so on. This data does not require any benchmark and
The other part we are using asymmetric key cryptographic transparently promoted by endorsed channels. Therefore, we
algorithms such as ECC, RSA, and ELGAMAL with proposed secure hypertext protocol and transport layer
different key sizes parameter utilized to assuring data security protocol are adequate for such data applications in
secrecy and integrity towards top-secret data. This data the middle of client server transmissions for encryption and
categorization increases efficient resource utilization and decryption process.
reducing data processing time. 2) Highly Confidential Data or Top Classified Data
Profoundly Highly Confidential Data is a critical data
A. Data Classification Process that averted by legislative laws, financial business
The traditional K-NN algorithm has huge computing agreements, research data, and HRD employee’s data ,
complexity at data classification steps. This huge Medical Records, Donor Information, Bank details, Payment
computational complexity will affect low proficiency in data Card Data and so forth. Business organizations and
filtration. To resolve this computational complexity situation individuals must have retains personal internal data may
of K-NN at the testing phase, another method was proposed safeguarded because of copyright, delicate and secrecy
by authors [17] who are focusing just on specific training aspects to keep from illegal alteration, communication,
dataset from the large pool of training datasets. In this storage and usage. Just because authorized authority getting

11
this data seems to be used in business and personal mobile devices as it has limited constraint in terms of their
knowledge at whenever required. Table.1 has shown the data processing power, less storage, lack of network connectivity.
classification with different categories. To maintain such secrecy levels we are using distinctive
asymmetric key cryptographic algorithms to securing such
valuable data.
IV. EXPERIMENTAL SETUP AND RESULTS
We have developed a java simulator in java Net beans
SDK to measure performance of our proposed model. For
this purpose our experiments were conducted Windows 7
Professional with includes Intel(R) Corei3 processor,
processor speed 2.43 GHz, and 4 GB RAM. We have used
the build-in cryptography classes in java environment to
simulate ECC, RSA and ELGAMAL with different key sizes
for various sizes of mobile training datasets files. We
evaluate performance of public key cryptographic algorithms
like ECC, RSA and ELGAMAL with various key size
parameters for multiple text files data blocks. ECC offers
significant timeline encryption and decryption performance
benefits which shown in fig.2, fig.3 as comparison with
other asymmetric algorithms such as RSA and ELGAMAL.

Figure 1. Proposed model.

TABLE I. DATA CLASSIFICATION IN DIFFERENT CATEGORIES


S.No Data Types Sensitivity Level
1 SSN High
2 Address High
3 Phone Number High Figure 2. Performance evaluation of encryption process using proposed
4 Birthplace High framework.
5 Gender High
6 Employee salary High
7 Credit Card Number High
8 Bank account Number High
9 Email address High
10 Other Details Low

C. Asymmetric Key Cryptographic System


Morden Public key cryptography system like Elliptical
curve cryptography (ECC) which is one of the most suitable
and recommended by the U.S. National Security Agency
(NSA) for handling such critical data in order to avoid illegal
access. In contrast with traditional cryptosystem like RSA,
Elgamal and so forth are more power and time consuming
because of fact of prime factorization technique based and
Figure 3. Performance evaluation of decryption process using proposed
large key size usage for encryption and decryption requires framework.
more memory space of large datasets. Elliptic Curve
Cryptography is proficient, scalable and significant fastest ECC with small key size 160 bits accomplished higher
data processing abilities with small key sizes with equal level security demands as explosive growth of internet
levels of data security needs makes it more attractive for

12
enabled mobile devices with wireless connectivity in entire [2] Lo’ai Tawalbeh1,*, Nour S. Darwazeh2, Raad S. Al-Qassas2 and
globe. Table.2 has shown the comparison of algorithm’s Fahd AlDosari1 : A Secure Cloud Computing Model based on Data
Classification. First International Workshop on Mobile Cloud
encryption and decryption times with different file size Computing Systems, Management, and Security (MCSMS-2015)
values. [3] Hoang T. Dinh, Chonho Lee, Dusit Niyato* and Ping Wang : A
survey of mobile cloud computing: architecture, applications, and
TABLE II. COMPARISON OF ALGORITHM ENCRYPTION DECRYPTION TIMES approaches School of Computer Engineering, Nanyang
WITH THEIR FILE SIZE
Technological University (NTU), Singapore Wireless.
Communication. Mobile. Computing. 2013; 13:1587–1611 © 2011
John Wiley & Sons, Ltd. DOI: 10.1002/wcm
[4] Weiguang SONG, Xiaolong SU, Review of Mobile cloud computing
DCST CUMT City XuZhou, JiangSu, China IEEE©2011 978-1-
61284-486-2/111
[5] Ogigau-Neamtiu F. Cloud Computing Security Issues. Journal of
Defense Resources Management 2012; 3(2):141-148.
[6] Mazhar Ali , Samee U. Khan a, Athanasios V. Vasilakos bSecurity in
cloud computing: Opportunities and challenges Information Sciences
305 (2015) 357–3830020-0255/ Elsevier2015
[7] Wu J, Ping L, Ge X, Wang Y, Fu J. Cloud Storage as the
Infrastructure of Cloud Computing. International Conference on
Intelligent Computing and Cognitive Informatics (ICICCI), 22-23
June 2010; 380-383
[8] Kotsiantis, S.B., Zaharakis, I.D., Pintelas, P.E., Machine learning: a
Simulation results proves that ECC is outperformed as review of classification and combining techniques, Artif Intell Rev,
pp. 159-190 (2006).
compare to traditional cryptographic algorithms like RSA
[9] Jain, A,K., Murty, M.N., Flynn, P., Data Clustering: a review, ACM
and Elgamal and more suitable for wireless system like Computer Surveys vol. 31, 264-323 (1999)
mobile device, more reliable and efficient with occupied less [10] M. Reza Rahimi, Jian Ren, Chi Harold Liu, Athanasios V. Vasilakos,
storage, faster execution and step up equal security with less Nalini Venkata subramanian Mobile Cloud Computing: A Survey,
key size as compare to other algorithms. Our Model State of Art and Future Directions © Springer Science+Business
definitely more beneficial in term of providing higher level Media New York 2013
of mobile data confidentialities as compare of those secure [11] Faiqa Maqsood,Muhammad Ahmed,Muhammad Mumtaz Ali,
mobile computing solutions who have treated whole data as Munam Ali Shah: Cryptography: A Comparative Analysis for
ModernTechniques International Journal of Advanced Computer
confidential and improper utilization of their resources. Science and Applications Vol. 8, No. 6, 2017
V. CONCLUSION [12] Jalaluddin Khana, Haider Abbasa,b*, Jalal Al-Muhtadia : Survey on
Mobile User's Data Privacy Threats and Defense Mechanisms
In this research article, we have proposed a very capable International Workshop on Cyber Security and Digital Investigation
privacy based secure mobile cloud data repository model (CSDI 2015) Elsevier© 2015
with help of machine classified mobile training datasets [13] Hart, P., The condensed nearest neighbor rule, IEEE Transactions on
through TsF-KNN algorithms and public key cryptographic Information Theory, vol. 14, pp. 515–516, (1968).
algorithms that decreases the computation time and promises [14] Gates, G., The reduced nearest neighbor rule, IEEE Transactions on
Information Theory, vol. 18, pp. 431–433, (1972).
privacy and integrity of critical data categorization. The TsF-
[15] Maleq, K., Qin, D., William P., K-Nearest Neighbor Classification on
KNN is an augmentation of the traditional K-NN algorithm Special Data Steaming using P-Trees, Advances in Knowledge
which classified the data attributes into two classes, i.e., Discovery and Data Mining Lecture Notes in Computer Science, vol.
confidential and non-confidential with high precision and 23, pp. 517-528 (2002).
low computational complexity. The proficiency of our [16] Fabrizio, A., Fast condensed nearest neighbor rule, Technical report,
proposed model has been demonstrated by performing Proceedings of the 22nd International Conference on
simulations results. Apart from as a future work, we have MachineLearning, Bonn, Germany, (2005).
endeavour to exploring another machine learning approaches [17] Munwar Ali, Low Tang Jung Confidentiality Based File Attributes
and Data Classification using TsF-KNN 978-1-4673-6537-
to enhancing higher level of data secrecy of crucial data 6/15/$31.00 ©2015 IEEE
handling on cloud storage based environments.
[18] Open Mobile Data by MobiPerf
https://console.developers.google.com/storage/openmobiledata_publi
REFERENCES c

[1] Anunaya Inani , Manoj Singh Ravish Saxena A Secure Mobile Cloud
Computing Framework Based on Data Classification Using
Asymmetric Key Cryptography Elsevier SSRN ICToCT 2018

13

You might also like