XML Encryption
XML Encryption
Master of Technology
in
by
Certificate
This is to certify that the work in the thesis entitled Survey on XML Encryp-
tion by Saurabh Kumar Sah is a record of an unique research work completed
by him under my supervision and direction in halfway satisfaction of the neces-
sities for the honor of the degree of Master of Technology with the specialization
of Software Engineering in the department of Computer Science and Engineering,
National Institute of Technology Rourkela. Not this thesis or any some piece of it
has been submitted for any degree or scholarly honor somewhere else.
I hereby declare that all work contained in this report is my own work unless
otherwise acknowledged. Also, all of my work has not been submitted for any
academic degree. All sources of quoted information has been acknowledged by
means of appropriate reference.
Every transaction on the Internet involves some kind of data. Data can be
transferred in various modes. Now a days, XML is widely used for transferring
and storing the data. There must be some mechanism to protect these data. In
most of the literature, two most important techniques i.e. XML Signature and
XML Encryption are used for securing these XML data. These two techniques
provide signing and encrypting of XML data using cryptographic functionalities
and results are also represented in XML format. These two techniques are con-
sidered as standard worldwide which is released by W3C. In this thesis we are
focusing on XML Encryption.
In this study, W3C standards are used to encrypt sensitive XML data. JavaScript
has been used to implement encryption of XML data and ”Node.js” as software
platform for providing the environment for encrypting. In this study, time elapsed
is also measured in case of encryption and decryption. We have used AES and
Triple DES algorithm for encryption of XML data. For encryption of symmetric
key, RSA is used. Library used is ”xml-encryption” for encryption and decryption.
Time analysis for encryption and decryption are also shown by graph.
Keywords: XML, XML Encryption, XML Parser, DOM etc.
Contents
Certificate ii
Acknowledgement iii
Abstract v
List of Tables ix
1 Introduction 1
1.1 Thesis Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Theoritical Background 7
2.1 XML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Basic Cryptography Concepts . . . . . . . . . . . . . . . . . . . . . 8
2.3 XML Encryption [1] . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Encryption Granularity . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.4.1 Encrypting an XML Element . . . . . . . . . . . . . . . . . 11
2.4.2 Encrypting XML Element Content (Elements) . . . . . . . . 12
2.4.3 Encrypting XML Element Content (Character Data) . . . . 12
2.4.4 Encrypting Arbitrary Data and XML Documents . . . . . . 12
2.5 Processing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6 XML Parser or API . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.6.1 DOM(Document Object Model) API . . . . . . . . . . . . . 14
3 Literature Review 16
3.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
vi
3.3 Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Bibliography 28
List of Figures
viii
List of Tables
ix
Chapter 1
Introduction
Present techniques in the field of security are not perfect because it does not
provide enough high level security and flexibility in securing business data on the
web. So there must be some technique which can fulfill our present goal so the
system become more secure and flexible. In business transactions there are so
much sensitive data used daily. Thats why security of these data is very essential.
For instance, Secure Sockets Layer (SSL) provide secure exchange of important
data between a Web server and browser, but the problem with this technique is
that, upon reaching the data on the server side it becomes vulnerable because it
does not provides security on the server [2]. That is it provides security in the
transit only. Therefore we cant use this technique in most business transaction
because it lefts date or information unprotected at the server side. If the data or
information is encrypted and then transmitted then there is very low chance of
exploiting the data or information by attacker or third party on open servers.
So securing the confidential sensitive information is nothing but ensuring their
non-repudiation, integrity and authenticity. The largely used method for providing
these requirements for proper transmission of data or information is to use digital
certificates to apply the digital signing and encryption of the those useful data.
The Public Key Infrastructure(PKI) provides the standards and policies which
can be applied to sensitive data for signing and encrypting with the help of public
key, private key and certificates generated.
In the current scenario XML is largely applied for data or information transfer
and storing of data in the Internet. The most important quality which make XML
1
Introduction
2
Introduction
ular user can be decrypted only by the one who is the owner of the corresponding
private key (i.e., the recipient, if they have nicely secured access to the private
key). Even though the data is hacked by someone else with the private key they
cant read those data because of its encrypted format. It is very important that
the privacy can be granted without the use of any secret data with the original
message which is the case of symmetric cryptography. As in this case the key is
exchanged between sender and receiver which may cause problem when number
of users is very high. Thats why public key cryptography has come.
We know that confidentiality is the first quality of cryptography which comes
generally into the brain, the relation between private and public key also gives the
function to system such that no secret key cryptography is required. It provides
authentication, non repudiation and integrity very well. It is very much analogous
to the paper world signatures. Since it involves digital data thats why called
Digital signature.
To make an advanced signature for any information, the data to be marked is
changed by a calculation that uses as enter the private key of that specific sender.
Since a change made by the sender’s private key must be turned around back if
the opposite convert takes as a parameter the sender’s open key, a receiver of the
interpreted information could make certain of the wellspring of the information(the
distinguished of the sender). On the off chance that the data could be confirmed
utilizing the sender’s open key, then it must be marked utilizing the related private
key (to which just the sender ought to have entrance).
For the mark confirmation to be capable, the verifier must have surety that
people in general key truly fit in with the sender (overall an actor could confirm
to be the sender, dispensing her open enter set up of the real one). A testa-
ment, supplied by a Certification-Authority, is an affirmation of the pass-ability of
the affiliation between the confirmation’s subject and his/her open key such that
distinctive clients could verify that open key does undoubtedly identify with the
subject who attests it as her own.
Broadly because of the execution conduct of open key calculations, the com-
3
Introduction
plete data or information is predominantly not itself deciphered straight with the
private key. Rather a little unique part of the data or information, called a conden-
sation or hash quality is changed. Since the hashing calculation is really touchy
to any alteration in the source data, the hash of the first allows a recipient to
confirm that the data was not changed (by matching the hash or process that
was sent to them with the hash or overview they found after count from the data
they got). Besides, by deciphering the hash or overview with their private key, the
sender likewise allows the recipient to validate that it was really the sender that
done the changing (on the grounds that the beneficiary could utilize the sender’s
open key to switch back the conversion). The hash or overview of the data or
information, interpreted with the sender’s private key, in this manner demonstra-
tions as a computerized signature for that specific data or information and could
be transmitted freely alongside the information to the beneficiary. The collector
affirms the signature by getting to the hash of the message and providing for it as
info to a verifying calculation together with the mark that included the message
and the sender’s open key. In the event that the yield is fruitful, the beneficiary
might make certain of both the terms thats is, uprightness and legitimacy of the
data or delicate information.
XML marks are only computerized marks utilized for applying within XML
information transactions. This system gives a pattern to recognizing the conse-
quence of an advanced mark performed identified with self-spellbinding (really as
often as possible XML) data. Similar to non-XML information advanced marks
(e.g. PKCS), XML marks gives information respectability, validation and bear for
non revocation to the data or information that they will sign. Regardless, not like
non-XML advanced mark guidelines, XML digital signature has been utilized for
both record for and make utilization of XML and the Internet.
A basic nature of XML Signature is the quality to sign simply particular por-
tions of the XML archive instead of the whole record. This is suitable if a solitary
XML report may have a long data in which the assorted parts are composed at dif-
ferent times by diverse clients, every one marking simply those segments suitable
4
Introduction
5
1.1 Thesis Organization Introduction
6
Chapter 2
Theoritical Background
2.1 XML
The acronym XML remains for Extensible Markup Language. It is a W3c pro-
posal since 1998 [3]. W3c is a group of data engineering masters. This association
distributes specialized details and proposals to guarantee a long haul development
of the World Wide Web. XML was intended to transport and store information.
Initially, it ought to be comprehensible to some degree for debugging and other
managerial work. A XML report comprises of markup and character information.
Markup is characterized as all labels, references, assertions, segments, and re-
marks. Character information is all content that is not markup. There are a few
manages how markup components could be utilized, which brings about strict,
tree-organized records. A basic XML case is demonstrated in Figure 2.1. The
center idea is the XML component that comprises of a begin label, an end-label
and substance. Labels must be overall framed, i.e. for each one begin label must
exist an end-tag with the same name in the report. Generally, a parser might toss
a special case. All labels must be settled rightly. That methods all end-labels
must be shut in the inverse request than the begin labels. Each XML report
has a header and a body. The header details that the content record is a XML
archive. Figure 2.1 is a basic XML archive and its header is the first line ¡?xml
version=1.0?¿. Close to the rendition quality, it can additionally hold qualities like
encoding or standalone to signalize the XML parser how to decipher the body’s
substance effectively.
7
2.2 Basic Cryptography Concepts Theoritical Background
8
2.2 Basic Cryptography Concepts Theoritical Background
goes once more to antiquated Egypt yet it is till basic to securing information
today. indeed, encryption is totally important when sending delicate information
over unprotected environment like the Internet. The three basic sorts of calcula-
tions utilized for encryption are:
Hashing
DES was one of the first extensively utilized figurings at any rate it has been
part and is no more perceived secure. AES has not been part and is utilized by
the US government while IDEA is supported by European countries. RC stays
for ”Ron’s Code” and is a social occasion of calculations made by Ron Rivest in
1987. Blowfish is a solid open-source symmetric reckoning made in 1993. Uneven
cryptographic processing shift from symmetric estimations in that it obliges two
”keys” to encode and unscramble information rather than the symmetric number’s
single key.
Hilter kilter or open key encryption utilizes two numerically related keys: an
open key known by everybody to encode messages and a private key, known ba-
sically by the beneficiary of the message to decipher the data. Disproportionate
9
2.3 XML Encryption [1] Theoritical Background
10
2.4 Encryption Granularity Theoritical Background
Suppose we want to encrypt the Creditcard element of the XML document shown
in figure 2.1. After encrypting it this element get replaced by EncryptedData
element. Encrypted document look like figure 2.3 after the CreditCard element is
encrypted.
11
2.4 Encryption Granularity Theoritical Background
Suppose we want to encrypt the elements within the Creditcard element then it
comes under this type. Figure 2.4 shows how it looks after encryption.
Figure 2.6 shows how an arbitrary data or XML file can be encrypted. Within
12
2.4 Encryption Granularity Theoritical Background
13
2.5 Processing Rules Theoritical Background
the EncryptedData element, the Key resides which may be encrypted. If key
is encrypted then it is contained within the EncryptedKey element. Encrypting
EncryptedData is called Super Encryption.
The concept of this API is platform and language independent interface which
provides how to represent and interact with documents like XML, HTML etc.
This API provides dynamic access to the document and it can dynamically update
the style and content of the XML document. A DOM API converts the XML
document into a DOM tree first where each element and data in the document is
represented by a node. The DOM tree is saved into the memory. Figure 2.7 shows
the transformation.
14
2.6 XML Parser or API Theoritical Background
15
Chapter 3
Literature Review
16
3.1 Related Work Literature Review
17
3.1 Related Work Literature Review
Tao-Ku Chang et.al. have given the thought regarding the outline and ex-
ecution of a provision program interface for securing XML archives [9]. In this
paper, they utilize some true cases to exhibit that, it is important to plan a proper
API for protecting arrangement of child tree encryption for XML archives. Their
objective is to expand gainfulness and decrease the expense of keeping up this sort
of programming, for which they propose an archive security dialect (DSL) API.
They likewise depict the usage of the DSL API, and use exploratory results to
show its reasonableness.
David C. Yen et.al. have given the effect and usage of XML on business-
to-business trade [10]. This paper demonstrate the effect investigation of the
Extensible Markup Language (XML). In this paper, they additionally highlights
on how business accomplice inside a store network will be permitted to produce
its information trade design by receiving a XML meta-information administration
framework in the nearby side. Trailed a concise presentation of the data innova-
tion for Business to Business (B2b) and Business to Customer (B2c) Electronic
Commerce (EC), the effect of XML on the future business world is talked about.
Juha-Miikka Nurmilaakso et.al. depicted about XML-based e-business struc-
tures and institutionalization [11]. This paper examines the properties and insti-
tutionalization of 12 noticeable XML based e-business schema’s. Their dissection
concentrates on the shared characteristics, contrasts and regularities among these
e-business schemas and their institutionalization.
Alexandros Kaliontzoglou et.al. have given A protected e-Government stage
structural planning for little to medium measured open associations [12]. They
take Small to medium measured open associations (Smpos) impart some of their
e-Government prerequisites with their
bigger partners, for example, the pending requirements for interoperability,
security and ease of use. Moreover, they have some particular needs that are
either remarkable in their setting or all the more requesting because of their as-
pects. These are expense and assets contemplations, improved availability and
more amazing adaptability because of the bigger number of natives and organiza-
18
3.1 Related Work Literature Review
tions served and robotized handling in light of the confined number of prepared
faculty. This paper at first proposes a structural planning for a safe e-Government
stage focused around Web Services, which provides the above necessities. Also,
a particular administration is based upon the proposed stage, in which a region
produces and safely conveys an advanced conception testament to a resident or
an alternate district.
Paul Kearney depicted about Message level security for web administrations
[13]. This paper gives a rundown of the rising accord on security for communitar-
ian business utilizing web benefits as a part of a nature’s turf. The most widely
recognized security measure utilizing security at transport layer may be enough for
straightforward requisitions. Nonetheless, for more intricate situations, e.g. more
than two gatherings, or different web administrations, entire messages or distinc-
tive portions of messages may be scrambled and marked to secure the privacy and
trustworthiness of web administration messages.
Hye-Kyeong Ko et.al. have concentrated on the effectiveness of secure XML
television [14]. In this paper, a marking plan is proposed to help quick recreation
of XML reports in the connection of a well-known strategy, called XML pool en-
cryption. The proposed marking plan backs the expedient derivation of structure
data in all allotments of the archive. The double depiction of the stated marking
plan is additionally examined. In the test comes about, the proposed marking
plan is proficient in hunting down the area of unscrambled data.
Carlos Gutirrez et.al. have given The useful provision of a procedure for inspir-
ing and outlining security in web administration frameworks [15]. In this paper,
they introduce the provision of the Process for Web Service Security (Pwssec),
created by the creators, to a genuine web administration based research endeavor.
The way in which security in between authoritative data frameworks could be
investigated, outlined and actualized by applying Pwssec, which joins a dan-
ger investigation and administration, alongside a security building design and a
standard-based methodology, is likewise demonstrated. They furthermore display
an apparatus fabricated to give backing to the Pwssec proces.
19
3.1 Related Work Literature Review
20
3.1 Related Work Literature Review
Ernesto Damiani et.al. have given the thought regarding configuration and
usage of a right to gain entrance control processor for XML reports [21]. In this
paper an Access Control System for XML is depicted considering definition and
requirement of access limitations specifically on the structure and substance of
XML reports, in this manner giving a basic and compelling path for clients to
secure data at the same granularity level gave by the dialect itself.
Peter Michalek has given the thought regarding analyzing requisition security
of XML Schema’s [22]. This article depicts the state of the workmanship and
conceivable bearings later on. It outlines industry endeavors and focal points on
requisition security related XML patterns being created inside Oasis(advancing
Open Standards for the Information Society). Denis Trcek has given an integral
framework for the security management of information systems [23]. This article
shows an endeavor at administration of E-business frameworks security that is
focused around coordinating existing methodologies in an adjusted manner. To
cultivate handy utilization of the applied model in this paper, short foundation
learning in related zones is given.
Alfredo Cuzzocrea et.al. have given the security saving OLAP over dispersed
XML information [24]. This paper stated a novel Secure Multiparty Computa-
tion (SMC)-based security saving OLAP system for appropriated accumulations
of XML records. The system has numerous novel characteristics extending from
pleasant hypothetical properties to a compelling and effective convention, called
Secure Distributed OLAP conglomeration convention (SDO). The proficiency of
this methodology has been accepted by an exploratory assessment over dissemi-
nated accumulations of engineered, benchmark and genuine XML reports.
Jae-Gil Lee et.al. portrayed about secure inquiry transforming against scram-
bled XML information utilizing inquiry mindful unscrambling [25]. In this pa-
per,they proposed the thought of Query- Mindful Decryption for effective handling
of questions against scrambled XML information.
Dr Renato Iannella has given the thought regarding the Odrl(open Digital
Rights Language), XML for computerized right administration [26]. This paper
21
3.2 Motivation Literature Review
3.2 Motivation
As XML is widely used in business transactions, so the security of these XML
data is essential to meet the business requirements. Data in transit and in servers
can be effectively secured by XML Signature and XML Encryption techniques.
Conventional techniques uses SSL/TLS which is not sufficient because it does not
provide the security of data once it reaches the server side. This drawback can be
removed by using XML Signature and XML Encryption following W3C standards.
Here we are focusing on XML Encryption. Different author previously introduced
different techniques like the concept of Document security language etc but there
were some demerits with those technique. The effectiveness of XML signing and
encrypting also depends on Parsers like DOM etc. Here we have used AES and
3DES for Encryption.
3.3 Objective
The objective of this thesis is to apply AES and 3DES algorithm to encrypt
sensitive data. As these are symmetric key algorithms so to encrypt secret key
RSA algorithm is used. Time elapsed in Encrypting is also calculated using these
algorithms. Here we have used DOM parser. Time analysis for encryption and
decryption is also shown by graph.
22
Chapter 4
4.1 Introduction
We have applied AES and Triple-DES algorithms to encrypt XML data. As these
are symmetric key algorithms,So the key used in encrypting data is encrypted
again using RSA algorithm. Time for encryption and decryption is also calculated.
Graphs for encryption and decryption time are also drawn for different xml files.
4.2 Implementation
Language Used: ”JavaScript”
Libraries Used:
4.3 Results
Figure 4.1 shows the input XML file.. For Encryption we have used AES-128,
AES-256 and 3DES in CBC mode to encrypt the xml file. To encrypt the key,
23
4.3 Results Implementation and Results
RSA is used and put into the EncryptedKey element. The output file is stored
in the system in encrypted syntax. After encrypting cosole shows the message
that it is encrypted successfully. Encryption time is also shown in the console.
Figure 4.2 shows the snapshot of the console output after encrypting the given
XML. After Encryptin, the encrypted file generates which contains the encrypted
data in standard format. Figure 4.3 shows the output encrypted file. Decryption
process results in original xml which is encrypted. Figure 4.4 shows the console
output for decryption. Different XML files are taken having different number of
elements in the input file for time analysis of encryption and decryption. As the
number of elements in the input XML file increases the time for corresponding
operations are shown by graph. Generally, time is proportional to the number of
24
4.3 Results Implementation and Results
25
4.3 Results Implementation and Results
elements in the input XML file. Here DOM parser is used. As the running time
depends on current CPU utilization, number of applications running at that time.
So to find a time, concerned code is executed 10 times and average is calculated
to find nearly exact time. Similarly time is calculated for different xml files by
increasing the number of elements in the input file.
Figure 4.5 shows the time analysis for encryption and decryption.
26
Chapter 5
In this thesis we have applied AES and 3DES algorithm in CBC mode. We can
specify the algorithm in the code which we want to use. In future we can extend
our library for customized algorithm. So the conclusion is that W3C standard can
be expanded by using customized or user defined algorithms using this library.
Here we conclude that, as the number of elements in the input XML file in-
creases the execution time also increases. That is execution time is proportional
to number of elements in the input XML file.
Future work may be implementing the XML Signature and Encryption with
other algorithms which are not defined in W3C standard. Further memory con-
sumption for the same can be calculated.
27
Bibliography
[2] A. C. Weaver, “Secure sockets layer,” Computer, vol. 39, no. 4, pp. 88–90,
2006.
28
BIBLIOGRAPHY BIBLIOGRAPHY
[8] G.-H. Hwang and T.-K. Chang, “An operational model and language support
for securing xml documents,” Computers & Security, vol. 23, no. 6, pp. 498–
529, 2004.
[9] T.-K. Chang and G.-H. Hwang, “The design and implementation of an ap-
plication program interface for securing xml documents,” Journal of Systems
and Software, vol. 80, no. 8, pp. 1362–1374, 2007.
[10] D. C. Yen, S.-M. Huang, and C.-Y. Ku, “The impact and implementation of
xml on business-to-business commerce,” Computer Standards & Interfaces,
vol. 24, no. 4, pp. 347–362, 2002.
[13] P. Kearney, “Message level security for web services,” Information Security
Technical Report, vol. 10, no. 1, pp. 41–50, 2005.
[14] H.-K. Ko, M.-J. Kim, and S. Lee, “On the efficiency of secure xml broadcast-
ing,” Information Sciences, vol. 177, no. 24, pp. 5505–5521, 2007.
[16] E. J.-L. Lu and R.-F. Chen, “An xml multisignature scheme,” Applied Math-
ematics and Computation, vol. 149, no. 1, pp. 1–14, 2004.
[17] F. Song and Z. Cui, “Electronic voting scheme about elgamal blind-signatures
based on xml,” Procedia Engineering, vol. 29, pp. 2721–2725, 2012.
29
BIBLIOGRAPHY BIBLIOGRAPHY
[18] P. Buneman, S. Davidson, W. Fan, C. Hara, and W.-C. Tan, “Keys for xml,”
Computer networks, vol. 39, no. 5, pp. 473–487, 2002.
[19] E. J.-L. Lu, R.-H. Tsai, and S. Chou, “An empirical study of xml/edi,” Jour-
nal of Systems and Software, vol. 58, no. 3, pp. 271–279, 2001.
[20] A. Blyth, “An xml-based architecture to perform data integration and data
unification in vulnerability assessments,” Information Security Technical Re-
port, vol. 8, no. 4, pp. 14–25, 2003.
[22] P. Michalek, “Dissecting application security xml schemas: Avdl, was, oval–
state of the xml security standards report,” Information Security Technical
Report, vol. 9, no. 3, pp. 66–76, 2004.
[23] D. Trèek, “An integral framework for information systems security manage-
ment,” Computers & Security, vol. 22, no. 4, pp. 337–360, 2003.
[24] A. Cuzzocrea and E. Bertino, “Privacy preserving olap over distributed xml
data: A theoretically-sound secure-multiparty-computation approach,” Jour-
nal of Computer and System Sciences, vol. 77, no. 6, pp. 965–987, 2011.
[25] J.-G. Lee and K.-Y. Whang, “Secure query processing against encrypted xml
data using query-aware decryption,” Information sciences, vol. 176, no. 13,
pp. 1928–1947, 2006.
[26] R. Iannella, “The open digital rights language: Xml for digital rights man-
agement,” Information Security Technical Report, vol. 9, no. 3, pp. 47–55,
2004.
30