Unit - 1: Cryptography & Network Security
Unit - 1: Cryptography & Network Security
The art and science of concealing the messages to introduce secrecy in information security is
recognized as cryptography. The process of disguising a message in such a way as to hide its
substance is encryption. An encrypted message is cipher text. The process of turning cipher
text back into plaintext is decryption.
Data that can be read and understood without any special measures is called plaintext or clear
text. The method of disguising plaintext in such a way as to hide its substance is
called encryption. Encrypting plaintext results in unreadable gibberish called cipher text. You
use encryption to ensure that information is hidden from anyone for whom it is not intended,
even those who can see the encrypted data. The process of reverting cipher text to its original
plaintext is called decryption.
What is cryptography?
Cryptography is the science of using mathematics to encrypt and decrypt data. Cryptography
enables you to store sensitive information or transmit it across insecure networks (like the
Internet) so that it cannot be read by anyone except the intended recipient. While cryptography
is the science of securing data, cryptanalysis is the science of analyzing and breaking secure
communication. Classical cryptanalysis involves an interesting combination of analytical
reasoning, application of mathematical tools, pattern finding, patience, determination, and
luck. Cryptanalysts are also called attackers. Cryptology embraces both cryptography and
cryptanalysis.
How does cryptography work?
A cryptographic algorithm, or cipher, is a mathematical function used in the encryption and
decryption process. A cryptographic algorithm works in combination with a key — a word,
number, or phrase — to encrypt the plaintext. The same plaintext encrypts to different cipher
text with different keys. The security of encrypted data is entirely dependent on two things: the
strength of the cryptographic algorithm and the secrecy of the key. A cryptographic algorithm,
plus all possible keys and all the protocols that make it work comprise a cryptosystem.
Conventional cryptography
Caesar's Cipher
For example, if we encode the word "SECRET" using Caesar's key value of 3, we offset the
alphabet so that the 3rd letter down (D) begins the alphabet.
So starting with
ABCDEFGHIJKLMNOPQRSTUVWXYZ
Using this scheme, the plaintext, "SECRET" encrypts as "VHFUHW." To allow someone else to
read the cipher text, you tell them that the key is 3.
Obviously, this is exceedingly weak cryptography by today's standards, but hey, it worked for
Caesar, and it illustrates how conventional cryptography works.
Conventional encryption has benefits. It is very fast. It is especially useful for encrypting data
that is not going anywhere. However, conventional encryption alone as a means for
transmitting secure data can be quite expensive simply due to the difficulty of secure key
distribution.
Recall a character from your favorite spy movie: the person with a locked briefcase handcuffed
to his or her wrist. What is in the briefcase, anyway? It's probably not the missile launch code/
bio toxin formula/ invasion plan itself. It's the key that will decrypt the secret data.
For a sender and recipient to communicate securely using conventional encryption, they must
agree upon a key and keep it secret between themselves. If they are in different physical
locations, they must trust a courier, the Bat Phone, or some other secure communication
medium to prevent the disclosure of the secret key during transmission. Anyone who overhears
or intercepts the key in transit can later read, modify, and forge all information encrypted or
authenticated with that key. From DES to Captain Midnight's Secret Decoder Ring, the
persistent problem with conventional encryption is key distribution: how do you get the key to
the recipient without someone intercepting it?
The problems of key distribution are solved by public key cryptography, the concept of which
was introduced by Whitfield Diffie and Martin Hellman in 1975. (There is now evidence that the
British Secret Service invented it a few years before Diffie and Hellman, but kept it a military
secret — and did nothing with it.Public key cryptography is an asymmetric scheme that uses
a pair of keys for encryption: a public key, which encrypts data, and a
corresponding private, or secret key for decryption. You publish your public key to the world
while keeping your private key secret. Anyone with a copy of your public key can then encrypt
information that only you can read. Even people you have never met.
It is computationally infeasible to deduce the private key from the public key. Anyone who has
a public key can encrypt information but cannot decrypt it. Only the person who has the
corresponding private key can decrypt the information.
The primary benefit of public key cryptography is that it allows people who have no preexisting
security arrangement to exchange messages securely. The need for sender and receiver to
share secret keys via some secure channel is eliminated; all communications involve only
public keys, and no private key is ever transmitted or shared. Some examples of public-key
cryptosystems are Elgamal (named for its inventor, TaherElgamal), RSA (named for its
inventors, Ron Rivest, Adi Shamir, and Leonard Adleman), Diffie-Hellman (named, you guessed
it, for its inventors), and DSA, the Digital Signature Algorithm (invented by David Kravitz).
Because conventional cryptography was once the only available means for relaying secret
information, the expense of secure channels and key distribution relegated its use only to those
who could afford it, such as governments and large banks (or small children with secret
decoder rings). Public key encryption is the technological revolution that provides strong
cryptography to the adult masses. Remember the courier with the locked briefcase handcuffed
to his wrist? Public-key encryption puts him out of business (probably to his relief).
Keys
A key is a value that works with a cryptographic algorithm to produce a specific cipher text.
Keys are basically really, really, really big numbers. Key size is measured in bits; the number
representing a 1024-bit key is darn huge. In public key cryptography, the bigger the key, the
more secure the cipher text.
However, public key size and conventional cryptography's secret key size are totally unrelated.
A conventional 80-bit key has the equivalent strength of a 1024-bit public key. A conventional
128-bit key is equivalent to a 3000-bit public key. Again, the bigger the key, the more secure,
but the algorithms used for each type of cryptography are very different and thus comparison
is like that of apples to oranges.
While the public and private keys are mathematically related, it's very difficult to derive the
private key given only the public key; however, deriving the private key is always possible given
enough time and computing power. This makes it very important to pick keys of the right size;
large enough to be secure, but small enough to be applied fairly quickly. Additionally, you need
to consider who might be trying to read your files, how determined they are, how much time
they have, and what their resources might be.
Larger keys will be cryptographically secure for a longer period of time. If what you want to
encrypt needs to be hidden for many years, you might want to use a very large key. Of course,
who knows how long it will take to determine your key using tomorrow's faster, more efficient
computers? There was a time when a 56-bit symmetric key was considered extremely safe.
Keys are stored in encrypted form. Open PGP stores the keys in two files on your hard disk;
one for public keys and one for private keys. These files are called keyrings. As you use Open
PGP, you will typically add the public keys of your recipients to your public keyring. Your
private keys are stored on your private keyring. If you lose your private keyring, you will be
unable to decrypt any information encrypted to keys on that ring.
Digital signatures
A major benefit of public key cryptography is that it provides a method for employing digital
signatures. Digital signatures enable the recipient of information to verify the authenticity of
the information's origin, and also verify that the information is intact. Thus, public key digital
signatures provide authentication and data integrity. A digital signature also provides non-
repudiation, which means that it prevents the sender from claiming that he or she did not
actually send the information. These features are every bit as fundamental to cryptography as
privacy, if not more.
The basic manner in which digital signatures are created is illustrated in Figure 1-6. Instead of
encrypting information using someone else's public key, you encrypt it with your private key. If
the information can be decrypted with your public key, then it must have originated with you.
i. Security attacks
ii. Security services
iii. Security mechanisms
A Model For Network Security
A security-related transformation on the information to be sent. Examples include the
encryption of the message, which scrambles the message so that it is unreadable by the
opponent, and the addition of a code based on the contents of the message, which can be used
to verify the identity of the sender.
Some secret information shared by the two principals and, it is hoped, unknown to the
opponent. An example is an encryption key used in conjunction with the transformation to
scramble the message before transmission and unscramble it on reception. 6
A trusted third party may be needed to achieve secure transmission. For example, a third party
may be responsible for distributing the secret information to the two principals while keeping it
from any opponent. Or a third party may be needed to arbitrate disputes between the two
principals concerning the authenticity of a message transmission.
This general model shows that there are four basic tasks in designing a particular security
service:
3. Develop methods for the distribution and sharing of the secret information.
4. Specify a protocol to be used by the two principals that makes use of the security algorithm
and the secret information to achieve a particular security service.
Parts One through Five of this book concentrate on the types of security mechanisms and
services that fit into the model shown in Figure 1.4. However, there are other security-related
situations of interest that do not neatly fit this model but are considered in this book. A general
model of these other situations is illustrated by Figure 1.5, which reflects a concern for
protecting an information system from unwanted access. Most readers are familiar with the
concerns caused by the existence of hackers, who attempt to penetrate systems that can be
accessed over a network. The hacker can be someone who, with no malign intent, simply gets
satisfaction from breaking and entering a computer system. The intruder can be a disgruntled
employee who wishes to do damage or a criminal who seeks to exploit computer assets for
financial gain
• Information access threats: Intercept or modify data on behalf of users who should not
have access to that data.
• Service threats: Exploit service flaws in computers to inhibit use by legitimate users.
Viruses and worms are two examples of software attacks. Such attacks can be introduced into
a system by means of a disk that contains the unwanted logic concealed in otherwise useful
software. They can also be inserted into a system across a network; this latter mechanism is of
more concern in network security.
The security mechanisms needed to cope with unwanted access fall into two broad categories
(see Figure 1.5). The first category might be termed a gatekeeper function. It includes
password-based login procedures that are designed to deny access to all but authorized users
and screening logic that is designed to detect and reject worms, viruses, and other similar
attacks. Once either an unwanted user or unwanted software gains access, the second line of
defense consists of a variety of internal controls that monitor activity and analyze stored
information in an attempt to detect the presence of unwanted intruders. These issues are
explored in Part Six.
Cryptography
Cryptanalysis and Brute-Force Attack
Substitution Techniques
Caesar Cipher
Monoalphabetic Ciphers
Playfair Cipher
Hill Cipher
Polyalphabetic Ciphers
One-Time Pad
Transposition Techniques
Rotor Machines
Steganography
◆ Symmetric encryption transforms plaintext into cipher text using a secret key
and an encryption algorithm. Using the same key and a decryption algorithm, the
plaintext is recovered from the cipher text.
◆ Rotor machines are sophisticated pre computer hardware devices that use
substitution techniques.
• Secret key: The secret key is also input to the encryption algorithm. The
key is a value independent of the plaintext and of the algorithm. The algorithm will produce a
different output depending on the specific key being used at the time. The exact substitutions
and transformations performed by the algorithm depend on the key.
The opponent should be unable to decrypt ciphertext or discover the key even if he or she is
in possession of a number of ciphertexts together with the plaintext that produced each
ciphertext.
2. Sender and receiver must have obtained copies of the secret key in a
secure fashion and must keep the key secure. If someone can discover the key and knows the
algorithm, all communication using this key is readable.
Let us take a closer look at the essential elements of a symmetric encryp-tion scheme, using
Figure 2.2. A source produces a message in plaintext, X = [X1, X2, ..... , XM]. The M elements
of X are letters in some finite alphabet. Traditionally, the alphabet usually consisted of the 26
capital letters. Nowadays, the binary alphabet {0, 1} is typically used. For encryption, a key of
the form K = [K1, K2, ..... , KJ] is generated. If the key is generated at the message source, then
it must also be provided to the destination by means of some secure chan-nel. Alternatively, a
third party could generate the key and securely deliver it to both source and destination.
With the message X and the encryption key K as input, the encryption algo-rithm forms the
ciphertext Y = [Y1, Y2, ..... , YN]. We can write this as
Y = E(K, X)
The intended receiver, in possession of the key, is able to invert the transformation:
X = D(K, Y)
An opponent, observing Y but not having access to K or X , may attempt to recover X or K or both X and K. It is
assumed that the opponent knows the encryption
(E) and decryption (D) algorithms. If the opponent is interested in only this particular message, then the focus of
the effort is to recover X by generating a plaintext estimate
N
X . Often, however, the opponent is interested in being able to read future messages as N well, in which case an
attempt is made to recover K by generating an estimate K.
Cryptography:
Cryptographic systems are characterized along three independent dimensions:
1. The type of operations used for transforming plaintext to ciphertext. All encryption
algorithms are based on two general principles: substitution, in which each element in the
plaintext (bit, letter, group of bits or letters) is mapped into another element, and
transposition, in which elements in the plaintext are rearranged. The fundamental
requirement is that no informa-tion be lost (that is, that all operations are reversible). Most
systems, referred to as product systems, involve multiple stages of substitutions and
transpositions.
2. The number of keys used. If both sender and receiver use the same key, the system is
referred to as symmetric, single-key, secret-key, or conventional encryp-tion. If the sender and
receiver use different keys, the system is referred to as asymmetric, two-key, or public-key
encryption.
3. The way in which the plaintext is processed. A block cipher processes the input one
block of elements at a time, producing an output block for each input block. A stream
cipher processes the input elements continuously, producing output one element at a time, as
it goes along.
Cryptanalysis and Brute-Force Attack:
Typically, the objective of attacking an encryption system is to recover the key in use rather
than simply to recover the plaintext of a single ciphertext. There are two gen-eral approaches
to attacking a conventional encryption scheme:
Cryptanalysis: Cryptanalytic attacks rely on the nature of the algorithm plus perhaps some
knowledge of the general characteristics of the plaintext or even some sample plaintext–
ciphertext pairs. This type of attack exploits the characteristics of the algorithm to attempt to
deduce a specific plaintext or to deduce the key being used.
Table 2.1 summarizes the various types of cryptanalytic attacks based on the amount of
information known to the cryptanalyst. The most difficult problem is pre-sented when all that
is available is the ciphertext only. In some cases, not even the encryption algorithm is known,
but in general, we can assume that the opponent does know the algorithm used for
encryption. One possible attack under these cir-cumstances is the brute-force approach of
trying all possible keys. If the key space is very large, this becomes impractical. Thus, the
opponent must rely on an analysis of the ciphertext itself, generally applying various
statistical tests to it. To use this approach, the opponent must have some general idea of the
type of plaintext that is concealed, such as English or French text, an EXE file, a Java source
listing, an accounting file, and so on.
The ciphertext-only attack is the easiest to defend against because the oppo-nent has the least
amount of information to work with. In many cases, however, the analyst has more
information. The analyst may be able to capture one or more plaintext messages as well as
their encryptions. Or the analyst may know that cer-tain plaintext patterns will appear in a
message. For example, a file that is encoded in the Postscript format always begins with the
same pattern, or there may be a standardized header or banner to an electronic funds transfer
message, and so on. All these are examples of known plaintext. With this knowledge, the
analyst may be able to deduce the key on the basis of the way in which the known plaintext is
transformed.
If the analyst is able somehow to get the source system to insert into the system a message
chosen by the analyst, then a chosen-plaintext attack is possible. An example of this strategy
is differential cryptanalysis, explored in Chapter 3. In general, if the analyst is able to choose
the messages to encrypt, the analyst may deliberately pick patterns that can be expected to
reveal the structure of the key.
Table 2.1 lists two other types of attack: chosen ciphertext and chosen text. These are less
commonly employed as cryptanalytic techniques but are nevertheless possible avenues of
attack.
Two more definitions are worthy of note. An encryption scheme is unconditionally secure if
the ciphertext generated by the scheme does not con-tain enough information to determine
uniquely the corresponding plaintext, no matter how much ciphertext is available. That is, no
matter how much time an opponent has, it is impossible for him or her to decrypt the
ciphertext simply because the required information is not there. With the exception of a
scheme known as the one-time pad (described later in this chapter), there is no encryp-tion
algorithm that is unconditionally secure. Therefore, all that the users of an encryption
algorithm can strive for is an algorithm that meets one or both of the following criteria:
• The cost of breaking the cipher exceeds the value of the encrypted information.
• The time required to break the cipher exceeds the useful lifetime of the information.
All forms of cryptanalysis for symmetric encryption schemes are designed to exploit the fact
that traces of structure or pattern in the plaintext may survive encryption and be discernible
in the ciphertext. This will become clear as we exam-ine various symmetric encryption
schemes in this chapter. We will see in Part Two that cryptanalysis for public-key schemes
proceeds from a fundamentally different premise, namely, that the mathematical properties of
the pair of keys may make it possible for one of the two keys to be deduced from the other.
A brute-force attack involves trying every possible key until an intelligible translation of the
ciphertext into plaintext is obtained. On average, half of all possi-ble keys must be tried to
achieve success. Table 2.2 shows how much time is involved for various key spaces. Results
are shown for four binary key sizes. The 56-bit key size is used with the Data Encryption
Standard (DES) algorithm, and the 168-bit key size is used for triple DES. The minimum key
size specified for Advanced Encryption Standard (AES) is 128 bits. Results are also shown for
what are called substitution codes that use a 26-character key (discussed later), in which all
possible permutations of the 26 characters serve as keys. For each key size, the results are
shown assuming that it takes 1 μs to perform a single decryption, which is a reason-able
order of magnitude for today‘s machines. With the use of massively parallel organizations of
microprocessors, it may be possible to achieve processing rates many orders of magnitude
greater. The final column of Table 2.2 considers the results for a system that can process 1
million keys per microsecond. As you can see, at this performance level, DES can no longer be
considered computationally secure.
SUBSTITUTION TECHNIQUES:
In this section and the next, we examine a sampling of what might be called classical
encryption techniques. A study of these techniques enables us to illustrate the basic
approaches to symmetric encryption used today and the types of cryptanalytic attacks that
must be anticipated.
The two basic building blocks of all encryption techniques are substitution and transposition.
We examine these in the next two sections. Finally, we discuss a system that combines both
substitution and transposition.
A substitution technique is one in which the letters of plaintext are replaced by other letters or
by numbers or symbols.1 If the plaintext is viewed as a sequence of bits, then substitution
involves replacing plaintext bit patterns with ciphertext bit patterns.
Caesar Cipher:
The earliest known, and the simplest, use of a substitution cipher was by Julius
Caesar. The Caesar cipher involves replacing each letter of the alphabet with the let-ter
standing three places further down the alphabet. For example,
Note that the alphabet is wrapped around, so that the letter following Z is A. We can define the
transformation by listing all possibilities, as follows:
Plain: a b c d e f g h i j k l m n o p q r s t u v w x y z cipher: D E F G H I J K L M N O P Q R S T
UVWXYZABC
C = E(3, p) = (p + 3) mod 26
C = E(k, p) = (p + k) mod 26
where k takes on a value in the range 1 to 25. The decryption algorithm is simply
p = D(k, C) = (C - k) mod 26
In general, there are n! permutations of a set of n elements, because the first element can be
chosen in one of n ways, the second in n - 1 ways, the third in n - 2 ways, and so on.
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z cipher: D E F G H I J K L M N O P Q R S T
UVWXYZABC
If, instead, the ―cipher‖ line can be any permutation of the 26 alphabetic characters, then
there are 26! or greater than 4 * 1026 possible keys. This is 10 orders of magni-tude greater
than the key space for DES and would seem to eliminate brute-force techniques for
cryptanalysis. Such an approach is referred to as a monoalphabetic substitution cipher,
because a single cipher alphabet (mapping from plain alphabet to cipher alphabet) is used per
message.
There is, however, another line of attack. If the cryptanalyst knows the nature of the plaintext
(e.g., noncompressed English text), then the analyst can exploit the regularities of the
language. To see how such a cryptanalysis might proceed, we give a partial example here that
is adapted from one in [SINK66]. The ciphertext to be solved is
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
As a first step, the relative frequency of the letters can be determined and compared to a
standard frequency distribution for English, such as is shown in Figure 2.5 (based on
[LEWA00]). If the message were long enough, this technique alone might be sufficient, but
because this is a relatively short message, we cannot expect an exact match. In any case, the
relative frequencies of the letters in the ciphertext (in percentages) are as follows:
Comparing this breakdown with Figure 2.5, it seems likely that cipher letters P and Z are the
equivalents of plain letters e and t, but it is not certain which is which. The letters S, U, O, M,
and H are all of relatively high frequency and probably correspond to plain letters from the set
{a, h, i, n, o, r, s}. The letters with the lowest frequencies (namely, A, B, G, Y, I, J) are likely
included in the set {b, j, k, q, v, x, z}.
There are a number of ways to proceed at this point. We could make some tentative
assignments and start to fill in the plaintext to see if it looks like a reasonable ―skeleton‖ of a
message. A more systematic approach is to look for other regularities. For example, certain
words may be known to be in the text. Or we could look for repeating sequences of cipher
letters and try to deduce their plaintext equivalents.
Next, notice the sequence ZWSZ in the first line. We do not know that these four letters form a
complete word, but if they do, it is of the form th_t. If so, S equates with a.
Only four letters have been identified, but already we have quite a bit of the message.
Continued analysis of frequencies plus trial and error should easily yield a solution from this
point. The complete plaintext, with spaces added between words, follows:
it was disclosed yesterday that several informal but direct contacts have been made with
political representatives of the vietcong in Moscow
Monoalphabetic ciphers are easy to break because they reflect the frequency data of the
original alphabet. A countermeasure is to provide multiple substitutes, known as
homophones, for a single letter. For example, the letter e could be assigned a number of
different cipher symbols, such as 16, 74, 35, and 21, with each homophone assigned to a
letter in rotation or randomly. If the number of symbols assigned to each letter is proportional
to the relative frequency of that letter, then single-letter frequency information is completely
obliterated. The great mathematician Carl Friedrich Gauss believed that he had devised an
unbreakable cipher using homo-phones. However, even with homophones, each element of
plaintext affects only one element of ciphertext, and multiple-letter patterns (e.g., digram
frequencies) still survive in the ciphertext, making cryptanalysis relatively straightforward.
Two principal methods are used in substitution ciphers to lessen the extent to which the
structure of the plaintext survives in the ciphertext: One approach is to encrypt multiple
letters of plaintext, and the other is to use multiple cipher alphabets. We briefly examine each.
Play fair Cipher:
The best-known multiple-letter encryption cipher is the Playfair, which treats digrams in the
plaintext as single units and translates these units into ciphertext digrams.3
The Playfair algorithm is based on the use of a 5 x 5 matrix of letters con-structed using a
keyword. Here is an example, solved by Lord Peter Wimsey in Dorothy Sayers‘s Have His
Carcase:
In this case, the keyword is monarchy. The matrix is constructed by filling in the letters of the
keyword (minus duplicates) from left to right and from top to bot-tom, and then filling in the
remainder of the matrix with the remaining letters in alphabetic order. The letters I and J
count as one letter. Plaintext is encrypted two letters at a time, according to the following
rules:
1. Repeating plaintext letters that are in the same pair are separated with a
filler letter, such as x, so that balloon would be treated as ba lx lo on.
2. Two plaintext letters that fall in the same row of the matrix are each
replaced by the letter to the right, with the first element of the row circularly following the last.
For example, ar is encrypted as RM.
3. Two plaintext letters that fall in the same column are each replaced by
the letter beneath, with the top element of the column circularly following the last. For
example, mu is encrypted as CM.
4. Otherwise, each plaintext letter in a pair is replaced by the letter that
lies in its own row and the column occupied by the other plaintext letter. Thus, hsbecomes BP
and ea becomes IM (or JM, as the encipherer wishes).
The Playfair cipher is a great advance over simple monoalphabetic ciphers. For one thing,
whereas there are only 26 letters, there are 26 x 26 = 676 digrams,
that identification of individual digrams is more difficult. Furthermore, the relative frequencies
of individual letters exhibit a much greater range than that of digrams, making frequency
analysis much more difficult. For these reasons, the Playfair cipher was for a long time
considered unbreakable. It was used as the standard field system by the British Army in
World War I and still enjoyed considerable use by the U.S. Army and other Allied forces during
World War II.
Despite this level of confidence in its security, the Playfair cipher is relatively easy to break,
because it still leaves much of the structure of the plaintext language intact. A few hundred
letters of ciphertext are generally sufficient.
One way of revealing the effectiveness of the Playfair and other ciphers is shown in Figure 2.6,
based on [SIMM93]. The line labeled plaintext plots the frequency distri-bution of the more
than 70,000 alphabetic characters in the Encyclopaedia Britannica article on cryptology. This
is also the frequency distribution of any monoalphabetic substitution cipher, because the
frequency values for individual letters are the same, just with different letters substituted for
the original letters. The plot was developed in the following way: The number of occurrences of
each letter in the text was counted and divided by the number of occurrences of the letter e
(the most frequently used letter). As a result, e has a relative frequency of 1, t of about 0.76,
and so on. The points on the horizontal axis correspond to the letters in order of decreasing
frequency.
Figure 2.6 also shows the frequency distribution that results when the text is encrypted using
the Playfair cipher. To normalize the plot, the number of occurrences of each letter in the
ciphertext was again divided by the number of occurrences of e
in the plaintext. The resulting plot therefore shows the extent to which the frequency
distribution of letters, which makes it trivial to solve substitution ciphers, is masked by
encryption. If the frequency distribution information were totally concealed in the encryption
process, the ciphertext plot of frequencies would be flat, and cryptanalysis using ciphertext
only would be effectively impossible. As the figure shows, the Playfair cipher has a flatter
distribution than does plaintext, but nevertheless, it reveals plenty of structure for a
cryptanalyst to work with.
Transposition Techniques:
All the techniques examined so far involve the substitution of a ciphertext symbol for a
plaintext symbol. A very different kind of mapping is achieved by performing some sort of
permutation on the plaintext letters. This technique is referred to as a transposition cipher.
The simplest such cipher is the rail fence technique, in which the plaintext is written down as
a sequence of diagonals and then read off as a sequence of rows. For example, to encipher the
message ―meet me after the toga party‖ with a rail fence of depth 2, we write the following:
mematrhtgpry
etefeteoaat
The encrypted message is
MEMATRHTGPRYETEFETEOAAT
This sort of thing would be trivial to cryptanalyze. A more complex scheme is to write the
message in a rectangle, row by row, and read the message off, column by column, but
permute the order of the columns. The order of the columns then becomes the key to the
algorithm. For example,
Thus, in this example, the key is 4312567. To encrypt, start with the column that is labeled 1,
in this case column 3. Write down all the letters in that column. Proceed to column 4, which
is labeled 2, then column 2, then column 1, then columns 5, 6, and 7.
A pure transposition cipher is easily recognized because it has the same letter frequencies as
the original plaintext. For the type of columnar transposition just shown, cryptanalysis is
fairly straightforward and involves laying out the ciphertext in a matrix and playing around
with column positions. Digram and trigram frequency tables can be useful.
The transposition cipher can be made significantly more secure by performing more than one
stage of transposition. The result is a more complex permutation that is not easily
reconstructed. Thus, if the foregoing message is reencrypted using the same algorithm,
To visualize the result of this double transposition, designate the letters in the original
plaintext message by the numbers designating their position. Thus, with 28 letters in the
message, the original sequence of letters is
01 02 03 04 05 06 07 08 09 10 11 12 13 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28
15 22 05 12 19 26 06 13 20 27 07 14 21 28
which has a somewhat regular structure. But after the second transposition, we have
17 09 05 27 24 16 12 07 10 02 22 20 03 25
15 13 04 23 19 14 11 01 26 21 18 08 06 28
STEGANOGRAPHY:
A plaintext message may be hidden in one of two ways. The methods
of steganography conceal the existence of the message, whereas the methods of cryp-
tography render the message unintelligible to outsiders by various transformations of the
text.
A simple form of steganography, but one that is time-consuming to construct, is one in which
an arrangement of words or letters within an apparently innocuous text spells out the real
message. For example, the sequence of first letters of each word of the overall message spells
out the hidden message. Figure 2.9 shows an example in which a subset of the words of the
overall message is used to convey the hidden message. See if you can decipher this; it‘s not
too hard.
Block cipher vs Stream cipher:
The main difference between Block cipher and Stream cipher is that block cipher converts
the plain text into cipher text by taking plain text's block at a time. While stream
cipher Converts the plaint text into cipher text by taking 1 byte of plain text at a time.
A block cipher is an encryption algorithm that encrypts a fixed size of n-bits of data - known as
a block - at one time. The usual sizes of each block are 64 bits, 128 bits, and 256 bits. So for
example, a 64-bit block cipher will take in 64 bits of plaintext and encrypt it into 64 bits of
ciphertext. In cases where bits of plaintext is shorter than the block size, padding schemes are
called into play. Majority of the symmetric ciphers used today are actually block ciphers. DES,
Triple DES, AES, IDEA, and Blowfish are some of the commonly used encryption algorithms
that fall under this group.
DES - DES, which stands for Data Encryption Standard, used to be the most popular block
cipher in the world and was used in several industries. It's still popular today, but only
because it's usually included in historical discussions of encryption algorithms. The DES
algorithm became a standard in the US in 1977. However, it's already been proven to be
vulnerable to brute force attacks and other cryptanalytic methods. DES is a 64-bit cipher that
works with a 64-bit key. Actually, 8 of the 64 bits in the key are parity bits, so the key size is
technically 56 bits long.
3DES - As its name implies, 3DES is a cipher based on DES. It's practically DES that's run
three times. Each DES operation can use a different key, with each key being 56 bits long. Like
DES, 3DES has a block size of 64 bits. Although 3DES is many times stronger than DES, it is
also much slower (about 3x slower). Because many organizations found 3DES to be too slow for
many applications, it never became the ultimate successor of DES. That distinction is reserved
for the next cipher in our list - AES.
AES - A US Federal Government standard since 2002, AES or Advanced Encryption Standard
is arguably the most widely used block cipher in the world. It has a block size of 128 bits and
supports three possible key sizes - 128, 192, and 256 bits. The longer the key size, the stronger
the encryption. However, longer keys also result in longer processes of encryption. For a
discussion on encryption key lengths, read Choosing Key Lengths for Encrypted File Transfers.
Blowfish - This is another popular block cipher (although not as widely used as AES). It has a
block size of 64 bits and supports a variable-length key that can range from 32 to 448 bits.
One thing that makes blowfish so appealing is that Blowfish is unpatented and royalty-free.
Twofish - Yes, this cipher is related to Blowfish but it's not as popular (yet). It's a 128-bit block
cipher that supports key sizes up to 256 bits long.
A stream cipher is an encryption algorithm that encrypts 1 bit or byte of plaintext at a time. It
uses an infinite stream of pseudorandom bits as the key. For a stream cipher implementation
to remain secure, its pseudorandom generator should be unpredictable and the key should
never be reused. Stream ciphers are designed to approximate an idealized cipher, known as the
One-Time Pad.
The One-Time Pad, which is supposed to employ a purely random key, can potentially achieve
"perfect secrecy". That is, it's supposed to be fully immune to brute force attacks. The problem
with the one-time pad is that, in order to create such a cipher, its key should be as long or
even longer than the plaintext. In other words, if you have 500 MegaByte video file that you
would like to encrypt, you would need a key that's at least 4 Gigabits long.
Clearly, while Top Secret information or matters of national security may warrant the use of a
one-time pad, such a cipher would just be too impractical for day-to-day public use. The key of
a stream cipher is no longer as long as the original message. Hence, it can no longer guarantee
"perfect secrecy". However, it can still achieve a strong level of security.
RC4 - RC4, which stands for Rivest Cipher 4, is the most widely used of all stream ciphers,
particularly in software. It's also known as ARCFOUR or ARC4. RC4 steam chiphers have been
used in various protocols like WEP and WPA (both security protocols for wireless networks) as
well as in TLS. Unfortunately, recent studies have revealed vulnerabilities in RC4, prompting
Mozilla and Microsoft to recommend that it be disabled where possible. In fact, RFC
7465 prohibits the use of RC4 in all versions of TLS.
These recent findings will surely allow other stream ciphers (e.g. SALSA, SOSEMANUK,
PANAMA, and many others, which already exist but never gained the same popularity as RC4)
to emerge and possibly take its place.
Encryption Process
The encryption process uses the Feistel structure consisting multiple rounds of processing of
the plaintext, each round consisting of a ―substitution‖ step followed by a permutation step.
Feistel Structure is shown in the following illustration –
The input block to each round is divided into two halves that can be denoted as L and
R for the left half and the right half.
In each round, the right half of the block, R, goes through unchanged. But the left half,
L, goes through an operation that depends on R and the encryption key. First, we
apply an encrypting function ‗f‘ that takes two input − the key K and R. The function
produces the output f(R,K). Then, we XOR the output of the mathematical function
with L.
In real implementation of the Feistel Cipher, such as DES, instead of using the whole
encryption key during each round, a round-dependent key (a subkey) is derived from
the encryption key. This means that each round uses a different key, although all
these subkeys are related to the original key.
The permutation step at the end of each round swaps the modified L and unmodified R.
Therefore, the L for the next round would be R of the current round. And R for the next
round be the output L of the current round.
Above substitution and permutation steps form a ‗round‘. The number of rounds are
specified by the algorithm design.
Once the last round is completed then the two sub blocks, ‗R‘ and ‗L‘ are concatenated
in this order to form the ciphertext block.
The difficult part of designing a Feistel Cipher is selection of round function ‗f‘. In order to be
unbreakable scheme, this function needs to have several important properties.
Decryption Process
The process of decryption in Feistel cipher is almost similar. Instead of starting with a block of
plaintext, the ciphertext block is fed into the start of the Feistel structure and then the
process thereafter is exactly the same as described in the given illustration.
The process is said to be almost similar and not exactly same. In the case of decryption, the
only difference is that the subkeys used in encryption are used in the reverse order.
The final swapping of ‗L‘ and ‗R‘ in last step of the Feistel Cipher is essential. If these are not
swapped then the resulting ciphertext could not be decrypted using the same algorithm.
Number of Rounds
The number of rounds used in a Feistel Cipher depends on desired security from the system.
More number of rounds provide more secure system. But at the same time, more rounds mean
the inefficient slow encryption and decryption processes. Number of rounds in the systems
thus depend upon efficiency–security tradeoff.
DES is an implementation of a Feistel Cipher. It uses 16 round Feistel structure. The block size
is 64-bit. Though, key length is 64-bit, DES has an effective key length of 56 bits, since 8 of the
64 bits of the key are not used by the encryption algorithm (function as check bits only).
General Structure of DES is depicted in the following illustration.
Since DES is based on the Feistel Cipher, all that is required to specify DES is −
Round function
Key schedule
Any additional processing − Initial and final permutation
The initial and final permutations are straight Permutation boxes (P-boxes) that are inverses of
each other. They have no cryptography significance in DES. The initial and final permutations
are shown as follows −
Round Function
The heart of this cipher is the DES function, f. The DES function applies a 48-bit key to the
rightmost 32 bits to produce a 32-bit output.
Expansion Permutation Box − Since right input is 32-bit and round key is a 48-bit, we first
need to expand right input to 48 bits. Permutation logic is graphically depicted in the following
illustration –
The graphically depicted permutation logic is generally described as table in DES specification
illustrated as shown –
XOR (Whitener). − After the expansion permutation, DES does XOR operation on the
expanded right section and the round key. The round key is used only in this
operation.
Substitution Boxes. − The S-boxes carry out the real mixing (confusion). DES uses 8
S-boxes, each with a 6-bit input and a 4-bit output. Refer the following illustration −
There are a total of eight S-box tables. The output of all eight s-boxes is then combined
in to 32 bit section.
Straight Permutation − The 32 bit output of S-boxes is then subjected to the straight
permutation with rule shown in the following illustration:
Key Generation
The round-key generator creates sixteen 48-bit keys out of a 56-bit cipher key. The process of
key generation is depicted in the following illustration –
The logic for Parity drop, shifting, and Compression P-box is given in the DES description.
DES Analysis
The DES satisfies both the desired properties of block cipher. These two properties make
cipher very strong.
Avalanche effect − A small change in plaintext results in the very great change in the
ciphertext.
Completeness − Each bit of ciphertext depends on many bits of plaintext.
During the last few years, cryptanalysis have found some weaknesses in DES when key
selected are weak keys. These keys shall be avoided.
DES has proved to be a very well designed block cipher. There have been no significant
cryptanalytic attacks on DES other than exhaustive key search.
The speed of exhaustive key searches against DES after 1990 began to cause discomfort
amongst users of DES. However, users did not want to replace DES as it takes an enormous
amount of time and money to change encryption algorithms that are widely adopted and
embedded in large security architectures.
The pragmatic approach was not to abandon the DES completely, but to change the manner
in which DES is used. This led to the modified schemes of Triple DES (sometimes known as
3DES).
Incidentally, there are two variants of Triple DES known as 3-key Triple DES (3TDES) and 2-
key Triple DES (2TDES).
Before using 3TDES, user first generate and distribute a 3TDES key K, which consists of three
different DES keys K1, K2 and K3. This means that the actual 3TDES key has length 3×56 =
168 bits. The encryption scheme is illustrated as follows –
Second variant of Triple DES (2TDES) is identical to 3TDES except that K 3is replaced by K1. In
other words, user encrypt plaintext blocks with key K 1, then decrypt with key K2, and finally
encrypt with K1 again. Therefore, 2TDES has a key length of 112 bits.
Triple DES systems are significantly more secure than single DES, but these are clearly a
much slower process than encryption using single DES.
The process of encrypting a plan text into an encrypted message with the use of S-DES has
been divided into multi-steps which may help you to understand it as easily as possible.
1. It is a block cipher.
2. It has 8-bits block size of plain text or cipher text.
3. It uses 10-bits key size for encryption.
4. It is a symmetric cipher.
5. It has Two Rounds.
First and foremost, we need to generate a key. With the help of this key we will encrypt the
message.
Now the interesting question is, how to generate the key, and where the key is to be used.
Step 1:
Just select a random key of 10-bits, which only should be shared between both parties which
means sender and receiver.
Select key:1010000010
You can select any random number of 10-bits.
Step 2:
Put this key into P.10 Table and permute the bits.
P.10 Table:
1 2 3 4 5 6 7 8 9 10
Input
Output 3 5 2 7 4 10 1 9 8 6
Should
be
As I put key into P.10 Table.
Input 1 0 1 0 0 0 0 0 1 0
Output 1 0 0 0 0 0 1 1 0 0
Now the output will be:
Key: 1000001100
Step 3:
Divide the key into two halves, left half and right half;
{1 0 0 0 0} | {0 1 1 0 0}
Step 4:
Now once again combine both halve of the bits, right and left. Put them into the P8 table. What
you get, that will be the K1 or First key.
Combine: 0 0 0 0 1 1 1 0 0 0
Permute into 8bit table:
P8-Table
Input 1 2 3 4 5 6 7 8 9 10
Combine-bits 0 0 0 0 1 1 1 0 0 0
Output Should be 6 3 7 4 8 5 10 9
Output bits 1 0 1 0 0 1 0 0
See the table the 1 and 2 number of bits are removed and other are permuted, as 6 in place of
one, 9 in place of 8 and so on.
Step6:
As we know S-DES has two round and for that we also need two keys, one key we generate in
the above steps (step 1 to step 5). Now we need to generate a second bit and after that we will
move to encrypt the plain text or message.
It is simple to generate the second key. Simply, go in step 4 copy both halves, each one
consists of 5 bits. But be careful on the taking of bits. Select those halves which are output of
first round shift, don‘t take the bits which are not used in the first round. In simple words, take
the output of first round shift in above step 4.
Step 7:
Now just apply two round shift circulate on each half of the bits, which means to change the
position of two bits of each halves.
left half: 00001
Right half: 11000
After the two rounds shift on each half out-put of each half will be.
Left half: 00100
Right half: 00011
Combine both together: As: 0 0 1 0 0 – 0 0 0 1 1
Step 8:
Now put the bits into 8-P Table, what you get, that will be your second key. Table is also given
in step 5.
But here the combinations of bits are changed because of two left round shift from step 5.
Check it in depth.
Combine bits: 0 0 1 0 0 0 0 0 1 1
P.8 Table
1 2 3 4 5 6 7 8 9 10
Input
Combine-bits 0 0 1 0 0 0 0 0 1 1
Output Should be 6 3 7 4 8 5 10 9
Output bits 0 1 0 0 0 0 1 1
How To Encrypt the Plain Text into Cipher Text in S-DES After Generating Keys.
Note: the size of input text is 8 bit and output also will be 8-bit. Or the block size is 8-bit/one
byte always.
Step 1:
Step 2:
Put the plain text into IP-8(initial permutation) table and permute the bits.
IP-8 table
Bits number 1 2 3 4 5 6 7 8
Bits to be permuted 0 1 1 1 0 0 1 0
Permute the bits 2 6 3 1 4 8 5 7
Permuted bits 1 0 1 0 1 0 0 1
Output is: 1 0 1 0 1 0 0 1
Step 3:
Now break the bits into two halves, each half will consist of 4 bits. The halves will be right and
left.
Two Halves of the bits:
Step 4:
Take the right 4 bits and put them into E.P (expand and per-mutate) Table.
Bits of right half: 1001
E.P Table
Output of right four bits will be 8 bits, after their expanding with the help of E.P table.
O-P: 0 1 0 0 0 0 0 1
Step 5: Now, just take the output and XOR it with First key Or K 1 (which we created in
previous topic that is how to generate key.).
O-p: 0 1 0 0 0 0 0 1
⊕
K1: 1 0 1 0 0 1 0 0
Output of XOR: 1 1 1 0 0101
Step 6:
Once again split the output of XOR‘s bit into two halves and each half will consist of 4 bits.
Col 0 1 2 3
Rows
0 01 00 11 10
1 11 10 01 00
2 00 10 01 11
3 11 01 11 10
S-1
Col 0 1 2 3
Rows
0 00 01 10 11
1 10 00 01 11
2 11 00 01 00
3 10 01 00 11
Note: put the left half into S-0 box and put the right half into S-1 Box.
Take any half, (but don‘t forget the above mentioned note..).
The most first and most last bit will be consider the row and other remaining, which are, 2 and
3, will be considered the columns.
See, here I‘m taking the left half: which is 1 1 1 0.
Now I will take First and last bit which are: 1 and 0. These will be row.
And I also will take 2nd and 3rd bits which are: 11. These will be column number.
10 means=2rd row
11 means = 3rd col
See below how it will be the second row, and 3rd column. Please, remember the IP Addressing,
such as 28 for 255.
12 01 =2+0=2
12 11 =2+1=3
For left half we check the in S-0 .In which 2nd row and 3rd column.
Which means the value will be in 1st row and 3rd column, let‘s in check S-1.
The output will be: 11 for left half.
Step 7:
Step 8: Now take these 4 bits and put them in P-4 (permutation 4) table and get the result.
P 4 Table
Numbers 1 2 3 4
Input 0 0 1 1
Output should be 2 4 3 1
Out-Put 0 1 1 0
Step 9:
Now get XOR the output with left 4 bits of Initial Per-mutation. The left bits of initial per-
mutation are in step 3, which are 1 0 1 0.( please, in step 3).
Let them to be XOR.
0110
⊕
1010
Step 10:
Now get the right half of the initial permutation, which is step 3, and combine that with this
out- put.
Step 11: Now once again break the out-put into two halves, left and right;
Left: {1 1 0 0} right: {1 0 0 1}
Step 12:
Now swap both halves, which means put the left half in place of right and vice versa.
Result:
Left half: {1 0 0 1} right half: {1 1 0 0}
Step 13:
Now let‘s take these halves and once again start the same procedure from step 2 or initial
Permutation, BUT be careful on using key in this stage we use second key or K2 (not K1). And
put that into IP-1 (IP inverse) Table. What you get will be your final cipher text.
Ø Now Take the right 4bits and put them into EP table, and get the result of 8 bits.
It will be: 0 1 0 1 0 1 0 1
Out-put of EP: 0 1 0 1 0 1 0 1
Out- Put: 0 0 0 1 0 1 1 0
Ø Once again split the output of XOR‘s bit into two halves:
Left: {0 0 0 1} right: {0 1 1 0}
Ø Now put each half in S-Boxes, which are S-0 and S-1:
Note: put the left half into S-0 box and put the right half into S-1 Box.
Left: 0 0 0 1
Row: 0 1= 1
Col: 0 0 = 0
Let find the row and column of left, in S-0, the value is row number one and col number Zero.
It will be. 1 1
Ø now check the right half: 0 1 1 0
Row: 0 0=0
Col: 1 1= 3
Let's find the row and column of right, in S-1, the value is row number zero and col number
three.
It will be: 11
Ø Now combine both halves together.
It will be: 1 1 1 1
Ø Now take these 4 bits and put them in P-4 ( Per-mutation 4) table and get the result.
Out-put:1 0 1 0
Ø Now get the right half of the initial permutation and combine that with this out- put.
1010-0110
Ø Now once again break the it into two halves, left and right
Left: {1 0 1 0} right: {0 1 1 0}
Ø Now swap both halves, which means put the left half in place of right and vice versa.
01101010
IP-1 Table
Numbers 1 2 3 4 5 6 7 8
input 0 1 1 0 1 0 1 0
Out-put to be 2 6 3 1 4 8 5 7
Out-Put 1 0 1 0 0 0 1 1
Finally we Encrypted successfully our plain text: 01110010 into cipher text which is:
10100011
The Data Encryption Standard (DES) is a symmetric key block cipher which takes 64-bit plaintext
and 56-bit key as an input and produces 64-bit cipher text as output. The DES function is made
up of P and S-boxes. P-boxes transpose bits and S-boxes substitute bits to generate a cipher.
Advantages of CBC –
CBC works well for input greater than b bits.
CBC is a good authentication mechanism.
Better resistive nature towards cryptanalsis than ECB.
Disadvantages of CBC –
Parallel encryption is not possible since every encryption requires previous cipher.
Advantages of CFB –
Since, there is some data loss due to use of shift register, thus it is difficult for applying
cryptanalysis.
Like in normal counter mode, blocks are numbered sequentially, and then this block number is
combined with an initialization vector (IV) and encrypted with a block cipher E, usually AES.
The result of this encryption is then XORed with the plaintext to produce the ciphertext. Like
all counter modes, this is essentially a stream cipher, and so it is essential that a different IV is
used for each stream that is encrypted.
The ciphertext blocks are considered coefficients of a polynomial which is then evaluated at a
key-dependent point H, using finite field arithmetic. The result is then encrypted, producing an
authentication tag that can be used to verify the integrity of the data. The encrypted text then
contains the IV, ciphertext, and authentication tag.
GCM requires one block cipher operation and one 128-bit multiplication in the Galois field per
each block (128 bit) of encrypted and authenticated data. The block cipher operations are
easily pipelined or parallelized; the multiplication operations are easily pipelined and can be
parallelized with some modest effort (either by parallelizing the actual operation, by
adapting Horner's method per the original NIST submission, or both).
GCM is proven secure in the concrete security model.[19] It is secure when it is used with a
block cipher that is indistinguishable from a random permutation; however, security depends
on choosing a unique initialization vector for every encryption performed with the same key
(see stream cipher attack). For any given key and initialization vector combination, GCM is
limited to encrypting 239−256 bits of plain text (64 GiB). NIST Special Publication 800-
38D[3] includes guidelines for initialization vector selection.
AES CIPHER
The more popular and widely adopted symmetric encryption algorithm likely to be
encountered nowadays is the Advanced Encryption Standard (AES). It is found at least six
time faster than triple DES.
A replacement for DES was needed as its key size was too small. With increasing computing
power, it was considered vulnerable against exhaustive key search attack. Triple DES was
designed to overcome this drawback but it was found slow.
Operation of AES
Interestingly, AES performs all its computations on bytes rather than bits. Hence, AES treats
the 128 bits of a plaintext block as 16 bytes. These 16 bytes are arranged in four columns and
four rows for processing as a matrix −
Unlike DES, the number of rounds in AES is variable and depends on the length of the key.
AES uses 10 rounds for 128-bit keys, 12 rounds for 192-bit keys and 14 rounds for256-bit
keys. Each of these rounds uses a different 128-bit round key, which is calculated from the
original AES key.
Encryption Process
Here, we restrict to description of a typical round of AES encryption. Each round comprise of
four sub-processes. The first round process is depicted below −
The 16 input bytes are substituted by looking up a fixed table (S-box) given in design. The
result is in a matrix of four rows and four columns.
Shiftrows
Each of the four rows of the matrix is shifted to the left. Any entries that ‗fall off‘ are re-
inserted on the right side of row. Shift is carried out as follows −
First row is not shifted.
Second row is shifted one (byte) position to the left.
Third row is shifted two positions to the left.
Fourth row is shifted three positions to the left.
The result is a new matrix consisting of the same 16 bytes but shifted with respect to
each other.
MixColumns
Each column of four bytes is now transformed using a special mathematical function. This
function takes as input the four bytes of one column and outputs four completely new bytes,
which replace the original column. The result is another new matrix consisting of 16 new
bytes. It should be noted that this step is not performed in the last round.
Addroundkey
The 16 bytes of the matrix are now considered as 128 bits and are XORed to the 128 bits of
the round key. If this is the last round then the output is the ciphertext. Otherwise, the
resulting 128 bits are interpreted as 16 bytes and we begin another similar round.
Decryption Process
The process of decryption of an AES ciphertext is similar to the encryption process in the
reverse order. Each round consists of the four processes conducted in the reverse order −
AES Analysis
In present day cryptography, AES is widely adopted and supported in both hardware and
software. Till date, no practical cryptanalytic attacks against AES has been discovered.
Additionally, AES has built-in flexibility of key length, which allows a degree of ‗future-
proofing‘ against progress in the ability to perform exhaustive key searches.
However, just as for DES, the AES security is assured only if it is correctly implemented and
good key management is employed.
UNIT-II
Public-key cryptography is a set of well-established techniques and standards for protecting
communications from eavesdropping, tampering, and impersonation attacks. Encryption and
decryption allow two communicating parties to disguise information they send to each
other.Public-key encryption (also called asymmetric encryption) involves a pair of keys, a
public key and a private key, associated with an entity. Each public key is published, and the
corresponding private key is kept secret. Data encrypted with a public key can be decrypted
only with the corresponding private key.
Breaking an encryption algorithm is basically finding the key to the access the encrypted data
in plain text. For symmetric algorithms, breaking the algorithm usually means trying to
determine the key used to encrypt the text. For a public key algorithm, breaking the algorithm
usually means acquiring the shared secret information between two recipients.
One method of breaking a symmetric algorithm is to simply try every key within the full
algorithm until the right key is found. For public key algorithms, since half of the key pair is
publicly known, the other half (private key) can be derived using published, though complex,
mathematical calculations. Manually finding the key to break an algorithm is called a brute
force attack.
For symmetric keys, encryption strength is often described in terms of the size or length of the
keys used to perform the encryption: longer keys generally provide stronger encryption. Key
length is measured in bits. For example, 128-bit keys with the RC4 symmetric-key cipher
supported by SSL provide significantly better cryptographic protection than 40-bit keys used
with the same cipher. The 128-bit RC4 encryption is 3 x 1026 times stronger than 40-bit RC4
encryption.
An encryption key is considered full strength if the best known attack to break the key is no
faster than a brute force attempt to test every key possibility.
Different types of algorithms — particularly public key algorithms — may require different key
lengths to achieve the same level of encryption strength as a symmetric-key cipher. The RSA
cipher can use only a subset of all possible values for a key of a given length, due to the nature
of the mathematical problem on which it is based. Other ciphers, such as those used for
symmetric-key encryption, can use all possible values for a key of a given length. More possible
matching options means more security.
Because it is relatively trivial to break an RSA key, an RSA public-key encryption cipher must
have a very long key — at least 1024 bits — to be considered cryptographically strong. On the
other hand, symmetric-key ciphers are reckoned to be equivalently strong using a much
shorter key length, as little as 80 bits for most algorithms.
The algorithms for setting up a PKC system is such that even if the public key is known to a
third party, it is nearly impossible to get the private key. Thus, because of this, the public key
of the recipient can be disseminated widely and yet the whole process remains secure. It is
important to note that each set of public key and private key is unique and differs from sender
to sender.
In a symmetrical system of n users, whenever a new user joins the system, he/she must issue
a common key with each of the previous users. Thus the number of keys amounts to:
Whereas in an asymmetrical system of n users, whenever a new user joins the system, he/she
must issue a separate private key and public key. The number of keys amounts to 2n keys
only.
management. A major part of key management is to ensure the safety and security of the whole
encryption and decryption process. In symmetric forms of encryption, the same key is used for
encryption as well as decryption, whereas in asymmetric encryption, there are separate keys
Public key cryptography systems are mainly based on three mathematical models — integer
factorization, discrete logarithms, and elliptical curve problems.
The RSA system was made by Ron Rivest, Adi Shamir, and Leonard Adleman, based on whose
initials the name of the system was coined in 1977. It is based on an integer factorization
model and on the impracticality of factoring large numbers. For example, if n=p*q then it is
easy to find n on knowing p and q but the reverse is difficult.
The RSA system has three main parts to it — key generation, encryption, and decryption.
The GCD of e and (p-1)*(q-1) should be 1, that is, they should be co-prime numbers.
Form the public key: The public key is represented by (n,e). Although an external
party who has access to the public key will know the n, finding p and q remains
largely a mystery.
The private key is represented by (n,d) where d is the multiplicative inverse of e modulo
(p-1)*(q-1).
This is represented as: d*e = 1 mod ((p-1)*(q-1))
Example:
For convenience, we have taken small numbers as examples but in reality, larger numbers are
taken (typically from the range of 1024 to 4096 bits).
n= 61*53= 3233
60*52=780
Let e=17 as e lies between 1 and 780 and is coprime with 780
Encryption:
Double security:
RSA also offers a double security by enabling Casey to imprint a digital signature with the
encrypted message to Hannah. Sometimes a conundrum can arise where Hannah receives an
encrypted message but is not sure if it is from Casey so that‘s when the digital signature comes
in handy.
RSA Analysis:
The RSA system makes use of the fact that it is based on an integer factorization
model and on the impracticality of factoring large numbers. For example, if n=p*q
then it is easy to find n on knowing p and q but the reverse is difficult.
It is the most popular encryption system today although many programs are
emerging that can feasibly factor large numbers making the prime security of the
RLA encryption unreliable. Although the largest RSA encrypted message that was
cracked 768 bits long and typical messages are about 1024 to 4096 bits long. Hence,
it is not very unreliable yet.
Diffie-Hellman algorithm :
The Diffie-Hellman algorithm was developed by Whitfield Diffie and Martin Hellman in 1976.
This algorithm was devices not to encrypt the data but to generate same private cryptographic
key at both ends so that there is no need to transfer this key from one communication end to
another.
Asymmetric Encryption of data requires transfer of cryptographic private key. The most
challenging part in this type of encryption is the transfer of the encryption key from sender to
receiver without anyone intercepting this key in between. This transfer or rather generation on
same cryptographic keys at both sides secretively was made possible by the Diffie-Hellman
algorithm.
This algorithm uses arithmetic modulus as the basis of its calculation. Suppose Alice and Bob
follow this key exchange procedure with Eve acting as a man in middle interceptor (or the bad
guy).
Here are the calculation steps followed in this algorithm that make sure that eve never gets to
know the final keys through which actual encryption of data takes place.
First, both Alice and Bob agree upon a prime number and another number that has no
factor in common. Lets call the prime number as p and the other number as g. Note
that g is also known as the generator and p is known as prime modulus.
Now, since eve is sitting in between and listening to this communication so eve also gets
to know p and g.
Now, the modulus arithmetic says that r = (g to the power x) mod p. So r will always
produce an integer between 0 and p.
The first trick here is that given x (with g and p known) ,its very easy to find r. But
given r (with g and p known) its difficult to deduce x.
One may argue that this is not that difficult to crack but what if the value of p is a very
huge prime number? Well, if this is the case then deducing x (if r is given) becomes
almost next to impossible as it would take thousands of years to crack this even with
supercomputers.
This is also called the discrete logarithmic problem.
Coming back to the communication, all the three Bob, Alice and eve now know g and p.
Now, Alice selects a random private number xa and calculates (g to the power xa)
mod p = ra. This resultant ra is sent on the communication channel to Bob.
Intercepting in between, eve also comes to know ra.
Similarly Bob selects his own random private number xb, calculates (g to the power xb)
mod p = rb and sends this rb to Alice through the same communication channel.
Obviously eve also comes to know about rb.
So eve now has information about g, p, ra and rb.
Now comes the heart of this algorithm. Alice calculates (rb to the power xa)
mod p = Final key which is equivalent to (g to the power (xa*xb) ) mod p .
Similarly Bob calculates (ra to the power xb) mod p = Final key which is again
equivalent to (g to the power(xb * xa)) mod p.
So both Alice and Bob were able to calculate a common Final key without sharing each
others private random number and eve sitting in between will not be able to determine
the Final key as the private numbers were never transferred.
Cryptographic explanation
The simplest and the original implementation [2] of the protocol uses the multiplicative group of
integers modulo p, where p is prime, and g is a primitive root modulo p. These two values are
chosen in this way to ensure that the resulting shared secret can take on any value from 1
to p–1. Here is an example of the protocol, with non-secret values in blue, and secret values
in red.
1. Alice and Bob publicly agree to use a modulus p = 23 and base g = 5 (which is a
primitive root modulo 23).
2. Alice chooses a secret integer a = 4, then sends Bob A = ga mod p
o A = 54 mod 23 = 4
3. Bob chooses a secret integer b = 3, then sends Alice B = gb mod p
o B = 53 mod 23 = 10
4. Alice computes s = Ba mod p
o s = 104 mod 23 = 18
5. Bob computes s = Ab mod p
o s = 43 mod 23 = 18
6. Alice and Bob now share a secret (the number 18).
Both Alice and Bob have arrived at the same value s, because, under mod p,
More specifically,
Note that only a, b, and (gab mod p = gba mod p) are kept secret. All the other values –
p, g, ga mod p, and gb mod p – are sent in the clear. Once Alice and Bob compute the
shared secret they can use it as an encryption key, known only to them, for sending
messages across the same open communications channel.
Of course, much larger values of a, b, and p would be needed to make this example
secure, since there are only 23 possible results of n mod 23. However, if p is a prime of
at least 600 digits, then even the fastest modern computers cannot find a given
only g, p and ga mod p. Such a problem is called the discrete logarithm problem.[3] The
computation of ga mod p is known as modular exponentiation and can be done
efficiently even for large numbers. Note that g need not be large at all, and in practice is
usually a small integer (like 2, 3, ...).
Secrecy chart
The chart below depicts who knows what, again with non-secret values in blue, and secret
values in red. Here Eve is an eavesdropper – she watches what is sent between Alice and Bob,
but she does not alter the contents of their communications.
Both Alice and Bob are now in possession of the group element gab, which can serve as the
shared secret key. The group G satisfies the requisite condition for secure communication if
there is not an efficient algorithm for determining gab given g, ga, and gb.
For example, the elliptic curve Diffie–Hellman protocol is a variant that uses elliptic curves
instead of the multiplicative group of integers modulo p. Variants using hyperelliptic
curves have also been proposed. The supersingular isogeny key exchange is a Diffie–Hellman
variant that has been designed to be secure against quantum computers.
An eavesdropper has been able to see ga, gb, gc, gab, gac, and gbc, but cannot use any
combination of these to efficiently reproduce gabc.
To extend this mechanism to larger groups, two basic principles must be followed:
Starting with an "empty" key consisting only of g, the secret is made by raising the current
value to every participant‘s private exponent once, in any order (the first such
exponentiation yields the participant‘s own public key).
Any intermediate value (having up to N-1 exponents applied, where N is the number of
participants in the group) may be revealed publicly, but the final value (having had
all N exponents applied) constitutes the shared secret and hence must never be revealed
publicly. Thus, each user must obtain their copy of the secret by applying their own private
key last (otherwise there would be no way for the last contributor to communicate the final
key to its recipient, as that last contributor would have turned the key into the very secret
the group wished to protect).
These principles leave open various options for choosing in which order participants contribute
to keys. The simplest and most obvious solution is to arrange the N participants in a circle and
have N keys rotate around the circle, until eventually every key has been contributed to by
all N participants (ending with its owner) and each participant has contributed to N keys
(ending with their own). However, this requires that every participant perform N modular
exponentiations.
By choosing a more optimal order, and relying on the fact that keys can be duplicated, it is
possible to reduce the number of modular exponentiations performed by each participant
to log2(N) + 1 using a divide-and-conquer-style approach, given here for eight participants:
1. Participants A, B, C, and D each perform one exponentiation, yielding gabcd; this value is
sent to E, F, G, and H. In return, participants A, B, C, and D receive gefgh.
2. Participants A and B each perform one exponentiation, yielding gefghab, which they send
to C and D, while C and D do the same, yielding gefghcd, which they send to A and B.
3. Participant A performs an exponentiation, yielding gefghcda, which it sends to B; similarly,
B sends gefghcdb to A. C and D do similarly.
4. Participant A performs one final exponentiation, yielding the secret gefghcdba = gabcdefgh,
while B does the same to get gefghcdab = gabcdefgh; again, C and D do similarly.
5. Participants E through H simultaneously perform the same operations using gabcd as
their starting point.
Once this operation has been completed all participants will possess the secret gabcdefgh, but
each participant will have performed only four modular exponentiations, rather than the eight
implied by a simple circular arrangement.
Security
The protocol is considered secure against eavesdroppers if G and g are chosen properly. In
particular, the order of the group G must be large, particularly if the same group is used for
large amounts of traffic. The eavesdropper ("Eve") has to solve the Diffie–Hellman problem to
obtain gab. This is currently considered difficult for groups whose order is large enough. An
efficient algorithm to solve the discrete logarithm problem would make it easy to
compute a or b and solve the Diffie–Hellman problem, making this and many other public key
cryptosystems insecure. Fields of small characteristic may be less secure.
Elliptic curves are a very important new area of mathematics which has been greatly explored
over the past few decades. They have shown tremendous potential as a tool for solving
complicated number problems and also for use in cryptography.
Elliptic curve cryptography is based on the difficulty of solving number problems involving
elliptic curves. On a simple level, these can be regarded as curves given by equations of the
form.
Elliptic curve cryptography is based on the difficulty of solving number problems involving
elliptic curves. On a simple level, these can be regarded as curves given by equations of the
form.
where and are constants. Below are some examples. In each case the graph shows all the
points with coordinates , where and satisfy an equation of the form shown above.
For the sake of accuracy we need to say a couple of words about the constants and For an
equation of the form given above to qualify as an elliptic curve, we need
that This ensures that the curve has no singular points. Informally, it means
that the curve is nice and smooth everywhere and doesn't contain any sharp points or cusps.
In the examples above the constants and were chosen to be whole numbers
between and and and respectively. But in general they can also take on other
values. (For uses in cryptography and are required to come from special sets of numbers
called finite fields). You can find out more about elliptic curves in this article.
Adding points
Given an elliptic curve, we can define the addition of two points on it as in the following
example.
We now want to find an answer for which we would also like to lie on the elliptic
We join up the points and with a straight line. This line generally intersects the curve in
one more place, We then reflect the point in the -axis.
We also need a definition for the sum when to understand what we mean
by In this case we take the tangent to the curve at the point , and then as
before find the intersection of this tangent line and the curve, before reflecting the point. This is
probably easier to understand with another graph:
The Elliptic Curve Cryptography (ECC) is modern family of public-key cryptosystems,
which is based on the algebraic structures of the elliptic curves over finite fields and on the
difficulty of the Elliptic Curve Discrete Logarithm Problem (ECDLP).
The ECC cryptography is considered a natural modern successor of the RSA cryptosystem,
because ECC uses smaller keys and signatures than RSA for the same level of security and
provides very fast key generation, fast key agreement and fast signatures.
ECC Keys
The private keys in the ECC are integers (in the range of the curve's field size, typically 256-
bit integers). Example of 256-bit ECC private key (hex encoded, 32 bytes, 64 hex digits) is:
0x51897b64e85c3f714bba707e867914295a1377a7463a9dae8ea6a8b914246319.
The key generation in the ECC cryptography is as simple as securely generating a random
integer in certain range, so it is extremely fast. Any number within the range is valid ECC
private key.
The public keys in the ECC are EC points - pairs of integer coordinates {x, y}, laying on the
curve. Due to their special properties, EC points can be compressed to just one coordinate + 1
bit (odd or even). Thus the compressed public key, corresponding to a 256-bit ECC private
key, is a 257-bit integer. Example of ECC public key (corresponding to the above private key,
encoded in the Ethereum format, as hex with prefix 02 or 03) is:
0x02f54ba86dc1ccb5bed0224d23f01ed87e4a443c47fc690d7797a13d41d2340e1a. In this
format the public key actually takes 33 bytes (66 hex digits), which can be optimized to exactly
257 bits.
ECC crypto algorithms can use different underlying elliptic curves. Different curves provide
different level of security (cryptographic strength), different performance (speed) and different
key length, and also may involve different algorithms.
ECC Algorithms
Elliptic-curve cryptography (ECC) provides several groups of algorithms, based on the math
of the elliptic curves over finite fields:
ECC digital signature algorithms like ECDSA (for classical curves) and EdDSA (for
twisted Edwards curves).
ECC encryption algorithms and hybrid encryption schemes like the ECIES integrated
encryption scheme and EEECC (EC-based ElGamal).
ECC key agreement algorithms like ECDH, X25519 and FHMQV.
All these algorithms use a curve behind (like secp256k1, curve25519 or p521) for the
calculations and rely of the difficulty of the ECDLP (elliptic curve discrete logarithm
problem). All these algorithms use public / private key pairs, where the private key is
an integer and the public key is a point on the elliptic curve (EC point). Let's get into
details about the elliptic curves over finite fields.
Advantages of ECC:
ECC employs a relatively short encryption key -- a value that must be fed into the encryption
algorithm to decode an encrypted message. This short key is faster and requires less
computing power than other first-generation encryption public key algorithms. For example, a
160-bit ECC encryption key provides the same security as a 1024-bit RSA encryption key and
can be up to 15 times faster, depending on the platform on which it is implemented. RSA is a
first-generation public-key cryptography technique invented by Ronald Rivest, Adi Shamir and
Leonard Adleman in the late 70s. Both RSA and ECC are in widespread use. The advantages of
ECC over RSA are particularly important in wireless devices, where computing power, memory
and battery life are limited.
Disadvantages of ECC
One of the main disadvantages of ECC is that it increases the size of the encrypted
message significantly more than RSA encryption. Furthermore, the ECC algorithm is
more complex and more difficult to implement than RSA, which increases the likelihood
of implementation errors, thereby reducing the security of the algorithm.
Elliptic curve cryptography is a branch of mathematics that deals with curves or functions that
take the format.
These curves have some properties that are of interest and use in cryptography – where we
define the addition of points as the reflection in the x axis of the third point that intersects the
curve.
Elliptical curve cryptography (ECC) is a public key encryption technique based on elliptic curve
theory that can be used to create faster, smaller, and more efficient cryptographic keys. ECC
generates keys through the properties of the elliptic curve equation instead of the traditional
method of generation as the product of very large prime numbers. The technology can be used
in conjunction with most public key encryption methods, such as RSA, and Diffie-Hellman.
According to some researchers, ECC can yield a level of security with a 164-bit key that other
systems require a 1,024-bit key to achieve. Because ECC helps to establish equivalent security
with lower computing power and battery resource usage, it is becoming widely used for mobile
applications.
You might have heard of ECC, ECDH or ECDSA. The first is an acronym for Elliptic Curve
Cryptography, the others are names for algorithms based on it. Today, we can find elliptic
curves cryptosystems in TLS, PGP and SSH, which are just three of the main technologies on
which the modern web and IT world are based. Not to mention Bitcoin and other
cryptocurrencies.
Before ECC become popular, almost all public-key algorithms were based on RSA, DSA, and
DH, alternative cryptosystems based on modular arithmetic. RSA and friends are still very
important today, and often are used alongside ECC. However, while the magic behind RSA and
friends can be easily explained, is widely understood, and rough implementations can be
written quite easily, the foundations of ECC are still a mystery to most.
Elliptic Curves:
An elliptic curve will simply be the set of points described by the equation:
Different shapes for different elliptic curves ( b=1, varying from 2 to -3).
Depending on the value of a and b , elliptic curves may assume different shapes on the
plane. As it can be easily seen and verified, elliptic curves are symmetric about the x -axis.
For our aims, we will also need a point at infinity (also known as ideal point) to be part of
our curve. From now on, we will denote our point at infinity with the symbol 0 (zero).
If we want to explicitly take into account the point at infinity, we can refine our definition of
elliptic curve as follows:
Since the discovery of RSA (and El-Gamal) their ability to withstand attacks has meant that
these two cryptographic systems have become widespread in use. They are being used every
day both for authentication purposes as well as encryption/decryption. Both systems cover the
current security standards–so why invent a new system? Even though ECC is relatively new,
the use of elliptic curves as a base for a cryptographic system was independently proposed by
By Victor Miller and Neil Koblitz. What makes it stand apart from RSA and El-Gamal is its
ability to be more efficient that those two. The reason why this is important are the
developments in information technology–most importantly hand held, mobile devices, sensor
networks, etc. Somehow, there must be a way to secure communications generated by these
devices, however their computing power and memory are not nearly as abundant as on their
desktop and laptop counterparts. A contemporary desktop or laptop system has no problems
working with 2048 bit keys and higher, but these small embedded devices do since we do not
want to spend a lot of their resources and bandwidth securing traffic.
The security of ECC depends on the difficulty of the Elliptic Curve Discrete Logarithm Problem.
This problem is defined as follows: let and be two points on an elliptic curve such
that , where is a scalar. Given and , it is computationally unfeasible to
obtain , if is sufficiently large. Hence, is the discrete logarithm of to . We can see
that the main operation involved in ECC is point multiplication, namely, multiplication of a
scalar with any point on the curve to obtain another point on the curve.
This is also the reason a ECC key of 160 bits provides the equivalent protection of a symmetric
key of 80 bits, namely because of the methods used to crack . If one knows and ,
one must guess at least the square root of the number of points on average to find . So if the
field size is , one must guess points. With a 80 bit symmetric key, it
takes guesses to crack it on average. The table below gives a comparison of equivalent key
sizes.
Each curve has a specially designated point called the base point chosen such that a large
fraction of the elliptic curve points are multiples of it. To generate a key pair, one selects a
random integer which serves as the private key, and computes which serves as the
corresponding public key. For cryptographic application the order of , that is the smallest
non-negative number such that , with the point at infinity, must be prime.
Point multiplication
In point multiplication a point on the elliptic curve is multiplied with a scalar using
elliptic curve equation to obtain another point on the same elliptic curve, giving .
Point multiplication can be achieved by two basic elliptic curve operations, namely point
addition and point doubling. Point addition is defined as adding two points and to obtain
another point written as . Point doubling is defined as adding a point to itself
to obtain another point so that .
Point multiplication is hence achieved as follows: let be a point on an elliptic curve. Let be
a scalar that is multiplied with the point to obtain another point on the curve so
that . If then .
Thus point multiplication uses point addition and point doubling repeatedly to find the result.
The above method is called the 'double and add' method for point multiplication. There are
other, more efficient methods for point multiplication.
Point addition
Point addition is the addition of two points and on an elliptic curve to obtain another
point on the same elliptic curve. This is demonstrated geometrically in the figure below for
the condition that Q neq -P .
Elliptic Curves provide security equivalent to classical systems (like RSA), but uses fewer
bits. 2) Implementation of elliptic curves in cryptography requires smaller chip size, less
power consumption, increase in speed, etc.
An elliptic curve is not an ellipse (oval shape), but is represented as a looping line
intersecting two axes (lines on a graph used to indicate the position of a point).
Elliptic curve cryptography is probably better for most purposes, but not for
everything. ECC's main advantage is that you can use smaller keys for the same level of
security, especially at high levels of security (AES-256 ~ ECC-512 ~ RSA-15424). ...
Advantages of ECC: Smaller keys, ciphertexts and signatures.
ECC as the Answer for High Security and for the Future
Consider these three facets of the problem, now: (i) the fact that the security and
practicality of a given asymmetric cryptosystems relies upon the difference in
difficulty between doing a given operation and its inverse. (ii) the fact that the
difference in difficulty between the forward and the inverse operation in a given
system is a function of the key length in use, due to the fact that the difficulty of
the forward and the inverse operations increase as very different functions of the
key length; the inverse operations get harder faster. (iii) Third, the fact that as
you are forced to use longer key lengths to adjust to the greater processing
power now available to attack the cryptosystem, even the ‗legitimate‘ forward
operations get harder, and require greater resources (chip space and/or
processor time), though by a lesser degree than do the inverse operations.
ECC‘s advantage is this: its inverse operation gets harder, faster, against
increasing key length than do the inverse operations in DH and RSA. What this
means is: as security requirements become more stringent, and as processing
power gets cheaper and more available, ECC becomes the more practical system
for use. And as security requirements become more demanding, and processors
become more powerful, considerably more modest increases in key length are
necessary, if you‘re using the ECC cryptosystem — to address the threat. This
keeps ECC implementations smaller and more efficient than other
implementations. ECC can use a considerably shorter key and offer the same
level of security as other asymmetric algorithms using much larger ones.
Moreover, the gulf between ECC and its competitors in terms of key size required
for a given level of security becomes dramatically more pronounced, at higher
levels of security.
What you need for a public key cryptographic system to work is a set of
algorithms that is easy to process in one direction, but difficult to undo. In the
case of RSA, the easy algorithm multiplies two prime numbers. If multiplication
is the easy algorithm, its difficult pair algorithm is factoring the product of the
multiplication into its two component primes. Algorithms that have this
characteristic — easy in one direction, hard the other — are known as Trap door
Functions. Finding a good Trapdoor Function is critical to making a secure
public key cryptographic system. Simplistically: the bigger the spread between
the difficulty of going one direction in a Trapdoor Function and going the other,
the more secure a cryptographic system based on it will be.
In general elliptic curves (ec) combine number theory and algebraic geometry.
These curves can be defined over any field of numbers (i.e., real, integer,
complex and even Fp). An elliptic curve consists of the set of numbers (x, y), also
known as points on that curve, that satisfies the equation: y 2 = x 3 + ax + b.
Like the prime factorization problem in RSA, elliptic curves can be used to define
a ―hard‖ to solve problem: Given two points, P and Q, on an elliptic curve, find
the integer k, if it exists, such that P = kQ.
In short ECC is ―simply‖ based on the difficulty of solving the Elliptic Curve
Discrete Logarithm Problem (ECDLP). ECC was independently formulated in
1985 by the researchers Victor Miller (IBM) and Neal Koblitz (University of
Washington).
Trapdoor function:
In the example above the public key is a very large number, and the private key
is the two prime factors of the public key. This is a good example of a Trapdoor
Function because it is very easy to multiply the numbers in the private key
together to get the public key, but if all you have is the public key it will take a
very long time using a computer to re-create the private key.
In real cryptography the private key would need to be 200+ digits long to be
considered secure.
Message Authentication
Another type of threat that exist for data is the lack of message authentication. In this threat,
the user is not sure about the originator of the message. Message authentication can be
provided using the cryptographic techniques that use secret keys as done in case of encryption.
Authentication Requirements
In the context of communications across a network, the following attacks can be identified:
1. Disclosure: Release of message contents to any person or process not possessing the
appropriate cryptographic key.
2. Traffic analysis: Discovery of the pattern of traffic between parties. In a connection oriented
application, the frequency and duration of connections could be determined. In either a
connection-oriented or connectionless environment, the number and length of messages
between parties could be determined.
3. Masquerade: Insertion of messages into the network from a fraudulent source. This includes
the creation of messages by an opponent that are purported to come from an authorized entity.
Also included are fraudulent acknowledgments of message receipt or non receipt by someone
other than the message recipient.
Any message authentication or digital signature mechanism has two levels of functionality. At
the lower level, there must be some sort of function that produces an authenticator: a value to
be used to authenticate a message. This lower-
level function is then used as a primitive in a higher
level authentication protocol that enablesa receiver to verify the authenticity of a message.
This section is concerned with the types of functions that may be used to pro-
duce an authenticator. These may be grouped into three classes.
Hash function: A function that maps a message of any length into a fixed-
length hash value, which serves as the authenticator
Message encryption: The ciphertext of the entire message serves as its authenticator
An alternative authentication technique involves the use of a secret key to generate a small fixed-
size block of data, known as a cryptographic checksum or MAC, that is
appended to the message. This technique assumes that two communicating parties, say A
and B, share a common secret key K. When A has a message to send to B, it
calculates the MAC as a function of the message and the key:
MAC = MAC(K, M)
Where
M = input message
C = MAC function
The message plus MAC are transmitted to the intended recipient. The recipient per- forms the
same calculation on the received message, using the same secret key, to generate a new MAC.
The received MAC is compared to the calculated MAC
(Figure 12.4a). If we assume that only the receiver and the sender know the identity
of the secret key, and if the received MAC matches the calculated MAC, then
The receiver is assured that the message has not been altered. If an attacker
alters the message but does not alter the MAC, then the receiver‘s calculation of the MAC will
differ
from the received MAC. Because the attacker is assumed not to know the secret key, the attack
er cannot alter the MAC to cor- respond to the alterations in the message.
2. The receiver is assured that the message is from the alleged sender. B
ecause no one else knows the secret key, no one else could prepare a message with a proper MAC.
3. If the message includes a sequence number (such as is used with HD
LC, X.25, and TCP), then the receiver can be assured of the proper sequence because an
attacker cannot successfully alter the sequence number.
For example, suppose that we are using 100-bit messages and a 10-bit MAC. Then, there are a
total of 2100 different messages but only 210 different MACs. So, on average, each MAC value
is generated by a total of 2100/210 = 290 different mes- sages. If a 5-bit key is used, then
there are 25 = 32 different mappings from the set of messages to the set of MAC values.
The process depicted in Figure 12.4a provides authentication but not confiden-
tiality, because the message as a whole is transmitted in the clear. Confidentiality can
be provided by performing message encryption either after (Figure 12.4b) or before
(Figure 12.4c) the MAC algorithm. In both these cases, two separate keys are needed,
each of which is shared by the sender and the receiver. In the first case, the MAC is calculated
with the message as input and is then concatenated to the message. The entire block is then
encrypted. In the second case, the message is encrypted
first. Then the MAC is calculated using the resulting ciphertext and is concatenated to the
ciphertext to form the transmitted block. Typically, it is preferable to tie the authenti-
cation directly to the plaintext, so the method of Figure 12.4b is used.
A message authentication code (often called MAC) is a block of a few bytes that is used
to authenticate a message. The receiver can check this block and be sure that the message
hasn't been modified by the third party.
The abbreviation MAC can also be used for describing algorithms that can create
an authentication code and verify its correctness.
The simplest way to mark the authenticity of the message is to compute its checksum, for
example using the CRC algorithm. One can attach the result to the transmitted message.
The primary disadvantage of this method is the lack of protection against intentional
modifications in the message content. The intruder can change the message, then calculate
a new checksum, and eventually replace the original checksum by the new value. An ordinary
CRC algorithm allows only to detect randomly damaged parts of messages (but not intentional
changes made by the attacker).
A cryptographic hash function (CHF) is a hash function that is suitable for use
in cryptography. It is a mathematical algorithm that maps data of arbitrary size (often called
the "message") to a bit string of a fixed size (the "hash value", "hash", or "message digest") and
is a one-way function, that is, a function which is practically infeasible to invert. [1] Ideally, the
only way to find a message that produces a given hash is to attempt a brute-force search of
possible inputs to see if they produce a match, or use a rainbow table of matched hashes.
Cryptographic hash functions are a basic tool of modern cryptography.
The ideal cryptographic hash function has the following main properties:
it is deterministic, meaning that the same message always results in the same hash
it is quick to compute the hash value for any given message
it is infeasible to generate a message that yields a given hash value
it is infeasible to find two different messages with the same hash value
a small change to a message should change the hash value so extensively that the new
hash value appears uncorrelated with the old hash value
Cryptographic hash functions have many information-security applications, notably in digital
signatures, message authentication codes (MACs), and other forms of authentication. They can
also be used as ordinary hash functions, to index data in hash tables, for fingerprinting, to
detect duplicate data or uniquely identify files, and as checksums to detect accidental data
corruption. Indeed, in information-security contexts, cryptographic hash values are sometimes
called (digital) fingerprints, checksums, or just hash values, even though all these terms stand
for more general functions with rather different properties and purposes.
Most cryptographic hash functions are designed to take a string of any length as input and
produce a fixed-length hash value.A cryptographic hash function must be able to withstand all
known types of cryptanalytic attack. In theoretical cryptography, the security level of a
cryptographic hash function has been defined using the following properties:
Pre-image resistance
Given a hash value h it should be difficult to find any message m such that h = hash(m).
This concept is related to that of a one-way function. Functions that lack this property
are vulnerable to preimage attacks.
Given an input m1, it should be difficult to find a different input m2 such that hash(m1)
= hash(m2). This property is sometimes referred to as weak collision resistance.
Functions that lack this property are vulnerable to second-preimage attacks.
Collision resistance
It should be difficult to find two different messages m1 and m2 such that hash(m1) =
hash(m2). Such a pair is called a cryptographic hash collision. This property is
sometimes referred to as strong collision resistance. It requires a hash value at least
twice as long as that required for pre-image resistance; otherwise collisions may be
found by a birthday attack.
Collision resistance implies second pre-image resistance, but does not imply pre-image
resistance.[5] The weaker assumption is always preferred in theoretical cryptography, but in
practice, a hash-function which is only second pre-image resistant is considered insecure and
is therefore not recommended for real applications.
Informally, these properties mean that a malicious adversary cannot replace or modify the
input data without changing its digest. Thus, if two strings have the same digest, one can be
very confident that they are identical. Second pre-image resistance prevents an attacker from
crafting a document with the same hash as a document the attacker cannot control. Collision
resistance prevents an attacker from creating two distinct documents with the same hash.
A function meeting these criteria may still have undesirable properties. Currently popular
cryptographic hash functions are vulnerable to length-extension attacks:
given hash(m) and len(m) but not m, by choosing a suitable m′ an attacker can
calculate hash(m ∥ m′), where ∥ denotes concatenation. This property can be used to break
naive authentication schemes based on hash functions. The HMAC construction works around
these problems.
In practice, collision resistance is insufficient for many practical uses. In addition to collision
resistance, it should be impossible for an adversary to find two messages with substantially
similar digests; or to infer any useful information about the data, given only its digest. In
particular, a hash function should behave as much as possible like a random function (often
called a random oracle in proofs of security) while still being deterministic and efficiently
computable. This rules out functions like the SWIFFT function, which can be rigorously proven
to be collision resistant assuming that certain problems on ideal lattices are computationally
difficult, but as a linear function, does not satisfy these additional properties.[7]
Checksum algorithms, such as CRC32 and other cyclic redundancy checks, are designed to
meet much weaker requirements, and are generally unsuitable as cryptographic hash
functions. For example, a CRC was used for message integrity in the WEP encryption standard,
but an attack was readily discovered which exploited the linearity of the checksum.
Degree of difficulty
In cryptographic practice, "difficult" generally means "almost certainly beyond the reach of any
adversary who must be prevented from breaking the system for as long as the security of the
system is deemed important". The meaning of the term is therefore somewhat dependent on the
application since the effort that a malicious agent may put into the task is usually proportional
to his expected gain. However, since the needed effort usually multiplies with the digest length,
even a thousand-fold advantage in processing power can be neutralized by adding a few dozen
bits to the latter.
For messages selected from a limited set of messages, for example passwords or other short
messages, it can be feasible to invert a hash by trying all possible messages in the set. Because
cryptographic hash functions are typically designed to be computed quickly, special key
derivation functions that require greater computing resources have been developed that make
such brute force attacks more difficult.
In some theoretical analyses "difficult" has a specific mathematical meaning, such as "not
solvable in asymptotic polynomial time". Such interpretations of difficulty are important in the
study of provably secure cryptographic hash functions but do not usually have a strong
connection to practical security. For example, an exponential time algorithm can sometimes
still be fast enough to make a feasible attack. Conversely, a polynomial time algorithm (e.g.,
one that requires n20 steps for n-digit keys) may be too slow for any practical use.
The best way to demonstrate a one-way function is with a simple modular function, also called
modular arithmetic or clock arithmetic. Modular functions are mathematical functions that,
put simply, produce the remainder of a division problem.
So, for example, 10 mod 3 = 1. This is true because 10 divided by 3 is 3 with a remainder of 1.
We ignore the number of times 3 goes into 10 (which is 3 in this case) and the only output is
the remainder: 1.
Let‘s use the equation X mod 5 = Y as our function. Here‘s a table to help get the point across:
A cryptographic hash function is a mathematical equation that enables many everyday forms of
encryption. This includes everything from the HTTPS protocol to payments made on e-
commerce websites. Cryptographic hash functions are also used extensively in blockchain
technology.
While the term itself may seem intimidating, cryptographic hash functions are relatively easy to
understand. In this article, Komodo will explain exactly how a cryptographic hash function
works.
A cryptographic hash function is more or less the same thing. It‘s a formula with a set of
specific properties that makes it extremely useful for encryption. Let‘s learn more about these
properties now.
At the heart of a hashing is a mathematical function that operates on two fixed-size blocks of
data to create a hash code. This hash function forms the part of the hashing algorithm.
The size of each data block varies depending on the algorithm. Typically the block sizes are
from 128 bits to 512 bits.
Hashing algorithm involves rounds of above hash function like a block cipher. Each round
takes an input of a fixed size, typically a combination of the most recent message block and the
output of the last round.
This process is repeated for as many rounds as are required to hash the entire message.
Schematic of hashing algorithm is depicted in the following illustration
Since, the hash value of first message block becomes an input to the second hash operation,
output of which alters the result of the third operation, and so on. This effect, known as
an avalanche effect of hashing.
Avalanche effect results in substantially different hash values for two messages that differ by
even a single bit of data.
Understand the difference between hash function and algorithm correctly. The hash function
generates a hash code by operating on two blocks of fixed-length binary data.
Hashing algorithm is a process for using the hash function, specifying how the message will
be broken up and how the results from previous message blocks are chained together.
MD5 was most popular and widely used hash function for quite some years.
The MD family comprises of hash functions MD2, MD4, MD5 and MD6. It was adopted
as Internet Standard RFC 1321. It is a 128-bit hash function.
MD5 digests have been widely used in the software world to provide assurance about
integrity of transferred file. For example, file servers often provide a pre-computed MD5
checksum for the files, so that a user can compare the checksum of the downloaded
file to it.
Family of SHA comprise of four SHA algorithms; SHA-0, SHA-1, SHA-2, and SHA-3. Though
from same family, there are structurally different.
The original version is SHA-0, a 160-bit hash function, was published by the National
Institute of Standards and Technology (NIST) in 1993. It had few weaknesses and did
not become very popular. Later in 1995, SHA-1 was designed to correct alleged
weaknesses of SHA-0.
SHA-1 is the most widely used of the existing SHA hash functions. It is employed in
several widely used applications and protocols including Secure Socket Layer (SSL)
security.
In 2005, a method was found for uncovering collisions for SHA-1 within practical time
frame making long-term employability of SHA-1 doubtful.
SHA-2 family has four further SHA variants, SHA-224, SHA-256, SHA-384, and SHA-
512 depending up on number of bits in their hash value. No successful attacks have
yet been reported on SHA-2 hash function.
Though SHA-2 is a strong hash function. Though significantly different, its basic
design is still follows design of SHA-1. Hence, NIST called for new competitive hash
function designs.
In October 2012, the NIST chose the Keccak algorithm as the new SHA-3 standard.
Keccak offers many benefits, such as efficient performance and good resistance for
attacks.
RIPEMD
The RIPEND is an acronym for RACE Integrity Primitives Evaluation Message Digest. This set
of hash functions was designed by open research community and generally known as a family
of European hash functions.
The set includes RIPEND, RIPEMD-128, and RIPEMD-160. There also exist 256, and
320-bit versions of this algorithm.
Original RIPEMD (128 bit) is based upon the design principles used in MD4 and found
to provide questionable security. RIPEMD 128-bit version came as a quick fix
replacement to overcome vulnerabilities on the original RIPEMD.
RIPEMD-160 is an improved version and the most widely used version in the family.
The 256 and 320-bit versions reduce the chance of accidental collision, but do not
have higher levels of security as compared to RIPEMD-128 and RIPEMD-160
respectively.
Whirlpool
There are two direct applications of hash function based on its cryptographic properties.
Password Storage
An intruder can only see the hashes of passwords, even if he accessed the password. He can
neither logon using hash nor can he derive the password from hash value since hash function
possesses the property of pre-image resistance.
Data integrity check is a most common application of the hash functions. It is used to generate
the checksums on data files. This application provides assurance to the user about correctness
of the data. The process is depicted in the following illustration −
The integrity check helps the user to detect any changes made to original file. It however, does
not provide any assurance about originality. The attacker, instead of modifying file data, can
change the entire file and compute all together new hash and send to the receiver. This
integrity check application is useful only if the user is sure about the originality of file.
Another type of threat that exist for data is the lack of message authentication. In this threat,
the user is not sure about the originator of the message. Message authentication can be
provided using the cryptographic techniques that use secret keys as done in case of encryption.
This property is probably somewhat obvious. If an ordinary computer needed several minutes
to process a cryptographic hash function and receive the output, it would not be very practical.
To be useful, hash functions must be computationally efficient.
In reality, this is not as large of a concern as it was 40 or 50 years ago. Nowadays, an average
home computer can process an advanced hash function in just a small fraction of a second.
This may also be rather obvious. If a cryptographic hash function were to produce different
outputs each time the same input was entered, the hash function would be random and
therefore useless. It would be impossible to verify a specific input, which is the whole point of
hash functions— to be able to verify that a private digital signature is authentic without ever
having access to the private key.
It‘s important to note that cryptographic hashing algorithms can receive any kind of input. The
input can be numbers, letters, words, or punctuation marks. It can be a single character, a
sentence from a book, a page from a book, or an entire book.
However, a hash function will always produce a fixed-length output. Regardless of what the
input is, the output will be an alphanumeric code of fixed length.
Consider why this is so important: if a longer input produced a longer output, then attackers
would already have a seriously helpful clue when trying to discover someone‘s private input.
For example, if an input always produced an output 1.5 times its length, then the hash
function would be giving away valuable information to hackers. If hackers saw an output of,
say, 36 characters, they would immediately know that the input was 24 characters.
Instead, a useful hash function must conceal any clues about what the input may have looked
like. It needs to be impossible to determine whether the input was long or short, numbers or
letters, even or odd, random characters or a string of recognizable words. In addition, changing
one character in a long string of text must result in a radically different digest.
However, outputs are of a fixed length. This means that there are a finite number— albeit an
extremely large number— of outputs that a hash function can produce. A fixed-length means a
fixed number of possibilities.
Since the number of inputs are essentially infinite, but the outputs are limited to a specific
number, it is a mathematical certainty that more than one input will produce the same output.
The goal is to make finding two inputs that produce the same output so astronomically
improbable that the possibility can be practically dismissed outright. It should not pose a risk.
At this point you might be wondering what kind of incredible equations possess all four of
these properties. The answer is probably far simpler than you think.
The best way to demonstrate a one-way function is with a simple modular function, also called
modular arithmetic or clock arithmetic. Modular functions are mathematical functions that,
put simply, produce the remainder of a division problem.
So, for example, 10 mod 3 = 1. This is true because 10 divided by 3 is 3 with a remainder of 1.
We ignore the number of times 3 goes into 10 (which is 3 in this case) and the only output is
the remainder: 1.
Let‘s use the equation X mod 5 = Y as our function. Here‘s a table to help get the point across:
You can probably spot the pattern. There are only five possible outputs for this function. They
rotate in this order to infinity.
This is significant because both the hash function and the output can be made public but no
one will ever be able to learn your input. As long as you keep the number you chose to use as X
a secret, it‘s impossible for an attacker to figure it out.
Let‘s say that your input is 27. This gives an output of 2. Now, imagine that you announce to
the world that you‘re using the hash function X mod 5 = Y and that your personal output is 2.
Would anyone be able to guess your input?
Obviously not. There are literally an infinite number of possible inputs that you could have
used to get a result of 2. For instance, your number could be 7, 52, 3492, or 23390787. Or, it
could be any of the other infinite number of possible inputs.
The important point to understand here is that one-way hash functions are just that: one-way.
They cannot be reversed.
When these same principles are applied to a much more sophisticated hash function, and
much, much bigger numbers, it becomes impossible to determine the inputs. This is what
makes a cryptographic hash function so secure and useful.
BLAKE2
Each of these classes of hash function may contain several different algorithms. For example,
SHA-2 is a family of hash functions that includes SHA-224, SHA-256, SHA-384, SHA-512,
SHA-512/224, and SHA-512/256.
While all of these hash functions are similar, they differ slightly in the way the algorithm
creates a digest, or output, from a given input. They also differ in the fixed length of the digest
they produce.
SHA-256 is perhaps the most famous of all cryptographic hash functions because it‘s used
extensively in blockchain technology. It is used in Satoshi Nakamoto‘s original Bitcoin protocol.
Just as with symmetric and public-key encryption, we can group attacks on hash functions
and MACs into two categories: brute-force attacks and cryptanalysis.
Brute-Force Attacks
The nature of brute-force attacks differs somewhat for hash functions and MACs.
The strength of a hash function against brute-force attacks depends solely on the length of the
hash code produced by the algorithm.
One-way: For any given code h, it is computationally infeasible to find x such that H(x) = h.
Weak collision resistance: For any given block x, it is computationally infeasible to find y x with
H(y) = H(x).
Strong collision resistance: It is computationally infeasible to find any pair (x, y) such that H(x)
= H(y).
The cryptographic hash functions can be divided into two main categories. In the first category
are those functions whose designs are based on mathematical problems, and whose security
thus follows from rigorous mathematical proofs, complexity theory and formal reduction. These
functions are called Provably Secure Cryptographic Hash Functions. To construct these is very
difficult, and few examples have been introduced. Their practical use is limited.
In the second category are functions which are not based on mathematical problems, but on an
ad-hoc constructions, in which the bits of the message are mixed to produce the hash. These
are then believed to be hard to break, but no formal proof is given. Almost all hash functions in
widespread use reside in this category. Some of these functions are already broken, and are no
longer in use.
Pre-image resistance: given a hash it should be hard to find any message such
that . This concept is related to that of the one-way function. Functions that lack this
property are vulnerable to pre-image attacks.
Second pre-image resistance: given an input , it should be hard to find another
Collision resistance: it should be hard to find two different messages and such
that . Such a pair is called a (cryptographic) hash collision. This property is sometimes
referred to as strong collision resistance. It requires a hash value at least twice as long as
what is required for pre-image resistance, otherwise collisions may be found by a birthday
attack.
Pseudo-randomness: it should be hard to distinguish the pseudo-random number
generator based on the hash function from a random number generator, e.g., it passes
usual randomness tests.
The second approach is theoretical and is based on the computational complexity theory. If
problem A is hard, there exists a formal security reduction from a problem which is widely
considered unsolvable in polynomial time, such as integer factorization problem or discrete
logarithm problem.
However, non-existence of a polynomial time algorithm does not automatically ensure that the
system is secure. The difficulty of a problem also depends on its size. For example, RSA public
key cryptography relies on the difficulty of integer factorization. However, it is considered
secure only with keys that are at least 2048 bits large.
Digital Signature
A digital signature is a mathematical scheme for verifying the authenticity of digital messages
or documents. A valid digital signature, where the prerequisites are satisfied, gives a recipient
very strong reason to believe that the message was created by a known sender (authentication),
and that the message was not altered in transit (integrity).
Digital signatures can provide the added assurances of evidence of origin, identity and status of
an electronic document, transaction or message and can acknowledge informed consent by the
signer.
Digital signatures are based on public key cryptography, also known as asymmetric
cryptography. Using a public key algorithm, such as RSA, one can generate two keys that are
mathematically linked: one private and one public.
Digital signatures work because public key cryptography depends on two mutually
authenticating cryptographic keys. The individual who is creating the digital signature uses
their own private key to encrypt signature-related data; the only way to decrypt that data is
with the signer's public key. This is how digital signatures are authenticated.
Digital signature technology requires all the parties to trust that the individual creating the
signature has been able to keep their own private key secret. If someone else has access to the
signer's private key, that party could create fraudulent digital signatures in the name of the
private key holder.
Digital signatures are a standard element of most cryptographic protocol suites, and are
commonly used for software distribution, financial transactions, contract management
software, and in other cases where it is important to detect forgery or tampering.
Digital signatures are often used to implement electronic signatures, which includes any
electronic data that carries the intent of a signature, but not all electronic signatures use
digital signatures.
Digital signatures employ asymmetric cryptography. In many instances they provide a layer of
validation and security to messages sent through a non-secure channel: Properly implemented,
a digital signature gives the receiver reason to believe the message was sent by the claimed
sender. Digital seals and signatures are equivalent to handwritten signatures and stamped
seals.[12] Digital signatures are equivalent to traditional handwritten signatures in many
respects, but properly implemented digital signatures are more difficult to forge than the
handwritten type. Digital signature schemes, in the sense used here, are cryptographically
based, and must be implemented properly to be effective. Digital signatures can also
provide non-repudiation, meaning that the signer cannot successfully claim they did not sign a
message, while also claiming their private key remains secret. Further, some non-repudiation
schemes offer a time stamp for the digital signature, so that even if the private key is exposed,
the signature is valid. Digitally signed messages may be anything representable as a bitstring:
examples include electronic mail, contracts, or a message sent via some other cryptographic
protocol.
A digital signature scheme typically consists of 3 algorithms;
A key generation algorithm that selects a private key uniformly at random from a set of
possible private keys. The algorithm outputs the private key and a corresponding public
key.
A signing algorithm that, given a message and a private key, produces a signature.
A signature verifying algorithm that, given the message, public key and signature, either
accepts or rejects the message's claim to authenticity.
Two main properties are required. First, the authenticity of a signature generated from a
fixed message and fixed private key can be verified by using the corresponding public key.
Secondly, it should be computationally infeasible to generate a valid signature for a party
without knowing that party's private key. A digital signature is an authentication
mechanism that enables the creator of the message to attach a code that acts as a
signature. The Digital Signature Algorithm (DSA), developed by the National Institute of
Standards and Technology, is one of many examples of a signing algorithm.
The reason for encrypting the hash instead of the entire message or document is that a hash
function can convert an arbitrary input into a fixed length value, which is usually much
shorter. This saves time as hashing is much faster than signing.
The value of a hash is unique to the hashed data. Any change in the data, even a change in a
single character, will result in a different value. This attribute enables others to validate the
integrity of the data by using the signer's public key to decrypt the hash.
If the decrypted hash matches a second computed hash of the same data, it proves that the
data hasn't changed since it was signed. If the two hashes don't match, the data has either
been tampered with in some way -- integrity -- or the signature was created with a private key
that doesn't correspond to the public key presented by the signer -- authentication.
A digital signature can be used with any kind of message -- whether it is encrypted or not --
simply so the receiver can be sure of the sender's identity and that the message arrived intact.
Digital signatures make it difficult for the signer to deny having signed something -- assuming
their private key has not been compromised -- as the digital signature is unique to both the
document and the signer and it binds them together. This property is called non repudiation.
Class 1: Cannot be used for legal business documents as they are validated
based only on an email ID and username. Class 1 signatures provide a basic level of
security and are used in environments with a low risk of data compromise.
Class 2: Often used for e-filing of tax documents, including income tax returns and Goods
and Services Tax (GST) returns. Class 2 digital signatures authenticate a signee‘s identity
against a pre-verified database. Class 2 digital signatures are used in environments where
Class 3: The highest level of digital signatures. Class 3 signatures require a person or
signing. Class 3 digital signatures are used for e-auctions, e-tendering, e-ticketing, court
filings and in other environments where threats to data or the consequences of a security
PINs, passwords and codes: Used to authenticate and verify a signee‘s identity and
approve their signature. Email, username and password are most common.
Time stamping: Provides the date and time of a signature. Time stamping is useful when
the timing of a digital signature is critical, such as stock trades, lottery ticket issuance and
legal proceedings.
Requirements of Digital signatures
By using signatures an author has to digitally sign the contents using by a following criteria.
Digital signature is valid. A certificate authority trust by an operating system and have to be
signed the digital certificate that digital signature based on.
Signing publisher must be received the certificate which is associated with signature by a
reputable certification authority.
Authentication Protocols
With the increasing amount of trustworthy information being accessible over the network, the
need for keeping unauthorized persons from access to this data emerged. Stealing someone's
identity is easy in the computing world - special verification methods had to be invented to find
out whether the person/computer requesting data is really who he says he is. The task of the
authentication protocol is to specify the exact series of steps needed for execution of the
authentication. It has to comply with the main protocol principles:
1. A Protocol has to involve two or more parties and everyone involved in the protocol must
know the protocol in advance.
2. All the included parties have to follow the protocol.
3. A protocol has to be unambiguous - each step must be defined precisely.
4. A protocol must be complete - must include a specified action for every possible
situation.
Alice (an entity wishing to be verified) and Bob (an entity verifying Alice's identity) are both
aware of the protocol they agreed on using. Bob has Alice's password stored in a database for
comparison.
1. Alice sends Bob her password in a packet complying with the protocol rules.
2. Bob checks the received password against the one stored in his database. Then he
sends a packet saying "Authentication successful" or "Authentication failed" based on
the result.
The authentication process in this protocol is always initialized by the server/host and can be
performed anytime during the session, even repeatedly. Server sends a random string (usually
128B long). The client uses password and the string received as parameters for MD5 hash
function and then sends the result together with username in plain text. Server uses the
username to apply the same function and compares the calculated and received hash. An
authentication is successful or unsuccessful.
EAP - Extensible Authentication Protocol
EAP was originally developed for PPP(Point-to-Point Protocol) but today is widely used in IEEE
802.3, IEEE 802.11(WiFi) or IEEE 802.16 as a part of IEEE 802.1x authentication framework.
The latest version is standardized in RFC 5247. The advantage of EAP is that it is only a
general authentication framework for client-server authentication - the specific way of
authentication is defined in its many versions called EAP-methods. More than 40 EAP-methods
exist, the most common are:
EAP-MD5
EAP-TLS
EAP-TTLS
EAP-FAST
EAP-PEAP
RADIUS
Remote Authentication Dial-In User Service (RADIUS) is a full AAA protocol commonly used
by ISP. Credentials are mostly username-password combination based, it
uses NAS and UDP protocol for transport.
DIAMETER
Diameter (protocol) evolved from RADIUS and involves many improvements such as usage of
more reliable TCP or SCTP transport protocol and higher security thanks to TLS.
Kerberos (protocol)
Kerberos is a centralized network authentication system developed at MIT and available as a
free implementation from MIT but also in many commercial products. It is the default
authentication method in Windows 2000 and later. The authentication process itself is much
more complicated than in the previous protocols - Kerberos uses symmetric key cryptography,
requires a trusted third party and can use public-key cryptography during certain phases of
authentication if need be.
It defines the Digital Signature Algorithm, contains a definition of RSA signatures based on the
definitions contained within PKCS #1 version 2.1 and in American National Standard X9.31
with some additional requirements, and contains a definition of the Elliptic Curve Digital
Signature Algorithm based on the definition provided by American National Standard X9.62
with some additional requirements and some recommended elliptic curves. It also approves the
use of all three algorithms.
The Digital Signature Standard is intended to be used in electronic funds transfer, software
distribution, electronic mail, data storage and applications which require high data integrity
assurance. The Digital Signature Standard can be implemented in software, hardware or
firmware.
The algorithm used behind the Digital Signature Standard is known as the Digital Signature
Algorithm. The algorithm makes use of two large numbers which are calculated based on a
unique algorithm which also considers parameters that determine the authenticity of the
signature. This indirectly also helps in verifying the integrity of the data attached to the
signature. The digital signatures can be generated only by the authorized person using their
private keys and the users or public can verify the signature with the help of the public keys
provided to them. However, one key difference between encryption and signature operation in
the Digital Signature Standard is that encryption is reversible, whereas the digital signature
operation is not. Another fact about the digital signature standard is that it does not provide
any capability with regards to key distribution or exchange of keys. In other words, security of
the digital signature standard largely depends on the secrecy of the private keys of the
signatory.
The Digital Signature Standard ensures that the digital signature can be authenticated and the
electronic documents carrying the digital signatures are secure. The standard also ensures
non-repudiation with regards to the signatures and provides all safeguards for imposter
prevention. The standard also ensures that digital signed documents can be tracked.
Authentication Applications
KERBEROS
Secure: A network eavesdropper should not be able to obtain the necessary information
to impersonate a user. More generally, Kerberos should be strong enough that a potential
opponent does not find it to be the weak link.
Reliable: For all services that rely on Kerberos for access control, lack of availability of
the Kerberos service means lack of availability of the supported services. Hence, Kerberos
should be highly reliable and should employ a distributed server architecture, with one system
able to back up another.
Transparent: Ideally, the user should not be aware that authentication is taking place,
beyond the requirement to enter a password.
Scalable: The system should be capable of supporting large numbers of clients and
servers. This suggests a modular, distributed architecture.
To support these requirements, the overall scheme of Kerberos is that of a trusted third-
party authentication service that uses a protocol based on that proposed by Needham
and Schroeder [NEED78] It is trusted in the sense that clients and servers trust
Kerberos to mediate their mutual authentication. Assuming the Kerberos protocol is
well designed, then the authentication service is secure if the Kerberos server itself is
secure.
Kerberos Encryption Technique
Kerberos can use a variety of cipher algorithms to protect data. A Kerberos encryption type
(also known as an enctype) is a specific combination of a cipher algorithm with an
integrity algorithm to provide both confidentiality and integrity to data.
Under Kerberos, a client (generally either a user or a service) sends a request for a ticket to the
Key Distribution Center (KDC). The KDC creates a ticket-granting ticket (TGT) for the client,
encrypts it using the client's password as the key, and sends the encrypted TGT back to the
client.
Unit – IV
Network security entails securing data against attacks while it is in transit on a network. To
achieve this goal, many real-time security protocols have been designed. There are popular
standards for real-time network security protocols such as S/MIME, SSL/TLS, SSH, and
IPsec. As mentioned earlier, these protocols work at different layers of networking model.
In the last chapter, we discussed some popular protocols that are designed to provide
application layer security. In this chapter, we will discuss the process of achieving network
security at Transport Layer and associated security protocols.
For TCP/IP protocol based network, physical and data link layers are typically implemented in
the user terminal and network card hardware. TCP and IP layers are implemented in the
operating system. Anything above TCP/IP is implemented as user process.
Bob visits Alice‘s website for selling goods. In a form on the website, Bob enters the type of
good and quantity desired, his address and payment card details. Bob clicks on Submit and
waits for delivery of goods with debit of price amount from his account. All this sounds good,
but in absence of network security, Bob could be in for a few surprises.
If transactions did not use confidentiality (encryption), an attacker could obtain his
payment card information. The attacker can then make purchases at Bob's expense.
If no data integrity measure is used, an attacker could modify Bob's order in terms of
type or quantity of goods.
Lastly, if no server authentication is used, a server could display Alice's famous logo
but the site could be a malicious site maintained by an attacker, who is masquerading
as Alice. After receiving Bob's order, he could take Bob's money and flee. Or he could
carry out an identity theft by collecting Bob's name and credit card details.
Transport layer security schemes can address these problems by enhancing TCP/IP based
network communication with confidentiality, data integrity, server authentication, and client
authentication.
The security at this layer is mostly used to secure HTTP based web transactions on a network.
However, it can be employed by any application running over TCP.
Philosophy of TLS Design
Transport Layer Security (TLS) protocols operate above the TCP layer. Design of these
protocols use popular Application Program Interfaces (API) to TCP, called ―sockets" for
interfacing with TCP layer.
Applications are now interfaced to Transport Security Layer instead of TCP directly. Transport
Security Layer provides a simple API with sockets, which is similar and analogous to TCP's
API.
In the above diagram, although TLS technically resides between application and transport
layer, from the common perspective it is a transport protocol that acts as TCP layer enhanced
with security services.
TLS is designed to operate over TCP, the reliable layer 4 protocol (not on UDP protocol), to
make design of TLS much simpler, because it doesn't have to worry about ‗timing out‘ and
‗retransmitting lost data‘. The TCP layer continues doing that as usual which serves the need
of TLS.
The reason for popularity of using a security at Transport Layer is simplicity. Design and
deployment of security at this layer does not require any change in TCP/IP protocols that are
implemented in an operating system. Only user processes and applications needs to be
designed/modified which is less complex.
In this section, we discuss the family of protocols designed for TLS. The family includes SSL
versions 2 and 3 and TLS protocol. SSLv2 has been now replaced by SSLv3, so we will focus
on SSL v3 and TLS.
In year 1995, Netscape developed SSLv2 and used in Netscape Navigator 1.1. The SSL
version1 was never published and used. Later, Microsoft improved upon SSLv2 and
introduced another similar protocol named Private Communications Technology (PCT).
Netscape substantially improved SSLv2 on various security issues and deployed SSLv3 in
1999. The Internet Engineering Task Force (IETF) subsequently, introduced a similar TLS
(Transport Layer Security) protocol as an open standard. TLS protocol is non-interoperable
with SSLv3.
TLS modified the cryptographic algorithms for key expansion and authentication. Also, TLS
suggested use of open crypto Diffie-Hellman (DH) and Digital Signature Standard (DSS) in
place of patented RSA crypto used in SSL. But due to expiry of RSA patent in 2000, there
existed no strong reasons for users to shift away from the widely deployed SSLv3 to TLS.
SSL is specific to TCP and it does not work with UDP. SSL provides Application Programming
Interface (API) to applications. C and Java SSL libraries/classes are readily available.
SSL protocol is designed to interwork between application and transport layer as shown in the
following image −
SSL itself is not a single layer protocol as depicted in the image; in fact it is composed of two
sub-layers.
Lower sub-layer comprises of the one component of SSL protocol called as SSL Record
Protocol. This component provides integrity and confidentiality services.
o Alert Protocol.
These three protocols manage all of SSL message exchanges and are discussed later in
this section.
Functions of SSL Protocol Components
The four sub-components of the SSL protocol handle various tasks for secure communication
between the client machine and the server.
Record Protocol
o It fragments the data into manageable blocks (max length 16 KB). It optionally
compresses the data.
o Provides a header for each message and a hash (Message Authentication Code
(MAC)) at the end.
o It is the most complex part of SSL. It is invoked before any application data is
transmitted. It creates SSL sessions between the client and the server.
o Multiple secure TCP connections between a client and a server can share the
same session.
o Handshake protocol actions through four phases. These are discussed in the
next section.
ChangeCipherSpec Protocol
o As each entity sends the ChangeCipherSpec message, it changes its side of the
connection into the secure state as agreed upon.
o The cipher parameters pending state is copied into the current state.
o Exchange of this Message indicates all future data exchanges are encrypted and
integrity is protected.
o It is also used for other purposes – such as notify closure of the TCP connection,
notify receipt of bad or unknown certificate, etc.
As discussed above, there are four phases of SSL session establishment. These are mainly
handled by SSL Handshake protocol.
It also sends the Pre-master Secret (PMS) encrypted with the server‘s public key.
Client also sends Certificate_verify message if certificate is sent by him to prove he has
the private key associated with this certificate. Basically, the client signs a hash of the
previous messages.
Phase 4 − Finish.
Client and server send Change_cipher_spec messages to each other to cause the
pending cipher state to be copied into the current state.
Message ―Finished‖ from each end verifies that the key exchange and authentication
processes were successful.
All four phases, discussed above, happen within the establishment of TCP session. SSL
session establishment starts after TCP SYN/ SYNACK and finishes before TCP Fin.
This avoids recalculating of session cipher parameters and saves computing at the
server and the client end.
SSL Session Keys
We have seen that during Phase 3 of SSL session establishment, a pre-master secret is sent
by the client to the server encrypted using server‘s public key. The master secret and various
session keys are generated as follows −
The master secret is generated (via pseudo random number generator) using −
o Two nonces (RA and RB) exchanged in the client_hello and server_hello
messages.
Six secret values are then derived from this master secret as −
TLS Protocol
In order to provide an open Internet standard of SSL, IETF released The Transport Layer
Security (TLS) protocol in January 1999. TLS is defined as a proposed Internet Standard in
RFC 5246.
Salient Features
TLS protocol sits above the reliable connection-oriented transport TCP layer in the
networking layers stack.
The architecture of TLS protocol is similar to SSLv3 protocol. It has two sub protocols:
the TLS Record protocol and the TLS Handshake protocol.
Though SSLv3 and TLS protocol have similar architecture, several changes were made
in architecture and functioning particularly for the handshake protocol.
Comparison of TLS and SSL Protocols
There are main eight differences between TLS and SSLv3 protocols. These are as follows −
Protocol Version − The header of TLS protocol segment carries the version number 3.1
to differentiate between number 3 carried by SSL protocol segment header.
Session Key Generation − There are two differences between TLS and SSL protocol for
generation of key material.
o The algorithm for computing session keys and initiation values (IV) is different
in TLS than SSL protocol.
o TLS protocol supports all the messages used by the Alert protocol of SSL,
except No certificate alert message being made redundant. The client sends
empty certificate in case client authentication is not required.
o Many additional Alert messages are included in TLS protocol for other error
conditions such as record_overflow, decode_error etc.
Supported Cipher Suites − SSL supports RSA, Diffie-Hellman and Fortezza cipher
suites. TLS protocol supports all suits except Fortezza.
The above differences between TLS and SSLv3 protocols are summarized in the following
table.
In this section, we will discuss the use of SSL/TLS protocol for performing secure web
browsing.
HTTPS Defined
Hyper Text Transfer Protocol (HTTP) protocol is used for web browsing. The function of HTTPS
is similar to HTTP. The only difference is that HTTPS provides ―secure‖ web browsing. HTTPS
stands for HTTP over SSL. This protocol is used to provide the encrypted and authenticated
connection between the client web browser and the website server.
The secure browsing through HTTPS ensures that the following content are encrypted −
Working of HTTPS
HTTPS application protocol typically uses one of two popular transport layer security protocols
- SSL or TLS. The process of secure browsing is described in the following points.
Web browser initiates a connection to the web server. Use of https invokes the use of
SSL protocol.
An application, browser in this case, uses the system port 443 instead of port 80 (used
in case of http).
The SSL protocol goes through a handshake protocol for establishing a secure session
as discussed in earlier sections.
The website initially sends its SSL Digital certificate to your browser. On verification of
certificate, the SSL handshake progresses to exchange the shared secrets for the
session.
When a trusted SSL Digital Certificate is used by the server, users get to see a padlock
icon in the browser address bar. When an Extended Validation Certificate is installed
on a website, the address bar turns green.
Once established, this session consists of many secure connections between the web
server and the browser.
Use of HTTPS
Prevents data from eavesdropping and denies identity theft which are common attacks
on HTTP.
Present day web browsers and web servers are equipped with HTTPS support. The use of
HTTPS over HTTP, however, requires more computing power at the client and the server end
to carry out encryption and SSL handshake.
SSH is a network protocol that runs on top of the TCP/IP layer. It is designed to replace
the TELNET which provided unsecure means of remote logon facility.
SSH provides a secure client/server communication and can be used for tasks such as
file transfer and e-mail.
Transport Layer Protocol − This part of SSH protocol provides data confidentiality,
server (host) authentication, and data integrity. It may optionally provide data
compression as well.
o Session Key Establishment − After authentication, the server and the client
agree upon cipher to be used. Session keys are generated by both the client
and the server. Session keys are generated before user authentication so that
usernames and passwords can be sent encrypted. These keys are generally
replaced at regular intervals (say, every hour) during the session and are
destroyed immediately after use.
User Authentication Protocol − This part of SSH authenticates the user to the server.
The server verifies that access is given to intended users only. Many authentication
methods are currently used such as, typed passwords, Kerberos, public-key
authentication, etc.
SSH provides three main services that enable provision of many secure solutions. These
services are briefly described as follows −
Secure Command-Shell (Remote Logon) − It allows the user to edit files, view the
contents of directories, and access applications on connected device. Systems
administrators can remotely start/view/stop services and processes, create user
accounts, and change file/directories permissions and so on. All tasks that are feasible
at a machine‘s command prompt can now be performed securely from the remote
machine using secure remote logon.
Secure File Transfer − SSH File Transfer Protocol (SFTP) is designed as an extension
for SSH-2 for secure file transfer. In essence, it is a separate protocol layered over the
Secure Shell protocol to handle file transfers. SFTP encrypts both the
username/password and the file data being transferred. It uses the same port as the
Secure Shell server, i.e. system port no 22.
The benefits and limitations of employing communication security at transport layer are as
follows −
Benefits
o Server is authenticated.
Limitations
o Suitable for direct communication between the client and the server. Does not
cater for secure applications using chain of servers (e.g. email)
Transport Layer Security (TLS), and its now-deprecated predecessor, Secure Sockets
Layer (SSL), are cryptographic protocols designed to provide communications security over
a computer network. Several versions of the protocols find widespread use in applications such
as web browsing, email, instant messaging, and voice over IP (VoIP). Websites can use TLS to
secure all communications between their servers and web browsers.
The TLS protocol aims primarily to provide privacy and data integrity between two or more
communicating computer applications. When secured by TLS, connections between a client
(e.g., a web browser) and a server (e.g., wikipedia.org) should have one or more of the following
properties:
The connection is private (or secure) because symmetric cryptography is used
to encrypt the data transmitted. The keys for this symmetric encryption are generated
uniquely for each connection and are based on a shared secret that was negotiated at the
start of the session . The server and client negotiate the details of which encryption
algorithm and cryptographic keys to use before the first byte of data is transmitted . The
negotiation of a shared secret is both secure (the negotiated secret is unavailable
to eavesdroppers and cannot be obtained, even by an attacker who places themselves in
the middle of the connection) and reliable (no attacker can modify the communications
during the negotiation without being detected).
The identity of the communicating parties can be authenticated using public-key
cryptography. This authentication can be made optional, but is generally required for at
least one of the parties (typically the server).
The connection is reliable because each message transmitted includes a message integrity
check using a message authentication code to prevent undetected loss or alteration of the
data during transmission.
In addition to the properties above, careful configuration of TLS can provide additional
privacy-related properties such as forward secrecy, ensuring that any future disclosure of
encryption keys cannot be used to decrypt any TLS communications recorded in the past.
TLS supports many different methods for exchanging keys, encrypting data, and
authenticating message integrity. As a result, secure configuration of TLS involves many
configurable parameters, and not all choices provide all of the privacy-related properties
described in the list above.
Attempts have been made to subvert aspects of the communications security that TLS seeks to
provide, and the protocol has been revised several times to address these security threats.
Developers of web browsers have also revised their products to defend against potential
security weaknesses after these were discovered.
The TLS protocol comprises two layers: the TLS record and the TLS handshake protocols.
TLS is a proposed Internet Engineering Task Force (IETF) standard, first defined in 1999, and
the current version is TLS 1.3 defined in RFC 8446 (August 2018). TLS builds on the earlier
SSL specifications (1994, 1995, 1996) developed by Netscape Communications for adding
the HTTPS protocol to their Navigator web browser.
TLS is an upgraded version of SSL and provides secure communications between the client and
server. Because TLS uses a symmetric cryptography algorithm to encrypt the data, the data
transfer is more secure and stable than the transfer is by using SSL.
Transport layer security (TLS) is a protocol that provides communication security between
client/server applications that communicate with each other over the Internet. It enables
privacy, integrity and protection for the data that's transmitted between different nodes on the
Internet. TLS is a successor to the secure socket layer (SSL) protocol.
TLS primarily enables secure Web browsing, applications access, data transfer and most
Internet-based communication. It prevents the transmitted/transported data from being
eavesdropped or tampered. TLS is used to secure Web browsers, Web servers, VPNs, database
servers and more. TLS protocol consists of two different layers of sub-protocols:
TLS Handshake Protocol: Enables the client and server to authenticate each other and
select a encryption algorithm prior to sending the data
TLS Record Protocol: It works on top of the standard TCP protocol to ensure that the
created connection is secure and reliable. It also provides data encapsulation and data
encryption services.
The TLS protocol specification defines two layers. The TLS record protocol provides connection
security, and the TLS handshake protocol enables the client and server to authenticate each
other and to negotiate security keys before any data is transmitted.
The TLS handshake is a multi-step process. A basic TLS handshake involves the client and
server sending ―hello‖ messages, and the exchange of keys, cipher message and a finish
message. The multi-step process is what makes TLS flexible enough to use in different
applications because the format and order of exchange can be modified.
TLS flaws and breaches
Flaws in protocols and implementations constantly cause problems with security tools and
technology, and TLS has certainly had its share of breaches. Some of the more significant
attacks on TLS/SSL:
BEAST (2011): The Browser Exploit Against SSL/TLS is a browser exploit that took
advantage of a weakness in the cipher blocking chain (CBC) to extract the unencrypted
plaintext in an encrypted session.
CRIME and BREACH (2012 and 2013): The creators of BEAST authored the security
exploit Compression Ratio Info-link Made Easy, which enables a hacker to retrieve the
content of Web cookies, even when compression and TLS are used. One nefarious use
case for this is recovering the authentication cookies so attackers can hijack
authenticated web sessions. Browser Reconnaissance and Exfiltration via Adaptive
Compression of Hypertext, or BREACH, is built on CRIME and extracts login tokens, e-
mail addresses and other information.
Heartbleed (2014): Heartbleed allows hackers to steal private keys from what should be
secure servers. Infected servers were left wide open to let anyone on the Internet read
the memory in systems being protected by a vulnerable version of OpenSSL. The breach
let threat actors steal data from servers or listen in on conversations or even spoof
services and other users.
In addition to making a major revision, the IETF set out to make what it called ―major
improvements in the areas of security, performance and privacy‖. The biggest change is that
TLS 1.3 makes it significantly more difficult for attackers to decrypt HTTPS-encrypted traffic
and therefore better protect privacy.
Version 1.3 also makes the handshake process faster by speeding up the encryption process.
This has a security benefit, but it should also improve performance of secure web applications.
With TLS 1.2, the handshake process involved several round trips. With 1.3 only one round is
required, and all the information is passed at that time.
Implementing TLS 1.3 should be simple as it‘s designed to seamlessly replace TLS 1.2 and uses
the same certificates and keys. Also, clients and servers can automatically negotiate a
connection if it‘s supported on both sides.
Early in its development cycle there were a few issues, the most notable of which came to light
at a school system in Maryland where about 20,000 Chromebooks bricked when upgraded to
TLS 1.3. Also, financial-services organizations were vehemently opposed because the
encryption made them blind to what was happening on their own networks. The IETF made a
few enhancements to allow the protocol to work with monitoring tools if implemented correctly.
In addition to security improvements, TLS 1.3 eliminated a number of older algorithms that did
nothing other than create vulnerabilities. These include:
Also, the updated protocol added a function called ―0-RTT resumption‖ that enables the client
and server to remember if they have communicated before. If prior communications exist, the
previous keys can be used, security checks skipped and the client and server can begin
communicating immediately. It is believed that some of the bigger tech companies pushed for
0-RTT because they benefit from the faster connections, but there is some concern from
security professionals.
The security benefits alone should justify TLS 1.3, but there are network reasons as well. In
addition to the security improvements, TLS 1.3 is lighter weight than its predecessor and uses
fewer resources. This means its more efficient, consumes fewer CPU cycles and reduces
latency, which leads to better performance.
HTTPS
HyperText Transfer Protocol Secure (HTTPS) is an extension of the Hypertext Transfer
Protocol (HTTP). It is used for secure communication over a computer network, and is widely
used on the Internet. In HTTPS, the communication protocol is encrypted using Transport
Layer Security (TLS) or, formerly, its predecessor, Secure Sockets Layer (SSL). The protocol is
therefore also often referred to as HTTP over TLS, or HTTP over SSL.
The principal motivations for HTTPS are authentication of the accessed website, protection of
the privacy and integrity of the exchanged data while in transit. It protects against man-in-the-
middle attacks. The bidirectional encryption of communications between a client and server
protects against eavesdropping and tampering of the communication. In practice, this provides
a reasonable assurance that one is communicating without interference by attackers with the
website that one intended to communicate with, as opposed to an impostor.
The authentication aspect of HTTPS requires a trusted third party to sign server side digital
certificates, which historically was expensive. Thus the full authenticated HTTPS connections
were more commonly found only on secured payment transaction services, and other secured
corporate information systems on the World Wide Web. In 2016 a non profit organisation, Let's
Encrypt, began to offer free server certificates to all, and a campaign by the Electronic Frontier
Foundation and support of web browser developers led to the protocol to become more
prevalent[6]. HTTPS is now used more often by web users than the original non-secure HTTP,
primarily to protect page authenticity on all types of websites; secure accounts; and to keep
user communications, identity, and web browsing private.
HTTPS stands for Hyper Text Transfer Protocol Secure. It is a protocol for securing the
communication between two systems e.g. the browser and the web server.
The following figure illustrates the difference between communication over http and https:
As you can see in the above figure, http transfers data between the browser and the web server
in the hypertext format, whereas https transfers data in the encrypted format. Thus, https
prevents hackers from reading and modifying the data during the transfer between the browser
and the web server. Even if hackers manage to intercept the communication, they will not be
able to use it because the message is encrypted.
HTTPS established an encrypted link between the browser and the web server using the Secure
Socket Layer (SSL) or Transport Layer Security (TLS) protocols. TLS is the new version of SSL.
SSL is the standard security technology for establishing an encrypted link between the two
systems. These can be browser to server, server to server or client to server. Basically, SSL
ensures that the data transfer between the two systems remains encrypted and private.
The https is essentially http over SSL. SSL establishes an encrypted link using an SSL
certificate which is also known as a digital certificate.
http vs https
http https
Transfers data in hypertext (structured text) format Transfers data in encrypted format
Advantage of https
Secure Shell (SSH) is a cryptographic network protocol for operating network services securely
over an unsecured network Typical applications include remote command-line, login, and
remote command execution, but any network service can be secured with SSH.
SSH was designed as a replacement for Telnet and for unsecured remote shell protocols such
as the Berkeley rlogin, rsh, and rexec protocols. Those protocols send information,
notably passwords, in plaintext, rendering them susceptible to interception and disclosure
using packet analysis.[4] The encryption used by SSH is intended to provide confidentiality and
integrity of data over an unsecured network, such as the Internet, although files leaked
by Edward Snowden indicate that the National Security Agency can sometimes decrypt SSH,
allowing them to read the contents of SSH sessions.
SSH uses public-key cryptography to authenticate the remote computer and allow it to
authenticate the user, if necessary.[2] There are several ways to use SSH; one is to use
automatically generated public-private key pairs to simply encrypt a network connection, and
then use password authentication to log on.
Another is to use a manually generated public-private key pair to perform the authentication,
allowing users or programs to log in without having to specify a password. In this scenario,
anyone can produce a matching pair of different keys (public and private). The public key is
placed on all computers that must allow access to the owner of the matching private key (the
owner keeps the private key secret). While authentication is based on the private key, the key
itself is never transferred through the network during authentication. SSH only verifies
whether the same person offering the public key also owns the matching private key. In all
versions of SSH it is important to verify unknown public keys, i.e. associate the public keys
with identities, before accepting them as valid. Accepting an attacker's public key without
validation will authorize an unauthorized attacker as a valid user.
Authentication: OpenSSH Key management
On Unix-like systems, the list of authorized public keys is typically stored in the home
directory of the user that is allowed to log in remotely, in the file ~/.ssh/authorized_keys. This
file is respected by SSH only if it is not writable by anything apart from the owner and root.
When the public key is present on the remote end and the matching private key is present on
the local end, typing in the password is no longer required (some software like Message Passing
Interface (MPI) stack may need this password-less access to run properly). However, for
additional security the private key itself can be locked with a passphrase.
The private key can also be looked for in standard places, and its full path can be specified as a
command line setting (the option -i for ssh). The ssh-keygen utility produces the public and
private keys, always in pairs.
SSH is typically used to log into a remote machine and execute commands, but it also
supports tunneling, forwarding TCP ports and X11 connections; it can transfer files using the
associated SSH file transfer (SFTP) or secure copy (SCP) protocols.[2] SSH uses the client-
server model.
The standard TCP port 22 has been assigned for contacting SSH servers.
SSH is important in cloud computing to solve connectivity problems, avoiding the security
issues of exposing a cloud-based virtual machine directly on the Internet. An SSH tunnel can
provide a secure path over the Internet, through a firewall to a virtual machine.
The protocol works in the client-server model, which means that the connection is established
by the SSH client connecting to the SSH server. The SSH client drives the connection setup
process and uses public key cryptography to verify the identity of the SSH server. After the
setup phase the SSH protocol uses strong symmetric encryption and hashing algorithms to
ensure the privacy and integrity of the data that is exchanged between the client and server.
The figure below presents a simplified setup flow of a secure shell connection.
There are several options that can be used for user authentication. The most common ones are
passwords and public key authentication. The public key authentication method is primarily
used for automation and sometimes by system administrators for single sign-on. It has turned
out to be much more widely used than we ever anticipated. The idea is to have a cryptographic
key pair - public key and private key - and configure the public key on a server to authorize
access and grant anyone who has a copy of the private key access to the server. The keys used
for authentication are called SSH keys. Public key authentication is also used with
The main use of key-based authentication is to enable secure automation. Automated secure
shell file transfers are used to seamlessly integrate applications and also for automated
systems & configuration management.
We have found that large organizations have way more SSH keys than they imagine,
and managing SSH keys has become very important. SSH keys grant access as user names
and passwords do. They require a similar provisioning and termination processes.
In some cases we have found several million SSH keys authorizing access into production
servers in customer environments, with 90% of the keys actually being unused and
representing access that was provisioned but never terminated. Ensuring proper policies,
processes, and audits also for SSH usage is critical for proper identity and access management.
Traditional identity management projects have overlooked as much as 90% of all credentials by
ignoring SSH keys.
Once a connection has been established between the SSH client and server, the data that is
transmitted is encrypted according to the parameters negotiated in the setup. During the
negotiation the client and server agree on the symmetric encryption algorithm to be used and
generate the encryption key that will be used. The traffic between the communicating parties is
protected with industry standard strong encryption algorithms (such as AES (Advanced
Encryption Standard)), and the SSH protocol also includes a mechanism that ensures the
integrity of the transmitted data by using standard hash algorithms (such as SHA-2 (Standard
Hashing Algorithm).
Many laptop computers have wireless cards pre-installed. The ability to enter a network while
mobile has great benefits. However, wireless networking is prone to some security
issues. Hackers have found wireless networks relatively easy to break into, and even use
wireless technology to hack into wired networks.[2] As a result, it is very important that
enterprises define effective wireless security policies that guard against unauthorized access to
important resources.[3] Wireless Intrusion Prevention Systems (WIPS) or Wireless Intrusion
Detection Systems (WIDS) are commonly used to enforce wireless security policies.
The risks to users of wireless technology have increased as the service has become more
popular. There were relatively few dangers when wireless technology was first introduced.
Hackers had not yet had time to latch on to the new technology, and wireless networks were
not commonly found in the work place. However, there are many security risks associated with
the current wireless protocols and encryption methods, and in the carelessness and ignorance
that exists at the user and corporate IT level.[4] Hacking methods have become much more
sophisticated and innovative with wireless access. Hacking has also become much easier and
more accessible with easy-to-use Windows- or Linux-based tools being made available on the
web at no charge.
Some organizations that have no wireless access points installed do not feel that they need to
address wireless security concerns. In-Stat MDR and META Group have estimated that 95% of
all corporate laptop computers that were planned to be purchased in 2005 were equipped with
wireless cards. Issues can arise in a supposedly non-wireless organization when a wireless
laptop is plugged into the corporate network. A hacker could sit out in the parking lot and
gather information from it through laptops and/or other devices, or even break in through this
wireless card–equipped laptop and gain access to the wired network.
Anyone within the geographical network range of an open, unencrypted wireless network can
"sniff", or capture and record, the traffic, gain unauthorized access to internal network
resources as well as to the internet, and then use the information and resources to perform
disruptive or illegal acts. Such security breaches have become important concerns for both
enterprise and home networks.
If router security is not activated or if the owner deactivates it for convenience, it creates a
free hotspot. Since most 21st-century laptop PCs have wireless networking built in (see Intel
"Centrino" technology), they don't need a third-party adapter such as a PCMCIA
Card or USB dongle. Built-in wireless networking might be enabled by default, without the
owner realizing it, thus broadcasting the laptop's accessibility to any computer nearby.
Modern operating systems such as Linux, macOS, or Microsoft Windows make it fairly easy to
set up a PC as a wireless LAN "base station" using Internet Connection Sharing, thus allowing
all the PCs in the home to access the Internet through the "base" PC. However, lack of
knowledge among users about the security issues inherent in setting up such systems often
may allow others nearby access to the connection. Such "piggybacking" is usually achieved
without the wireless network operator's knowledge; it may even be without the knowledge of
the intruding user if their computer automatically selects a nearby unsecured wireless network
to use as an access point.
If an employee (trusted entity) brings in a wireless router and plugs it into an unsecured
switchport, the entire network can be exposed to anyone within range of the signals. Similarly,
if an employee adds a wireless interface to a networked computer using an open USB port, they
may create a breach in network security that would allow access to confidential materials.
However, there are effective countermeasures (like disabling open switchports during switch
configuration and VLAN configuration to limit network access) that are available to protect both
the network and the information it contains, but such countermeasures must be applied
uniformly to all network devices.
IEEE 802.11
IEEE 802.11 is part of the IEEE 802 set of LAN protocols, and specifies the set of media access
control (MAC) and physical layer (PHY) protocols for implementing wireless local area
network (WLAN) Wi-Fi computer communication in various frequencies, including but not
limited to 2.4 GHz, 5 GHz, and 60 GHz frequency bands.
They are the world's most widely used wireless computer networking standards, used in most
home and office networks to allow laptops, printers, and smartphones to talk to each other and
access the Internet without connecting wires. They are created and maintained by the Institute
of Electrical and Electronics Engineers (IEEE) LAN/MAN Standards Committee (IEEE 802). The
base version of the standard was released in 1997, and has had subsequent amendments. The
standard and amendments provide the basis for wireless network products using the Wi-
Fi brand. While each amendment is officially revoked when it is incorporated in the latest
version of the standard, the corporate world tends to market to the revisions because they
concisely denote capabilities of their products. As a result, in the marketplace, each revision
tends to become its own standard.
The protocols are typically used in conjunction with IEEE 802.2, and are designed to interwork
seamlessly with Ethernet, and are very often used to carry Internet Protocol traffic.
Although IEEE 802.11 specifications list channels that might be used, the radio
frequency spectrum availability allowed varies significantly by regulatory domain .
General description
The 802.11 family consists of a series of half-duplex over-the-air modulation techniques that
use the same basic protocol. The 802.11 protocol family employ carrier-sense multiple access
with collision avoidance whereby equipment listens to a channel for other users (including non
802.11 users) before transmitting each packet.
802.11-1997 was the first wireless networking standard in the family, but 802.11b was the
first widely accepted one, followed by 802.11a, 802.11g, 802.11n, and 802.11ac. Other
standards in the family (c–f, h, j) are service amendments that are used to extend the current
scope of the existing standard, which may also include corrections to a previous specification. [1]
802.11b and 802.11g use the 2.4 GHz ISM band, operating in the United States under Part
15 of the U.S. Federal Communications Commission Rules and Regulations; 802.11n can also
use that band. Because of this choice of frequency band, 802.11b/g/n equipment may
occasionally suffer interference in the 2.4 GHz band from microwave ovens, cordless
telephones, and Bluetooth devices etc. 802.11b and 802.11g control their interference and
susceptibility to interference by using direct-sequence spread spectrum (DSSS) and orthogonal
frequency-division multiplexing (OFDM) signaling methods, respectively.
802.11a uses the 5 GHz U-NII band, which, for much of the world, offers at least 23 non-
overlapping 20 MHz-wide channels rather than the 2.4 GHz ISM frequency band offering only
three non-overlapping 20 MHz-wide channels, where other adjacent channels overlap—see list
of WLAN channels. Better or worse performance with higher or lower frequencies (channels)
may be realized, depending on the environment. 802.11n can use either the 2.4 GHz or 5 GHz
band; 802.11ac uses only the 5 GHz band.
The segment of the radio frequency spectrum used by 802.11 varies between countries. In the
US, 802.11a and 802.11g devices may be operated without a license, as allowed in Part 15 of
the FCC Rules and Regulations. Frequencies used by channels one through six of 802.11b and
802.11g fall within the 2.4 GHz amateur radio band. Licensed amateur radio operators may
operate 802.11b/g devices under Part 97 of the FCC Rules and Regulations, allowing increased
power output but not commercial content or encryption. [2]
In 2018, the Wi-Fi Alliance began using a consumer-friendly generation numbering scheme for
the publicly used 802.11 protocols. Wi-Fi generations 1–6 refer to the 802.11b, 802.11a,
802.11g, 802.11n, 802.11ac, and 802.11ax protocols, in that order. [3][4]
History
802.11 technology has its origins in a 1985 ruling by the U.S. Federal Communications
Commission that released the ISM band for unlicensed use.
In 1991 NCR Corporation/AT&T (now Nokia Labs and LSI Corporation) invented a precursor to
802.11 in Nieuwegein, the Netherlands. The inventors initially intended to use the technology
for cashier systems. The first wireless products were brought to the market under the
name WaveLAN with raw data rates of 1 Mbit/s and 2 Mbit/s.
Vic Hayes, who held the chair of IEEE 802.11 for 10 years, and has been called the "father of
Wi-Fi", was involved in designing the initial 802.11b and 802.11a standards within the IEEE.
In 1999, the Wi-Fi Alliance was formed as a trade association to hold the Wi-Fi trademark
under which most products are sold.
The major commercial breakthrough came with Apple Inc. adopting Wi-Fi for their iBook series
of laptops in 1999. It was the first mass consumer product to offer Wi-Fi network connectivity,
which was then branded by Apple as AirPort. One year later IBM followed with
its ThinkPad 1300 series in 2000.
Legacy 802.11 with direct-sequence spread spectrum was rapidly supplanted and popularized
by 802.11b.
Since the 2.4 GHz band is heavily used to the point of being crowded, using the relatively
unused 5 GHz band gives 802.11a a significant advantage. However, this high carrier
frequency also brings a disadvantage: the effective overall range of 802.11a is less than that of
802.11b/g. In theory, 802.11a signals are absorbed more readily by walls and other solid
objects in their path due to their smaller wavelength, and, as a result, cannot penetrate as far
as those of 802.11b. In practice, 802.11b typically has a higher range at low speeds (802.11b
will reduce speed to 5.5 Mbit/s or even 1 Mbit/s at low signal strengths). 802.11a also suffers
from interference,[33] but locally there may be fewer signals to interfere with, resulting in less
interference and better throughput.
802.11b
The 802.11b standard has a maximum raw data rate of 11 Mbit/s (Megabits per second), and
uses the same media access method defined in the original standard. 802.11b products
appeared on the market in early 2000, since 802.11b is a direct extension of the modulation
technique defined in the original standard. The dramatic increase in throughput of 802.11b
(compared to the original standard) along with simultaneous substantial price reductions led to
the rapid acceptance of 802.11b as the definitive wireless LAN technology.
Devices using 802.11b experience interference from other products operating in the 2.4 GHz
band. Devices operating in the 2.4 GHz range include microwave ovens, Bluetooth devices,
baby monitors, cordless telephones, and some amateur radio equipment. As unlicensed
intentional radiators in this ISM band, they must not interfere with and must tolerate
interference from primary or secondary allocations (users) of this band, such as amateur radio.
802.11g
In June 2003, a third modulation standard was ratified: 802.11g. This works in the 2.4 GHz
band (like 802.11b), but uses the same OFDM based transmission scheme as 802.11a. It
operates at a maximum physical layer bit rate of 54 Mbit/s exclusive of forward error
correction codes, or about 22 Mbit/s average throughput. 802.11g hardware is fully backward
compatible with 802.11b hardware, and therefore is encumbered with legacy issues that
reduce throughput by ~21% when compared to 802.11a.
The then-proposed 802.11g standard was rapidly adopted in the market starting in January
2003, well before ratification, due to the desire for higher data rates as well as reductions in
manufacturing costs. By summer 2003, most dual-band 802.11a/b products became dual-
band/tri-mode, supporting a and b/g in a single mobile adapter card or access point. Details of
making b and g work well together occupied much of the lingering technical process; in an
802.11g network, however, activity of an 802.11b participant will reduce the data rate of the
overall 802.11g network.
Like 802.11b, 802.11g devices also suffer interference from other products operating in the
2.4 GHz band, for example wireless keyboards.
802.11-2007
In 2003, task group TGma was authorized to "roll up" many of the amendments to the 1999
version of the 802.11 standard. REVma or 802.11ma, as it was called, created a single
document that merged 8 amendments (802.11a, b, d, e, g, h, i, j) with the base standard. Upon
approval on March 8, 2007, 802.11REVma was renamed to the then-current base
standard IEEE 802.11-2007.
802.11n
802.11n is an amendment that improves upon the previous 802.11 standards by
adding multiple-input multiple-output antennas (MIMO). 802.11n operates on both the
2.4 GHz and the 5 GHz bands. Support for 5 GHz bands is optional. Its net data rate ranges
from 54 Mbit/s to 600 Mbit/s. The IEEE has approved the amendment, and it was published
in October 2009. Prior to the final ratification, enterprises were already migrating to 802.11n
networks based on the Wi-Fi Alliance's certification of products conforming to a 2007 draft of
the 802.11n proposal.
802.11-2012
In May 2007, task group TGmb was authorized to "roll up" many of the amendments to the
2007 version of the 802.11 standard.[38] REVmb or 802.11mb, as it was called, created a single
document that merged ten amendments (802.11k, r, y, n, w, p, z, v, u, s) with the 2007 base
standard. In addition much cleanup was done, including a reordering of many of the
clauses.[39] Upon publication on March 29, 2012, the new standard was referred to as IEEE
802.11-2012.
802.11ac
IEEE 802.11ac-2013 is an amendment to IEEE 802.11, published in December 2013, that
builds on 802.11n.[40] Changes compared to 802.11n include wider channels (80 or 160 MHz
versus 40 MHz) in the 5 GHz band, more spatial streams (up to eight versus four), higher-order
modulation (up to 256-QAM vs. 64-QAM), and the addition of Multi-user MIMO (MU-MIMO). As
of October 2013, high-end implementations support 80 MHz channels, three spatial streams,
and 256-QAM, yielding a data rate of up to 433.3 Mbit/s per spatial stream, 1300 Mbit/s total,
in 80 MHz channels in the 5 GHz band.[41] Vendors have announced plans to release so-called
"Wave 2" devices with support for 160 MHz channels, four spatial streams, and MU-MIMO in
2014 and 2015.
802.11ad
IEEE 802.11ad is an amendment that defines a new physical layer for 802.11 networks to
operate in the 60 GHz millimeter wave spectrum. This frequency band has significantly
different propagation characteristics than the 2.4 GHz and 5 GHz bands where Wi-Fi networks
operate. Products implementing the 802.11ad standard are being brought to market under
the WiGig brand name. The certification program is now being developed by the Wi-Fi
Alliance instead of the now defunct Wireless Gigabit Alliance.[45] The peak transmission rate of
802.11ad is 7 Gbit/s.
IEEE 802.11ad is a protocol used for very high data rates (about 8 Gbit/s) and for short range
communication (about 1–10 meters).
TP-Link announced the world's first 802.11ad router in January 2016.
The WiGig standard is not too well known, although it was announced in 2009 and added to
the IEEE 802.11 family in December 2012.
02.11af
IEEE 802.11af, also referred to as "White-Fi" and "Super Wi-Fi",[49] is an amendment, approved
in February 2014, that allows WLAN operation in TV white space spectrum in
the VHF and UHF bands between 54 and 790 MHz. It uses cognitive radio technology to
transmit on unused TV channels, with the standard taking measures to limit interference for
primary users, such as analog TV, digital TV, and wireless microphones. [51] Access points and
stations determine their position using a satellite positioning system such as GPS, and use the
Internet to query a geolocation database (GDB) provided by a regional regulatory agency to
discover what frequency channels are available for use at a given time and position. [51] The
physical layer uses OFDM and is based on 802.11ac.[52] The propagation path loss as well as
the attenuation by materials such as brick and concrete is lower in the UHF and VHF bands
than in the 2.4 GHz and 5 GHz bands, which increases the possible range. [51] The frequency
channels are 6 to 8 MHz wide, depending on the regulatory domain. [51] Up to four channels
may be bonded in either one or two contiguous blocks. MIMO operation is possible with up to
four streams used for either space–time block code (STBC) or multi-user (MU) operation.[51] The
achievable data rate per spatial stream is 26.7 Mbit/s for 6 and 7 MHz channels, and
35.6 Mbit/s for 8 MHz channels.[30] With four spatial streams and four bonded channels, the
maximum data rate is 426.7 Mbit/s for 6 and 7 MHz channels and 568.9 Mbit/s for 8 MHz
channels.
802.11-2016
IEEE 802.11-2016 which was known as IEEE 802.11 REVmc, is a revision based on IEEE
802.11-2012, incorporating 5 amendments (11ae, 11aa, 11ad, 11ac, 11af). In addition, existing
MAC and PHY functions have been enhanced and obsolete features were removed or marked
for removal. Some clauses and annexes have been renumbered.
802.11ah
IEEE 802.11ah, published in 2017, defines a WLAN system operating at sub-1 GHz license-
exempt bands. Due to the favorable propagation characteristics of the low frequency spectra,
802.11ah can provide improved transmission range compared with the conventional 802.11
WLANs operating in the 2.4 GHz and 5 GHz bands. 802.11ah can be used for various purposes
including large scale sensor networks,[56] extended range hotspot, and outdoor Wi-Fi for
cellular traffic offloading, whereas the available bandwidth is relatively narrow. The protocol
intends consumption to be competitive with low power Bluetooth, at a much wider range.
802.11ai
IEEE 802.11ai is an amendment to the 802.11 standard that added new mechanisms for a
faster initial link setup time.
802.11aj
IEEE 802.11aj is a rebanding of 802.11ad for use in the 45 GHz unlicensed spectrum available
in some regions of the world (specifically China). Alternatively known as China Milli-Meter
Wave (CMMW).
802.11aq
IEEE 802.11aq is an amendment to the 802.11 standard that will enable pre-association discovery of
services. This extends some of the mechanisms in 802.11u that enabled device discovery to further
discover the services running on a device, or provided by a network.
802.11ax
IEEE 802.11ax (marketed as "Wi-Fi 6" by the Wi-Fi Alliance) is the successor to 802.11ac, and
will increase the efficiency of WLAN networks. Currently in development, this project has the
goal of providing 4x the throughput of 802.11ac at the user layer, [59] having just 37% higher
nominal data rates at the PHY layer.[60] In the previous amendment of 802.11 (namely
802.11ac), Multi-User MIMO has been introduced, which is a spatial multiplexing technique.
MU-MIMO allows the Access Point to form beams towards each Client, while transmitting
information simultaneously. By doing so, the interference between Clients is reduced, and the
overall throughput is increased, since multiple Clients can receive data at the same time. With
802.11ax, a similar multiplexing is introduced in the frequency domain, namely OFDMA. With
this technique, multiple Clients are assigned with different Resource Units in the available
spectrum. By doing so, an 80 MHz channel can be split into multiple Resource Units, so that
multiple Clients receive different type of data over the same spectrum, simultaneously. In order
to have enough amount of subcarriers to support the requirements of OFDMA, the number of
subcarriers are increased by a factor of 4 (compared to 802.11ac standard). In other words, for
20, 40, 80 and 160 MHz channels, there are 64, 128, 256 and 512 subcarriers in 802.11ac
standard; while there are 256, 512, 1024 and 2048 subcarriers in 802.11ax standard. Since
the available bandwidths have not changed and the number of subcarriers are increased with a
factor of 4, the subcarrier spacing is reduced by a factor of 4 as well. This introduces 4 times
longer OFDM symbols. E.g., for 802.11ac and 802.11ax, the duration of an OFDM symbol is
3.2 micro seconds and 12.8 micro seconds (both without guard intervals), respectively.
802.11ay
IEEE 802.11ay is a standard that is being developed. It is an amendment that defines a
new physical layer for 802.11 networks to operate in the 60 GHz millimeter wave spectrum. It
will be an extension of the existing 11ad, aimed to extend the throughput, range and use-
cases. The main use-cases include: indoor operation, out-door back-haul and short range
communications. The peak transmission rate of 802.11ay is 20 Gbit/s.] The main extensions
include: channel bonding (2, 3 and 4), MIMO and higher modulation schemes.
802.11be
IEEE 802.11be Extremely High Throughput (EHT) is the potential next amendment of the
802.11 IEEE standard It will build upon 802.11ax, focusing on WLAN indoor and outdoor
operation with stationary and pedestrian speeds in the 2.4 GHz, 5 GHz, and 6 GHz frequency
bands. Being the potential successor of Wi-Fi 6, the Wi-Fi Alliance will most likely certify it as
Wi-Fi 7.
Graphical representation of Wi-Fi application specific (UDP) performance envelope 2.4 GHz
band, with 802.11n with 40MHz
This means that, typically, data frames pass an 802.11 (WLAN) medium, and are being
converted to 802.3 (Ethernet) or vice versa. Due to the difference in the frame (header) lengths
of these two media, the application's packet size determines the speed of the data transfer. This
means applications that use small packets (e.g., VoIP) create dataflows with high-overhead
traffic (i.e., a low goodput). Other factors that contribute to the overall application data rate are
the speed with which the application transmits the packets (i.e., the data rate) and, of course,
the energy with which the wireless signal is received. The latter is determined by distance and
by the configured output power of the communicating devices. [63][64]
The same references apply to the attached graphs that show measurements
of UDP throughput. Each represents an average (UDP) throughput (please note that the error
bars are there, but barely visible due to the small variation) of 25 measurements. Each is with
a specific packet size (small or large) and with a specific data rate (10 kbit/s – 100 Mbit/s).
Markers for traffic profiles of common applications are included as well. These figures assume
there are no packet errors, which if occurring will lower transmission rate further.
802.11b, 802.11g, and 802.11n-2.4 utilize the 2.400–2.500 GHz spectrum, one of the ISM
bands. 802.11a, 802.11n and 802.11ac use the more heavily regulated 4.915–5.825 GHz band.
These are commonly referred to as the "2.4 GHz and 5 GHz bands" in most sales literature.
Each spectrum is sub-divided into channels with a center frequency and bandwidth, analogous
to the way radio and TV broadcast bands are sub-divided.
The 2.4 GHz band is divided into 14 channels spaced 5 MHz apart, beginning with channel 1,
which is centered on 2.412 GHz. The latter channels have additional restrictions or are
unavailable for use in some regulatory domains.
he channel numbering of the 5.725–5.875 GHz spectrum is less intuitive due to the differences in
regulations between countries. These are discussed in greater detail on the list of WLAN channels.
Channel spacing within the 2.4 GHz band
In addition to specifying the channel center frequency, 802.11 also specifies (in Clause 17)
a spectral mask defining the permitted power distribution across each channel. The mask
requires the signal be attenuated a minimum of 20 dB from its peak amplitude at ±11 MHz
from the centre frequency, the point at which a channel is effectively 22 MHz wide. One
consequence is that stations can use only every fourth or fifth channel without overlap.
Confusion often arises over the amount of channel separation required between transmitting
devices. 802.11b was based on direct-sequence spread spectrum (DSSS) modulation and
utilized a channel bandwidth of 22 MHz, resulting in three "non-overlapping" channels (1, 6,
and 11). 802.11g was based on OFDM modulation and utilized a channel bandwidth of
20 MHz. This occasionally leads to the belief that four "non-overlapping" channels (1, 5, 9, and
13) exist under 802.11g, although this is not the case as per 17.4.6.3 Channel Numbering of
operating channels of the IEEE Std 802.11 (2012), which states "In a multiple cell network
topology, overlapping and/or adjacent cells using different channels can operate
simultaneously without interference if the distance between the center frequencies is at least
25 MHzThis does not mean that the technical overlap of the channels recommends the non-use
of overlapping channels. The amount of inter-channel interference seen on a configuration
using channels 1, 5, 9, and 13 (which is permitted in Europe, but not in North America) is
barely different from a three-channel configuration, but with an entire extra channel. [68][69]
802.11 non-overlapping channels for 2.4GHz. Covers 802.11b,g,n
However, overlap between channels with more narrow spacing (e.g. 1, 4, 7, 11 in North
America) may cause unacceptable degradation of signal quality and throughput, particularly
when users transmit near the boundaries of AP cells.
IEEE uses the phrase regdomain to refer to a legal regulatory region. Different countries define
different levels of allowable transmitter power, time that a channel can be occupied, and
different available channels.[71] Domain codes are specified for the United States, Canada, ETSI
(Europe), Spain, France, Japan, and China.
Most Wi-Fi certified devices default to regdomain 0, which means least common
denominator settings, i.e., the device will not transmit at a power above the allowable power in
any nation, nor will it use frequencies that are not permitted in any nation. [citation needed]
The regdomain setting is often made difficult or impossible to change so that the end users do
not conflict with local regulatory agencies such as the United States' Federal Communications
Commission.
Layer 2 – Datagrams
The datagrams are called frames. Current 802.11 standards specify frame types for use in
transmission of data as well as management and control of wireless links.
Frames are divided into very specific and standardized sections. Each frame consists of a MAC
header, payload, and frame check sequence (FCS). Some frames may not have a payload.
Fram Frame
Field Duratio Addre Addre Addre Sequen Addre QoS HT Fra
e check
n, ss ss ss ce ss contr contr me
contr sequen
ol id. 1 2 3 control 4 ol ol body ce
Leng 0, or 0, or Variabl
2 2 6 6 6 0, or 2 6 e 4
th 2 4
The first two bytes of the MAC header form a frame control field specifying the form and
function of the frame. This frame control field is subdivided into the following sub-fields:
Protocol Version: Two bits representing the protocol version. Currently used protocol
version is zero. Other values are reserved for future use.
Type: Two bits identifying the type of WLAN frame. Control, Data, and Management are
various frame types defined in IEEE 802.11.
Subtype: Four bits providing additional discrimination between frames. Type and Subtype
are used together to identify the exact frame.
ToDS and FromDS: Each is one bit in size. They indicate whether a data frame is headed
for a distribution system. Control and management frames set these values to zero. All the
data frames will have one of these bits set. However communication within an independent
basic service set (IBSS) network always set these bits to zero.
More Fragments: The More Fragments bit is set when a packet is divided into multiple
frames for transmission. Every frame except the last frame of a packet will have this bit set.
Retry: Sometimes frames require retransmission, and for this there is a Retry bit that is set
to one when a frame is resent. This aids in the elimination of duplicate frames.
Power Management: This bit indicates the power management state of the sender after the
completion of a frame exchange. Access points are required to manage the connection, and
will never set the power-saver bit.
More Data: The More Data bit is used to buffer frames received in a distributed system. The
access point uses this bit to facilitate stations in power-saver mode. It indicates that at
least one frame is available, and addresses all stations connected.
Protected Frame: The Protected Frame bit is set to one if the frame body is encrypted by a
protection mechanism such as Wired Equivalent Privacy (WEP), Wi-Fi Protected
Access (WPA), or Wi-Fi Protected Access II (WPA2).
Order: This bit is set only when the "strict ordering" delivery method is employed. Frames
and fragments are not always sent in order as it causes a transmission performance
penalty.
The next two bytes are reserved for the Duration ID field. This field can take one of three forms:
Duration, Contention-Free Period (CFP), and Association ID (AID).
An 802.11 frame can have up to four address fields. Each field can carry a MAC address.
Address 1 is the receiver, Address 2 is the transmitter, Address 3 is used for filtering purposes
by the receiver.[dubious – discuss] Address 4 is only present in data frames transmitted between
access points in an Extended Service Set or between intermediate nodes in a mesh network.
The Sequence Control field is a two-byte section used for identifying message order as well
as eliminating duplicate frames. The first 4 bits are used for the fragmentation number,
and the last 12 bits are the sequence number.
An optional two-byte Quality of Service control field, present in QoS Data frames; it was
added with 802.11e.
The payload or frame body field is variable in size, from 0 to 2304 bytes plus any overhead from
security encapsulation, and contains information from higher layers.
The Frame Check Sequence (FCS) is the last four bytes in the standard 802.11 frame. Often
referred to as the Cyclic Redundancy Check (CRC), it allows for integrity check of retrieved
frames. As frames are about to be sent, the FCS is calculated and appended. When a station
receives a frame, it can calculate the FCS of the frame and compare it to the one received. If
they match, it is assumed that the frame was not distorted during transmission.
Management frames
Management frames are not always authenticated, and allow for the maintenance, or
discontinuance, of communication. Some common 802.11 subtypes include:
Authentication frame: 802.11 authentication begins with the wireless network interface
card (WNIC) sending an authentication frame to the access point containing its identity. With an
open system authentication, the WNIC sends only a single authentication frame, and the
access point responds with an authentication frame of its own indicating acceptance or
rejection. With shared key authentication, after the WNIC sends its initial authentication
request it will receive an authentication frame from the access point containing challenge
text. The WNIC sends an authentication frame containing the encrypted version of the
challenge text to the access point. The access point ensures the text was encrypted with
the correct key by decrypting it with its own key. The result of this process determines the
WNIC's authentication status.
Association request frame: Sent from a station it enables the access point to allocate
resources and synchronize. The frame carries information about the WNIC, including
supported data rates and the SSID of the network the station wishes to associate with. If
the request is accepted, the access point reserves memory and establishes an association
ID for the WNIC.
Association response frame: Sent from an access point to a station containing the
acceptance or rejection to an association request. If it is an acceptance, the frame will
contain information such an association ID and supported data rates.
Beacon frame: Sent periodically from an access point to announce its presence and provide
the SSID, and other parameters for WNICs within range.
Deauthentication frame: Sent from a station wishing to terminate connection from another
station.
Disassociation frame: Sent from a station wishing to terminate connection. It's an elegant
way to allow the access point to relinquish memory allocation and remove the WNIC from
the association table.
Probe request frame: Sent from a station when it requires information from another station.
Probe response frame: Sent from an access point containing capability information,
supported data rates, etc., after receiving a probe request frame.
Reassociation request frame: A WNIC sends a reassociation request when it drops from
range of the currently associated access point and finds another access point with a
stronger signal. The new access point coordinates the forwarding of any information that
may still be contained in the buffer of the previous access point.
Reassociation response frame: Sent from an access point containing the acceptance or
rejection to a WNIC reassociation request frame. The frame includes information required
for association such as the association ID and supported data rates.
Action frame: extending management frame to control certain action. Some of action
categories are Block Ack, Radio Measurement, Fast BSS Transition, etc. These frames are
sent by a station when it needs to tell its peer for certain action to be taken. For example, a
station can tell another station to set up a block acknowledgement by sending an ADDBA
Request action frame. The other station would then respond with an ADDBA
Response action frame.
Control frames
Control frames facilitate in the exchange of data frames between stations. Some common
802.11 control frames include:
Acknowledgement (ACK) frame: After receiving a data frame, the receiving station will send
an ACK frame to the sending station if no errors are found. If the sending station doesn't
receive an ACK frame within a predetermined period of time, the sending station will resend
the frame.
Request to Send (RTS) frame: The RTS and CTS frames provide an optional collision
reduction scheme for access points with hidden stations. A station sends a RTS frame as
the first step in a two-way handshake required before sending data frames.
Clear to Send (CTS) frame: A station responds to an RTS frame with a CTS frame. It
provides clearance for the requesting station to send a data frame. The CTS provides
collision control management by including a time value for which all other stations are to
hold off transmission while the requesting station transmits.
Data frames
Data frames carry packets from web pages, files, etc. within the body. [73] The body begins with
an IEEE 802.2 header, with the Destination Service Access Point (DSAP) specifying the
protocol, followed by a Subnetwork Access Protocol (SNAP) header if the DSAP is hex AA, with
the organizationally unique identifier (OUI) and protocol ID (PID) fields specifying the protocol.
If the OUI is all zeroes, the protocol ID field is an EtherType value.[74] Almost all 802.11 data
frames use 802.2 and SNAP headers, and most use an OUI of 00:00:00 and an EtherType
value.
Similar to TCP congestion control on the internet, frame loss is built into the operation of
802.11. To select the correct transmission speed or Modulation and Coding Scheme, a rate
control algorithm may test different speeds. The actual packet loss rate of an Access points
vary widely for different link conditions. There are variations in the loss rate experienced on
production Access points, between 10% and 80%, with 30% being a common average. [75] It is
important to be aware that the link layer should recover these lost frames. If the sender does
not receive an Acknowledgement (ACK) frame, then it will be resent.
WLAN security
Wireless local are network security (WLAN security) is a security system designed to protect
networks from the security breaches to which wireless transmissions are susceptible. This type
of security is necessary because WLAN signals have no physical boundary limitations, and are
prone to illegitimate access over network resources, resulting in the vulnerability of private and
confidential data. Network operations and availability can also be compromised in case of a
WLAN security breech. To address these issues, various authentication, encryption, invisibility
and other administrative controlling techniques are used in WLANs. Business and corporate
WLANs in particular require adequate security measures to detect, prevent and block
piggybackers, eavesdroppers and other intruders.
Figure below depicts the WLAN part integrated with LAN system components. As shown WLAN
system composed of Station and Access Point. Access point helps interface wireless local area
network devices with fixed LAN devices. Wireless network is more vulnerable to hacking
compare to wired one. Hence more security is provided to the wireless portion and not to the
wired one. This wifi security is very essential for the wifi users accessing the wifi network at
public places such as restaurants, airports, railway stations etc.
WLAN has many benefits such as user mobility, rapid installation, flexibility and scalability.
The WLAN security services are provided by WEP protocol (Wired Equivalent Privacy protocol).
This protocol protects link level data during wireless transmissions between the stations(i.e.
STAs/clients) and APs(Access Points). WEP protocol takes care of security only in the wireless
part and not in the wired part.
There are three security services specified by IEEE for WLAN network viz. authentication,
confidentiality and integrity.
Authentication takes care of denying the access for the stations who do not authenticate with
the APs.
Confidentiality makes sure to prevent unauthorized entry into the WLAN network. Integrity
makes sure that messages in the transit are not altered.
There are two authentication types defined in WLAN system(IEEE 802.11). They are open
system authentication and shared key authentication. In open system authentication stations
are allowed to join the WLAN network without any identity verification. It does use any
cryptography or encryption algorithm. The other type shared key authentication uses RC4
cryptographic algorithm as explained below.
The Wireless Application Protocol (WAP) is a worldwide standard for the delivery and
presentation of wireless information to mobile phones and other wireless devices. The idea
behind WAP is simple: simplify the delivery of Internet content to wireless devices by delivering
a comprehensive, Internet-based, wireless specification. The WAP Forum released the first
version of WAP in 1998. Since then, it has been widely adopted by wireless phone
manufacturers, wireless carriers, and application developers worldwide. Many industry
analysts estimate that 90 percent of mobile phones sold over the next few years will be WAP-
enabled.
The driving force behind WAP is the WAP Forum component of the Open Mobile Alliance. The WAP
Forum was founded in 1997 by Ericsson, Motorola, Nokia, and Openwave Systems (the latter known
as Unwired Planet at the time) with the goal of making wireless Internet applications more
mainstream by delivering a development specification and framework to accelerate the delivery
of wireless applications. Since then, more than 300 corporations have joined the forum,
making WAP the de facto standard for wireless Internet applications. In June 2002, the WAP
Forum, the Location Interoperability Forum, SyncML Initiative, MMS Interoperability Group,
and Wireless Village consolidated under the name Open Mobile Alliance to create a governing
body that will be at the center of all mobile application standardization work.
The WAP architecture is composed of various protocols and an XML-based markup language
called the Wireless Markup Language (WML), which is the successor to the Handheld Device
Markup Language (HDML) as defined by Open wave Systems. WAP 2.x contains a new version
of WML, commonly referred to as WML2; it is based on the eXtensible HyperText Markup
Language (XHTML), signaling part of WAP's move toward using common Internet specifications
such as HTTP and TCP/IP.
In the remainder of this section we will take a look at the WAP programming model and the
various components that comprise the WAP architecture. Where it is applicable, we will supply
information on both the WAP 1.x and 2.x specifications.
The WAP programming model is very similar to the Internet programming model. It typically
uses the pull approach for requesting content, meaning the client makes the request for
content from the server. However, WAP also supports the ability to push content from the
server to the client using the Wireless Telephony Application Specification (WTA), which
provides the ability to access telephony functions on the client device.
Content can be delivered to a wireless device using WAP in two ways: with or without a WAP
gateway. Whether a gateway is used depends on the features required and the version of WAP
being implemented. WAP 1.x requires the use of a WAP gateway as an intermediary between
the client and the wireless application server, as depicted in Figure 11.6. This gateway is
responsible for the following:
Translating requests from the WAP protocol to the protocols used over the World Wide
Web, such as HTTP and TCP/IP.
Encoding and decoding regular Web content into compact formats that are more
appropriate for wireless communication.
Allowing use of standard HTTP-based Web servers for the generation and delivery of
wireless content. This may involve transforming the content to make it appropriate for
wireless consumption.
When developing WAP 2.x applications, you no longer are required to use a WAP gateway. WAP
2.x allows HTTP communication between the client and the origin server, so there is no need
for conversion. This is not to say, however, that a WAP gateway is not beneficial. Using a WAP
gateway will allow you to optimize the communication process and facilitate other wireless
service features such as location, privacy, and WAP Push. Figure 11.7 shows the WAP
programming model without a WAP gateway: Note that removing it makes the wireless Internet
application architecture nearly identical to that used for standard Web applications.
The WAP architecture comprises several components, each serving a specific function. These
components include a wireless application environment, session and transaction support,
security, and data transfer. The exact protocols used depend on which version of WAP you are
implementing. WAP 2.x is based mainly on common Internet protocols such as HTTP and
TCP/IP, while WAP 1.x uses proprietary protocols developed as part of the WAP specification.
We will investigate each component and its related function.
To begin, we will look at how WAP conforms to the Open Systems Interconnection (OSI) model
as defined by the International Standards Organization (ISO). The OSI model consists of seven
distinct layers, six of which are depicted in Figure 11.8 as they relate to the WAP architecture.
The physical layer is not shown; it sits below the network layer and defines the physical
aspects such as the hardware and the raw bit-stream. For each of the other six layers, WAP
has a corresponding layer, which will now be described in more depth.
Figure 11.8: WAP architecture and its relationship to the OSI model.
Wireless Application Environment (WAE)
The Wireless Application Environment (WAE) is the application layer of the OSI model. It
provides the required elements for interaction between Web applications and wireless clients
using a WAP microbrowser. These elements are as follows:
A specification for a microbrowser that controls the user interface and interprets WML
and WMLScript.
The foundation for the microbrowser in the form of the Wireless Markup Language
(WML). WML has been designed to accommodate the unique characteristics of wireless
devices, by incorporating a user interface model that is suitable for small form-factor
devices that do not have a QWERTY keyboard.
A complete scripting language called WMLScript that extends the functionality of WML,
enabling more capabilities on the client for business and presentation logic.
Support for other content types such as wireless bitmap images (WBMP), vCard, and
vCalendar.
Note WAP 2.x WAE has backward compatibility to WML1. This is accomplished either via built-
in support for both languages or by translating WML1 into WML2 using eXtensible
Stylesheet Language Transformation (XSLT). The method used depends on the
implementation by the device manufacturer.
The WAP protocol stack has undergone significant change from WAP 1.x to WAP 2.x. The basis
for the change is the support for Internet Protocols (IPs) when IP connectivity is supported by
the mobile device and network. As with other parts of WAP, the WAP 2.x protocol stack is
backward-compatible. Support for the legacy WAP 1.x stack has been maintained for non-IP
and low-bandwidth IP networks that can benefit from the optimizations in the WAP 1.x protocol
stack.
We will take a look at both WAP 1.x and WAP 2.x, with a focus on the technologies used in
each version of the specification.
WAP 1.x
The protocols in the WAP 1.x protocol stack have been optimized for low-bandwidth, high-
latency networks, which are prevalent in pre-3G wireless networks. The protocols are as
follows:
Wireless Session Protocol (WSP). WSP provides capabilities similar to HTTP/1.1 while
incorporating features designed for low-bandwidth, high-latency wireless networks such
as long-lived sessions and session suspend/resume. This is particularly important, as it
makes it possible to suspend a session while not in use, to free up network resources or
preserve battery power. The communication from a WAP gateway to the microbrowser
client is over WSP.
Wireless Transaction Protocol (WTP). WTP provides a reliable transport mechanism
for the WAP datagram service. It offers similar reliability as Transmission Control
Protocol/Internet Protocol (TCP/IP), but it removes characteristics that make TCP/IP
unsuitable for wireless communication, such as the extra handshakes and additional
information for handling out-of-order packets. Since the communication is directly from
a handset to a server, this information is not required. The result is that WTP requires
less than half of the number of packets of a standard HTTP-TCP/IP request. In addition,
using WTP means that a TCP stack is not required on the wireless device, reducing the
processing power and memory required.
Wireless Transport Layer Security (WTLS). WTLS is the wireless version of the
Transport Security Layer (TLS), which was formerly known as Secure Sockets Layer
(SSL). It provides privacy, data integrity, and authentication between the client and the
wireless server. Using WTLS, WAP gateways can automatically provide wireless security
for Web applications that use TLS. In addition, like the other wireless protocols, WTLS
incorporates features designed for wireless networks, such as datagram support,
optimized handshakes, and dynamic key refreshing.
Wireless Datagram Protocol (WDP). WDP is a datagram service that brings a common
interface to wireless transportation bearers. It can provide this consistent layer by using
a set of adapters designed for specific features of these bearers. It supports CDPD,
GSM, CDMA, TDMA, SMS, FLEX (a wireless technology developed by Motorola), and
Integrated Digital Enhanced Network (iDEN) protocols.
WAP 2.x
One of the main new features in WAP 2.x is the use of Internet protocols in the WAP protocol
stack. This change was precipitated by the rollout of 2.5G and 3G networks that provide IP
support directly to wireless devices. To accommodate this change, WAP 2.x has the following
new protocol layers:
Wireless Profiled HTTP (WP-HTTP). WP-HTTP is a profile of HTTP designed for the
wireless environment. It is fully interoperable with HTTP/1.1 and allows the usage of
the HTTP request/response model for interaction between the wireless device and the
wireless server.
Transport Layer Security (TLS). WAP 2.0 includes a wireless profile of TLS, which
allows secure transactions. The TLS profile includes cipher suites, certificate formats,
signing algorithms, and the use of session resume, providing robust wireless security.
There is also support for TLS tunneling, providing end-to-end security at the transport
level. The support for TLS removes the WAP security gap that was present in WAP 1.x.
Wireless Profiled TCP (WP-TCP). WP-TCP is fully interoperable with standard Internet-
based TCP implementations, while being optimized for wireless environments. These
optimizations result in lower overhead for the communication stream.
Note Wireless devices can support both the WAP 1.x and WAP 2.x protocol stacks. In this
scenario, they would need to operate independently of each other, since WAP 2.x provides
support for both stacks.
In addition to a new protocol stack, WAP 2.x introduced many other new features and services.
These new features expand the capabilities of wireless devices and allow developers to create
more useful applications and services. The following is a summary of the features of interest:
WAP Push. WAP Push enables enterprises to initiate the sending of information on the
server using a push proxy. This capability was introduced in WAP 1.2, but has been
enhanced in WAP 2.x. Applications that require updates based on external information
are particularly suited for using WAP Push. Examples include various forms of
messaging applications, stock updates, airline departure and arrival updates, and
traffic information. Before WAP Push was introduced, the wireless user was required to
poll the server for updated information, wasting both time and bandwidth.
User Agent Profile (UAProf). The UAProf enables a server to obtain information about
the client making the request. In WAP 2.x, it is based on the Composite
Capabilities/Preference Profiles (CC/PP) specification as defined by the W3C. It works
by sending information in the request object, allowing wireless servers to adapt the
information being sent according to the client device making the request.
External Functionality Interface (EFI). This allows the WAP applications within the
WAE to communicate with external applications, enabling other applications to extend
the capabilities of WAP applications, similar to plug-ins for desktop browsers.
Wireless Telephony Application (WTA). The WTA allows WAP applications to control
various telephony applications, such as making calls, answering calls, putting calls on
hold, or forwarding them. It allows WAP WTA-enabled cell phones to have integrated
voice and data services.
Persistent storage interface. WAP 2.x introduces a new storage service with a well-
defined interface to store data locally on the device. The interface defines ways to
organize, access, store, and retrieve data.
Data synchronization. For data synchronization, WAP 2.x has adopted the SyncML
solution. As outlined in Chapter 10, "Enterprise Integration through Synchronization,"
SyncML provides an XML-based protocol for synchronizing data over both WSP and
HTTP.
WAP Benefits
The WAP specification is continually changing to meet the growing demands of wireless
applications. The majority of wireless carriers and handset manufacturers support WAP and
continue to invest in the new capabilities it offers. Over the years WAP has evolved from using
proprietary protocols in WAP 1.x to using standard Internet protocols in WAP 2.x, making it
more approachable for Web developers. The following are some of the key benefits that WAP
provides:
WAP supports legacy WAP 1.x protocols that encode and optimize content for low-
bandwidth, high-latency networks while communicating with the enterprise servers
using HTTP.
WAP supports wireless profiles of Internet protocols for interoperability with Internet
applications. This allows WAP clients to communicate with enterprise servers, without
requiring a WAP gateway.
WAP allows end users to access a broad range of content over multiple wireless
networks using a common user interface, the WAP browser. Because the WAP
specification defines the markup language and microbrowser, users can be assured that
wireless content will be suitable for their WAP-enabled device.
WAP uses XML as the base language for both WML and WML2 (which uses XHTML),
making it easy for application developers to learn and build wireless Internet
applications. It also makes content transformation easier by incorporating support for
XSL stylesheets to transform XML content. Once an application is developed using WML
or WML2, any device that is WAP-compliant can access it.
WAP has support for WTA. This allows applications to communicate with the device and
network telephony functions. This permits the development of truly integrated voice and
data applications.
Using UAProf, the information delivered to each device can be highly customized.
(Chapter 13 provides more details on how this information can be used to deliver user-
specific content.)
WAP works with all of the main wireless bearers, including CDPD, GSM, CDMA, TDMA,
FLEX, and iDEN protocols. This interoperability allows developers to focus on creating
their applications, without having to worry about the underlying network that will be
used.
At present, all major wireless carriers support the WAP specification. This universal support is
expected to continue as WAP evolves, providing a robust, intuitive way to extend Web content
to wireless devices.
Wireless Transport Layer Security
Wireless Transport Layer Security (WTLS) is a security protocol, part of the Wireless
Application Protocol (WAP) stack. It sits between the WTP and WDP layers in the WAP
communications stack.
Overview
WTLS is derived from TLS. WTLS uses similar semantics adapted for a low bandwidth mobile
device. The main changes are:
Compressed data structures — Where possible packet sizes are reduced by using bit-fields,
discarding redundancy and truncating some cryptographic elements.
New certificate format — WTLS defines a compressed certificate format. This broadly
follows the X.509 v3 certificate structure, but uses smaller data structures.
Packet based design — TLS is designed for use over a data stream. WTLS adapts that
design to be more appropriate on a packet based network. A significant amount of the
design is based on a requirement that it be possible to use a packet network such
as SMS as a data transport.
WTLS has been superseded in the WAP Wireless Application Protocol 2.0 standard by the End-
to-end Transport Layer Security Specification.
Security
WTLS uses cryptographic algorithms and in common with TLS allows negotiation of
cryptographic suites between client and server.
Algorithms
An incomplete list:
Interoperability
As mentioned above the client and server negotiate the cryptographic suite. This happens when
the session is started, briefly the client sends a list of supported algorithms and the server
chooses a suite, or refuses the connection. The standard does not mandate support of any
algorithm. An endpoint (either client or server) that needs to be interoperable with any other
endpoint may need to implement every algorithm (including some covered by intellectual
property rights).
This figure corresponds to the protocol architecture . The mobile device establishes a secure
WTLS session with the WAP gateway. The WAP gateway, in turn, establishes a secure SSL or
TLS session with the Web server. Within
the gateway, data are not encrypted during the translation process. The gateway is thus a point a
t which the data may be compromised.
There are a number of approaches to providing end-to-end security between the mobile client
and the Web server. In the WAP version 2 (known as WAP2) architecture document, the WAP
forum defines several protocol arrange- ments that allow for end-to-end security.
Version 1 of WAP assumed a simplified set of protocols over the wireless network and assumed
that the wireless network did not support IP. WAP2 provides the option for the mobile device to
implement full TCP/IP-based protocols and operate over an IP-capable wireless network.
Figure below shows two ways in which this IP capability can be exploited to provide end-to- end
security. In both approaches, the mobile client implements TCP/IP and HTTP.
The first approach (Figure a) is to make use of TLS between client and server. A secure TLS
session is set up between the endpoints. The WAP gateway
acts as a TCPlevel gateway and splices together two TCP connections to carry the traffic
between the endpoints. However, the TCP user data field (TLS records) remains encrypted as
it passes through the gateway, so end-to-end security is maintained.
Yet another, somewhat more complicated, approach has been defined in more specific terms
by the WAP forum in specification entitled ―WAP Transport Layer End-to-End Security.‖ This
approach is illustrated in Figure 17.21, which is based on
a figure in [ASHL01]. In this scenario, the WAP client
connects to its usual WAP gateway and attempts to send a request
through the gateway to a secure domain. The secure content server
determines the need for security that requires that the mobile client connect to its
local WAP gateway rather
than its default WAP gateway. The Web server responds to the initial client request with an HTTP
redirect message that redirects the client to a WAP gateway that is part of the enterprise
network. This message passes back through the default gateway,
which validates the redirect and sends it to the client. The client caches the redirect information
and establishes a secure session with the enterprise WAP gateway using WTLS. After the
connection is terminated, the default gateway is reselected and used for subsequent
communication to other Web servers. Note
that this approach requires that the enterprise maintain a WAP gateway on the wireless net-
work that the client is using.
Most people do not realize how trivial it is for any person on the Internet to forge an e-mail by
simply changing the identity profile of their own e-mail program. This makes it possibly for
anyone to send you an e-mail from some known e-mail address, pretending to be someone else.
This can be compared with normal mail; you can write anything on the envelope as the return
address, and it will still get delivered to the recipient (given that the destination address is
correct). We will describe a method for signing e-mail messages, which prevents the possibility
of forgery. Signing e-mail messages will be explained in the chapter about PGP (Pretty Good
Privacy).
An e-mail message travels across many Internet servers before it reaches its final recipient.
Every one of these servers can look into the content of messages, including subject, text and
attachments. Even if these servers are run by trusted infrastructure providers, they may have
been compromised by hackers or by a rogue employee, or a government agency may
seize equipment and retrieve your personal communication.
There are two levels of security that protect against such e-mail interception. The first one is
making sure the connection to your e-mail server is secured by an encryption mechanism. The
second is by encrypting the message itself, to prevent anyone other than the recipient from
understanding the content. Connection security is covered extensively in this section and in
the sections about VPN. E-mail encryption is also covered in detail in the chapters about using
PGP.
More than 80% of all the traffic coming through a typical e-mail server on the Internet contains
either spam messages, viruses or attachments that intend to harm your computer. Protection against
such hostile e-mails requires keeping your software up-to-date and having an attitude of distrust toward
any e-mail that cannot be properly authenticated. In the final chapter of this section, we will describe
some ways to protect against hostile e-mail.
Phishing attacks can come from a wide variety of sources. You may receive mails from an
organization or an individual who offers to assist you with some problem or provide you with
some service. For example, you might receive an e-mail that looks like it is from the company
who makes the anti-virus program you to use. The message says that there is an important
update to their software. They have conveniently attached a handy executable file that will
automatically fix your software.
Because the sender of the message cannot be verified, such messages should be immediately
discarded, as the attached file almost certainly contains a virus or hostile program.
You may receive a message from a friend that contains an attachment. In the message, your
friend might say that the attachment is a great game, or a handy utility, or anything else.
Computer systems infected with viruses can "hijack" email accounts and send these kinds of
messages to everyone in a person's address book. The message is not from your friend - it is
from a virus that has infected your friend's computer system.
Only open attachments when you have verified the sender's address. This applies to
attachments of any type, not just executable files. Viruses can be contained in almost any type
of file: videos, images, audio, office documents. Running an anti-virus program or a spam filter
provides some protection against these hostile mails, as they will warn you whenever you
download an infected file or a trojan. However, you should not count entirely on your anti-virus
programs or spam filters, because they are only effective against threats that they know about.
They cannot protect you from threats that have not yet been included in their definition files.
(That is why it is important to keep your anti-virus and anti-spam definition files up to date.)
The safest approach regarding email attachments is to never open an attachment unless you
are completely certain that it originates from a known, trusted source.
Compromise by malware
Even if you have verified all your email and have only opened those attachments that you have
deemed safe, your computer may still be infected by a virus. For example, your friend may have
inadvertently sent you a document that contains a virus. Malware detection can be difficult,
although it is usually detected by anti-virus programs (assuming that the definition files are
current, as described above). Signs of active malware can include:
If this happens to you, ensure your anti-virus program is up-to-date and then thoroughly scan
your system.
Pretty Good Privacy (PGP)
PGP and similar software follow the OpenPGP, an open standard of PGP encryption software,
standard (RFC 4880) for encrypting and decrypting data.
PGP fingerprint
A public key fingerprint is a shorter version of a public key. From a fingerprint, someone can
get the right corresponding public key. A fingerprint like C3A6 5E46 7B54 77DF 3C4C 9790
4D22 B3CA 5B32 FF66 can be printed on a business card. [
Compatibility
As PGP evolves, versions that support newer features and algorithms are able to create
encrypted messages that older PGP systems cannot decrypt, even with a valid private key.
Therefore, it is essential that partners in PGP communication understand each other's
capabilities or at least agree on PGP settings.
Confidentiality
PGP can be used to send messages confidentially. For this, PGP uses hybrid cryptosystem by
combining symmetric-key encryption and public-key encryption. The message is encrypted
using a symmetric encryption algorithm, which requires a symmetric key generated by the
sender. The symmetric key is used only once and is also called a session key. The message and
its session key are sent to the receiver. The session key must be sent to the receiver so they
know how to decrypt the message, but to protect it during transmission it is encrypted with the
receiver's public key. Only the private key belonging to the receiver can decrypt the session key,
and use it to symmetrically decrypt the message.
Digital signatures
PGP supports message authentication and integrity checking. The latter is used to detect
whether a message has been altered since it was completed (the message integrity property)
and the former, to determine whether it was actually sent by the person or entity claimed to be
the sender (a digital signature). Because the content is encrypted, any changes in the message
will result in failure of the decryption with the appropriate key. The sender uses PGP to create
a digital signature for the message with either the RSA or DSA algorithms. To do so, PGP
computes a hash (also called a message digest) from the plaintext and then creates the digital
signature from that hash using the sender's private key.
Web of trust
Both when encrypting messages and when verifying signatures, it is critical that the public key
used to send messages to someone or some entity actually does 'belong' to the intended
recipient. Simply downloading a public key from somewhere is not a reliable assurance of that
association; deliberate (or accidental) impersonation is possible. From its first version, PGP has
always included provisions for distributing users' public keys in an 'identity certification',
which is also constructed cryptographically so that any tampering (or accidental garble) is
readily detectable. However, merely making a certificate which is impossible to modify without
being detected is insufficient; this can prevent corruption only after the certificate has been
created, not before. Users must also ensure by some means that the public key in a certificate
actually does belong to the person or entity claiming it. A given public key (or more specifically,
information binding a user name to a key) may be digitally signed by a third party user to
attest to the association between someone (actually a user name) and the key. There are
several levels of confidence which can be included in such signatures. Although many
programs read and write this information, few (if any) include this level of certification when
calculating whether to trust a key.
The web of trust protocol was first described by Phil Zimmermann in 1992, in the manual for
PGP version 2.0:
As time goes on, you will accumulate keys from other people that you may want to designate as
trusted introducers. Everyone else will each choose their own trusted introducers. And
everyone will gradually accumulate and distribute with their key a collection of certifying
signatures from other people, with the expectation that anyone receiving it will trust at least
one or two of the signatures. This will cause the emergence of a decentralized fault-tolerant
web of confidence for all public keys.
The web of trust mechanism has advantages over a centrally managed public key
infrastructure scheme such as that used by S/MIME but has not been universally used. Users
have to be willing to accept certificates and check their validity manually or have to simply
accept them. No satisfactory solution has been found for the underlying problem.
Certificates
In the (more recent) OpenPGP specification, trust signatures can be used to support creation
of certificate authorities. A trust signature indicates both that the key belongs to its claimed
owner and that the owner of the key is trustworthy to sign other keys at one level below their
own. A level 0 signature is comparable to a web of trust signature since only the validity of the
key is certified. A level 1 signature is similar to the trust one has in a certificate authority
because a key signed to level 1 is able to issue an unlimited number of level 0 signatures. A
level 2 signature is highly analogous to the trust assumption users must rely on whenever they
use the default certificate authority list (like those included in web browsers); it allows the
owner of the key to make other keys certificate authorities.
PGP versions have always included a way to cancel ('revoke') identity certificates. A lost or
compromised private key will require this if communication security is to be retained by that
user. This is, more or less, equivalent to the certificate revocation lists of centralised PKI
schemes. Recent PGP versions have also supported certificate expiration dates.
The problem of correctly identifying a public key as belonging to a particular user is not unique
to PGP. All public key/private key cryptosystems have the same problem, even if in slightly
different guises, and no fully satisfactory solution is known. PGP's original scheme at least
leaves the decision as to whether or not to use its endorsement/vetting system to the user,
while most other PKI schemes do not, requiring instead that every certificate attested to by a
central certificate authority be accepted as correct.
S/MIME (Secure/Multipurpose Internet Mail Extensions)
Function
S/MIME provides the following cryptographic security services for electronic messaging
applications:
Authentication
Message integrity
Non-repudiation of origin (using digital signatures)
Privacy
Data security (using encryption)
for data enveloping (encrypting) where the whole (prepared) MIME entity to be enveloped is
encrypted and packed into an object which subsequently is inserted into an
application/pkcs7-mime MIME entity.
S/MIME certificates
Before S/MIME can be used in any of the above applications, one must obtain and install an
individual key/certificate either from one's in-house certificate authority (CA) or from a public
CA. The accepted best practice is to use separate private keys (and associated certificates) for
signature and for encryption, as this permits escrow of the encryption key without compromise
to the non-repudiation property of the signature key. Encryption requires having the
destination party's certificate on store (which is typically automatic upon receiving a message
from the party with a valid signing certificate). While it is technically possible to send a
message encrypted (using the destination party certificate) without having one's own certificate
to digitally sign, in practice, the S/MIME clients will require the user to install their own
certificate before they allow encrypting to others. This is necessary so the message can be
encrypted for both, recipient and sender, and a copy of the message can be kept (in the sent
folder) and be readable for the sender.
A typical basic ("class 1") personal certificate verifies the owner's "identity" only insofar as it
declares that the sender is the owner of the "From:" email address in the sense that the sender
can receive email sent to that address, and so merely proves that an email received really did
come from the "From:" address given. It does not verify the person's name or business name. If
a sender wishes to enable email recipients to verify the sender's identity in the sense that a
received certificate name carries the sender's name or an organization's name, the sender
needs to obtain a certificate ("class 2") from a CA who carries out a more in-depth identity
verification process, and this involves making inquiries about the would-be certificate holder.
For more detail on authentication, see digital signature.
Depending on the policy of the CA, the certificate and all its contents may be posted publicly
for reference and verification. This makes the name and email address available for all to see
and possibly search for. Other CAs only post serial numbers and revocation status, which does
not include any of the personal information. The latter, at a minimum, is mandatory to uphold
the integrity of the public key infrastructure.
S/MIME is sometimes considered not properly suited for use via webmail clients. Though
support can be hacked into a browser, some security practices require the private key to be
kept accessible to the user but inaccessible from the webmail server, complicating the key
advantage of webmail: providing ubiquitous accessibility. This issue is not fully specific to
S/MIME: other secure methods of signing webmail may also require a browser to execute
code to produce the signature; exceptions are PGP Desktop and versions of GnuPG, which
will grab the data out of the webmail, sign it by means of a clipboard, and put the signed
data back into the webmail page. Seen from the view of security this is a more secure
solution.
S/MIME is tailored for end-to-end security. Logically it is not possible to have a third party
inspecting email for malware and also have secure end-to-end communications. Encryption
will not only encrypt the messages, but also the malware. Thus if mail is not scanned for
malware anywhere but at the end points, such as a company's gateway, encryption will
defeat the detector and successfully deliver the malware. The only solution to this is to
perform malware scanning on end user stations after decryption. Other solutions do not
provide end-to-end trust as they require keys to be shared by a third party for the purpose
of detecting malware. Examples of this type of compromise are:
o Solutions which store private keys on the gateway server so decryption can occur prior
to the gateway malware scan. These unencrypted messages are then delivered to end
users.
o Solutions which store private keys on malware scanners so that it can inspect
messages content, the encrypted message is then relayed to its destination.
Due to the requirement of a certificate for implementation, not all users can take advantage
of S/MIME, as some may wish to encrypt a message, with a public/private key pair for
example, without the involvement or administrative overhead of certificates.
Any message that an S/MIME email client stores encrypted cannot be decrypted if the
applicable key pair's private key is unavailable or otherwise unusable (e.g., the certificate has
been deleted or lost or the private key's password has been forgotten). However, an expired,
revoked, or untrusted certificate will remain usable for cryptographic purposes. Indexing of
encrypted messages' clear text may not be possible with all email clients. Neither of these
potential dilemmas is specific to S/MIME but rather cipher text in general and do not apply to
S/MIME messages that are only signed and not encrypted.
S/MIME signatures are usually "detached signatures": the signature information is separate
from the text being signed. The MIME type for this is multipart/signed with the second part
having a MIME subtype of application/(x-)pkcs7-signature. Mailing list software is notorious
for changing the textual part of a message and thereby invalidating the signature; however, this
problem is not specific to S/MIME, and a digital signature only reveals that the signed content
has been changed.
Security issues
On May 13, 2018, the Electronic Frontier Foundation (EFF) announced critical vulnerabilities
in S/MIME, together with an obsolete form of PGP that is still used, in many email
clients.[3] Dubbed EFAIL, this is a particularly critical hit to S/MIME that will require
significant coordinated effort by many email client vendors to fix
ZIP is an archive file format that supports lossless data compression. A ZIP file may contain
one or more files or directories that may have been compressed. The ZIP file format permits a
number of compression algorithms, though DEFLATE is the most common. This format was
originally created in 1989 and released to the public domain on February 14, 1989 by Phil
Katz, and was first implemented in PKWARE, Inc.'s PKZIP utility,[2] as a replacement for the
previous ARC compression format by Thom Henderson. The ZIP format was then quickly
supported by many software utilities other than PKZIP. Microsoft has included built-in ZIP
support (under the name "compressed folders") in versions of Microsoft Windows since 1998.
Apple has included built-in[3] ZIP support in Mac OS X 10.3 (via BOMArchiveHelper,
now Archive Utility) and later. Most free operating systems have built in support for ZIP in
similar manners to Windows and Mac OS X.
ZIP files generally use the file extensions .zip or .ZIP and the MIME media
type application/zip .[1] ZIP is used as a base file format by many programs, usually under a
different name. When navigating a file system via a user interface, graphical icons representing
ZIP files often appear as a document or other object prominently featuring a zipper.
Standardization
In April 2010, ISO/IEC JTC 1 initiated a ballot to determine whether a project should be
initiated to create an ISO/IEC International Standard format compatible with ZIP. [25] The
proposed project, entitled Document Packaging, envisaged a ZIP-compatible 'minimal
compressed archive format' suitable for use with a number of existing standards
including OpenDocument, Office Open XML and EPUB.
In 2015, ISO/IEC 21320-1 "Document Container File — Part 1: Core" was published which
states that "Document container files are conforming Zip files". [26]
ISO/IEC 21320-1:2015 requires the following main restrictions of the ZIP file format:
Files in ZIP archives may only be stored uncompressed, or using the "deflate" compression
(i.e. compression method may contain the value "0" - stored or "8" - deflated).
The encryption features are prohibited.
The digital signature features are prohibited.
The "patched data" features are prohibited.
Archives may not span multiple volumes or be segmented.
Design
.ZIP files are archives that store multiple files. ZIP allows contained files to be compressed
using many different methods, as well as simply storing a file without compressing it. Each file
is stored separately, allowing different files in the same archive to be compressed using
different methods. Because the files in a ZIP archive are compressed individually it is possible
to extract them, or add new ones, without applying compression or decompression to the entire
archive. This contrasts with the format of compressed tar files, for which such random-access
processing is not easily possible.
A directory is placed at the end of a ZIP file. This identifies what files are in the ZIP and
identifies where in the ZIP that file is located. This allows ZIP readers to load the list of files
without reading the entire ZIP archive. ZIP archives can also include extra data that is not
related to the ZIP archive. This allows for a ZIP archive to be made into a self-extracting archive
(application that decompresses its contained data), by prepending the program code to a ZIP
archive and marking the file as executable. Storing the catalog at the end also makes possible
hiding a zipped file by appending it to an innocuous file, such as a GIF image file.
The .ZIP format uses a 32-bit CRC algorithm and includes two copies of the directory structure
of the archive to provide greater protection against data loss.
Structure
A ZIP file is correctly identified by the presence of an end of central directory record which is
located at the end of the archive structure in order to allow the easy appending of new files. If
the end of central directory record indicates a non-empty archive, the name of each file or
directory within the archive should be specified in a central directory entry, along with other
metadata about the entry, and an offset into the ZIP file, pointing to the actual entry data. This
allows a file listing of the archive to be performed relatively quickly, as the entire archive does
not have to be read to see the list of files. The entries within the ZIP file also include this
information, for redundancy, in a local file header. Because ZIP files may be appended to, only
files specified in the central directory at the end of the file are valid. Scanning a ZIP file for local
file headers is invalid (except in the case of corrupted archives), as the central directory may
declare that some files have been deleted and other files have been updated.
For example, we may start with a ZIP file that contains files A, B and C. File B is then deleted
and C updated. This may be achieved by just appending a new file C to the end of the original
ZIP file and adding a new central directory that only lists file A and the new file C. When ZIP
was first designed, transferring files by floppy disk was common, yet writing to disks was very
time consuming. If you had a large zip file, possibly spanning multiple disks, and only needed
to update a few files, rather than reading and re-writing all the files, it would be substantially
faster to just read the old central directory, append the new files then append an updated
central directory.
The order of the file entries in the central directory need not coincide with the order of file
entries in the archive.
Each entry stored in a ZIP archive is introduced by a local file header with information about
the file such as the comment, file size and file name, followed by optional "extra" data fields,
and then the possibly compressed, possibly encrypted file data. The "Extra" data fields are the
key to the extensibility of the ZIP format. "Extra" fields are exploited to support the ZIP64
format, WinZip-compatible AES encryption, file attributes, and higher-resolution NTFS or Unix
file timestamps. Other extensions are possible via the "Extra" field. ZIP tools are required by
the specification to ignore Extra fields they do not recognize.
The ZIP format uses specific 4-byte "signatures" to denote the various structures in the file.
Each file entry is marked by a specific signature. The end of central directory record is
indicated with its specific signature, and each entry in the central directory starts with the 4-
byte central file header signature.
There is no BOF or EOF marker in the ZIP specification. Conventionally the first thing in a ZIP
file is a ZIP entry, which can be identified easily by its local file header signature. However, this
is not necessarily the case, as this not required by the ZIP specification - most notably, a self-
extracting archive will begin with an executable file header.
Tools that correctly read ZIP archives must scan for the end of central directory record
signature, and then, as appropriate, the other, indicated, central directory records. They must
not scan for entries from the top of the ZIP file, because (as previously mentioned in this
section) only the central directory specifies where a file chunk starts and that it has not been
deleted. Scanning could lead to false positives, as the format does not forbid other data to be
between chunks, nor file data streams from containing such signatures. However, tools that
attempt to recover data from damaged ZIP archives will most likely scan the archive for local
file header signatures; this is made more difficult by the fact that the compressed size of a file
chunk may be stored after the file chunk, making sequential processing difficult.
Most of the signatures end with the short integer 0x4b50, which is stored in little-
endian ordering. Viewed as an ASCII string this reads "PK", the initials of the inventor Phil
Katz. Thus, when a ZIP file is viewed in a text editor the first two bytes of the file are usually
"PK". (DOS, OS/2 and Windows self-extracting ZIPs have an EXE before the ZIP so start with
"MZ"; self-extracting ZIPs for other operating systems may similarly be preceded by executable
code for extracting the archive's content on that platform.)
The .ZIP specification also supports spreading archives across multiple file-system files.
Originally intended for storage of large ZIP files across multiple floppy disks, this feature is now
used for sending ZIP archives in parts over email, or over other transports or removable media.
The FAT filesystem of DOS has a timestamp resolution of only two seconds; ZIP file records
mimic this. As a result, the built-in timestamp resolution of files in a ZIP archive is only two
seconds, though extra fields can be used to store more precise timestamps. The ZIP format has
no notion of time zone, so timestamps are only meaningful if it is known what time zone they
were created in.
In September 2007, PKWARE released a revision of the ZIP specification providing for the
storage of file names using UTF-8, finally adding Unicode compatibility to ZIP
File headers
All multi-byte values in the header are stored in little-endian byte order. All length fields count
the length in bytes.
8 2 Compression method
14 4 CRC-32
18 4 Compressed size
22 4 Uncompressed size
30 n File name
The extra field contains a variety of optional data such as OS-specific attributes. It is divided
into chunks, each with a 16-bit ID code and a 16-bit length.
Data descriptor
If the bit at offset 3 (0x08) of the general-purpose flags field is set, then the CRC-32 and file
sizes are not known when the header is written. The fields in the local header are filled with
zero, and the CRC-32 and size are appended in a 12-byte structure (optionally preceded by a 4-
byte signature) immediately after the compressed data:
Data descriptor
0/4 4 CRC-32
4 2 Version made by
10 2 Compression method
16 4 CRC-32
20 4 Compressed size
24 4 Uncompressed size
Relative offset of local file header. This is the number of bytes between the start
of the first disk on which the file occurs, and the start of the local file header. This
42 4
allows software reading the central directory to locate the position of the file inside
the ZIP file.
46 n File name
This ordering allows a ZIP file to be created in one pass, but the central directory is also placed
at the end of the file in order to facilitate easy removal of files from multiple-part (e.g. "multiple
floppy-disk") archives, as previously discussed.
Compression methods
The .ZIP File Format Specification documents the following compression methods: Store (no
compression), Shrink, Reduce (levels 1-4), Implode, Deflate,
Deflate64, bzip2, LZMA (EFS), WavPack, and PPMd.[28] The most commonly used compression
method is DEFLATE, which is described in IETF RFC 1951.
Compression methods mentioned, but not documented in detail in the specification include:
PKWARE Data Compression Library (DCL) Implode, IBM TERSE, and IBM LZ77 z Architecture
(PFS). A "Tokenize" method was reserved for a third party, but support was never added.
Encryption
New features including new compression and encryption (e.g. AES) methods have been
documented in the ZIP File Format Specification since version 5.2. A WinZip-developed AES-
based standard is used also by 7-Zip and Xceed, but some vendors use other
formats.[29] PKWARE SecureZIP also supports RC2, RC4, DES, Triple DES encryption methods,
Digital Certificate-based encryption and authentication (X.509), and archive header
encryption.[30]
File name encryption is introduced in .ZIP File Format Specification 6.2, which encrypts
metadata stored in Central Directory portion of an archive, but Local Header sections remain
unencrypted. A compliant archiver can falsify the Local Header data when using Central
Directory Encryption. As of version 6.2 of the specification, the Compression Method and
Compressed Size fields within Local Header are not yet masked.
ZIP64
The original .ZIP format had a 4 GiB (232 bytes) limit on various things (uncompressed size of a
file, compressed size of a file, and total size of the archive), as well as a limit of 65,535 (2 16)
entries in a ZIP archive. In version 4.5 of the specification (which is not the same as v4.5 of any
particular tool), PKWARE introduced the "ZIP64" format extensions to get around these
limitations, increasing the limits to 16 EiB (264 bytes). In essence, it uses a "normal" central
directory entry for a file, followed by an optional "zip64" directory entry, which has the larger
fields.[31]
The File Explorer in Windows XP does not support ZIP64, but the Explorer in Windows Vista
and later do.[citation needed] Likewise, some extension libraries support ZIP64, such as DotNetZip,
QuaZIP[32] and IO::Compress::Zip in Perl. Python's built-in zipfile supports it since 2.5 and
defaults to it since 3.4.[33] OpenJDK's built-in java.util.zip supports ZIP64 from version Java
7.[34] Android Java API support ZIP64 since Android 6.0. [35] Mac OS Sierra's Archive Utility
notably does not support ZIP64, and can create corrupt archives when ZIP64 would be
required.[36] However, the ditto command shipped with Mac OS will unzip ZIP64 files. [37] More
recent versions of Mac OS ship with the zip and unzip command line tools which do support
Zip64: to verify run zip -v and look for "ZIP64_SUPPORT".
The .ZIP file format allows for a comment containing up to 65,535 (2 16−1) bytes of data to occur
at the end of the file after the central directory.[27] Also, because the central directory specifies
the offset of each file in the archive with respect to the start, it is possible for the first file entry
to start at an offset other than zero, although some tools, for example gzip, will not process
archive files that do not start with a file entry at offset zero.
This allows arbitrary data to occur in the file both before and after the ZIP archive data, and for
the archive to still be read by a ZIP application. A side-effect of this is that it is possible to
author a file that is both a working ZIP archive and another format, provided that the other
format tolerates arbitrary data at its end, beginning, or middle. Self-extracting archives (SFX),
of the form supported by WinZip, take advantage of this, in that they are executable (.exe) files
that conform to the PKZIP AppNote.txt specification, and can be read by compliant zip tools or
libraries.
This property of the .ZIP format, and of the JAR format which is a variant of ZIP, can be
exploited to hide rogue content (such as harmful Java classes) inside a seemingly harmless file,
such as a GIF image uploaded to the web. This so-called GIFAR exploit has been demonstrated
as an effective attack against web applications such as Facebook. [38]
Limits
The minimum size of a .ZIP file is 22 bytes. Such an empty zip file contains only an End of
Central Directory Record (EOCD):
[0x50,0x4B,0x05,0x06,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x0
0,0x00,0x00,0x00,0x00,0x00]
The maximum size for both the archive file and the individual files inside it is
4,294,967,295 bytes (232−1 bytes, or 4 GiB minus 1 byte) for standard ZIP. For ZIP64, the
maximum size is 18,446,744,073,709,551,615 bytes (264−1 bytes, or 16 EiB minus 1 byte).
Proprietary extensions
Extra field
.ZIP file format includes an extra field facility within file headers, which can be used to store
extra data not defined by existing ZIP specifications, and which allow compliant archivers that
do not recognize the fields to safely skip them. Header IDs 0–31 are reserved for use by
PKWARE. The remaining IDs can be used by third-party vendors for proprietary usage.
In another controversial move, PKWare applied for a patent on 16 July 2003 describing a
method for combining ZIP and strong encryption to create a secure file. [41]
In the end, PKWARE and WinZip agreed to support each other's products. On 21 January
2004, PKWARE announced the support of WinZip-based AES compression format.[42] In a later
version of WinZip beta, it was able to support SES-based ZIP files.[43] PKWARE eventually
released version 5.2 of the .ZIP File Format Specification to the public, which documented SES.
The Free Software project 7-Zip also supports AES in ZIP files (as does its POSIX port p7zip).
When using AES encryption under WinZip, the compression method is always set to 99, with
the actual compression method stored in an AES extra data field. [44] In contrast, Strong
Encryption Specification stores the compression method in the basic file header segment of
Local Header and Central Directory, unless Central Directory Encryption is used to
mask/encrypt metadata.
Implementation
There are numerous .ZIP tools available, and numerous .ZIP libraries for various programming
environments; licenses used include proprietary and free software. WinZip, WinRAR, Info-
ZIP, 7-Zip, PeaZip and B1 Free Archiver are well-known .ZIP tools, available on various
platforms. Some of those tools have library or programmatic interfaces.
Some development libraries licensed under open source agreement are libzip and Info-ZIP. For
Java: Java Platform, Standard Edition contains the package "java.util.zip" to handle standard
.ZIP files; the Zip64File library specifically supports large files (larger than 4 GB) and treats
.ZIP files using random access; and the Apache Ant tool contains a more complete
implementation released under the Apache Software License.
The Info-ZIP implementations of the .ZIP format adds support for Unix filesystem features,
such as user and group IDs, file permissions, and support for symbolic links. The Apache
Ant implementation is aware of these to the extent that it can create files with predefined Unix
permissions. The Info-ZIP implementations also know how to use the error correction
capabilities built into the .ZIP compression format. Some programs do not, and will fail on a file
that has errors.
The Info-ZIP Windows tools also support NTFS filesystem permissions, and will make an
attempt to translate from NTFS permissions to Unix permissions or vice versa when extracting
files. This can result in potentially unintended combinations, e.g. .exe files being created on
NTFS volumes with executable permission denied.
Versions of Microsoft Windows have included support for .ZIP compression in Explorer since
the Microsoft Plus! pack was released for Windows 98. Microsoft calls this feature "Compressed
Folders".[citation needed] Not all .ZIP features are supported by the Windows Compressed Folders
capability. For example, Unicode entry encoding is not supported until Windows 7, while split
and spanned archives are not readable or writable by the Compressed Folders feature, nor is
AES Encryption supported.[45]
Microsoft Office started using the zip archive format in 2006 for their Office Open XML .docx,
.xlsx, .pptx, etc. files, which became the default file format with Microsoft Office 2007.
Legacy
There are numerous other standards and formats using "zip" as part of their name. For
example, zip is distinct from gzip, and the latter is defined in an IETF RFC (RFC 1952). Both zip
and gzip primarily use the DEFLATE algorithm for compression. Likewise, the ZLIB format
(IETF RFC 1950) also uses the DEFLATE compression algorithm, but specifies different
headers for error and consistency checking. Other common, similarly named formats and
programs with different native formats include 7-Zip, bzip2, and rzip.
Concerns
The theoretical maximum compression factor for a raw DEFLATE stream is about 1032 to
one,[46] but by exploiting the ZIP format in unintended ways, ZIP archives with compression
ratios of billions to one can be constructed. These "zip bomb"s unzip to extremely large sizes
overwhelming the capacity of the computer it is running on
Radix – 64 Conversion
To decode radix 64 encoded text, typically four characters are converted to three bytes. If the
string contains a single padding character (i.e. '=') the last four characters (including the
padding character) will decode to only two bytes, while '==' indicates that the four characters
will decode to only a single byte.
Base64 encoding is commonly used when there is a need to transmit binary data over media
that do not correctly handle binary data and is designed to deal with textual data belonging to
the 7-bit US-ASCII charset only.
One example of such a system is Email (SMTP), which was traditionally designed to work with
plain text data in the 7-bit US-ASCII character set. Although, It was later extended to support
non US-ASCII text messages as well as non-text messages such as audio and images, It is still
recommended to encode the data to ASCII charset for backward compatibility.
Base64 encoding encodes any binary data or non-ASCII text data to printable ASCII format so
that it can be safely transmitted over any communication channel. For example, when you
send an email containing an image to your friend, your email software Base64 encodes the
image and inserts the equivalent text into the message like so -
Base64 encoding works with a 65-character subset of the US-ASCII charset. The first 64
characters out of the 65-character subset are mapped to an equivalent 6-bit binary sequence
( 26 = 64 ). The extra 65th character ( = ) is used for padding.
Each of the 6-bit binary sequences from 0 to 63 are assigned a Base64 alphabet. This mapping
between the 6-bit binary sequence and the corresponding Base64 alphabet is used during the
encoding process. Following is the mapping table which is also called the Base64 index or
alphabet table -
The Base64 encoding algorithm receives an input stream of 8-bit bytes. It processes the input
from left to right and organizes the input into 24-bit groups by concatenating three 8-bit bytes.
These 24-bit groups are then treated as 4 concatenated 6-bit groups. Finally, each 6-bit group
is converted to a single character in the Base64 alphabet by consulting the above Base64
alphabet table.
When the input has fewer than 24 bits at the end, zero bits are added (on the right) to form an
integral number of 6-bit groups. Then, one or two pad ( = ) characters are output depending on
The last chunk of input contains exactly 8 bits: Four zero bits are added to form two 6-
bit groups. Each 6-bit group is converted to the resulting Base64 encoded character using
the Base64 index table. After that two pad ( = ) characters are appended to the output.
The last chunk of input contains exactly 16 bits: Two zero bits are added to form three
6-bit groups. Each of the three 6-bit groups is converted to the corresponding Base64
alphabet. Finally a single pad ( = ) character is appended to the output.
Input: ab@yz
Step 1: Organize the input into 24-bit groups (having four 6-bit groups each). Pad with zero
bits at the end to form an integral no of 6-bit groups.
011000 010110 001001 000000 011110 010111 101000 # (padded with two zeros at the end)
Step 2: Convert the 6-bit sequences to Base64 alphabets by indexing into the Base64 index
table. Add pad character if zero bits are added at the end of the input.
24 22 9 0 30 23 40
Indexing into the Base64 alphabet table gives the following output:
YWJAeXo= # (padded with `=` to account for extra bits added)
RFC 4648 describes a Base64 encoding variant which is URL and Filename Safe. That means,
the output produced by this Base64 encoding variant can be safely transmitted in URLs and
used in filenames.
This variant has a simple change to the Base64 alphabet. Since + and / characters have
special meaning within URLs and filenames, they are replaced with hyphen ( - ) and underscore
(_)
0A 17 R 34 i 51 z
1B 18 S 35 j 52 0
2C 19 T 36 k 53 1
3D 20 U 37 l 54 2
4E 21 V 38 m 55 3
5F 22 W 39 n 56 4
6G 23 X 40 o 57 5
7H 24 Y 41 p 58 6
8I 25 Z 42 q 59 7
9J 26 a 43 r 60 8
10 K 27 b 44 s 61 9
11 L 28 c 45 t 62 - (hyphen)
12 M 29 d 46 u 63 _ (underscore)
13 N 30 e 47 v
14 O 31 f 48 w (pad) =
15 P 32 g 49 x
16 Q 33 h 50 y
PGP needs (pseudo-) random numbers, amongst others to generate temporary keys. To
generate these numbers, PGP uses two so-called pseudo-random number generators or PRNGs.
The first is the ANSI X9.17 generator. The second is a function which measures the entropy
from the latency in a user's keystrokes. The random pool (which is the randseed.bin file) is
used to seed the ANSI X9.17 PRNG (which uses IDEA, not 3DES). Randseed.bin is initially
generated from trueRand which is the keystroke timer. The X9.17 generator is pre-washed with
an MD5 hash of the plaintext and postwashed with some random data which is used to
generate the next randseed.bin file. The process is broken up and discussed below.
where
IP security (IPSec)
The IP security (IPSec) is an Internet Engineering Task Force (IETF) standard suite of
protocols between 2 communication points across the IP network that provide data
authentication, integrity, and confidentiality. It also defines the encrypted, decrypted and
authenticated packets. The protocols needed for secure key exchange and key management are
defined in it.
Uses of IP Security –
IPsec can be used to do the following things:
To encrypt application layer data.
To provide security for routers sending routing data across the public internet.
To provide authentication without encryption, like to authenticate that the data originates
from a known sender.
To protect network data by setting up circuits using IPsec tunneling in which all data is
being sent between the two endpoints is encrypted, as with a Virtual Private
Network(VPN) connection.
Components of IP Security –
It has the following components:
Working of IP Security –
1. The host checks if the packet should be trasmitted using IPsec or not. These packet
traffic triggers the security policy for themselves. This is done when the system sending
the packet apply an appropriate encryption. The incomming packets are also checked by
the host that they are encrypted properly or not.
2. Then the IKE Phase 1 starts in which the 2 hosts( using IPsec ) authenticate themselves
to each other to start a secure channel. It has 2 modes. The Main mode which provides
the greater security and the Aggressive mode which enables the host to establish an
IPsec circuit more quickly.
3. The channel created in the last step is then used to securely negotiate the way the IP
circuit will encrypt data accross the IP circuit.
4. Now, the IKE Phase 2 is conducted over the secure channel in which the two hosts
negotiate the type of cryptographic algorithms to use on the session and agreeing on
secret keying material to be used with those algorithms.
5. Then the data is exchanged accross the newly created IPsec encrypted tunnel. These
packets are encrypted and decrypted by the hosts using IPsec SAs.
6. When the communacation between the hosts is completed or the session times out then
the IPsec tunnel is terminated by discarding the keys by both the hosts.
IPSec Architecture
IPSec (IP Security) architecture uses two protocols to secure the traffic or data flow. These
protocols are ESP (Encapsulation Security Payload) and AH (Authentication Header). IPSec
Architecture include protocols, algorithms, DOI, and Key Management. All these components
are very important in order to provide the three main services:
1. Architecture:
Architecture or IP Security Architecture covers the general concepts, definitions, protocols,
algorithms and security requirements of IP Security technology.
2. ESP Protocol:
ESP(Encapsulation Security Payload) provide the confidentiality service. Encapsulation
Security Payload is implemented in either two ways:
ESP with optional Authentication.
ESP with Authentication.
Packet Format:
Encryption algorithm:
Encryption algorithm is the document that
describes various encryption algorithm used for
Encapsulation Security Payload.
4.AH Protocol:
Authentication Header covers the packet format and general issue related to the use of AH for
packet authentication and integrity.
5. Authentication Algorithm:
Authentication Algorithm contains the set of the documents that describe authentication
algorithm used for AH and for the authentication option of ESP.
6. DOI (Domain of Interpretation):
DOI is the identifier which support both AH and ESP protocols. It contains values needed for
documentation related to each other.
7. Key Management:
Key Management contains the document that describes how the keys are exchanged between
sender and receiver.
Authentication Header
Next Header: Next header field points to next protocol header that follows the AH header. It can
be a Encapsulating Security Payload (ESP) header, a TCP header or a UDP header (depending
on the network application).
Payload Length: specifies the length of AH in 32-bit words (4-byte units), minus 2.
Security Parameter Index (SPI): The Security Parameter Index (SPI) field contains the Security
Parameter Index, is used to identify the security association used to authenticate this packet.
Sequence Number: Sequence Number field is the number of messages sent from the sender to
the receiver using the current SA. The initial value of the counter is 1. The function of this field
is to enable replay protection, if required.
Authentication Data: The Authentication Data field contains the result of the Integrity Check
Value calculation, that can be used by the receiver to check the authentication and integrity of
the packet. This field is padded to make total length of the AH is an exact number of 32-bit
words. RFC 2402 requires that all AH implementations support at least HMAC-MD5-96 and
HMAC-SHA1-96.
Encapsulating Security Protocol (ESP)
Messages, documents, and files sent via the internet are transmitted in the form of data
packets using one or more transfer mechanisms or protocols such as TCP/IP. But how can we
ensure that the information received is the authentic material which the originator of the
message claims to have sent? That its confidentiality has been preserved along the way? And
that it retains the integrity and authenticity of the source material?
One way to ensure all that is through Encapsulating Security Payload or ESP.
Encapsulating Security Payload (or ESP) is a transport layer security protocol designed to
function with both the IPv4 and IPv6 protocols. It takes the form of a header inserted after the
Internet Protocol or IP header, before an upper layer protocol like TCP, UDP, or ICMP, and
before any other IPSec headers that have already been put in place .
ESP gives protection to upper layer protocols, with a Signed area indicating where a protected
data packet has been signed for integrity, and an Encrypted area which indicates the
information that‘s protected with confidentiality. Unless a data packet is being tunneled, ESP
protects only the IP data payload (hence the name), and not the IP header.
ESP may be used to ensure confidentiality, the authentication of data origins, connectionless
integrity, some degree of traffic-level confidentiality, and an anti-replay service (a form of partial
sequence integrity which guards against the use of commands or credentials which have been
captured through password sniffing or similar attacks).
The set of services provided by ESP depends on the options selected when a Security
Association (or SA) was established, and also on the location of the service‘s deployment within
the network configuration.
In practice, the ESP header is placed after the IP header and before the next layer protocol
header when used in transport mode (see below), or before an encapsulated IP header in tunnel
mode. The ESP header itself consists of two parts: a Security Parameters Index, and a
sequence number.
Sequence Number
32-bit sequence number is also mandatory, and always present – even if a receiver elects not
to enable the anti-replay service for a given Security Association. The sender is obliged to
always transmit this field of the ESP header, whose processing (or not) is left to the discretion
of the recipient.
When an SA is set up, the counters at both the sender‘s and receiver‘s end are initialized to
zero. The first packet sent using a given SA will have a value of 1. Intended as a mechanism for
anti-replay protection, the sequence number for subsequent transmissions increases in single
steps from 1, and is never allowed to cycle.
The receiver checks this field to verify that a packet for a Security Association bearing this
number has not been received already. The packet is rejected if one has been received.
The Payload
This is a variable-length data field containing the information described by the Next header
field. This field is mandatory, and its length must be an integral number of bytes.Any algorithm
requiring explicit, per-packet synchronization data to be used in encrypting the payload must
indicate the payload data length, any structure for such data, and the location of this
information as part of an RFC specification on how to use the algorithm with ESP.
Padding
To ensure that the ciphertext resulting from data packet encryption terminates on a 4 byte
boundary (and regardless of any other requirements laid down by the encryption algorithm or
block cipher), some padding in the 0 to 255 bytes range is used for 32-bit alignment. The 4
byte boundary condition is necessary to ensure the correct positioning of the Authentication
data field, if present.
Padding Length
This is an 8-bit figure which specifies the length of the Padding field in bytes. The Padding
Length is used by the data recipient as a criterion to judge whether or not to accept or discard
the Padding field received.
Next Header
This field indicates the nature of the payload (e.g., TCP or UDP). The Next Header is an IPv4 or
IPv6 protocol number describing the format of the Payload data field.
Transport mode
When used in transport mode, the ESP header follows the IP header of the original IP
datagram. If the datagram carries an IPSec header, then the ESP header goes before this. The
ESP trailer and optional authentication data are inserted after the payload.
Transport mode doesn‘t authenticate or encrypt the IP header, which can potentially expose the
addressing information to attackers while the packet is in transit. But although it doesn‘t
provide as much security protection as tunnel mode, hosts typically use ESP in transport
mode, as this requires less processing power.
Tunnel mode
Encapsulation or protective coverage occurs more extensively in tunnel mode, which creates
and uses a new IP header as the outermost IP header of a datagram. This is followed by the
ESP header, then the original datagram (which includes both the IP header and the original
payload). As in transport mode, the ESP trailer and optional authentication data are appended
to the payload.
In tunnel mode, ESP completely protects the original datagram, which now forms the payload
data for the newly formed ESP data packet. Again, though, ESP does not protect the new IP
header. Gateways are required to use ESP in tunnel mode.
The combined use of encryption and authentication under ESP reduces processor overhead,
and reduces a system‘s vulnerability to denial-of-service (DoS) attacks.
Security Associations.
Once the services are selected and the algorithms chosen to implement those services, the two
peers must exchange or implement session keys required by the algorithms. Is this beginning
to sound complicated? How can you keep track of all these choices and decisions? The security
association is the mechanism IPSec uses to manage these decisions and choices for each IPSec
communication session. A basic component of configuring IPSec services on a client, router,
firewall, or VPN concentrator is defining SA parameters.
Description Example
Authentication method used MD5
Encryption and hash algorithm 3DES
DH group used 2
Lifetime of the IKE SA in seconds or 86,400
kilobytes
Shared secret key values for the Preshared
encryption algorithms
At the IPSec level, SAs are unidirectional—one for each direction. A separate IPSec SA is
established for each direction of a communication session. Each IPSec peer is configured with
one or more SAs, defining the security policy parameters to use during an IPSec session. To
establish an IPSec session, peer 1 sends peer 2 a policy. If peer 2 can accept this policy, it
sends the policy back to peer 1. This establishes the two one-way SAs between the peers.
Key management is an important aspect of IPSec or any encrypted com- munication that
uses keys to provide information confidentiality and in- tegrity. Key management and the
protocols utilized are implemented to set up, maintain, and control secure relationships and
ultimately the VPN between systems.