0% found this document useful (0 votes)
52 views

Encryption Notes

The document discusses secret-key encryption techniques including common ciphers, attack models, and encryption modes. It also covers one-way hash functions, their properties, and applications including integrity verification, password verification, and message authentication codes.

Uploaded by

simon sylvester
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Encryption Notes

The document discusses secret-key encryption techniques including common ciphers, attack models, and encryption modes. It also covers one-way hash functions, their properties, and applications including integrity verification, password verification, and message authentication codes.

Uploaded by

simon sylvester
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Secret-key Encryption

 Common ciphers
o Monoalphabetic substitution cipher (can be broken w/ frequency analysis)
o Polyalphabetic substitution cipher
o DES (key size=56 bits, block size=64 bits), AES (key size=128,196… bits, block
size=128 bits)
 Attack models
o Ciphertext attack: the attacker only knows the ciphertext. If this does not lead to
leakage of further information, the encryption is considered secure.
o Plaintext attack: the attacker knows both plaintext and ciphertext. If this does not
lead to leakage of further information, the encryption is considered secure.
o Chosen plaintext attack: the attacker can choose a specific plaintext and obtain its
corresponding ciphertext. If this does not lead to leakage of further information,
the encryption is considered secure.
 Encryption modes
o Electronic Codebook Mode (ECB)
The problem of this encryption mode is the lack of diffusion: identical plaintext
will result in identical ciphertext, which means the structure of plaintext can be
leaked.

o Cipher Block Chaining (CBC)


In this mode, decryption can be done in parallel if all ciphertext blocks are
available. However, encryption cannot be done in parallel because the output of a
ciphertext block depends on the previous one.
The initialization vector (IV) is used to ensure same plaintext will not generate the
same ciphertext. IV is not considered a secret: it can be released without damage.
 Question: Suppose during the transmission, the fifth bit of the second
ciphertext block is corrupted. How many data loss will we face? Answer:
we will lose the entire second block as well as the fifth bit of the third
block.
o Cipher Feedback (CFB)
This is similar to CBC. One important property of CFB is that we have turned a
block cipher into a stream cipher, that is, we can do encryption and transmission
bit by bit.
o Output Feedback (OFB)
This mode is similar to CFB, except that the data is fed into the next block before
XOR operation. This allows encryption to be parallelizable without waiting for
the plaintext. OFB also has stream cipher property. It is essentially using a cipher
as random number generator.
o Counter Mode (CTR)
Counter mode works with a nounce and block counter. Obviously, both
encryption and decryption can be done in parallel. Like IV, nounce does not
necessary need to be kept secret.
 Modes that do not require padding: CFB, OFB, CTR (effectively stream ciphers)
 Initialization Vectors and common mistakes
IVs are not necessarily secrets. However, this does not mean that we can select them at
will. If we do not follow certain rules, this will result in severe security flaws. We discuss
the following scenarios provided that the encryption key stays the same.
o Common mistake: using the same IV
A basic requirement for IV is uniqueness, which means no IV should be reused
under the same key. For some cipher modes, reusing IV can be catastrophic. In
OFB, the use of static IV will make the encryption scheme vulnerable to known
plaintext attack. If the attacker knows both plaintext and ciphertext, the attacker
will be able to decrypt all subsequent ciphertext, if the IV is reused. That is
because the output of OFB will always be the same if the key and IV are identical.
o Common mistake: using predictable IV
If the IV is predictable, it will create security flaw in some encryption modes, e.g.
CBC. Using predictable IV will make CBC susceptible to chosen plaintext attack.
There are three assumptions under this scenario:
 The IV used for next message is predictable.
 CBC is used.
 The victim will encrypt any plaintext the attacker provides.
Then the attacker can guess what message was sent by the victim. He can do XOR
operation to his plaintext with the previous IV and next IV, and then he can
compare the resulting ciphertext to deduce if the previous plaintext is the same,
just as it is shown below.
One-way Hash Function
 What hash functions do: they generate a fixed length digest for a message of arbitrary
length.
 Properties of a cryptographic hash function
Denote the hash function by h
.
 One-way: Given a hashed value v
, it should be difficult to find a message m such that h(m)=v
 .
 Collision resistant: It should be difficult to find two messages m1 and m2 such that
h(m1)=h(m2)
  .
 Case study: the number game
A and B both come up with a number. If the sum is even, then A wins; otherwise B wins.
 The dilemma: anyone that releases his number first loses.
 Dealing the dilemma with a hash function
1. A chooses his number and sends the hashed value of his number to B.
2. When B acquires A’s hash value, B can disclose his number to A.
3. After receiving B’s answer, A reveals his answer. B can verify A indeed chose
this number by comparing the hash value.
1. This is fair for A because of one-way property. Given the hash value, it is
difficult for B to know what number is chosen by A.
2. This is fair for B because of collision resistant property. Because it is difficult
for A to find multiple values that hash to the same value. That is, the value
revealed in the third step is indeed the value A chose.
 Common hash functions
 The Message Digest (MD) series: MD2, MD4, MD5
MD2 and MD4 are severely flawed and should not be used. The collision-resistant
property of MD5 is broken, yet it is still one-way.
 The Secure Hash Algorithm (SHA) series: SHA1, SHA-2, SHA-3
 How hash function works
Most hash functions use a similar construction structure called Merkle–Damgård construction.
Input data is broken into blocks of fixed size, with a padding added to the last block. Each block
and the output of the previous iteration are fed into a compression function; the first iteration
uses a fixed value called IV as one of its inputs.
Notice that SHA-3 does not use this structure anymore.
 Applications of hash functions
 Integrity verification
If we change a bit in the message, its hash value would be completely different.
Therefore, we can use the hash value to determine if a document/file has been modified
or not.
 Committing a secret without telling it
One can prove that he knows a specific secret without telling it. He can simply hash the
secret and then disclose the hashed value. The one-way proerty makes it almost
impossible for others to get the secret given the hashed value. The collision resistant
property makes it almost impossible to change the secret without being noticed after
disclosing the hashed value.
 Password verification
It’s unwise to save passwords as plaintext because every user will be compromised if the
password database is stolen. If we store the hashed value of password instead, due to the
one-way property, it is difficult for the attacker to get user’s password.
o The use of salt
If multiple users have the same password, their hashed value will be the same. To
avoid this situation, we usually hash the password concatenated with a random
string called salt. This guarantees that the hashed value will not be the same even
two users have identical passwords.
o On Linux, the hashed password is acquired by hashing the password-salt mixture
5000 times. This will slow the hashing process by a factor of 5000, which
effectively slows down brute-force attack.
o None of IV, nounce and salt are necessarily confidential.
 Trusted timestamping
Sometimes we would like to prove we have the copyright of a digital document without
publishing it. To do so, we can use a service called trusted timestamping. Basically,
instead of publishing the entire digital content, one only publishes the hashed value of the
content. He needs to publish the hash value to a printed media or a Time Stamping
Authority (TSA). The TSA will sign the hash with their private key to certify its validity.
 Message Authentication Code (MAC)
MAC is used to detect whether the message has been modified or not during
transmission. We can use one-way hash functions to implement MAC. Obviosuly, we
cannot just use the hash of a message as the MAC, because this allows anyone to forge
the MAC. We need to concatenate a secret key with the actual message first, and then
compute the hash. As it turns out, whether to put the key before or after the message will
affect the security of resulting MAC significantly.
 Length extension attack
Denote the secret key by K
and message by M. The correct way to generate MAC is to compute Hash(M∥K). If one
computes MAC by Hash(K∥M)
, it will lead to security loopholes!
Review the Merkle–Damgård construction process as below. If the we compute MAC by
Hash(K∥M)
, it is possible to extend the length of M and generate the correct MAC without knowing what
the key is! More concretely, the attacker needs to know padding P, then given any message T, he
can get the MAC by computing Hash(K∥M∥P∥T). This is because the Merkle–Damgård
construction process breaks down message into blocks and use the chained compression function
technique to compute the output. We the attacker needs to do is to insert P and T
 into the chain as if they are a complete message.

 Key-Hash MAC Algorithm (HMAC)


It is really important to avoid rebuilding wheels in cryptography, since the tiniest error can
lead to severe security flaws. Almost all existing libraries and algorithms are carefully tweaked
to enhance security. There is a well-known algorithm to generate MAC given key and message.
We must need to call HMAC(K,M)
 
o .
 Hash Collision Attacks
 Forging fake public-key certificates
Suppose an attacker can find two certificates that shares the same hash value but with
different common names. For example, the first one’s CN is example.com, and the
second one’s CN is attacker’s own attacker32.com. Then he can let the CA sign the
second version, and he will effectively have a valid certificate for example.com.
This idea can be extended to forging fake signed programs, PDF documents and so on.
 Generating two different files with the same MD5 hash
The tool developed by Marc Stevens can generate two files that share the same MD5
value. The prefix of two files are the same. For example, message one and two are shown
as below.
$ cat message1.bin | xxd
00000000: 4dc9 68ff 0ee3 5c20 9572 d477 7b72 1587 M.h...\ .r.w{r..
00000010: d36f a7b2 1bdc 56b7 4a3d c078 3e7b 9518 .o....V.J=.x>{..
00000020: afbf a200 a828 4bf3 6e8e 4b55 b35f 4275 .....(K.n.KU._Bu
00000030: 93d8 4967 6da0 d155 5d83 60fb 5f07 fea2 ..Igm..U].`._...

$ cat message2.bin | xxd


00000000: 4dc9 68ff 0ee3 5c20 9572 d477 7b72 1587 M.h...\ .r.w{r..
00000010: d36f a7b2 1bdc 56b7 4a3d c078 3e7b 9518 .o....V.J=.x>{..
00000020: afbf a202 a828 4bf3 6e8e 4b55 b35f 4275 .....(K.n.KU._Bu
00000030: 93d8 4967 6da0 d1d5 5d83 60fb 5f07 fea2 ..Igm...].`._...
The MD5 and SHA-1 sum of these two messages are shown as below.
$ md5sum message1.bin message2.bin
008ee33a9d58b51cfeb425b0959121c9 message1.bin
008ee33a9d58b51cfeb425b0959121c9 message2.bin

$ sha1sum message1.bin message2.bin


c6b384c4968b28812b676b49d40c09f8af4ed4cc message1.bin
c728d8d93091e9c7b87b43d9e33829379231d7ca message2.bin
If the hash function happen to use Merkle–Damgård construction, then we can use the
length extension technique to append a common suffix to both message1.bin and
message2.bin, and the resulting hash value of them will still be the same.
 Generating two programs with the same MD5 hash
We can use the same idea above to generate two programs with the same MD5 hash.
Suppose the program is given as follows. Assume the xyz array is filled with 200 'A'’s.
#include <stdio.h>
unsigned char xyz[200] = {"..."}; // fill with actual content

int main(){
int i;
for(i = 0; i < 200; ++i){
printf("%x ", xyz[i]);
}
}
Now, we can locate the xyz array inside the program’s binary, and we can divide the
program into three parts:
1. The prefix (whose length must be a multiple of 64)
2. The center (whose length must be 128)
3. The suffix
The center must be inside array xyz completely since it needs to be filled with arbitrary
content without affecting the program’s control logic. We run the MD5 collision
generator on prefix+center, and we require the prefix part of two generated messages to
be the same. As a result, we are able to come up with two versions of this program, which
can be represented by
4. Version 1: prefix+Q
5. Version 2: prefix+P
, where P and Q are different, but both versions have the same hash value. The next step
is to use the length extension technique to concatenate the suffix to these two versions.
As a result, we have created two programs
6. Program 1: prefix+Q+suffix
7. Program 2: prefix+P+suffix
which have identical hash value but have different data stored in xyz array.
To alter the control logic of program 1 and program 2 in the attacker’s favor, one can
check if xyz is still filled with all A’s in later code sections. If xyz is not all A’s, then the
program can start to execute some malicious code. That is to say, we can create two
programs that have the same hash value, but one is benign and the other one is malicious.

You might also like