Lecture 15
Lecture 15
Goals:
• The birthday paradox and the birthday attack
• Structure of cryptographically secure hash functions
• SHA series of hash functions
• Compact Python and Perl implementations for SHA-1 using
BitVector [Although SHA-1 is now considered to be fully broken (see Section 15.7.1), program-
ming it is still a good exercise if you are learning how to code Merkle type hash functions.]
Back to TOC
(As to what is meant by an “associative array”, think of a telephone directory that consists of
<name,number> pairs.) Those types of hash functions also play a central role in many modern
big-data processing algorithms. For example, in the MapReduce framework used in Hadoop, a hash
function is applied to the “keys’ related to the Map tasks in order to determine their bucket
addresses, with each bucket constituting a Reduce task. In this lecture, the notion of a hash function
3
Computer and Network Security by Avi Kak Lecture 15
Message: "The quick brown fox jumps over the lazy dog"
SHA1 hashcode: 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12
Message: "The quick brown fox jumps over the lazy dog"
SHA1 hashcode: 8de49570b9d941fb26045fa1f5595005eb5f3cf2
• The two hashcodes (or, message digests, if you would rather call
them that) shown above were produced by the following
interactive session with Python:
4
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
for their libraries. Confidentiality is not so critical here, but you want to make sure that the
library you are downloading is authentic by matching its hashcode with the one published by
6
Computer and Network Security by Avi Kak Lecture 15
the developer. ]
7
Computer and Network Security by Avi Kak Lecture 15
8
Computer and Network Security by Avi Kak Lecture 15
Party A Party B
MESSAGE
Compare
Calculate Calculate HASH
Hash K K Hash
HASH
(a)
Party A Party B
MESSAGE
Calculate HASH
Compare
Calculate
Hash Hash
HASH K
ENCRYPT K DECRYPT
concatenate Encrypted
MESSAGE Hash
(b)
Party A Party B
MESSAGE
Calculate HASH
Compare
Calculate
Hash Hash
A’s Public Key
HASH
concatenate Encrypted
MESSAGE Hash
(c)
9
Computer and Network Security by Avi Kak Lecture 15
Party A Party B
MESSAGE
Calculate HASH
Compare
Calculate
Hash Hash
HASH A’s Public Key
K K
(a)
Party A Party B
concatenate concatenate
Calculate
Compare
Message Shared Secret
Hash Only
HASH HASH
(b)
Party A Party B
concatenate
concatenate
Calculate
Compare
(c)
10
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
hold true trivially. However, note that a hash function must possess this property regardless of the
length of the messages. In other words, it should be just as difficult to recover from its hashcode a
]
message that is as short as, say, a single byte as a message that consists of millions of bytes.
11
Computer and Network Security by Avi Kak Lecture 15
• Hash functions that are not collision resistant can fall prey to
birthday attack. More on that later.
hashcode value. ]
12
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
∆(M) = X1 ⊕ X2 ⊕ · · · ⊕ Xm
Ym = Y1 ⊕ Y2 ⊕ · · · ⊕ Ym−1 ⊕ ∆(M)
• When you are hashing regular text and the character encoding
is based on ASCII (or its variants), the collision resistance
property of the XOR algorithm suffers even more because the
highest bit in every byte will be zero. Ideally, one would hope
that, with an N -bit hashcode, any particular message would
result in a given hashcode value with a probability of 21N . But
when the highest bit in each byte for each character is always 0,
some of the N bits in the hashcode will predictably be 0 with
14
Computer and Network Security by Avi Kak Lecture 15
15
Computer and Network Security by Avi Kak Lecture 15
Message = "abc"
length L = 24 bits
16
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
17
Computer and Network Security by Avi Kak Lecture 15
to answer our overall question stated in red on the previous page on account of the phrase “at least one”
in it. Also see the note in blue at the end of this section. ]
k
1
1 − 1 − (1)
N
k k
≈ 1 − 1 − = (2)
N N
19
Computer and Network Security by Avi Kak Lecture 15
its hashcode equal to a particular value h is obviously 1/N . Now consider a pool of just 2 messages. Speaking
colloquially (that is, without worrying about violating the rules of logic), as you might over a glass of wine at
a late-night party, the event that this pool has at least one message whose hashcode is h is made up of the
event that the first of the two messages has its hashcode equal to h or the event that the second of the two
messages has its hashcode equal to h. Since the two events are disjunctive, the probability that a pool of two
messages has at least one message whose hashcode is h is a sum of the individual probabilities in the
disjunction — that gives is a probability of 2/N . Generalizing this argument to a pool of k messages, we get
for the desired probability a value of k/N that was shown in Equation (2). But this formula, if
considered as a precise formula for the probability we are
looking for, couldn’t possibly be correct. As you can see, this
formula gives us absurd values for the probability when k
exceeds N .
20
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
possible pairs from a group of 20 people. Since this number, 190, is rather comparable
to 365, the total number of different birthdays, the conclusion is not
surprising.]
The birthday paradox states that given a group of 23
or more randomly chosen people, the probability that at least
two of them will have the same birthday is more than 50%. And
if we randomly choose 60 or more people, this probability is
greater than 90%. (These statements are based on the more
precise formulas shown in this section.) [A man on the street would
certainly think that it would take many more than 60 people for any two of them to have the same
birthday with near certainty. That’s why we refer to this as a ‘paradox.’ Note, however, it is NOT a
22
Computer and Network Security by Avi Kak Lecture 15
N!
1 − (3)
(N − k)!N k
23
Computer and Network Security by Avi Kak Lecture 15
N!
M1 = N × (N − 1) × . . . × (N − k + 1) = (4)
(N − k)!
– Let’s now try to figure out the total number of ways, M2, in
which we can construct a pool of k messages without
worrying at all about duplicate hashcodes. Reasoning as
before, there are N ways to choose the first message. For
selecting the second message, we pay no attention to the
hashcode value of the first message. There are still N ways
to select the second message; and so on. Therefore, the total
number of ways we can construct a pool of k messages
without worrying about hashcode duplication is
M2 = N × N × . . . × N = Nk (5)
24
Computer and Network Security by Avi Kak Lecture 15
25
Computer and Network Security by Avi Kak Lecture 15
k(k−1)
• Since 1 + 2 + 3 + . . . + (k − 1) is equal to we can write 2
,
the following expression for the lower bound on the probability
k(k−1)
1 − e− 2N (12)
• We will now use Equation (12) to estimate the size k of the pool
so that the pool contains at least one pair of messages with
equal hashcodes with a probability of 0.5. We need to solve
26
Computer and Network Security by Avi Kak Lecture 15
k(k−1) 1
1 − e− 2N =
2
Simplifying, we get
k(k−1)
e 2N = 2
Therefore,
k(k − 1)
= ln2
2N
which gives us
k(k − 1) = (2ln2)N
k2 ≈ (2ln2)N (14)
implying
q
k ≈ (2ln2)N
√
≈ 1.18 N
√
≈ N
27
Computer and Network Security by Avi Kak Lecture 15
28
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
• Now the question is: “What is the probability that the two
sets of contracts will have at least one contract each with
the same hashcode?”
29
Computer and Network Security by Avi Kak Lecture 15
30
Computer and Network Security by Avi Kak Lecture 15
1 1
• Since 1 − is always less than e− N , the above probability will
N
always be greater than
k 2
− N1
1 − e
which gives us
q √ √
k = (ln 2)N = 0.83 N ≈ N
√
So if B is willing to generate N versions of the both the
correct contract and the fraudulent contract, there is better
than an even chance that B will find a fraudulent version to
replace the correct version.
31
Computer and Network Security by Avi Kak Lecture 15
32
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
15.7 STRUCTURE OF
CRYPTOGRAPHICALLY SECURE HASH
FUNCTIONS
33
Computer and Network Security by Avi Kak Lecture 15
• The final block also includes the total length of the message
whose hash function is to be computed. This step enhances
the security of the hash function since it places an
additional constraint on the counterfeit messages.
• For the n-bit input, the first stage is supplied with a special
n-bit pattern called the Initialization Vector (IV).
• The function f that processes the two inputs, one n bits long
and the other b bits long, to produce an n bit output is usually
called the compression function. That is because, usually,
b > n, so the output of the f function is shorter than the length
of the input message segment.
34
Computer and Network Security by Avi Kak Lecture 15
Hash
Initialization f f f
Vector n bits n bits n bits n bits
35
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
Here is what the different columns of the above table stand for:
– The column heading Block Size is the size of each bit block
36
Computer and Network Security by Avi Kak Lecture 15
37
Computer and Network Security by Avi Kak Lecture 15
creating two different PDFs with the same SHA-1 hash value.
To compare SHAttered with the theoretical attack mentioned in
the previous bullet, the authors of SHAttered say their attack
took 263 SHA-1 compressions. Note that document formats like
PDF that contain macros appear to be particularly vulnerable
to attacks like SHAttered. Such documents may lend themselves
to what is known as the chosen-prefix collision attack in which
given two different message prefixes p1 and p2, the goal is to
find two suffixes s1 and s2 so that the hash value for the
concatenation p1||s1 is the same as for the concatenation p2||s2.]
39
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
STEP 1: Pad the message so that its length is an integral multiple of 1024
bits, the block size. The only complication here is that the last 128 bits
of the last block must contain a value that is the length of the message.
40
Computer and Network Security by Avi Kak Lecture 15
• The last 128 bits of what gets hashed are reserved for the message
length value.
• Leaving aside the trailing 128 bit positions, the padding consists of
a single 1-bit followed by the required number of 0-bits.
Padding +
Length
Block 1 Block 2 Block N
M1 M2 MN
Initialization
Vector
Hash
512 bits f f f
H0 H1 H2 HN−1 HN
512 bits 512 bits 512 bits 512 bits 512 bits
42
Computer and Network Security by Avi Kak Lecture 15
43
Computer and Network Security by Avi Kak Lecture 15
where
σ0(x) = ROT R1 (x) ⊕ ROT R8 (x) ⊕ SHR7 (x)
σ1(x) = ROT R19 (x) ⊕ ROT R61 (x) ⊕ SHR6 (x)
44
Computer and Network Security by Avi Kak Lecture 15
• How the contents of the hash buffer are processed along with the
inputs Wi and Ki is referred to as implementing the round
function.
h = g
g = f
f = e
e = d +64 T1
d = c
45
Computer and Network Security by Avi Kak Lecture 15
c = b
b = a
a = T1 +64 T2
X
T1 = h +64 Ch(e, f, g) +64 e +64 Wi +64 Ki
X
T2 = a +64 Maj(a, b, c)
Ch(e, f, g) = (e AN D f ) ⊕ (N OT e AN D g)
Maj(a, b, c) = (a AN D b) ⊕ (a AN D c) ⊕ (b AN D c)
• The output of the 80th round is added to the content of the hash
buffer at the beginning of the round-based processing. This
addition is performed separately on each 64-bit word of the
output of the 80th modulo 264. In other words, the addition is
46
Computer and Network Security by Avi Kak Lecture 15
carried out separately for each of the eight registers of the hash
buffer modulo 264.
Finally, ....: After all the N message blocks have been processed
(see Figure 4), the content of the hash buffer is the message
digest.
47
Computer and Network Security by Avi Kak Lecture 15
Mi H
i−1
Compression function f
Message
Schedule
Eight 64−bit registers of
a b c d e f g h the 512 bit hash buffer
W
0
Round 0
K0
a b c d e f g h
W
1
Round 1
K1
a b c d e f g h
W
79
Round 79
K
79
a b c d e f g h
64
+ + + + + + + + Addition Modulo 2
a b c d e f g h
H
i
48
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
• Despite its having been broken, SHA-1 can still serve as a useful
stepping stone if you are learning how to write code for Merkle
type hash functions. My goal in this section is to demonstrate
my Python and Perl implementations for SHA-1 in order to
help you do the same for SHA-512 in the second of the
programming homeworks at the end of this lecture.
• Even more specifically, my goal here is to show how you can use
my BitVector modules (Algorithm::BitVector in Perl and
BitVector in Python) to create compact implementations for
cryptographically secure hash algorithms. Typical
implementations of the SHA algorithms consist of several
hundred lines of code. With BitVector in Python and
Algorithm::BitVector in Perl, you can do the same with a
couple of dozen lines.
49
Computer and Network Security by Avi Kak Lecture 15
h0 = 67452301
h1 = ef cdab89
h2 = 98badcf e
h3 = 10325476
h4 = c3d2e1f 0
• The goal of the compression function for each block of 512 bits
of the message is to process a new 512-bit block along with the
160-bit hash code produced for the previous block to output the
51
Computer and Network Security by Avi Kak Lecture 15
160-bit hashcode for the new block. The final 160-bit hashcode
is the SHA-1 digest of the message.
And, for the fourth and the final 20 round sequence, we have
f = b ⊕ c ⊕ d
k = 0xca62c1d6
52
Computer and Network Security by Avi Kak Lecture 15
e = d
d = c
c = b << 30
b = a
a = T
rounds of processing, the value of b is used directly for the hashcode for the current 512-bit input block.]
53
Computer and Network Security by Avi Kak Lecture 15
sha1_from_command_line.pl string_whose_hash_you_want
#!/usr/bin/env python
## sha1_from_command_line.py
## by Avi Kak ([email protected])
## February 19, 2013
## Modified: March 2, 2016
## Call syntax:
##
## sha1_from_command_line.py your_message_string
54
Computer and Network Security by Avi Kak Lecture 15
import sys
import BitVector
if BitVector.__version__ < ’3.2’:
sys.exit("You need BitVector module of version 3.2 or higher" )
from BitVector import *
if len(sys.argv) != 2:
sys.stderr.write("Usage: %s <string to be hashed>\n" % sys.argv[0])
sys.exit(1)
message = sys.argv[1]
bv = BitVector(textstring = message)
length = bv.length()
bv1 = bv + BitVector(bitstring="1")
length1 = bv1.length()
howmanyzeros = (448 - length1) % 512
zerolist = [0] * howmanyzeros
bv2 = bv1 + BitVector(bitlist = zerolist)
bv3 = BitVector(intVal = length, size = 64)
bv4 = bv2 + bv3
words = [None] * 80
for n in range(0,bv4.length(),512):
block = bv4[n:n+512]
words[0:16] = [block[i:i+32] for i in range(0,512,32)]
for i in range(16, 80):
words[i] = words[i-3] ^ words[i-8] ^ words[i-14] ^ words[i-16]
words[i] << 1
a,b,c,d,e = h0,h1,h2,h3,h4
for i in range(80):
if (0 <= i <= 19):
f = (b & c) ^ ((~b) & d)
k = 0x5a827999
elif (20 <= i <= 39):
f = b ^ c ^ d
k = 0x6ed9eba1
elif (40 <= i <= 59):
f = (b & c) ^ (b & d) ^ (c & d)
k = 0x8f1bbcdc
elif (60 <= i <= 79):
55
Computer and Network Security by Avi Kak Lecture 15
f = b ^ c ^ d
k = 0xca62c1d6
a_copy = a.deep_copy()
T = BitVector( intVal = (int(a_copy << 5) + int(f) + int(e) + int(k) + \
int(words[i])) & 0xFFFFFFFF, size=32 )
e = d
d = c
b_copy = b.deep_copy()
b_copy << 30
c = b_copy
b = a
a = T
h0 = BitVector( intVal = (int(h0) + int(a)) & 0xFFFFFFFF, size=32 )
h1 = BitVector( intVal = (int(h1) + int(b)) & 0xFFFFFFFF, size=32 )
h2 = BitVector( intVal = (int(h2) + int(c)) & 0xFFFFFFFF, size=32 )
h3 = BitVector( intVal = (int(h3) + int(d)) & 0xFFFFFFFF, size=32 )
h4 = BitVector( intVal = (int(h4) + int(e)) & 0xFFFFFFFF, size=32 )
message_hash = h0 + h1 + h2 + h3 + h4
hash_hex_string = message_hash.getHexStringFromBitVector()
sys.stdout.writelines((hash_hex_string, "\n"))
475f6511376a8cf1cc62fa56efb29c2ed582fe18
#!/usr/bin/env perl
## sha1_from_command_line.pl
## by Avi Kak ([email protected])
## March 2, 2016
## Call syntax:
##
56
Computer and Network Security by Avi Kak Lecture 15
## sha1_from_command_line.pl your_message_string
use strict;
use warnings;
use Algorithm::BitVector 1.25;
my $message = shift;
my ($a,$b,$c,$d,$e) = ($h0,$h1,$h2,$h3,$h4);
my ($f,$k);
foreach my $i (16 .. 79) {
$words_bv[$i] = $words_bv[$i-3] ^ $words_bv[$i-8] ^ $words_bv[$i-14] ^ $words_bv[$i-16];
$words_bv[$i] = $words_bv[$i] << 1;
}
foreach my $i (0 .. 79) {
if (($i >= 0) && ($i <= 19)) {
$f = ($b & $c) ^ ((~$b) & $d);
$k = 0x5a827999;
} elsif (($i >= 20) && ($i <= 39)) {
$f = $b ^ $c ^ $d;
$k = 0x6ed9eba1;
57
Computer and Network Security by Avi Kak Lecture 15
• As you would expect, this script produces the same hash values
as the Python version shown earlier in this section:
sha1_from_command_line.pl 0 => b6589fc6ab0dc82cf12099d1c2d40ab994e8410c
475f6511376a8cf1cc62fa56efb29c2ed582fe18
58
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
59
Computer and Network Security by Avi Kak Lecture 15
sha256_from_command_line.py
The former is intended for hashing the contents of a text file that you
supply to the script as its command-line argument and the latter
intended to directly hash whatever string you place in the
command-line after the name of the script.
60
Computer and Network Security by Avi Kak Lecture 15
#!/usr/bin/env python
## sha256_file_based.py
## by Avi Kak ([email protected])
## January 3, 2018
## Call syntax:
##
## sha256_file_based.py your_file_name
## The above command line applies the SHA256 algorithm implemented here to the contents of the
## named file and prints out the hash value for the file at its standard output. NOTE: IT
## ADDS A NEWLINE AT THE END OF THE OUTPUT TO SHOW THE HASHCODE IN A LINE BY ITSELF.
import sys
import BitVector
if BitVector.__version__ < ’3.2’:
sys.exit("You need BitVector module of version 3.2 or higher" )
from BitVector import *
if len(sys.argv) != 2:
sys.stderr.write("Usage: %s <the name of the file>\n" % sys.argv[0])
sys.exit(1)
# The 8 32-words used for initializing the 256-bit hash buffer before we start scanning the
# input message block for its hashing. See page 13 (page 17 of the PDF) of the NIST standard.
# Note that the hash buffer consists of 8 32-bit words named h0, h1, h2, h3, h4, h5, h6, and h7.
h0 = BitVector(hexstring=’6a09e667’)
h1 = BitVector(hexstring=’bb67ae85’)
h2 = BitVector(hexstring=’3c6ef372’)
h3 = BitVector(hexstring=’a54ff53a’)
h4 = BitVector(hexstring=’510e527f’)
h5 = BitVector(hexstring=’9b05688c’)
h6 = BitVector(hexstring=’1f83d9ab’)
h7 = BitVector(hexstring=’5be0cd19’)
# The K constants (also referred to as the "round constants") are used in round-based processing of
# each 512-bit input message block. There is a 32-bit constant for each of the 64 rounds. These are
# as provided on page 10 (page 14 of the PDF) of the NIST standard. Note that these are ONLY USED
# in STEP 3 of the hashing algorithm where we take each 512-bit input message block through 64
# rounds of processing.
K = ["428a2f98", "71374491", "b5c0fbcf", "e9b5dba5", "3956c25b", "59f111f1", "923f82a4", "ab1c5ed5",
"d807aa98", "12835b01", "243185be", "550c7dc3", "72be5d74", "80deb1fe", "9bdc06a7", "c19bf174",
"e49b69c1", "efbe4786", "0fc19dc6", "240ca1cc", "2de92c6f", "4a7484aa", "5cb0a9dc", "76f988da",
"983e5152", "a831c66d", "b00327c8", "bf597fc7", "c6e00bf3", "d5a79147", "06ca6351", "14292967",
"27b70a85", "2e1b2138", "4d2c6dfc", "53380d13", "650a7354", "766a0abb", "81c2c92e", "92722c85",
"a2bfe8a1", "a81a664b", "c24b8b70", "c76c51a3", "d192e819", "d6990624", "f40e3585", "106aa070",
"19a4c116", "1e376c08", "2748774c", "34b0bcb5", "391c0cb3", "4ed8aa4a", "5b9cca4f", "682e6ff3",
"748f82ee", "78a5636f", "84c87814", "8cc70208", "90befffa", "a4506ceb", "bef9a3f7", "c67178f2"]
61
Computer and Network Security by Avi Kak Lecture 15
# STEP 1 OF THE HASHING ALGORITHM: Pad the input message so that its length is an integer multiple
# of the block size which is 512 bits. This padding must account
# for the fact that the last 64 bit of the padded input must store
# length of the input message:
bv = BitVector(textstring = message)
length = bv.length()
bv1 = bv + BitVector(bitstring="1")
length1 = bv1.length()
howmanyzeros = (448 - length1) % 512
zerolist = [0] * howmanyzeros
bv2 = bv1 + BitVector(bitlist = zerolist)
bv3 = BitVector(intVal = length, size = 64)
bv4 = bv2 + bv3
# Initialize the array of "words" for storing the message schedule for each block of the
# input message:
words = [None] * 64
for n in range(0,bv4.length(),512):
block = bv4[n:n+512]
# STEP 2 OF THE HASHING ALGORITHM: Now we need to create a message schedule for this 512-bit
# input block. The message schedule contains 64 words, each
# 32-bits long. As shown below, the first 16 words of the
# message schedule are obtained directly from the 512-bit
# input block:
words[0:16] = [block[i:i+32] for i in range(0,512,32)]
# Now we need to expand the first 16 32-bit words of the message schedule into a full schedule
# that contains 64 32-bit words. This involves using the functions sigma0 and sigma1 as shown
# below:
for i in range(16, 64):
i_minus_2_word = words[i-2]
i_minus_15_word = words[i-15]
# The sigma1 function is applied to the i_minus_2_word and the sigma0 function is applied
# to the i_minus_15_word:
sigma0 = (i_minus_15_word.deep_copy() >> 7) ^ (i_minus_15_word.deep_copy() >> 18) ^ \
(i_minus_15_word.deep_copy().shift_right(3))
sigma1 = (i_minus_2_word.deep_copy() >> 17) ^ (i_minus_2_word.deep_copy() >> 19) ^ \
(i_minus_2_word.deep_copy().shift_right(10))
words[i] = BitVector(intVal=(int(words[i-16]) + int(sigma1) + int(words[i-7]) +
int(sigma0)) & 0xFFFFFFFF, size=32)
# Before we can start STEP 3, we need to store the hash buffer contents obtained from the
# previous input message block in the variables a,b,c,d,e,f,g,h:
a,b,c,d,e,f,g,h = h0,h1,h2,h3,h4,h5,h6,h7
# STEP 3 OF THE HASHING ALGORITHM: In this step, we carry out a round-based processing of
# each 512-bit input message block. There are a total of
# 64 rounds and the calculations carried out in each round
# are referred to as calculating a "round function". The
# round function for the i-th round consists of permuting
# the previously calculated contents of the hash buffer
# registers as stored in the temporary variables
62
Computer and Network Security by Avi Kak Lecture 15
# STEP 4 OF THE HASHING ALGORITHM: The values in the temporary variables a,b,c,d,e,f,g,h
# AFTER 64 rounds of processing are now mixed with the
# contents of the hash buffer as calculated for the previous
# block of the input message:
h0 = BitVector( intVal = (int(h0) + int(a)) & 0xFFFFFFFF, size=32 )
h1 = BitVector( intVal = (int(h1) + int(b)) & 0xFFFFFFFF, size=32 )
h2 = BitVector( intVal = (int(h2) + int(c)) & 0xFFFFFFFF, size=32 )
h3 = BitVector( intVal = (int(h3) + int(d)) & 0xFFFFFFFF, size=32 )
h4 = BitVector( intVal = (int(h4) + int(e)) & 0xFFFFFFFF, size=32 )
h5 = BitVector( intVal = (int(h5) + int(f)) & 0xFFFFFFFF, size=32 )
h6 = BitVector( intVal = (int(h6) + int(g)) & 0xFFFFFFFF, size=32 )
h7 = BitVector( intVal = (int(h7) + int(h)) & 0xFFFFFFFF, size=32 )
# Concatenate the contents of the hash buffer to obtain a 512-element BitVector object:
message_hash = h0 + h1 + h2 + h3 + h4 + h5 + h6 + h7
sys.stdout.writelines((hash_hex_string, "\n"))
63
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
64
Computer and Network Security by Avi Kak Lecture 15
64 bits for the block length. We will also assume the key
length to be 56 bits.) Let’s say that an adversary can
observe {M, C(K, M )}.
67
Computer and Network Security by Avi Kak Lecture 15
68
Computer and Network Security by Avi Kak Lecture 15
+
K
ipad
+
K
HASH
opad
n bit hash
b bits
pad n−bit hash to b bits
b bits b bits
HASH
HMAC
n bits
69
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
70
Computer and Network Security by Avi Kak Lecture 15
• I’ll explain these and other related concepts with the help of the
CeroCoinClient module in Python that I specifically created for
this educational exercise. So I’d encourage the reader to visit
the following webpage in order to become familiar with the
CeroCoin virtual currency:
https://engineering.purdue.edu/kak/distCC/CeroCoinClient-2.0.1.html
71
Computer and Network Security by Avi Kak Lecture 15
and other wealthy individuals. Only 65 of these eggs were made and, according to the Wikipedia, 57
• Let’s now switch over to the virtual world and pose the
questions:
– Is it possible to create rare things in cyberspace?
72
Computer and Network Security by Avi Kak Lecture 15
the interval [0, 2N−1 ] and the fact that a good function should
distribute the output integers over the entire range uniformly,
the probability that you will discover a message whose hash is
upper bounded by t is given by t/2N . This probability can be
made as small as you wish by making t sufficiently small for any
reasonable value for N . Consider, for example, t = 1000 and
N = 256. Now we are talking about a rareness probability of
1000/2256, which is a number pretty close to zero.
• If you blow up the coin printout shown above, you will notice
the following entry in the last line:
POW_DIFFICULTY_LEVEL=252
74
Computer and Network Security by Avi Kak Lecture 15
MSG_HASH=0ce65c1874662ba9d64755a983db9056003a6065c80d068c4a1459df50489b6d
• The value of 252 for POW DIFFICULTY LEVEL would be much too
silly for discovering coins that would command any value in
the marketplace of crypto currencies. However, this value is
convenient if you want to show coin production during the
time limits imposed by a classroom demonstration of CeroCoin.
75
Computer and Network Security by Avi Kak Lecture 15
76
Computer and Network Security by Avi Kak Lecture 15
• After the miner has signed the coin, this is how it looks like:
CEROCOIN COIN_ID=e0051e79 CREATOR=e31645f56058f1e450ce17c86752bbc0fe74fa9a69f295323bf793ec7185a7ea
CREATOR_PUB_KEY=CEROCOIN-1.4,mod=cd0b1fa3615775745575d625b6cd2262c11cd50aedc2d89d5a3d3ed590e344a5d8bfcfed0581ddf7836ccda1eb1417b8ea2bc66047fe107c8643575c147c3577,e=10001
MESSAGE_STRING=e671b364d94e7f644c8c170bd6f2d86abee7b10961703d9d5bbcb251a346bbfca5b1ac01dbe476961b0aa70e8f95c4a2de38b7787ada49c5dd4bfd620c246b4316d91a1a8702825f01d1e34332a77a815a66a0e647
c1ef083d2e5292cfc73737b06ac15f41e0fe5a4077bafcb5cd48524cf88b0a7148771280da11d81c3e187a88b1cc2c8808e5e291795e603e092e65629cc2f25eba5cd26ad06aec1a69d850c2ef02e3aa064fdca0d8a16c5118bc9766b
3b31be07340dfb51beb1a9755f99b7b73d53e8f551d6c27af45075632d058aa6f34bf5e86b1639a5d203c94a78970fc16128b7c16774a16cc0744be23c039b48d89d059ca1a2a8196408c248734483e45a6de4c43d3f9c51fa2caf4ff
9929caaa6038088bbedeb1388b334831afea80621f4aba0c7221b8bc02182b1ac05316d6a80f2005bceca3859a3145ec72fa4c40853c7d3dc4459f5079afe2c9b5a22de314f0bc128463ee0689cca536109c1bdb3f93269a8d02f3dba188ffa8f72780
7a802bda71da177ca7b55ed35fdae33bda98a81cee6731251ef092da2e91d75809b3db3c1a3432c2a430ab61ebc5f297774a880842cf0e43f93464821fc3b2b8dcdc2fc96371bf9df31b83b42e1b5b57014cc1d64f9d6b
NONCE=3e45a6de4c43d3f9c51fa2caf4ff9929caaa6038088bbedeb1388b334831afea80621f4aba0c7221b8bc02182b1ac05316d6a80f2005bceca3859a3145ec72fa4c40853c7d3dc4459f5079afe2c9b5a22de314f0bc128463ee0689cca536109c
POW_DIFFICULTY_LEVEL=252 TIMESTAMP=1519751894.29 MSG_HASH=0ce65c1874662ba9d64755a983db9056003a6065c80d068c4a1459df50489b6d
SELLER_SIGNATURE=68e5afedaa0f438586cc4995e8f5e5cfafd7dc51fa56e38f5f58a25537e49ae71e5246a2ef9df4a3a44bdb22cc89662acfd4f4e0ba0157b64cc32e1aebe8f075
def find_buyer_for_new_coin_and_make_a_transaction(self):
’’’
We now look for a remote client with whom the new coin can be traded in the form of a
transaction. The remote client must first send over its public key in order to construct
the transaction.
’’’
mdebug = False
while len(self.outgoing_client_sockets) == 0: time.sleep(2)
while len(self.coins_currently_owned_digitally_signed) == 0: time.sleep(2)
time.sleep(4)
while True:
while len(self.coins_currently_owned_digitally_signed) == 0: time.sleep(2)
while self.blockmaker_flag or self.blockvalidator_flag: time.sleep(2)
self.transactor_flag = True
print("\n\nLOOKING FOR A CLIENT FOR MAKING A TRANSACTION\n\n")
coin = self.coins_currently_owned_digitally_signed.pop()
if coin is not None:
print("\nNew outgoing coin: %s" % coin)
buyer_sock = None
new_transaction = None
random.shuffle(self.outgoing_client_sockets)
sock = self.outgoing_client_sockets[0]
if sys.version_info[0] == 3:
sock.send(b"Send pub key for a new transaction\n")
else:
sock.send("Send pub key for a new transaction\n")
try:
while True:
while self.blockmaker_flag or self.blockvalidator_flag: time.sleep(2)
message_line_from_remote = ""
while True:
byte_from_remote = sock.recv(1)
if sys.version_info[0] == 3:
byte_from_remote = byte_from_remote.decode()
if byte_from_remote == ’\n’ or byte_from_remote == ’\r’:
break
78
Computer and Network Security by Avi Kak Lecture 15
else:
message_line_from_remote += byte_from_remote
if mdebug:
print("\n:::::::::::::message received from remote: %s\n" % message_line_from_remote)
if message_line_from_remote == "Do you have a coin to sell?":
if sys.version_info[0] == 3:
sock.send( b"I do. If you want one, send public key.\n" )
else:
sock.send( "I do. If you want one, send public key.\n" )
elif message_line_from_remote.startswith("BUYER_PUB_KEY="):
while self.blockmaker_flag or self.blockvalidator_flag: time.sleep(2)
buyer_pub_key = message_line_from_remote
print("\nbuyer pub key: %s" % buyer_pub_key)
new_transaction = self.prepare_new_transaction(coin, buyer_pub_key)
self.old_transaction = self.transaction
self.transaction = new_transaction
self.transactions_generated.append(new_transaction)
print("\n\nNumber of tranx in ’self.transactions_generated’: %d\n" % len(self.transactio
print("\n\nsending to buyer: %s\n" % new_transaction)
if sys.version_info[0] == 3:
tobesent = str(new_transaction) + "\n"
sock.send( tobesent.encode() )
else:
sock.send( str(new_transaction) + "\n" )
self.num_transactions_sent += 1
break
else:
print("seller side: we should not be here")
except:
print("\n\n>>>Seller to buyer: Could not maintain socket link with remote for %s\n" % str(socket
self.transactor_flag = False
time.sleep(10)
79
Computer and Network Security by Avi Kak Lecture 15
buyer_pub_key = buyer_pub_key,
seller_pub_key = ",".join(self.pub_key_string.split()),
pow_difficulty = self.pow_difficulty,
timestamp = str(time.time()),
)
digitally_signed_tranx = self.digitally_sign_transaction(new_tranx)
return digitally_signed_tranx
80
Computer and Network Security by Avi Kak Lecture 15
def construct_a_new_block_and_broadcast_the_block_to_cerocoin_network(self):
’’’
We pack the newly generated transactions in a new block for broadcast to the network
’’’
mdebug = False
time.sleep(10)
while len(self.transactions_generated) < self.num_transactions_in_block: time.sleep(2)
while True:
while len(self.transactions_generated) < self.num_transactions_in_block: time.sleep(2)
self.blockmaker_flag = True
print("\n\n\nPACKING THE ACCUMULATED TRANSACTIONS INTO A NEW BLOCK\n\n")
current_block_hash = min_pow_difficulty = None
if self.block is None:
current_block_hash = self.gen_rand_bits_with_set_bits(256)
min_pow_difficulty = self.pow_difficulty
self.blockchain_length = len(self.transactions_generated)
else:
current_block_hash = self.get_hash_of_block(self.block)
min_pow_difficulty = 0
for tranx in self.transactions_generated:
tranx_pow = int(self.get_tranx_prop(self.block, ’POW_DIFFICULTY’))
if tranx_pow > min_pow_difficulty:
min_pow_difficulty = tranx_pow
self.blockchain_length += len(self.transactions_generated)
new_block = CeroCoinBlock( block_id = self.gen_rand_bits_with_set_bits(32),
block_creator = self.ID,
82
Computer and Network Security by Avi Kak Lecture 15
transactions = str(self.transactions_generated),
pow_difficulty = min_pow_difficulty,
prev_block_hash = current_block_hash,
blockchain_length = self.blockchain_length,
timestamp = str(time.time()),
)
self.transactions_generated = []
new_block_with_signature = self.digitally_sign_block( str(new_block) )
print("\n\n\nWILL BROADCAST THIS SIGNED BLOCK: %s\n\n\n" % new_block_with_signature)
self.block = new_block_with_signature
for sock in self.outgoing_client_sockets:
if sys.version_info[0] == 3:
sock.send("Sending new block\n".encode())
else:
sock.send("Sending new block\n")
try:
while True:
message_line_from_remote = ""
while True:
byte_from_remote = sock.recv(1)
if sys.version_info[0] == 3:
byte_from_remote = byte_from_remote.decode()
if byte_from_remote == ’\n’ or byte_from_remote == ’\r’:
break
else:
message_line_from_remote += byte_from_remote
if mdebug:
print("\n::::::::::BLK: message received from remote: %s\n" % message_line_from_remo
if message_line_from_remote == "OK to new block":
if sys.version_info[0] == 3:
tobesent = self.block + "\n"
sock.send( tobesent.encode() )
else:
sock.send( self.block + "\n" )
break
else:
print("sender side for block upload: we should not be here")
except:
print("Block upload: Could not maintain socket link with remote for %s\n" % str(sockx))
self.blockmaker_flag = False
time.sleep(10)
83
Computer and Network Security by Avi Kak Lecture 15
• The reasons mentioned above imply that, at the least, you must
execute the coin mining code and the code that is in charge of
receiving and validating incoming blocks in two separate
threads or processes. Since a client can construct a new block
using a mixture of transactions, some of which are based on the
coins discovered by the client while others are based on the
coins acquired from other clients, you are going to have to also
run the block construction code in a separate thread or a
process of its own.
85
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
• While our focus so far in this Lecture has been on hashing for
message authentication, I’d be remiss if I did not touch even
briefly on the other extremely important use of hashing in
modern programming — efficient storage of associative arrays.
In general, the hash functions used in message authentication
are different from those used for efficient storage of information
and it is educational to see the reasons for why that is the case.
The goal of this section is to focus on this difference by
presenting examples of hash functions for efficient storage. I’ll
start with the concept of an associative array because that is
what is stored in the containers based on hash functions.
telephone operator responding to your query for the phone number for an individual had to linearly
scan through the entire directory to fetch that number? In a large metropolitan area with tens of
millions of people, a linear scan (or even binary search) through alphabetized sub-lists would take far
too long. ]
87
Computer and Network Security by Avi Kak Lecture 15
hash function for their hash based containers. Based on the idea
of prime numbers mentioned above, FNV is fast, in the sense
that it requires only two operations, one XOR and one multiply,
for each byte of a key. Here is a pseudocode description of the
FNV hash function:
hash = offset_basis
large prime, you can run into high collision rates if the keys are
such that, when translated into integers, the bit patterns
associated with them occupy mostly the high-level bits. You
see, the modulo operation, by its definition, discards a certain
number of high-level bits from the keys. For illustration,
consider calculating key values modulo 256 and assume that all
the keys when translated into integers have values larger than
256. In this case, since the remainders would all be zero, you
will have all the <key,value> pairs placed in the bucket with
address 0. Although such an extreme non-uniformity in the
distribution of the keys over the buckets does not happen when
the capacity is a prime, you may nonetheless end up with an
unacceptable level of collisions in certain buckets if the low-level
bits of the keys are mostly zeros.
91
Computer and Network Security by Avi Kak Lecture 15
92
Computer and Network Security by Avi Kak Lecture 15
Back to TOC
1. The very first step in the SHA1 algorithm is to pad the message
so that it is a multiple of 512 bits. This padding occurs as
follows (from NIST FPS 180-2): Suppose the length of the
message M is L bits. Append bit 1 to the end of the message,
followed by K zero bits where K is the smallest non-negative
solution to
L + 1 + K ≡ 448 (mod 512)
Next append a 64-bit block that is a binary representation of
the length integer L. For example,
Message = "abc"
length L = 24 bits
a b c <---423---> <---64---->
2. The fact that only the last 64 bits of the padded message are
used for representing the length of the message implies that
93
Computer and Network Security by Avi Kak Lecture 15
SHA1 should NOT be used for messages that are longer than
what?
94
Computer and Network Security by Avi Kak Lecture 15
8. Programming Assignment:
95
Computer and Network Security by Avi Kak Lecture 15
9. Programming Assignment:
96