0% found this document useful (0 votes)
13 views232 pages

Lecture#1 Merged

The document outlines the course CSE345/545 - Foundations of Computer Security, taught by Dr. Arun Balaji Buduru, covering various aspects of computer security including vulnerabilities, threats, and security properties. It emphasizes the importance of understanding both technical and human factors in security, and provides an overview of the course structure, topics, and evaluation methods. The course aims to equip students with practical knowledge and skills to enhance security in various systems.

Uploaded by

LEMPUU JJJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views232 pages

Lecture#1 Merged

The document outlines the course CSE345/545 - Foundations of Computer Security, taught by Dr. Arun Balaji Buduru, covering various aspects of computer security including vulnerabilities, threats, and security properties. It emphasizes the importance of understanding both technical and human factors in security, and provides an overview of the course structure, topics, and evaluation methods. The course aims to equip students with practical knowledge and skills to enhance security in various systems.

Uploaded by

LEMPUU JJJ
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 232

CSE345/545 - Winter 2025

Foundations of Computer Security


Introduction

Dr. Arun Balaji Buduru


Founding Head, Usable Security Group (USG)
Associate Professor, Dept. of CSE | HCD, IIIT-Delhi, India
Visiting Faculty, Indiana University – Bloomington, USA
Background
1

 Name: Arun Balaji Buduru


 Education:
 B.E. in CSE, Anna University-Chennai [2011]
 Ph.D. in CS, Arizona State University, USA [2017]
◼ Specialization in Information Assurance under Prof. Stephen S. Yau
◼ Dissertation: User Centric Approach to Securing IoT Devices through
Probabilistic Human Behavior Learning
 Current Affiliation
 Founding Head, Usable Security Group (USG) @IIIT-Delhi
 Associate Professor, Dept. of CSE | HCD @IIIT-Delhi
 Visiting Faculty @Indiana University’s Luddy School of Informatics,
Computing and Engineering – Bloomington, USA
 Research Interests: Affective AI, Usable Security, Computational
Linguistics, and Continuous Authentication. More details at
https://faculty.iiitd.ac.in/~arunb/
Smart World Environment
2
Why study Computer Security
3

 Process: Increasing need to accommodate the emergence


of multiple perimeters and moving parts in the network
 Technology: Ever Changing and increasingly advanced
targeted threats
 People: Cyber systems breaches, including those in Internet
of Things (IoT) system, have shown increasing sophistication
in attacks
 Attackers
are no longer limited by resources, including human
and computing power
Why is Cyber Security so difficult?
4

 No Full Stack Development


 Dynamic Environments

 Higher ROI

 Lack of User Awareness


Recent Attacks
5
Threat Landscape
6
Racing Analogy for Security
7

 Information Security is
like brakes in a race car.
Wait, What?

 Adaptive application of brakes are what allows the car


to turn corners, maneuver around traffic, and ultimately
win the race
What do we do?
8

 You need to think like a Criminal to catch a criminal

➢ Understand them better


➢ They usually target weak
points in tools/apps, users
and network
➢ They continuously learn
and adapt
➢ They are incredibly
motivated and persistent
Weaknesses in the System
9

 Weakness 1: Issues with tools/apps/devices


 Weakness 2: User Behaviors – Lack of
Awareness
 Weakness 3: Complex Networks
Weakness 1: Issues with tools/apps/devices

 Jeremiah Grossman discovered how to hack users’ DSL routers without


their knowledge [12, 16]
 Use router’s default password in attack, e.g., <img
src=“http://admin:[email protected]/”>
 Then attacker manipulates the router as desired
 E.g., change DNS to attacker-controlled server, <img
src=”http://192.168.1.1/changeDNS?newDNS=143.23.45.1
” />
 CSRF lets attacker control organization’s Intranet, bypassing firewalls
[12]
 Attacker can map network, scan ports using JS
 Capture unique files, error messages, etc. for each IP address, determine
software, servers running on network
 Change network password, access all Web-enabled devices on network
Weakness 1: Issues with tools/apps/devices
11
 #include <cstdio>
#include <cstring>
#include <iostream>
const char *PASSWORD_FILE = "rictro"; Shell Code Injection:
int main()
{ unsigned char cde[] =
char input[8];
char password[8]; "\xeb\x1f\x5e\x89\x76\x08\x31\xc0”
“\x88\x46\x07\x89\x46\x0c\xb0\x0b"
std::sscanf(PASSWORD_FILE, "%s", password);
"\x89\xf3\x8d\x4e\x08\x8d\x56\x0c”
std::cout << "Enter password: "; “\xcd\x80\x31\xdb\x89\xd8\x40\xcd"
std::cin >> input;
"\x80\xe8\xdc\xff\xff\xff/bin/ls";
// Debug prints:
// std::cout << "Address of input: " << &input << "\n";
// std::cout << "Address of password: " << &password << "\n";
// std::cout << "Input: " << input << "\n";
// std::cout << "Password: " << password << "\n";
if (std::strncmp(password, input, 8) == 0)
std::cout << "Access granted\n";
else
std::cout << "Access denied\n";
return 0;
}
Weakness 2: User Behaviors
12
Security solutions are
only as strong as their
weakest link(s)

Stanford Research Study claims 88% Of Data Breaches Are


Caused By Human Error
Weakness 2: User Behaviors
13
(Static vs. Dynamic)
Environment (Observable vs.
Partially Observable)

(perfect vs.
Imperfect) (Full vs.
(Instantaneous vs.
Partial satisfaction) Durative)
Goals
(Deterministic vs.
Stochastic)

What action next?


*Taken from lecture slides of Prof. Subbarao Khambhampati
Weakness 3: Networks
14
 Internet is very Decentralized
 Spoofing is built-into the system design
What would you do?
15
IA Areas related to CSE

Database &
Big Data

Computer Data Operating


Networks Security Systems
Computer Computer
Network System
Security
Information Security
Assurance
User-
Software Centric
Security Security

Software Data Science


Engineering
Why should you take this class?
18

- Overview about security topics


- To decide on further study
- Topic of interest
- To learn about vulnerabilities while using Internet
- Required class for Mtech in IS
- Learning to protect your data from unauthorized access
- Desire/Opportunity to break into systems
Course Philosophy
19

 Objectives
 Introducevarious security topics
 Whirlwind tour of various security topics

 Learning
 Lectures,questions, discussions
 Guest lectures (if possible)

 Evaluation
 Exams

 Assignments/quizzes/Homeworks

 Course project (Learning by doing / Real world issues)


Topics to be covered
20

 CIA Model
 Security protocols
 Identity management
 Cryptography
 Economics of Information Security
 Information hiding and watermarking techniques
 Privacy
 Buffer Overflow
 Blockchain/Digital Currency
Topics to be covered
21

 Network Security
 ACL Mechanism
 Threat Modeling
 Security Testing
 Critical infrastructure protection
 Ethical hacking
 …
Post conditions for the course
22

 Get introduced to various topics of Computer Security

 Through project, student will be able to understand,


implement, and test techniques covered in class

 Get hands-on experience with various security tools


Protocol inside the class
23
Attendance policy
24

 We will take attendance on random days through


surprise quizzes

 Your attendance has to be > 50% to pass the course

 It will be tough for you to understand the slides without


attending the class!
Grading
25

 Assignments: 15%
 Quizzes: 10%

 Exams: 35%

 Mid-term: 15%
 Final: 20%

 Project: 35%
 Participation/discussions: 5% ☺

 Don’t worry too much about the grades, it is all

about learning and having fun!


Project
26

 35% of the overall grade


 25% for Development
 10% for Testing

 Develop a web application (requirements will be


shared)
 Team of four members

 At least 50% of the project should be original

 Platform and Language independent

 Project will be graded based on peer testing


Plagiarism / Cheating
27

 the practice of taking someone else's work or ideas and


passing them off as one's own.
 in assignments first instance reduces the grade by one,
second instance gets an F
 in exam, at least an F (maybe disciplinary action as
well.)
Online discussion
28

 Google classroom
 Please register yourself here ASAP
 We expect you to post at least one question or answer
one question or make a comment, etc. once a week
 Remember, there is some data to show that students who
do well on these online discussion forums are some who
understand or have understood the topics well
Administrativia
29

 Course Website:
 Google Classroom code: gi7roxd

 All materials will be shared or pointed out in the


class/notes
Office Hours
30

 Class Schedule:
 Time: M Th 0930-1100hrs
 Class Venue: Old Acad C21

 Instructor: Dr. Arun Balaji Buduru


 E-mail: [email protected]
 Office hours: T Fr 1030 –1130hrs and by appointments

 Office: B-504 (R&D Block)

 Website: http://faculty.iiitd.ac.in/~arunb/

 Teaching Assistants:
 To be Announced
31

Thank you
CSE345/545 - Winter 2025
Foundations of Computer Security
Lecture 1: Security Components

Dr. Arun Balaji Buduru


Founding Head, Usable Security Group (USG)
Associate Professor, Dept. of CSE | HCD, IIIT-Delhi, India
Visiting Faculty, Indiana University – Bloomington, USA
What would you do?
1
Know the Characters
Alice, the average
user (Bob)

Trudy, the bad guy

Dan,
2
the admin
Alice’s view of Security
3

 I just want to finish my work


 Financial transactions
 Transferring files

 I don’t do much on the Internet so I am safe


 I don’t have any PII (Personally Identifiable Information)
on my machine
 I don’t want somebody to keep on tab on what I am
doing
Dan’s view of Security
4

 How do I convince users that having a strong password


helps?
 What technology, process, or people skills can I use to
reduce
 Attackson my machines
 Customer or complaint calls / emails
Trudy’s view of Security
5

 How can I guess his / her password?


 Can I exploit any weaknesses / loop holes in the
systems?
 Can I exploit human behavior?
 Social engineering
Security Properties
6

 Five main security properties:


 Confidentiality – No unauthorized information gathering
 Integrity – Data has not been (maliciously) altered

 Availability – Data/services can be accessed as desired

 Accountability – Actions traceable to those responsible

 Authentication – User or data origin accurately


identifiable
Confidentiality
7
Integrity
8
Authentication
9
Availability
10
Which Property is Violated?
11

 Hacker gets access to a classified information from a


machine
 You are not able to access the bank’s site for transaction
 You break into IIIT-D machine to change your grades
 The online session keeps expiring when you are trying to
a transaction on the bank’s website
Whole-System is Critical
12

 Securing a system involves a whole-system view


 Cryptography

 Implementation

 People

 Physical security
 Everything in between

 This is because “security is only as strong as the weakest


link,” and security can fail in many places
 No reason to attack the strongest part of a system if you can
walk right around it.
Analyzing the Security of a System
13

 First thing: Summarize the system as clearly and


concisely as possible
 Criticalstep. If you can’t summarize the system clearly and
concisely, how can you analyze it’s security?
 Next steps:
 Identify the assets: What do you wish to protect?
 Identify the adversaries and threats

 Identify vulnerabilities: Weaknesses in the system

 Calculate the risks

 Evaluate controls / mitigation strategies, and iterate


Assets
14

 Need to know what you are protecting!


 Hardware: Laptops, servers, routers, PDAs, phones, ...
 Software: Applications, operating systems, database systems,
source code, object code, ...
 Data and information: Data for running and planning your
business, design documents, data about your customers, data
about your identity
 Reputation, brand name
 Responsiveness

 Assets should have an associated value (e.g., cost to replace


hardware, cost to reputation, how important to business
operation)
Adversaries
15

 National governments
 Terrorists
 Thieves
 Business competitors
 Your supplier
 Your consumer
 New York Times
 Your family members (parents, children)
 Your friends
 Your ex-friends
Vulnerabilities
16

 Weaknesses of a system that could be exploited to cause


damage
 Accounts with system privileges where the default password has
not been changed (Diebold: 1111)
 Programs with unnecessary privileges
 Programs with known flaws
 Known problems with cryptography
 Weak firewall configurations that allow access to vulnerable
services
 ...

 Sources for vulnerability updates: MITRE, CVSS, CERT, SANS,


Bugtraq, the news(?)
Threats
17

 Threats are actions by adversaries who try to exploit


vulnerabilities to damage assets
 Spoofing identities: Attacker pretends to be someone else
 Tampering with data: Change outcome of election

 Denial of service: Attacker makes voting machines


unavailable on election day
 Escalation of privilege: Regular voter becomes admin

 Specific threats depend on environmental conditions,


enforcement mechanisms, etc
 Youmust have a clear, simple, accurate understanding of
how the system works!
Threats
18

 Several ways to classify threats


 By damage done to the assets
◼ Confidentiality, Integrity, Availability
 By the source of attacks
◼ (Type of) insider
◼ (Type of) outsider
◼ Local attacker
◼ Remote attacker
◼ Attacker resources
 By the actions
◼ Interception
◼ Interruption
◼ Modification
◼ Fabrication
19

Authentication
Authentication
20

 Binding of identity / entity to the subject


 One or more of the following
 What entity knows (eg. password)
 What entity has (eg. badge, smart card)

 What entity is (eg. fingerprints, retinal characteristics)

 ??Where entity is (eg. In front of a particular terminal)


Authentication System
21

 (A, C, F, L, S)
A information that proves identity
 C information stored on computer and used to validate
authentication information
 F complementation function; f : A → C

 L functions that prove identity

 S functions enabling entity to create, alter information in A or


C
Passwords
22

 Sequence of characters
 Examples: 10 digits, a string of letters, etc.
 Generated randomly, by user, by computer with user input

 Sequence of words
 Examples: pass-phrases
 Algorithms
 Examples: challenge-response, one-time passwords
 Entropy vs. memorability
 The more complex a password the harder it is to guess ...
 ... and the harder it is to remember.
 Thus, we write them down.
Storage
23

 Store as cleartext
 If password file compromised, all passwords revealed
 Encipher file
 Need to have decipherment, encipherment keys in memory
 Reduces to previous problem

 Store one-way hash of password


 If file read, attacker must still guess passwords or invert the hash
Password Cracking
24

 Social Engineering
 Password Resetting – surprisingly large!

 Dictionary Attacks – John the Ripper

 Brute Force Attacks

 Key stroke Logging and Sniffing

 Hash chains and Rainbow Tables


One-Time Passwords
25

 Password that can be used exactly once


 After use, it is immediately invalidated
 Challenge-response mechanism
 Challenge is number of authentications; response is password for
that particular number
 Problems
 Synchronization of user and system
 Generation of good random passwords

 Password distribution problem


One-Time Passwords
26

 Generation mechanisms
 Time-synchronization

◼ Using a synchronized time between client and server


◼ Example
Let tx be a current synchronized time,
f(tx)=px The passwords in the order of use are
p1, p2 … px …
One-Time Passwords (cont.)
27

 Challenge-response
◼ Using a challenge from server
◼ Example: Let cn be the current challenge from server,
f(cn) = pn The passwords p in the order of use are
p1, p2 … pn
 Hash chain
◼ Using a chain of hash functions
◼ Example: h is the one-way hash function, p is the OTP and an
initial seed s
h(s)=p1, h(p1)=p2, …, h(pn-1)=pn
The passwords in the order of use are
pn, pn-1, …, p2, p1
Challenge-Response
28

User and system share a secret function f

user request to authenticate system

user random message r system


(the challenge)

user f(r) system


(the response)
Hardware Support
29

 Token-based
 Used to compute response to challenge
◼ May encipher or hash challenge
◼ May require PIN from user

 Temporally-based
 Every minute (or so) different number shown
◼ Computer knows what number to expect when
 User enters number and fixed password
Biometrics
30

 Automated measurement of biological, behavioral


features that identify a person
 Fingerprints: optical or electrical techniques
◼ Maps fingerprint into a graph, then compares with database
◼ Measurements imprecise, so approximate matching algorithms used
 Voices: speaker verification or recognition
◼ Verification: uses statistical techniques to test hypothesis that speaker is who
is claimed (speaker dependent)
◼ Recognition: checks content of answers (speaker independent)
Other Characteristics
31

 Can use several other characteristics


 Eyes: patterns in irises unique
◼ Measure patterns, determine if differences are random; or correlate images
using statistical tests
 Faces: image, or specific characteristics like distance from nose to
chin
◼ Lighting, view of face, other noise can hinder this
 Keystroke dynamics: believed to be unique
◼ Keystroke intervals, pressure, duration of stroke, where key is struck
◼ Statistical tests used
Effectiveness of Biometrics
32

 Evaluated on three basic criteria


 False reject rate: Rate at which supplicants (authentic users) are
denied or prevented from accessing authorized areas due to
failure detected by biometric device (Type I error).
 False accept rate: Rate at which supplicants who are not
legitimate users are allowed access to systems or areas due to
failure detected by biometric device (Type II error).
 Crossover error rate: Level at which the number of false
rejections equals the number of false acceptances, (equal error
rate). This is the most common and important overall measure
of the accuracy of biometric systems.
Acceptability of Biometrics
33

 Balance between how acceptable the security


system to users and its effectiveness in
maintaining the security
 Many biometric systems that are highly reliable and
effective are invasive
 Many information security professionals, in an effort
to avoid confrontation and possible user boycott of
biometric controls, do not use them
Authentication: Summary
34

 Authentication is not cryptography


 You have to consider system components
 Passwords are here to stay
 They provide a basis for most forms of authentication
 Protocols are important
 They can make masquerading harder
 Authentication methods can be combined
 Multi-factor
Authorization
35

 Authorization is the function of specifying access rights to


resources
 E.g: Human resources staff are normally authorized to
access employee records
 Represented as ACL
Access Control Matrix
36

 Access control matrix is simplest framework for


describing rights of users over files in a matrix
File 1 File 2 File 3 File 4

User 1 R, W, O R R, W, X, O W

User 2 R R, O R R, W, X, O
Access Control List
37

 A variant of the access control matrix


 Store each column with the object it represents

ACL(file 1) = {(user 1, RWO), (user 2, R)}


ACL(file 2) = {(user 1, R), (user 2, RO)}
ACL(file 3) = {(user 1, RWXO), (user 2, R)}
ACL(file 4) = {(user 1, W), (user 2, RWXO)}
Creation and Maintenance of
Access Control List
38

 Which subjects can modify an object’s ACL?


 Possessors with the “own” right can modify the ACL
 Does the ACL support groups and wildcards?
 Groups and wildcards are used to limit the size of the ACLs
 Conflicts?
 When there is conflict between two ACLs, the resolution
resolved by the rules in the system
 ACLs and default permissions?
 Ifno appropriate ACL entry exists, the default permission is
applied
Capabilities
39

 Another variant of the access control matrix


 Store each row with the subject it represents

CAP(user 1) = {(file 1, RWO), (file 2, R), (file 3, RWXO),


(file 4, W)}
CAP(user 2) = {(file 1, R), (file 2, RO), (file 3, R), (file 4,
RWXO)}
ACL vs. Capabilities
40

 Two different questions


 Given an object, which subjects can access it, and how?
 Given a subject, which objects can it access, and how?

 ACL is easy to answer the first question


 Capabilities is easy to answer the second question
 Which question is more important?
ACL vs. Capabilities (cont.)
41

 Authentication
 Given a process that wishes to perform an operation on an object
◼ ACL needs to authenticate the process’s identity
◼ Capabilities do not require authentication, but require unforgeability
 Least Privilege
 Capabilities provide finer grained least privilege control
 Revocation
 ACL can remove a group of users from the list, and those users can
no longer gain access to the object
 Capabilities have no equivalent operation
TROJAN HORSES

 A Trojan Horse is rogue software installed,


perhaps unwittingly, by duly authorized users
 A Trojan Horse does what a user expects it to
do, but in addition exploits the user's
legitimate privileges to cause a security
breach
TROJAN HORSE EXAMPLE

ACL

A:r
File F
A:w

B:r
File G
A:w

Principal B cannot read file F


TROJAN HORSE EXAMPLE

Principal A ACL
executes
A:r
read File F
Program Goodies A:w

Trojan Horse
B:r
File G
write A:w

Principal B can read contents of file F copied to file G


Bell-LaPadula security model
45

The Bell-LaPadula (BLP) model is about information


confidentiality, and this model formally represents the long
tradition of attitudes about the flow of information
concerning national secrets.
Classifications and clearances
46

 Unclassified, confidential, secret, top secret


 Cost ‘lives’ marked ‘secret’
 Cost ‘many lives’ marked ‘top secret’
Bell – LaPadula - Details
 Earliest formal model
 Each user subject and information object
has a fixed security class – labels
 Use the notation ≤ to indicate dominance

 Simple Security (ss) property:

the no read-up property


A subject s has read access to an object iff the class of the
subject C(s) is greater than or equal to the class of the object
C(o)
 i.e. Subjects can read Objects iff C(o) ≤ C(s)
Access Control: Bell-LaPadula

Subjects Objects
Top Secret Read OK Top Secret

Secret Secret

Unclassified Unclassified
Access Control: Bell-LaPadula

Subjects Objects
Top Secret Top Secret

Secret Read OK Secret

Unclassified Unclassified
Access Control: Bell-LaPadula

Subjects Objects
Top Secret Top Secret

Secret Secret

Unclassified Read OK Unclassified


Bell - LaPadula (2)

 * property (star):
the no write-down (NWD) property
 Whilea subject has read access to object O, the subject can
only write to object P if
C(O) ≤ C (P)
 No process may write data to a lower level
Access Control: Bell-LaPadula

Subjects Objects
Top Secret Write OK Top Secret

Secret Secret

Unclassified Unclassified
Access Control: Bell-LaPadula

Subjects Objects
Top Secret Top Secret

Secret Write OK Secret

Unclassified Unclassified
Access Control: Bell-LaPadula

Subjects Objects
Top Secret Top Secret

Secret Secret

Unclassified Write OK Unclassified


Access Control Models
55
 Discretionary Access Control (DAC)
 Restricting access to objects based on identity of
subjects and/or groups to which they belong
 Mandatory Access Control (MAC)
 Restrict access to objects based on the sensitivity
(as represented by a label) of the information
contained in the objects and the formal
authorization (i.e. clearance) of subjects to access
information of such sensitivity
Access Control Models (cont.)
56

 Role based access control (RBAC)


 Began in 1970s

 To facilitate the security management in multi-user, multi-


application systems
 Minimum requirements:
◼ Associate roles with each individual.
◼ Each role defines a specific set of operations that the
individual acting in that role may perform.
◼ An individual needs to be authenticated, chooses a role
assigned to the individual, and accesses information
according to operations needed for the role.
RBAC
57
 Users: human beings
 Roles: job function (title)
 Permissions: approval of a mode of access
 Always positive
 Abstract representation
 Can apply to single object or to many

users roles
permissions (P)
(U) (R) Permission
User Assignment (UA)
Assignment (PA)
RBAC Family

RBAC3 consolidated model

RBAC1 RBAC2
role hierarchy constraints

RBAC0 base model

58
RBAC Family (cont.)
59

 RBAC0: the base model indicating that it is the


minimum requirement for RBAC
 RBAC1: include RBAC0 and support of role hierarchy
 Inheritance among roles
 Inheritance of permission from junior role (bottom) to
senior role (top)
 RBAC2: include RBAC0 and support of constraints
 Enforces high-level organizational policies, such as
mutually exclusive roles
 RBAC3: combine RBAC1 and RBAC2
Situation-Aware Access Control
60

 Situation-aware access control model incorporates


situation-awareness into RBAC
For example, only when the user with the role of a teacher
in the Smart Classroom during the class time, the user
can create a group discussion
CSE345/545 - Winter 2025
Foundations of Computer Security
Overview of Cryptography
Dr. Arun Balaji Buduru
Founding Head, Usable Security Group (USG)
Associate Professor, Dept. of CSE | HCD, IIIT-Delhi, India
Visiting Faculty, Indiana University – Bloomington, USA

Thanks to PK, Kohno, Kurose, Ross and others for sample slides and materials
Cryptography
1

 In Greek means “secret writing”


 An outsider (interceptor/intruder/adversary) can
make following threats:
 Block message (affecting availability)
 Intercept message (affecting secrecy)

 Modify message (affecting integrity)

 Fabricate message (affecting integrity)

 The fundamental technique to counter these threats


Cryptography (cont.)
2

 Cryptography: Study of mathematical techniques related to


certain aspects of information security, such as confidentiality,
data integrity, entity authentication, and data origin
authentication.
 The basic component of cryptography is a cryptosystem
 Cryptographer: Person working for legitimate sender or receiver.
A cryptographer will use cryptography to convert plaintext
into ciphertext.
 Cryptanalyst: Person working for unauthorized interceptor. A
cryptanalyst will use cryptanalysis to attempt to turn ciphertext
back into plaintext.
 Cryptology: Study of encryption and decryption, including
cryptography and cryptanalysis.
Cryptosystem
3

 A cryptosystem is a 5-tuple (, D, M, K, C),


where M is the set of plaintexts, K is the set of keys, C is
the set of ciphertexts, :   K→C is the set of encipher
(encryption) functions, and D: C  K→M is the set of
deciphering (decryption) functions.
 PlaintextM: set of messages in original form
 Ciphertext C: set of messages in encrypted form
Classical Cryptography
4

 Basic techniques for classical ciphers


 Substitution:One letter is exchanged for another
 Transposition: The order of the letters is rearranged

 Classical ciphers
 Mono-alphabetic: Letters of the plaintext alphabet are
mapped into other unique letters
 Poly-alphabetic: Letters of the plaintext alphabet are
mapped into letters of the ciphertext space depending
on their positions in the text
Substitution
5

 Substitute each letter in the plaintext for another


one.
 Example (Caesar Cipher)

a b c d e f g h i j k l m n o p q r s t u v w x y z
q e r y u i o p a s d f g w h j k l z x c v b n m t
Plaintext: under attack we need help
Ciphertext: cwyul qxxqrd bu wuuy pufj
[from Stallings, Cryptography & Network Security]

6
Transposition
7

 Change the positions of the characters in the


plaintext
 Example:
message: meet me after the toga party
m e m a t r h t g p r y
e t e f e t e o a a t
Ciphertext: MEMATRHTGPRYETEFETEOAAT
Vigenere Cipher
7-8

 Idea: Uses Caesar's cipher with various different shifts, in


order to hide the distribution of the letters.
 A key defines the shift used in each letter in the text
 A key word is repeated as many times as required to
become the same length

Plain text: I a t t a c k
Key: 2342342 (key is “234”)
Cipher text: K d x v d g m
Problem of Vigenere Cipher
7-9

 Vigenere is easy to break (Kasiski, 1863):


 Assume we know the length of the key. We can organize the
ciphertext in rows with the same length of the key. Then, every
column can be seen as encrypted using Caesar's cipher.

 The length of the key can be found using several methods:


1. If short, try 1, 2, 3, . . . .
2. Find repeated strings in the ciphertext. Their distance is expected to be a
multiple of the length. Compute the gcd of (most) distances.
3. Use the index of coincidence.
Types of Cryptosystems
10

 Symmetric cryptosystems (also called single-key


cryptosystems) are classical cryptosystems:
M = D(K, E(K, M))
 The encryption key and decryption key are the same.
 Asymmetric cryptosystems:
M = D(Kd, E(Ke, M))
 Kd is the decryption key and Ke is the encryption key
 Kd ≠ Ke

 Hash Functions
 No keys
Symmetric Key Cryptography
11

AKA secret key cryptography


AKA conventional cryptography
Secure Key Distribution Strategies for Symmetric
Cryptosystems
12

 A key K can be selected by A to be shared with B, and


K needs to be physically delivered to B
 A third party can select the same key K and physically
deliver K to A and B
 If A and B have previously used a key K’, one party can
transmit the new key K to the other, encrypted using the
old key K’
 If A and B each has an encrypted connection to a third
party C, C can transmit the new key K on the encrypted
links to both A and B
 Any other means?
Symmetric Key Applications
13

 Transmission over insecure channel


 Shared secret (transmitter, receiver)
 Secure storage on insecure media
 Authentication
 Strong authentication: prove knowledge
without revealing key
A simple example
14

 KAB = +3 (Caesar cipher), known by Alice & Bob


 rA = “marco”
 rA encrypted with KAB: “pdufr”
 rB = “polo”
 rA encrypted with KAB: “sror”
 (“marco”, “pdufr”), (“polo”, “sror”)
Block Ciphers
15  In a block cipher:
 Plaintext and ciphertext have fixed length b (e.g., 128 bits)
 A plaintext of length n is partitioned into a sequence of m
blocks, P[0], …, P[m−1], where n  bm  n + b
 Each message is divided into a sequence of blocks and
encrypted or decrypted in terms of its blocks.
Requires padding
with extra bits.
Plaintext

Blocks of
plaintext
Padding
16

 Block ciphers require the length n of the plaintext to be a multiple of the


block size b
 Padding the last block needs to be unambiguous (cannot just add zeroes)
 When the block size and plaintext length are a multiple of 8, a common
padding method (PKCS5) is a sequence of identical bytes, each indicating
the length (in bytes) of the padding
 Example for b = 128 (16 bytes)
 Plaintext: “Roberto” (7 bytes)
 Padded plaintext: “Roberto999999999” (16 bytes), where 9 denotes the
number and not the character
 We need to always pad the last block, which may consist only of padding
Block Ciphers in Practice
17
 Data Encryption Standard (DES)
 Developed by IBM and adopted by NIST in 1977
 64-bit blocks and 56-bit keys
 Small key space makes exhaustive search attack feasible since late 90s
 Triple DES (3DES)
 Nested application of DES with three different keys KA, KB, and KC
 Effective key length is 168 bits, making exhaustive search attacks unfeasible
 C = EKC(DKB(EKA(P))); P = DKA(EKB(DKC(C)))
 Equivalent to DES when KA=KB=KC (backward compatible)
 Advanced Encryption Standard (AES)
 Selected by NIST in 2001 through open international competition and public
discussion
 128-bit blocks and several possible key lengths: 128, 192 and 256 bits
 Exhaustive search attack not currently possible
 AES-256 is the symmetric encryption algorithm of choice
Symmetric key crypto: DES
7-18

DES operation
initial permutation
16 identical “rounds” of
function application,
each using different 48
bits of key
final permutation
The Advanced Encryption Standard (AES)
19
 In 1997, the U.S. National Institute for Standards and Technology (NIST)
put out a public call for a replacement to DES.
 It narrowed down the list of submissions to five finalists, and ultimately
chose an algorithm that is now known as the Advanced Encryption
Standard (AES).
 AES is a block cipher that operates on 128-bit blocks. It is designed to
be used with keys that are 128, 192, or 256 bits long, yielding ciphers
known as AES-128, AES-192, and AES-256.
20

AES Round Structure


 The 128-bit version of the AES
encryption algorithm proceeds in
ten rounds.
 Each round performs an invertible
transformation on a 128-bit array,
called state.
 The initial state X0 is the XOR of the
plaintext P with the key K:
X0 = P XOR K.
 Round i (i = 1, …, 10) receives
state Xi-1 as input and produces
state Xi.
 The ciphertext C is the output of the
final round: C = X10.
AES Rounds
21

 Each round is built from four basic steps:


1. SubBytes step: an S-box substitution step
2. ShiftRows step: a permutation step
3. MixColumns step: a matrix multiplication step
4. AddRoundKey step: an XOR step with a round key
derived from the 128-bit encryption key
Block Cipher Modes
22  A block cipher mode describes the way a block cipher
encrypts and decrypts a sequence of message blocks.
 Electronic Code Book (ECB) Mode (is the simplest):
 Block P[i] encrypted into ciphertext block C[i] = EK(P[i])
 Block C[i] decrypted into plaintext block M[i] = DK(C[i])
Strengths and Weaknesses of ECB
23

 Weakness:
 Strengths:
 Documents and images are not
 Is very simple suitable for ECB encryption since
 Allows for parallel patters in the plaintext are
encryptions of the blocks repeated in the ciphertext:
of a plaintext
 Can tolerate the loss or
damage of a block
Cipher Block Chaining (CBC) Mode
24
 In Cipher Block Chaining (CBC) Mode
 The previous ciphertext block is combined with the
current plaintext block C[i] = EK (C[i −1]  P[i])
 C[−1] = V, a random block separately transmitted
encrypted (known as the initialization vector)
 Decryption: P[i] = C[i −1]  DK (C[i])
CBC Encryption: CBC Decryption:
P[0] P[1] P[2] P[3] P[0] P[1] P[2] P[3]
V
V

EK EK EK EK DK DK DK DK

C[0] C[1] C[2] C[3] C[0] C[1] C[2] C[3]


Strengths and Weaknesses of CBC
25

 Strengths:  Weaknesses:
 Doesn’t show patterns in  CBC requires the reliable
the plaintext transmission of all the
 Is the most common mode
blocks sequentially
 Is fast and relatively
 CBC is not suitable for
simple
applications that allow
packet losses (e.g., music
and video streaming)
Stream Cipher
26
 Key stream
 Pseudo-random sequence of bits S = S[0], S[1], S[2], …
 Can be generated on-line one bit (or byte) at the time
 Stream cipher
 XOR the plaintext with the key stream C[i] = S[i]  P[i]
 Suitable for plaintext of arbitrary length generated on the fly, e.g., media
stream
 Synchronous stream cipher
 Key stream obtained only from the secret key K
◼ Independent with plaintext and ciphertext
 Works for high-error channels if plaintext has packets with sequence numbers
 Sender and receiver must synchronize in using key stream
 If a digit is corrupted in transmission, only a single digit in the plaintext is
affected and the error does not propagate to other parts of the message.
Stream Cipher
27

 Self-synchronizing stream cipher


 Key stream obtained from the secret key and N previous
ciphertexts
 the receiver will automatically synchronize with the keystream
generator after receiving N ciphertext digits, making it easier to
recover if digits are dropped or added to the message stream.
 Lost packets cause a delay of q steps before decryption resumes

 Single-digit errors are limited in their effect, affecting only up to


N plaintext digits.
Key Stream Generation
28

 RC4
 Designed in 1987 by Ron Rivest for RSA Security
 Trade secret until 1994
 Uses keys with up to 2,048 bits
 Simple algorithm
 Block cipher in counter mode (CTR)
 Use a block cipher with block size b
 The secret key is a pair (K,t), where K is key and t (counter) is a b-
bit value
 The key stream is the concatenation of ciphertexts
EK (t), EK (t + 1), EK (t + 2), …
 Can use a shorter counter concatenated with a random value
 Synchronous stream cipher
Hash Functions
29

 A hash function h maps a plaintext x to a fixed-


length value x = h(P) called hash value or digest of P
 Usually x is much smaller in size compared to P.
A collision is a pair of plaintexts P and Q that map to the
same hash value, h(P) = h(Q)
 Collisions are unavoidable
 For efficiency, the computation of the hash function should
take time proportional to the length of the input plaintext
Cryptographic Hash Functions
30

 A cryptographic hash function satisfies additional properties


 Preimage resistance (aka one-way)
◼ Given a hash value x, it is hard to find a plaintext P such that h(P) = x
 Second preimage resistance (aka weak collision resistance)
◼ Given a plaintext P, it is hard to find a plaintext Q such that h(Q) =
h(P)
 Collision resistance (aka strong collision resistance)
◼ It is hard to find a pair of plaintexts P and Q such that h(Q) = h(P)
 Collision resistance implies second preimage resistance
 Hash values of at least 256 bits recommended to defend
against brute-force attacks
7-31

Hash Function Algorithms


 MD5 hash function widely used (RFC 1321)
 computes 128-bit message digest in 4-step process.
 arbitrary 128-bit string x, appears difficult to construct
msg m whose MD5 hash is equal to x.
 SHA-1 is also used.
 USstandard [NIST, FIPS PUB 180-1]
 160-bit message digest

 There are many hash functions, but most of them do


not satisfy cryptographic hash function requirements
 example: checksum
Message-Digest Algorithm 5 (MD5)
32

 Developed by Ron Rivest in 1991


 Uses 128-bit hash values
 Still widely used in legacy applications although considered
insecure
 Various severe vulnerabilities discovered
 Chosen-prefix collisions attacks found by Marc Stevens, Arjen
Lenstra and Benne de Weger
 Start with two arbitrary plaintexts P and Q
 One can compute suffixes S1 and S2 such that P||S1 and Q||S2
collide under MD5 by making 250 hash evaluations
 Using this approach, a pair of different executable files or PDF
documents with the same MD5 hash can be computed
Secure Hash Algorithm (SHA)
33

 Developed by NSA and approved as a federal standard by NIST


 SHA-0 and SHA-1 (1993)
 160-bits
 Considered insecure
 Still found in legacy applications
 Vulnerabilities less severe than those of MD5
 SHA-2 family (2002)
 256 bits (SHA-256) or 512 bits (SHA-512)
 Still considered secure despite published attack techniques
 Public competition for SHA-3 announced in 2007
Iterated Hash Function
34
 A compression function works on input values of fixed length
 An iterated hash function extends a compression function to inputs of
arbitrary length
 padding, initialization vector, and chain of compression functions
 inherits collision resistance of compression function
 MD5 and SHA are iterated hash functions
P1 P2 P3 P4

IV || || || || digest

SHA-1
Hashing Time MD5
0.06
0.05
0.04
msec

0.03
0.02
0.01
0
0 100 200 300 400 500 600 700 800 900 1000
Input Size (Bytes)
Cryptographic Hash Lifecycle
35

http://valerieaurora.org/hash.html

[via http://www.schneier.com/blog/archives/2011/06/the_life_cycle.html]
Birthday Attack
36

 The brute-force birthday attack aims at finding a collision for a hash function h
 Randomly generate a sequence of plaintexts X1, X2, X3,…
 For each Xi compute yi = h(Xi) and test whether yi = yj for some j < i
 Stop as soon as a collision has been found
 If there are m possible hash values, the probability that the i-th plaintext does not collide
with any of the previous i −1 plaintexts is 1 − (i − 1)/m
 The probability Fk that the attack fails (no collisions) after k plaintexts is
Fk = (1 − 1/m) (1 − 2/m) (1 − 3/m) … (1 − (k − 1)/m)
 Using the standard approximation 1 − x  e−x
Fk  e−(1/m + 2/m + 3/m + … + (k−1)/m) = e−k(k−1)/2m
 The attack succeeds/fails with probability ½ when Fk = ½ , that is,
e−k(k−1)/2m = ½
k  1.17 m½
 We conclude that a hash function with b-bit values provides about b/2 bits of security
Public Key Cryptography
37

AKA asymmetric cryptography


AKA unconventional cryptography (?)

Public key: published, ideally known widely


Private key (NOT “secret key”): not published
Public key cryptography

+ Bob’s public
K
B key

- Bob’s private
K
B key

plaintext, P encryption ciphertext decryption Plaintext, P


algorithm algorithm
C=EK+B(P) P=DK-B(C)
Facts About Numbers
39

 Prime number p:
 p is an integer
 p2
 The only divisors of p are 1 and p
 Examples
 2, 7, 19 are primes
 −3, 0, 1, 6 are not primes
 Prime decomposition of a positive integer n:
n = p1e1  …  pkek
 Example:
 200 = 23  52
Fundamental Theorem of Arithmetic
The prime decomposition of a positive integer is unique
Greatest Common Divisor
40

 The greatest common divisor (GCD) of two positive integers a and b,


denoted gcd(a, b), is the largest positive integer that divides both a
and b
 The above definition is extended to arbitrary integers
 Examples:
gcd(18, 30) = 6 gcd(0, 20) = 20
gcd(−21, 49) = 7
 Two integers a and b are said to be relatively prime if
gcd(a, b) = 1
 Example:
 Integers 15 and 28 are relatively prime
Modular Arithmetic
41
 Modulo operator for a positive integer n
r = a mod n
equivalent to
a = r + kn
and
r = a − a/n n
 Example:
29 mod 13 = 3 13 mod 13 = 0 −1 mod 13 = 12
29 = 3 + 213 13 = 0 + 113 12 = −1 + 113
For a<0, we first add a large kn to a such that it becomes positive
 Modulo and GCD:
gcd(a, b) = gcd(b, a mod b)
 Example:
gcd(21, 12) = 3 gcd(12, 21 mod 12) = gcd(12, 9) = 3
42

Euclid’s GCD Algorithm


 Euclid’s algorithm for Algorithm EuclidGCD(a, b)
Input integers a and b
computing the GCD Output gcd(a, b)
repeatedly applies the
formula if b = 0
return a
gcd(a, b) = gcd(b, a mod b)
else
 Example return EuclidGCD(b, a mod b)
 gcd(412, 260) = 4

a 412 260 152 108 44 20 4


b 260 152 108 44 20 4 0
RSA: Choosing keys
7-43

1. Choose two large prime numbers p, q.


(e.g., 1024 bits each)

2. Compute n = pq, z = (p-1)(q-1)

3. Choose e (with e<n) that has no common factors


with z. (e, z are “relatively prime”).

4. Choose d such that ed-1 is exactly divisible by z.


(in other words: ed mod z = 1 ).

5. Public key is (n,e). Private key is (n,d).


+ -
KB KB
RSA: Encryption, decryption
7-44

0. Given (n,e) and (n,d) as computed above

1. To encrypt bit pattern, m, compute


e e
c = m mod n (i.e., remainder when m is divided by n)

2. To decrypt received bit pattern, c, compute


d (i.e., remainder when cd is divided by n)
m = c mod n

Magic d
m = (m e mod n) mod n
happens!
c
RSA example:
45

Bob chooses p=5, q=7. Then n=35, z=24.


e=5 (so e, z relatively prime).
d=29 (so ed-1 exactly divisible by z).

letter m me c = me mod n
encrypt:
l 12 1524832 17

d
decrypt:
c c m = cd mod n letter
17 481968572106750915091411825223071697 12 l

Computational extensive
RSA: Why is that m = (m e mod n)
d
mod n

Useful number theory result: If p,q prime and


n = pq, then: y y mod (p-1)(q-1)
x mod n = x mod n

e
(m mod n) d mod n = medmod n
ed mod (p-1)(q-1)
= m mod n
(using number theory result above)
1
= m mod n
(since we chose ed to be divisible by
(p-1)(q-1) with remainder 1 )

= m
RSA: another important property
7-47

The following property will be very useful later:

use public key use private key


first, followed first, followed
by private key by public key

Result is the same!


RSA Cryptosystem
48

 Setup: Example
n = pq, with p and q primes ◼ Setup:
 e relatively prime to  p = 7, q = 17
(n) = (p − 1) (q − 1)  n = 717 = 119
 d inverse of e in Z(n)  (n) = 616 = 96
◼ ed mod z = 1 e=5
 Keys:  d = 77
 Public key: KE = (n, e) ◼ Keys:
 Private key: KD = d  public key: (119, 5)
 private key: 77
 Encryption:
◼ Encryption:
 Plaintext M in Zn
 M = 19
 C = Me mod n  C = 195 mod 119 = 66
 Decryption: ◼ Decryption:
M = Cd mod n  C = 6677 mod 119 = 19
Digital Signatures
49

Asymmetry:
Signature can only be generated by owner/knower of private key
Signature can be verified by anyone via public key

Non-repudiation:
Sender cannot prove message (signature) was not sent
Key may have been stolen
Public Key Distribution and Authentication
50

 Using the “right” Public Key:


 Must be authentic, not necessarily secret
 Obtaining the “right” Public Key:
 Directly from its owner
 Indirectly, in a signed message from a Certification
Authority (CA):
◼ A Certificate is a digitally signed message from
a CA binding a public key to a name
◼ Certificates can be passed around, or managed
in directories
◼ Protocols for certificate generation: e.g. X.509
(RFC 2459), SPKI/SDSI
Public Key Cryptography Issues
51

 Efficiency
 Publickey cryptographic algorithms are orders of magnitude
slower than symmetric key algorithms
 Hybrid model
 Public
key used to establish temporary shared key
 Symmetric key used for remainder of communication
Computational Security
52

 An encryption scheme is computationally secure if it


takes exponentially long time to break the ciphertext.
 Lifetime of a cryptosystem: The minimum time for
unauthorized decoding of encrypted message
 Defined for each application
◼ Examples:
◼ Military orders = 1 hour to 3 years
◼ Check transactions = 1 year
◼ Business agreements = 10-15 years
Quantum Cryptography
53

 Quantum cryptography uses quantum mechanical effects (in


particular quantum communication and quantum computation) to
perform cryptographic tasks or to break cryptographic systems
 Quantum communication (or qubit-communication)
◼ Example: The parties can use exchange of photons through an optical fiber to
transmit data
 Quantum computation
◼ In a general computational state model, there are two definite states (0 or 1),
whereas quantum computation uses qubits (quantum bits) which can be a
superposition [0 and/or 1] of both the states
 Quantum mechanics
◼ The body of scientific principles that explains the behavior of matter and its
interactions with energy on the small scale of atoms and subatomic particles.
Quantum Cryptography (Cont.)
54
 Quantum cryptography uses
◼ a quantum mechanical property of an electron existing partly in all its
theoretically possible states simultaneously; but when measured or
observed, gives a result corresponding to only one of the possible
configurations
◼ transmission of information in quantum states, to implement a
communication system that detects eavesdropping.
 Quantum key distribution (QKD)* describes the process to
establish a shared key between two parties which include
encoding the bits of the key as quantum states and transmitting
them. If eavesdropper tries to learn these bits, the messages will
be disturbed and can be easily detected due to the above
quantum mechanical property
*http://en.wikipedia.org/wiki/Quantum_key_distribution
Quantum Cryptography (Cont.)
55
 Major advantages:
A key that is guaranteed to be secure can be produced,
under realistic constraints
 It allows the completion of various cryptographic tasks
which are shown or conjectured to be impossible using only
classical cryptographic techniques (example)
 Major limitation
 Quantum cryptography can only provide 1:1 connection
Quantum Cryptography (Cont.)
56

 Protocols for Quantum Key Exchange,


 BB84 protocol: Charles H. Bennett and Gilles Brassard
 E91 protocol: Artur Ekert
 Some of the Quantum Key distribution networks,
 DARPA
 SECOQC
 SwissQuantum
 Tokyo QKD Network
 Los Alamos National Labs
 The major advantage of quantum key distribution is its ability to
detect any interception of the key
http://en.wikipedia.org/wiki/Quantum_cryptography
http://en.wikipedia.org/wiki/Quantum_key_distribution
Steganography
57

⚫ In Greek, steganography means “covered writing”


⚫ The art of hiding information is ways that prevent
detection of hidden messages.
⚫ Steganography and cryptography are cousins in
the spy craft family
⚫ Different goals:
• Cryptography: conceal the content of the messages
• Steganography: conceal the existence of the
messages
Steganography (cont.)
58

⚫ What to hide
⚫ Texts
⚫ Images
⚫ Sound
⚫ ……
⚫ How to hide
– embed text in text/images/audio/video files
– embed image in text/images/audio/video files
– embed sound in text/images/audio/video files
A Real Steganographic Example
59
 During WWI the following cipher message was
actually sent by a German spy
 “Apparently neutral’s protest is thoroughly discounted
and ignored. Isman hard hit. Blockade issue affects
pretext for embargo on by-products, ejecting suets and
vegetable oils”
 Hidden Message
 “Pershingsails from NY June 1”
 How to extract the hidden message from the sent
message?
A Steganographic System
60
CSE345/545 - Winter 2025
Network Basics and Security Concerns

Dr. Arun Balaji Buduru


Founding Head, Usable Security Group (USG)
Associate Professor, Dept. of CSE | HCD, IIIT-Delhi, India
Visiting Faculty, Indiana University – Bloomington, USA
OSI Network Model
1
Encapsulation
2

 Each protocol has its own “envelope”


 each protocol attaches its header to the packet
 so we have a protocol wrapped inside another protocol

 each layer of header contains a protocol demultiplexing


field to identify the “packet handler” the next layer up, e.g.,
◼ protocolnumber
◼ port number
IP Addressing: Introduction
3
IPv4 Addressing
4
NAT
5
IPv6
6
 Initial motivation: 32-bit address space exhaustion
 Additional motivation:
 header format helps speed processing/forwarding
 fixed-length 40 byte header (0.06% overhead)
 header checksum: removed entirely to reduce processing time at each
hop
 options: allowed, but outside of header, indicated by “next header”
field
 header changes to facilitate QoS:
 priority: identify priority among datagrams in flow (ToS bit)
 flow label: identify datagrams in the same “flow” (concept of “flow” not
well defined, originally these were “reserved” bits)
 Next header identifies “upper layer” protocol or IPv6 options:
 hop-by-hop option, destination option, routing, fragmentation,
authentication, encryption
IPv6 Addresses
7

 What does an IPv6 address look like?


 128 bits written as 8 16-bit integers seperated by ’:’
 each 16 bit integer is represented by 4 hex digits

 Example: FEDC:BA98:7654:3210:FEDC:BA98:7654:3210

 Abbreviations:
 actual: 1080:0000:0000:0000:0008:0800:200C:417A
 skip 0’s: 1080:0:0:0:8:800:200C:417A

 double ’::’: 1080::8:800:200C:417A


DNS Design Points
8

 DNS serves a core Internet function


 host,
routers, and name servers communicate to resolve
names (name to address translation)
 complexity at network’s “edge”

 Why not centralize DNS?


 single point of failure
 traffic volume
 performance: distant centralized database
 maintenance
 doesn’t scale!
 DNS can be “exploited” for server load balancing
DNS Caching
9

 Once a (any) name server learns mapping, it caches


mapping
 to reduce latency in DNS translation
 Cache entries timeout (disappear) after some time (TTL)
 TTL assigned by the authoritative server responsible for the
host name
 Local name servers typically also cache
 TLD name servers to reduce visits to root name servers
 all other name server referrals

 both positive and negative results


Security Issues in TCP/IP
10

 There are a number of serious security flaws inherent in


the protocols, regardless of the correctness of any
implementations
 There are variety of attacks based on these flaws, some
of them are as follows,
 Sequence number spoofing
 Routing attacks

 Authentication attacks
TCP Sequence Number Prediction
11

 The normal TCP connection establishment sequence


involves a 3-way handshake.
 The exchange may be shown schematically as follows:
C→S:SYN(ISNC)
S→C:SYN(ISNS), ACK(ISNC)
C→S:ACK(ISNS)
C→S:data
and/or
S→C:data
TCP Sequence Number Prediction
12

 Suppose, that there was a way for an intruder X to


predict ISNS
 In that case, it could send the following sequence to
impersonate trusted host T:
X→S:SYN(ISNX ) , SRC = T
S→T:SYN(ISNS ) , ACK(ISNX )
X→S:ACK(ISNS ) , SRC = T
X→S:ACK(ISNS ) , SRC = T, nasty − data
TCP Sequence Number Prediction
13

How, then, to predict the random ISN?


 If the initial sequence number variable is incremented by

a constant amount once per second, one can initiates a


legitimate connection to observe the ISNS and calculate,
with a high degree of confidence, ISNS′ used on the next
connection attempt
 The TCP specification requires that this variable be

incremented approximately 250,000 times per second


 Defense is due to high refresh rate
Routing attacks
14

 A number of the attacks described below can also be


used to accomplish denial of service by confusing the
routing tables on a host or gateway.
 Some of them are listed below,
 Source Routing
 Routing Information Protocol Attacks

 Exterior Gateway Protocol


Security Issues in IP
15

 source spoofing
 replay packets • DOS attacks
 no data integrity or • Replay attacks
• Spying
confidentiality • and more…

Fundamental Issue:
Networks are not (and will never be)
fully secure
Goals of IPSec
16

 to verify sources of IP packets


 authentication

 to prevent replaying of old packets


 to protect integrity and/or confidentiality of packets
 data Integrity/Data Encryption
Secure

Insecure
IPSec Architecture
17

ESP AH

Encapsulating Security Authentication Header


Payload
IPSec Security Policy

IKE

The Internet Key Exchange


IPSec Architecture
18

 IPSec provides security in three situations:


 Host-to-host, host-to-gateway and gateway-to-gateway
 IPSec operates in two modes:
 Transportmode (for end-to-end)
 Tunnel mode (for VPN)

Transport Mode

Router Router

Tunnel Mode
IPSec
19

 A collection of protocols (RFC 2401)


 Authentication Header (AH)
◼ RFC 2402
 Encapsulating Security Payload (ESP)
◼ RFC 2406
 Internet Key Exchange (IKE)
◼ RFC 2409
 IP Payload Compression (IPcomp)
◼ RFC 3137
ESP Packet Details
20

IP header

Next Payload
Reserved
header length

Security Parameters Index (SPI)


Sequence Number
Authenticated
Initialization vector
TCP header
Data Encrypted TCP
packet
Pad Pad length Next

Authentication Data
How It Works
21

 IKE operates in two phases


 Phase 1: negotiate and establish an auxiliary end-to-end secure
channel
◼ Used by subsequent phase 2 negotiations
◼ Only established once between two end points!
 Phase 2: negotiate and establish custom secure channels
◼ Occurs multiple times
 Both phases use Diffie-Hellman key exchange to establish a
shared key
22

Firewalls
Firewalls
23

 Two primary types of firewalls are


 packet filtering firewalls
 proxy-server firewalls

 Sometimes both are employed to protect a network


 With a proxy-server based firewall, all network traffic
in a host is routed through the proxy server
 Packet filtering firewalls, on the other hand, take
advantage of the fact that direct support for TCP/IP is
built into the kernels of all major operating systems now
Firewalls
24

 In Linux, a packet filtering firewall is configured with the


Iptables modules.
 In a Windows machine, graphical interfaces are
provided through the Control Panel
 The latest packet filtering framework in Linux is known as
nftables.
 Meant as a more modern replacement for iptables, nftables
was merged into the Linux kernel mainline
Firewalls
25

 Iptables supports four tables: filter, mangle, nat, and


raw
Firewall Implementations
26
Firewall Implementations
27
Firewall Implementations
28
Firewall Implementations
29
CSE345/545 - Winter 2025
Protecting Privacy

Dr. Arun Balaji Buduru


Founding Head, Usable Security Group (USG)
Associate Professor, Dept. of CSE | HCD, IIIT-Delhi, India
Visiting Faculty, Indiana University – Bloomington, USA

Thanks to Yau, Boneh and others for materials


Privacy
1

 Ability of individuals or group to prevent their


personal information from being known to people
other than those the owners give the information
to.
 One of hottest topics in information assurance
due to increasing capabilities of IT technology:
 Collect information on individuals
 Combine facts from separate sources, and
merge them with other information; resulting in
various databases of private information
Basic Privacy Principles
2

 Lawfulness and fairness


 Necessity of data collection and processing
 Specification and binding
◼ No "non-sensitive" data
 Transparency
◼ Data subject´s right to information correction, erasure or
blocking of incorrect/illegally stored data
 Supervision by independent data protection
authority and sanctions
 Adequate organizational and technical safeguards
Computer Forensics vs. Privacy Protection
3

 Computer forensics focuses on finding hidden information


 Privacy protection concerns with individual’s right to hide
certain personal information
 Kind of information can be collected under certain situations
is usually limited and controlled by Constitution or legislation
 In US, the most important high-level document that defines
this limitation is the Fourth Amendment to the Constitution.
After 9/11, PATRIOT Act gave federal authorities much
wider latitude in monitoring Internet
 In India, the recent Supreme court verdict has elevated the
individual Privacy to be one of the fundamental right
◼ Inlaw, the ITA (Information Technology Act) 2000 governs the
privacy of individuals/entity over the internet
 GDPR [EU]
What Can Be Collected and When?
(Cont.)
4

 Most Privacy laws generally prohibit opening, accessing, or


viewing information from closed containers without a warrant.
If investigators identify the suspect has a right to privacy,
they should consider to secure a warrant.
 Individuals may lose their right to privacy when transferring
data to a third party, and that their right to privacy does not
extend to searches conducted by private parties who are not
acting on behalf of government.
 If someone took his personal computer to a repair shop,
and the technician there noticed illegal content on the
system, the repair shop is mandated to notify the
authorities
What Can Be Collected and When?
5

 Two types of searches


Warranted: Investigator obtained explicit
authorization (warrant) from proper
authorities
Warrantless: Investigator has implicit
authorization (warrantless) from
probable cause or otherwise to conduct
the search
What Can Be Collected and When? (Cont.)
6

 Warrantless searches happen only when


 The suspect has lost his/her right to privacy
 Consent is given by the owner:
◼ An employer normally has full authority to
search corporate data systems because
employee signed certain agreement before
using those systems.
◼ Scope of the consent and who gave the
consent can both be complex legal issues.
Privacy Protection
7
 Privacy and data protection laws promoted
by government
 Self-regulation for fair information practices
by codes of conducts promoted by businesses
 Privacy-enhancing technologies (PETs)
adopted by individuals
 Privacy education of consumers and IT
professionals
Threats to Privacy
8

 Application level
 Threats to collection/transmission of large quantities of personal data
 Applications, such as research involving population studies, electronic
commerce, distance learning
 Communication level
 Threats to anonymity of sender / forwarder / receiver
 Threats to anonymity of service provider
 Threats to privacy of communication, such as via monitoring / logging
of transactional data: Extraction of user profiles & its long-term
storage
 System level
 For example, threats due to attacks on system in order to gain access
to its data
 Audit trails
Threats to Privacy (cont.)
9
 Identity theft – the most serious crime against privacy
 Aggregation and data mining
 Poor system security
 Government threats
◼ Taxes / homeland security / etc.
◼ People’s privacy vs. homeland security concerns
 The Internet as privacy threat
◼ Unencrypted e-mail/web surfing/attacks
 Corporate rights and private business
◼ Companies may collect certain data
 Privacy for sale - many traps
◼ “Free” is not free, such as frequent-buyer cards reducing your
privacy
Privacy Practices in E-Commerce
10

 The five privacy practices that all


companies engaged in e-commerce are
recommended to observe are
1. Notice/awareness
◼In general, websites should clearly inform
users how it collects and handles user
information
Privacy Practices in E-Commerce (cont.)
11

◼ Essential notifications
◼ Identification of the entity collecting the data
◼ Identification of the uses to which the data will be put
◼ Identification of any potential recipients of the data
◼ The nature of the data collected and the means by which
it is collected
◼ Whether the provision of the requested data is voluntary
or required, and the consequences of a refusal to provide
the requested information
◼ The steps taken by the data collector to ensure the
confidentiality, integrity and quality of the data
Privacy Practices in E-Commerce (cont.)
12
2. Choice/consent
◼ Websites must give consumers options as to how any
personal information collected from them may be
used
◼ Two traditional types of choice/consent
◼ Opt-in requires affirmative steps by the consumers
to allow the collection and/or use of information
◼ Opt-out requires affirmative steps to disallow the
collection and/or use of such information.
Privacy Practices in E-Commerce (cont.)
13

3. Access/participation
◼ User would be able to review, correct, and in some cases
delete personal information on a particular website.
◼ Access must encompass
◼ timely and inexpensive access to data
◼ simple means for contesting inaccurate or incomplete
data
◼ mechanism by which the data collector can verify the
information
◼ means by which corrections and/or consumer objections
can be added to the data file and sent to all data
recipients.
Privacy Practices in E-Commerce (cont.)
14

4. Security/integrity
◼Websites must use both managerial and
technical measures to protect against
loss and the unauthorized access,
destruction, use, or disclosure of the
data.
Privacy Practices in E-Commerce (cont.)
15
5. Enforcement/Redress
◼ Mechanisms to enforce all above privacy principles.
◼ Self-Regulation: Mechanisms to ensure compliance
(enforcement) and appropriate means of recourse by
injured parties (redress).
◼ Private Remedies: A statutory scheme could create
private rights of action for consumers harmed by an
entity's unfair information practices .
◼ Government Enforcement: Civil or criminal penalties
enforced by governments.
A Case Study
16

 A corporation collects customers’ transactions from


over 1,500 stores in 10 countries, and allows more
than 3,000 suppliers to access and analyze data on
their products to identify customer buying patterns,
manage local store inventory and identify new
merchandising opportunities.
 What concerns on privacy of the collected
information?
Possible Privacy Preservation Approaches
17

 Randomization: Add noise to the data without


changing data’s aggregate distribution
 Distributed privacy preservation: Analyze data across
various entities without collecting data from entities
 Downgrading application effectiveness: Modify the
data to downgrade the accuracy of the results by
data mining, and remove sensitive information from
the results
K-Anonymity
18
 The idea behind k-anonymity is to make it hard to link sensitive and insensitive
attributes
 In the below table each row is an individual, and the columns (attributes) are
labeled “sensitive” or “insensitive”. We want to protect the sensitive attributes.

 “Disease” is a sensitive attribute and the others are insensitive


 In general, the following rules are applied:
 Rows are “clustered” (partitioned) into sets of size at least k.
 Within each set, make insensitive attributes identical. There are usually two ways doing
◼ Suppression: delete an entry (e.g., let “Gender” attribute be null).
◼ Generalization: replace with less specific info (e.g., for “Age”, substitute [40,49] for 42).
 Sensitive attributes remain untouched
Example of K-Anonymity
19
Differential Privacy
20

 Differential Privacy – providing


effective trade-off between data
privacy and data utility
 More relevant after GDPR and
recent court rulings
Acts Related to Privacy Protection
21
 Indian Laws:
 Digital Personal Data Protection Bill, 2023 https://prsindia.org/billtrack/digital-
personal-data-protection-bill-2023
 Generally covered by ITA 2000 and subsequent amendments
 Report published by the Centre for Internet and Society: https://cis-
india.org/telecom/knowledge-repository-on-internet-access/internet-privacy-in-india
 US Laws:
 Privacy Act of 1974 http://www.justice.gov/opcl/privstat.htm
 Health Insurance Portability and Accountability Act (HIPAA) of 1996
http://www.dol.gov/dol/topic/health-plans/portability.htm
 E-Government Act of 2002 http://www.archives.gov/about/laws/egov-act-section-
207.html
 Online Privacy Protection Act of 2003
http://en.wikipedia.org/wiki/Online_Privacy_Protection_Act
 Children's Online Privacy Protection Act of 1998 (COPPA)
http://www.ftc.gov/ogc/coppa1.htm
 Fair Information Practice Principles (FIPs) specified by the Federal Trade Commission
(FTC) http://www.ftc.gov/reports/privacy3/fairinfo.shtm
CSE345/545 - Winter 2025
Anonymity

Dr. Arun Balaji Buduru


Founding Head, Usable Security Group (USG)
Associate Professor, Dept. of CSE | HCD, IIIT-Delhi, India
Visiting Faculty, Indiana University – Bloomington, USA

Thanks to Norcie, Newman, Shambuddho and others for sample slides and materials
Who Needs Anonymity?
1

 Political Dissidents, Whistleblowers


 Censorship resistant publishers
 Socially sensitive communicants:
 Chat rooms and web forums for abuse survivors, people with
illnesses
 Law Enforcement:
 Anonymous tips or crime reporting
 Surveillance and honeypots (sting operations)

 Corporations:
 Hiding collaborations of sensitive business units or partners
 Hide procurement suppliers or patterns
 Competitive analysis
Who Needs Anonymity?
2
 You:
 Where are you sending email (who is emailing you)
 What web sites are you browsing
 Where do you work, where are you from
 What do you buy, what kind of physicians do you visit, what books do
you read, ...
 Governments
 Open source intelligence gathering
◼ Hiding individual analysts is not enough
◼ That a query was from a govt. source may be sensitive
 Defense in depth on open and classified networks
◼ Networks with only cleared users (but a million of them)
 Dynamic and semi-trusted international coalitions
◼ Network can be shared without revealing existence or amount of
communication between all parties
Anonymous From Whom?
Adversary Model
3

 Recipient of your message


 Sender of your message
 Need Channel and Data Anonymity
 Observer of network from outside
 Network Infrastructure (Insider)
 Need Channel Anonymity
 Note: Anonymous authenticated communication makes
perfect sense
 Communicant identification should be inside the basic
channel, not a property of the channel
Grab the code and try it out
4

 Published under the BSD license


 Not encumbered by Onion Routing patent
 Works on Linux, BSD, OS X, Solaris, Win32
 Packages: Debian, Gentoo, *BSD, Win32
 Runs in user space, no need for kernel mods or root

https://tor.eff.org
How to enforce anonymity
5/53

 Most obvious choices:


Anonymous communication
systems (e.g. Tor)
 Hide “who” communicates
with “whom”
 Other examples: JAP, I2P,
Mixminion, GNUNet
The Onion Router
6

 What is Tor?
Sender/Responder anonymity network
Circuit-based overlay network
Low-latency
2nd gen aims:
◼Perfect forward secrecy, congestion control,
directory servers, integrity checking, location
hidden servers...
Overlay Networks
7
Basic Tor ideas
8

 Each OR maintains TLS connection with the other ORs


 OPs get directory of ORs from Trusted Directory Server
 OP builds circuit of ORs. Default length: 3 ORs.
Selection of Circuits
9

 TOR choose the path for each new circuit before it builds it.
 The exit node is chosen first, followed by the other nodes in the
circuit
 Some of the constraints:
 Exit relay should actually allows you to exit the Tor network
◼ Some only allow web traffic (port 80) which is not useful when someone
wants to send emails
 The exit relay has to have available capacities
 No same router twice for the same path.
 No choosing any router in the same family as another in the same
path. (Two routers are in the same family if each one lists the
other in the "family" entries of its descriptor)
 No choosing more than one router in a given /16 subnet.
 The first node must be a Guard node.
Overview of Tor
10/53

Eentry(Emid(Eexit(M))) Emid(Eexit(M)) Eexit(M) M Server

Tor client
Middleman

Entry Node Exit Node

Directory Unencrypted Link


Service(s) TLS Encrypted Link
Malicious Entry/Exit Points
11

 If entry/exit points
collude, they know
that I and R are
using Tor. Can
conduct timing
analysis to try and
link I/R
De-anonymization
12/53

De-anonymization is a 2-step process.


 First step: Reducing the set of routers and/or hosts to
monitor.
 Second step: Finding the victim node.
◼ Feasibilities
of monitoring high-speed networks.
◼ Accuracy of identifying victim amidst several clients.
◼ Is de-anonymization as easy as it seems ?

 Defending against traffic analysis attacks.


 Finding the attackers.
De-anonymization of Anonymous
Communication: General Attack Strategy
13/53

Services
Users
Proxies
The 3 Traditional Threats to Tor's Security:

• DNS Leaks
• Traffic Analysis
• Malicious Exit Nodes
Threat 1: DNS Leaks

• DNS requests not sent through Tor


network by default
• Attacker could see what websites
are being visited
• External software such as
Foxyproxy and Privoxy can be
used to route DNS requests
through tor network, but this is
_not_ default behavior
Threat 2: Traffic Analysis

• Traffic-analysis is extracting and


inferring information from network
meta-data
• Including the volumes and timing of
network packets, as well as the visible
network addresses they are originating
from and destined for
• Tor is a low latency network, and thus
is vulnerable to an attacker who can
see both ends of a connection
Metadata Analysis
17

 Metadata comes from a variety of sources,


 Info on application running in a flow
 Domain names that were hosted on an IP address at the time
traffic was captured
 ID of intrusion detection and prevention system (IDPS) alerts
triggered by network traffic or flow
 Users logged in to a system at the time traffic was captured

 URLs extracted from an email message

 User-agent strings in an HTTP transaction.


Metadata Analysis
18

 Network profiling using flow


 With flow-monitoring capability, one can do analysis to
construct a profile of IP addresses based on past behaviors
 IPs can be tagged with labels like “DNS client” and
“NAT/Gateway”
 Metadata is useful, especially if the analyst knows where
it came from and how it was generated
 In many cases, this is derived from application-level analysis,
both within the organization and data shared from other
organizations.
Traffic Analysis Against Anonymity Networks Using
Available Bandwidth Estimation: Overview
19/53

❑ Can network bandwidth variations be used to track


anonymously communicating parties?
❑ Basic attack strategy:

- Adversary + one side of the connection (client or


hidden server): Induces change in network traffic in one
end of an (anonymous) TCP connection.
- Adversary tracks the induced traffic fluctuation
propagate through various network elements, all the
way, to the peer.
“Tracing” the Path of Tor Clients to their
Entry Nodes
20/53

Exit
Middleman Node
Colluding
Entry Server
Tor Node
client Injected
pattern
LinkWidth
Probes
Adversary
NetFlow Based Traffic Analysis: Approach

Non-victim Benign
Tor network server
Injected
Injected Entry traffic pattern travels Injected
traffic pattern through the victim circuit traffic
pattern
Middleman
Exit
NetFlow Data

Victim

NetFlow Data
Colluding
server
21/53

Computing correlation coefficient (r)


Non-victim
Threat 3: Rogue Exit Nodes

• Traffic going over Tor is not encrypted, just


anonymous
• Malicious exit node can observe traffic
• Swedish researcher Dan Egerstad obtained emails
from embassies belonging to Australia, Japan,
Iran, India and Russia, publishes them on the net.
• Sydney Morning Herald called it “hack of the
year” in interview with Egerstad
#1 Tor Usability Issue: TOR IS SLOW

• Example: TCP backoff slows down every


circuit at once.
• “Tor combines all the circuits going
between two Tor relays into a single TCP
connection.
• Smart approach in terms of anonymity, since
putting all circuits on the same connection
prevents an observer from learning which
packets correspond to which circuit.
• Bad idea in terms of performance, since
TCP’s backoff mechanism only has one option
when that connections sending too many
bytes: slow it down, and thus slow down all
the circuits going across it.
• This is only one subpart of one section of
a 27 page paper entitled “Why Tor is Photo courtesy Wikimedia Commons
Slow and What We're Doing About It”.
Something to Think About:

 "A hard-to-use system has fewer users — and


because anonymity systems hide users among users, a
system with fewer users provides less anonymity.
Usability is thus not only a convenience: it is a security
requirement"
-Tor Design Document
Summary
25/53

❑ Bandwidth estimation based traffic analysis technique can be used to


identify the relays, as well as routers, involved in a Tor circuit.
❑ Moderate success with real world anonymization systems such as Tor.
❑ Lack of vantage points.

❑ Partly due to Tor’s poor QoS.

❑ Powerful adversary, with adequately provisioned and appropriately


located vantage point hosts, can verify the identity of anonymously
communicating parties.
❑ What if the adversary had partial view of the network traffic
(statistics) ?
Additional Reading
• Tor design document: https://git.torproject.org/checkout/tor/master/doc/design-
paper/tor-design.html

• Usability of Anonymous web browsing: an examination of Tor Interfaces and


deployability Clark, J., van Oorschot, P. C., and Adams, C. 2007.
(http://cups.cs.cmu.edu/soups/2007/proceedings/p41_clark.pdf)
• Article in Wired on Malicious exit nodes:
http://www.wired.com/politics/security/news/2007/09/embassy_hacks?currentPage=1

Dan Egerstad Interview: (One of first to widely publish on malicious exit nodes):
http://www.smh.com.au/news/security/the-hack-of-the-
year/2007/11/12/1194766589522.html?page=fullpage#contentSwap1

• Low-Cost Traffic Analysis of Tor:


http://www.cl.cam.ac.uk/users/sjm217/papers/oakland05torta.pdf

• Why Tor is Slow and What We're Doing About It:


https://svn.torproject.org/svn/tor/trunk/doc/roadmaps/2009-03-11-
performance.pdf

You might also like