Lecture U 4
Lecture U 4
LeUNIT 4 PPT
MADE BY :YUVRAJ MEHRA
SAP ID :500106775
• Cryptography,
• Data provenance,
• Cross-cutting issues,
• Operational overview,
•Title: Data Security in Cloud Computing and GPU-Accelerated FHE for Applications
•Subtitle: A Focus on Modern Methods and Applications
•Details: Your name, date, and organization/university.
Definitions and terminology
Key A Key B
• Types:
1. Block Ciphers
– Encrypt data one block at a time (typically 64 bits, or
128 bits)
– Used for a single message
2. Stream Ciphers
– Encrypt data one bit or one byte at a time
– Used if data is a constant stream of information
Symmetric Encryption
Key Strength
• Strength of algorithm is determined by the size of the
key
• The longer the key the more difficult it is to crack
DEFGHIJKLMNOPQRSTUVWXYZABC
Key (3)
Decrypti
on
Cipher Text Plain Text
Cipher:
Message: Caesar Cipher Message:
Dwwdfn Dw Gdyq Algorithm Attack at Dawn
Key (3)
Key
Substitution Cipher
Polyalphabetic Caesar Cipher
Message: Encrypted
Cipher: Message:
Bob, I love you. Monoalphabetic Gnu, n etox dhz.
• Alice
Example Cipher tenvj
Key
Substitution Cipher
Using a key to shift alphabet
• Obtain a key to for the algorithm and then shift the alphabets
• For instance if the key is word we will shift all the letters by four and remove
the letters w, o, r, & d from the encryption
• We have to ensure that the mapping is one-to-one
• no single letter in plain text can map to two different letters in cipher text
• no single letter in cipher text can map to two different letters in plain text
WORD
Transposition Cipher
Columnar Transposition
• This involves rearrangement of characters on the plain text into columns
• The following example shows how letters are transformed
• If the letters are not exact multiples of the transposition size there may be a
few short letters in the last column which can be padded with an infrequent
letter such as x or z
T H I S I T S S O H
S A M E S O A N I W
S A G E T H A A S O
O S H O W L R S T O
H O W A C I M G H W
O L U M N U T P I R
A R T R A S E E O A
N S P O S M R O O K
I T I O N I S T W C
W O R K S N A S N S
Ciphers
Shannon’s Characteristics of “Good” Ciphers
48-bit k1
L1 R1 • DES run in reverse to
decrypt
F(L1, R1, K1)
48-bit k2
• Cracking DES
L2 R2
• 1997: 140 days
F(L2, R2, K2) • 1999: 14 hours
L3 R3
48-bit k3 • TripleDES uses DES 3
times in tandem
• Output from 1 DES is
input to next DES
F(L16, R16, K16)
48-bit k16
L17 R17
Encryption Algorithm
Summary
David’s
Bob’s Bob’s Public Key
Message Trudeau
Cipher Encrypted David
+ Public key (Middle-man)
Message
Trudeau’s David’s
Trudeau’s Trudeau’s
New Message Message
Encrypted Cipher + public key Encrypted Cipher + public key
Message Message
Asymmetric Encryption
Session-Key Encryption
• Used to improve efficiency
• Symmetric key is used for encrypting data
• Asymmetric key is used for encrypting the symmetric key
Send to Recipient
Encrypted
Cipher Key
(RSA)
Session Key
•
Bob’s Bob and Alice do not need to transfer any key
Cipher
Public Key
(DES) Alice and Bob
Bob’s Session Key
Generate Same
Private Key Session Key!
Alice’s Cipher
Public Key
(DES)
Asymmetric Encryption
Key Diffie-Hellman Mathematical Analysis
Bob & Alice
Bob agree on non-secret Alice
prime p and value a
Message
Message Digest Digest
Algorithm
Secret Key
Password Authentication
Basics
• Password is secret character string only known
to user and server
• Message Digests commonly used for password
authentication
• Stored hash of the password is a lesser risk
• Hacker can not reverse the hash except by brute
force attack
• Challenge-response
• Server sends a random value (challenge) to the client along with the
authentication request. This must be included in the response
• Protects against replay
• Time Stamp
• The authentication from the client to server must have time-stamp embedded
• Server checks if the time is reasonable
• Protects against replay
• Depends on synchronization of clocks on computers
• One-time password
• New password obtained by passing user-password through one-way function n
times which keeps incrementing
• Protects against replay as well as eavesdropping
Authentication Protocols
Kerberos
Facts
• Probability of two irises producing exactly the
same code: 1 in 10 to the 78th power
• Independent variables (degrees of freedom)
extracted: 266
• IrisCode record size: 512 bytes
• Operating systems compatibility: DOS and
Windows (NT/95)
• Average identification speed (database of
100,000 IrisCode records): one to two seconds
Authentication
Digital Signatures
• A digital signature is a data item which
accompanies or is logically associated with a
digitally encoded message.
• It has two goals
• A guarantee of the source of the data
• Proof that the data has not been tampered with
Sender’s Sender’s
Private Key Public Key
Same?
Digital
Message Signature Signature Signature Message
Digest Algorithm Sent to Algorithm Digest
Receiver
Sender Receiver
Authentication
Digital Cerftificates
• A digital certificate is a signed statement by a trusted party that
another party’s public key belongs to them.
• This allows one certificate authority to be authorized by a different
authority (root CA)
• Top level certificate must be self signed
• Any one can start a certificate authority
• Name recognition is key to some one recognizing a certificate authority
• Verisign is industry standard certificate authority
Identity
Information
Signature Certificate
Sender’s
Algorithm
Public Key
Certificate
Authority’s
Private Key
Authentication
• ChainingChaining
Cerftificates is the practice of signing a certificate with
another private key that has a certificate for its public
key
• Similar to the passport having the seal of the government
• It is essentially a person’s public key & some identifying
information signed by an authority’s private key
verifying the person’s identity
• The authorities public key can be used to decipher the
certificate
• The trusted party isSignature
Certificate
called the certificate authority
New Certificate
Algorithm
Certificate
Authority’s
Private Key
Cryptanalysis
Basics
• Practice of analyzing and breaking cryptography
• Resistance to crypt analysis is directly proportional to
the key size
• With each extra byte strength of key doubles
• Network Level
• Host Level
• Application Level
53
The Network Level
• SaaS/PaaS
• Both the PaaS and SaaS platforms abstract and hide the host OS from
end users
• Host security responsibilities are transferred to the CSP (Cloud Service
Provider)
• You do not have to worry about protecting hosts
• However, as a customer, you still own the risk of managing
information hosted in the cloud services.
From [6] Cloud Security and Privacy by Mather and
Kumaraswamy
56
The Host Level (cont.)
57
From [6] Cloud Security and Privacy by Mather and
Case study: Amazon's EC2
infrastructure
• “Hey, You, Get Off of My Cloud: Exploring Information Leakage in Third-Party Compute
Clouds”
• Multiple VMs of different organizations with virtual boundaries separating each VM can
run within one physical server
• "virtual machines" still have internet protocol, or IP, addresses, visible to anyone
within the cloud.
• VMs located on the same physical server tend to have IP addresses that are close to
each other and are assigned at the same time
• An attacker can set up lots of his own virtual machines, look at their IP addresses, and
figure out which one shares the same physical resources as an intended target
• Once the malicious virtual machine is placed on the same server as its target, it is
possible to carefully monitor how access to resources fluctuates and thereby
potentially glean sensitive information about the victim
58
Local Host Security
60
The Application Level
• DoS
• EDoS(Economic Denial of Sustainability)
• An attack against the billing model that underlies the cost of
providing a service with the goal of bankrupting the service itself.
• End user security
• Who is responsible for Web application security in the cloud?
• SaaS/PaaS/IaaS application security
• Customer-deployed application security
From [6] Cloud Security and Privacy by Mather and
Kumaraswamy
61
Data Security and Storage
• Data remanence
• Inadvertent disclosure of sensitive information is possible
• Data security mitigation?
• Do not place any sensitive data in a public cloud
• Encrypted data is placed into the cloud?
• Provider data and its security: storage
• To the extent that quantities of data from many companies are centralized, this collection
can become an attractive target for criminals
• Moreover, the physical security of the data center and the trustworthiness of system
administrators take on new importance.
• Organization’s trust boundary will become dynamic and will move beyond the control and
will extend into the service provider domain.
• Managing access for diverse user populations (employees, contractors, partners, etc.)
• Increased demand for authentication
• personal, financial, medical data will now be hosted in the cloud
• S/W applications hosted in the cloud requires access control
• Need for higher-assurance authentication
• authentication in the cloud may mean authentication outside F/W
• Limits of password authentication
• Need for authentication from mobile devices
66
What is Privacy?
• The concept of privacy varies widely among (and sometimes within) countries,
cultures, and jurisdictions.
• It is shaped by public expectations and legal interpretations; as such, a concise
definition is elusive if not impossible.
• Privacy rights or obligations are related to the collection, use, disclosure,
storage, and destruction of personal data (or Personally Identifiable Information
—PII).
• At the end of the day, privacy is about the accountability of organizations to data
subjects, as well as the transparency to an organization’s practice around
personal information.
• Whether the cloud provider itself has any right to see and access
customer data?
• Some services today track user behaviour for a range of purposes, from
sending targeted advertising to improving services
• including logs that may show access to the data of Companies Y and Z?
Full reliance on a third party to protect personal data?
• In-depth understanding of responsible data stewardship
• Organizations can transfer liability, but not accountability
• Risk assessment and mitigation throughout the data life cycle is critical.
• Many new risks and unknowns
• The overall complexity of privacy protection in the cloud represents a bigger
challenge.
• 3 dimensions of privacy:
1) Personal privacy
Protecting a person against undue interference (such as physical
searches) and information that violates his/her moral sense
2) Territorial privacy
Protecting a physical area surrounding a person that may not
be violated without the acquiescence of the person
• Safeguards: laws referring to trespassers search warrants
3) Informational privacy
Deals with the gathering, compilation and selective
1. Introduction (2) [cf. Simone Fischer-Hübner]
• By businesses
• Online consumers worrying about revealing personal data
held back $15 billion in online revenue in 2001
• By Federal government
• Privacy Act of 1974 for Federal agencies
• Health Insurance Portability and Accountability Act of 1996
(HIPAA)
2. Recognition of Need for Privacy Guarantees (2)
• UIUC
• Roy Campbell (Mist – preserving location privacy in pervasive computing)
• Marianne Winslett (trust negotiation w/ controled release of private
credentials)
• Perfect depersonalization:
• Data rendered anonymous in such a way that the data subject
is no longer identifiable
• Practical depersonalization:
• The modification of personal data so that the information
concerning personal or material circumstances can no longer
or only with a disproportionate amount of time, expense and
labor be attributed to an identified or identifiable individual
• Outline
a) Legal World Views on Privacy
b)International Privacy Laws:
• Comprehensive Privacy Laws
• Sectoral Privacy Laws
2) Sectoral Laws
• Idea: Avoid general laws, focus on specific sectors instead
• Advantage: enforcement through a range of mechanisms
• Disadvantage: each new technology requires new
legislation
4.2. Legal Privacy Controls (5) -- b) International Privacy Laws
Comprehensive Laws - European Union
• No explicit right
4.2. Legal toControls
Privacy privacy (6) --in
b) the constitution
International Privacy Laws
• Limited Sectoral
constitutional Laws right- United
to privacy States
implied (1) in
number of provisions in the Bill of Rights
• An evaluation conducted to assess how the adoption of new information policies, the
procurement of new computer systems, or the initiation of new data collection programs
will affect individual privacy
• The premise: Considering privacy issues at the early stages of a project cycle will reduce
potential adverse impacts on privacy after it has been implemented
• Requirements:
• PIA process should be independent
• PIA performed by an independent entity (office and/or commissioner) not linked to the project under
review
• Participating countries: US, EU, Canada, etc.
4.2. Legal Privacy Controls (10)
d) A Common Approach: PIA (2)
• EU implemented PIAs
• Under the European Union Data Protection Directive, all EU members must
have an independent privacy enforcement body
• Observation 2: Use of self-regulatory mechanisms for the protection of online activities seems
somewhat haphazard and is concentrated in a few member countries
• Observation 4: Not enough being done to encourage the implementation of technical solutions
for privacy compliance and enforcement
• Only a few member countries reported much activity in this area
4.2. Legal Privacy Controls (12)
e) Observations and Conclusions
[cf. A.M. Green, Yale, 2004]
• Conclusions
• Still work to be done to ensure the security of personal information for
all individuals in all countries
Outline
• Our belief: Socially based paradigms (such as trust-based approaches) will play
a big role in pervasive computing
• Solutions will vary (as in social settings)
• Heavyweighty solutions for entities of high intelligence and capabilities (such as
humans and intelligent systems) interacting in complex and important matters
• Lightweight solutions for less intelligent and capable entities interacting in simpler
matters of lesser consequence
5. Selected Advanced Topics in Privacy
Outline
• Problem
• How to determine that certain degree of data
privacy is provided?
• Challenges
• Different privacy-preserving techniques or systems
claim different degrees of data privacy
c) Related Work
d) Proposed Metrics
“Hiding in a
crowd”
Anonymity Set
• Anonymity set A
A = {(s1, p1), (s2, p2), …, (sn, pn)}
• si: subject i who might access private data
or: i-th possible value for a private data attribute
• pi: probability that si accessed private data
or: probability that the attribute assumes the i-th
possible value
5.3. Privacy Metrics (7)
| A|
L | A | min(
• Effective anonymity set size
p i ,1is/ | A |)
i 1
• Deficiency:
L does not consider violator’s learning behavior
5.3. Privacy Metrics (8)
B. Entropy-based Metrics
Dynamics of Entropy
H*
Entrop
y
Level
Disclosed All
attributes attribut
(a es (b (c) (d
) ) )
• When entropy reaches a threshold (b), data evaporation can be
invoked to increase entropy by controlled data distortions
• When entropy drops to a very low level (c), apoptosis can be triggered
to destroy private data
• Entropy increases (d) if the set of attributes grows or the disclosed
attributes become less valuable – e.g., obsolete or more data now
5.3. Privacy Metrics (10)
Entropy: Example
• Suppose that after time t, violator can figure out the state of
the phone number, which may allow him to learn the three
10
9
leftmost digits
H A, t 0 w j 0.1 log2 0.1 23.3
j 4 •
i 0 Entropy at time t is given by:
• At a high level, the client application uses a Client API to make use
of the Cloud Storage Service.
• The Client Management Service manages all of the keys, data,
and operations on behalf of the client.
• The Cloud Storage Service hosts the application server and the
Cloud Storage Engine.
• The application data is stored and processed in the cloud in
encrypted form, then returned to the client application for display
to the end-user.
Nomad Operational Overview
• When first using the system, the user must initialize the client and
generate their own public/private key pair (Keypublic, Keyprivate).
Nomad Operational Overview
• Data Storage Workflow:
• 1) System Initialization: Upon first using the system, the user sends a request to the
Client Management Engine to generate a public/private key pair (< IDuser,
Keypublic, Keyprivate >). The Client Management Engine then asks the HE Key
Manager to generate the key pair and store it in the Public/Private Key Store. The
Client Management Engine also sends the User ID and Public Key (< IDuser,
Keypublic >) to the Cloud Storage Engine for later usage. The Cloud Storage Engine
calls on the HE Key Manager to store the User ID and Public Key in the Public Key
Store.
• 2) The user initiates a request to store their Dataplaintext in the cloud storage
• 3) The Client Management Engine submits a request to encrypt the data to get the
ciphertext (Enc(Dataplaintext, Keypublic) = Dataciphertext).
• 4) The Client Management Engine submits a request to the server to store the
ID/data pair (< IDuser,IDdata, Dataciphertext >).
• 5) The Cloud Storage Engine receives the storage request and calls on the HE
Processing Engine to store the data (< IDdata, Dataciphertext >) in the Ciphertext
Store.
HE Operation Workflow
1. 1) The user requests an operation (e.g., addition) to be performed on two integers (e.g., data1
and data2).
2. 2) The Client Management Engine generates a request and sends it to the Cloud Storage Engine
(< IDuser,IDdata1,IDdata2, Operation >).
3. 3) The Cloud Storage Engine parses the request to identify the operation.
4. 4) The Cloud Storage Engine retrieves the two data elements from the Ciphertext Store using
their associated data IDs (IDdata1 and IDdata2).
5. 5) The Cloud Storage Engine retrieves the Public Key associated with the user’s ID (IDuser) from
the Public Key Store.
6. 6) The Cloud Storage Engine calls on the HE Processing Engine to compute the addition
operation on the ciphertext data (Add(Keypublic, Dataciphertext1, Dataciphertext2) =
Resultciphertext).
7. 7) The Cloud Storage Engine returns Resultciphertext to the Client Management Engine.
8. 8) The Client Management Engine calls on the HE Key Manager to retrieve the user’s private key.
9. 9) The Client Management Engine calls on the HE Processing Engine to decrypt the
Resultciphertext (Dec(Keyprivate, Resultciphertext) = Resultplaintext).
10.10) The Client Management Engine sends the Resultplaintext to the user.
. FULLY HOMOMORPHIC ENCRYPTION
• The BVG scheme for homomorphic encryption is based on the Ring Learning
With Errors (RLWE) problem and is built on algebraic structures, in particular
rings and ideal lattices.
• The security of RLWE schemes is based on the Shortest Vector Problem (SVP),
which has been shown to be at least as hard as worst-case lattice problems.
RLWE schemes are generally more computationally efficient than traditional
LWE because of the use of ideal lattices as the mathematical structure.
• BGV offers a choice of “leveled” FHE schemes based on either LWE or RLWE,
with the RLWE scheme having better performance. “Leveled” FHE means that
the size of the public key is linear in the depth of the circuits that the scheme
can evaluate, that is, its size is not constant.
• The key operation in the scheme is the REFRESH procedure, which switches
the moduli of the lattice structure and switches the key. In fact, with RLWE this
process runs in almost quasilinear time.
Brakerski, Gentry, and Vaikuntanathan (BGV) Scheme
• 1) Encryption: The actual encryption function is fairly straight forward given that the
complexity is built into the mathematical structure. Given a message m ∈ R2, let m =
(m, 0, 0,..., 0) ∈ RN 2 , where N = (2n + 1)(log(q)), for an odd modulus, q. Then the
output ciphertext is given as c = m + AT · r ∈ Rn1 q , where A is the matrix of public
keys and r is a vector of noise from χN 2 . In plain English, the plaintext message is
added to the secret key with additional noise
• 2) Decryption: The decryption algorithm is even simpler for those who hold the
decryption key. Given a ciphertext c and the secret key s, compute m = (c, s mod q)
mod 2, where c, s mod q is the noise associated with s.
• 3) Arithmetic Operations: This scheme supports a number of operations including
element-wise multiplication and addition, scalar arithmetic, and automorphic rotations
that support efficient polynomial arithmetic (think shift-register encoding). An
automorphism is any mapping that sends every element in a given set to another
element in the same set, that is, φ : A → A by φ(a) = b for a, b, ∈ A and it may be the
case that a = b. Note that these operations take plaintext as their input and produce
ciphertext as their output. While the results of these operations are accurate, the
computation overhead is very high. This project is aimed at reducing this computation
cost to make homomorphic encryption feasible for widespread use
HElib
•.
•Cryptography Layer:
•Contains modules for key-switching, encryption, decryption, and ciphertext operations.
•Key modules:
•KeySwitching: Manages matrices needed for key switching.
•FHE: Manages cryptographic operations (e.g., Key Generation, Encryption, Decryption).
•Ctxt: Manages all operations on ciphertexts.
•EncryptedArray: Routes plaintext slots for operations.
•FHEContext: Maintains parameters across operations.
HElib
•HElib Parameters:
•k: Security parameter, default value of 80 (minimum for security).
•m, p, r: Defines native plaintext space Z[X]/(Φm(X),pr)Z[X]/(\Phi_m(X), p^r)Z[X]/(Φm(X),pr),
where mmm aligns with the security parameter.
•d: Degree of field extension (default is 1 for linear factors).
•R: Number of rounds (default is 1).
•c: Columns in key-switching matrices (default is 2).
•L: Levels in modulus chain (default is heuristic).
•s: Minimum plaintext slots (default is 0).
ACCELERATING HOMOMORPHIC
ENCRYPTION
• Evolution of GPUs:
• GPUs evolved from simple display applications in the 1980s to complex computations
like 3D rendering.
• In 2007, NVIDIA’s CUDA enabled widespread use of General-Purpose GPU computing
(GPGPU), managing multiple cores efficiently.
• Advantages of GPUs for HElib:GPUs act as mobile high-performance computing (HPC)
devices, suitable for applications needing real-time data processing (e.g., on
UAVs).Reprogrammable GPUs fit military and mobile applications by providing real-
time feedback.
• Profiling and Identifying Intensive Tasks in HElib:Key HElib functions (encoding,
encryption, arithmetic operations) were profiled to identify computational
bottlenecks.
• Time-consuming, repetitive algorithms were targeted for GPU-based parallelization.
• CUDA Overhead:CUDA overhead includes memory allocation, transfers (host-to-
device and device-to-host), and deallocation.
• GPU initialization incurs a one-time cost that must be factored in when porting code.
ACCELERATING HOMOMORPHIC
ENCRYPTION
•Parallelization Strategy:
•Following Amdahl’s Law, code sections with the most potential for parallelization were
identified.
•BluesteinInit() and BluesteinFFT() functions were found to be the most
computationally intensive in HElib, taking up about 10% and 46% of execution time,
respectively.
•Implementation of Parallelization:
•Bluestein FFT code was partially ported to the GPU using CUDA for significant
speedup, leveraging GPU capabilities to reduce processing time for key operations in
HElib.
CallForF ire
•Computation Overhead:
•Measures time required for encryption, location computation, and decryption.
•Key findings:
•Encryption of location data (six parameters): ~0.64 seconds.
•Computation of target’s east and north coordinates (easting and northing): ~2.2 and
~2.34 seconds, respectively.
•Decryption of location data: ~39.2 seconds, which is higher due to the lack of efficient
data packing.
•Transmission Overhead:
•Evaluates end-to-end time for transmitting encrypted vs. unencrypted data.
•Key findings:
•Transmission of unencrypted FO location to FDC: ~1.1 seconds.
•Transmission of encrypted FO location to FDC: ~23.5 seconds.
•The increased time reflects the additional overhead of encryption but still supports
feasible communication.
•Storage Overhead:
•Assesses storage requirements for encrypted vs. unencrypted data.
•Key findings:
•Each encrypted location requires significantly more storage than the unencrypted
version, due to the larger ciphertext size.
•Example: average size of an encrypted location is around 8.96 MB compared to 17.6
bytes for unencrypted data.
•Compression methods help reduce storage but remain a key area for optimization.
•Overall Feasibility:
•Despite higher computational and storage costs, CallForFire demonstrates that secure
cloud-based operations are achievable.
•GPU-based parallelization and homomorphic encryption allow real-time processing for
defense applications, balancing security and performance.
Thank you