0% found this document useful (0 votes)
112 views12 pages

Eva CKKS

Homomorphic Encryption schemes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
112 views12 pages

Eva CKKS

Homomorphic Encryption schemes
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

EVA Improved: Compiler and Extension Library for CKKS


Sangeeta Chowdhary Wei Dai
Rutgers University Microsoft Research
New Brunswick, NJ, USA Redmond, WA, USA
[email protected] [email protected]

Kim Laine Olli Saarikivi


Microsoft Research Microsoft Research
Redmond, WA, USA Redmond, WA, USA
[email protected] [email protected]
ABSTRACT ACM Reference Format:
Homomorphic encryption (HE), especially the CKKS scheme, can Sangeeta Chowdhary, Wei Dai, Kim Laine, and Olli Saarikivi. 2021. EVA Im-
proved: Compiler and Extension Library for CKKS. In Proceedings of the 9th
be extremely challenging to use. The EVA language and compiler
Workshop on Encrypted Computing & Applied Homomorphic Cryptography
(Dathathri et al., PLDI 2020) was an attempt at addressing this (WAHC ’21), November 15, 2021, Virtual Event, Republic of Korea. ACM, New
challenge. EVA allows a developer to express their encrypted com- York, NY, USA, 12 pages. https://doi.org/10.1145/3474366.3486929
putation in a simple form with a Python-integrated language called
PyEVA. It then compiles the program into an executable form by 1 INTRODUCTION AND BACKGROUND
inserting operations such as relinearization and rescaling, apply-
We will first discuss homomorphic encryption in general and details
ing optimizations, and choosing encryption parameters with the
of the CKKS scheme both to give background and to motivate the
objective of minimizing execution time. Compiled programs can
need for a compiler like EVA. Then we will introduce EVA both from
be executed with a parallelizing back-end against a library of HE
a user’s and developer’s point of view to motivate our contributions.
primitives.
Our work improves upon the EVA toolchain in several ways:
changes to the Python front-end make writing PyEVA programs
1.1 Homomorphic Encryption
more natural, while a rework of EVA’s C++ APIs makes writing new Homomorphic encryption [22, 32] refers to encryption that allows
passes easier. We also implement two new optimizations, common computation to be done on encrypted data, without requiring secret
subexpression elimination and reduction balancing, which we show key material. Modern fully homomorphic encryption schemes [6, 7,
allow users to write simpler and more modular PyEVA programs. 12, 14, 20, 21, 23] support at least two different arithmetic or binary
We argue that the abstraction EVA provides is insufficient to operations on encrypted data. In this work we focus exclusively on
resolve some common usability challenges. For example, managing the CKKS scheme introduced in [12], which we will describe in the
vectors of arbitrary size is non-trivial. To resolve these problems, next section.
we demonstrate how building a library of commonly used data Since 2010 several homomorphic encryption libraries have been
structures and functions is simple in PyEVA. EVA’s automation developed, implementing many of the schemes mentioned above.
allows writing very concise code, which gets fused and optimized These libraries provide only low-level homomorphic encryption
together with the user program. We create the beginnings of an EVA primitives, including key generation, encryption, decryption, a few
Extension Library (EXL), that provides vector and matrix classes scheme-dependent computational operations on encrypted data,
and a collection of common statistical functions, to demonstrate and a variety of “maintenance” operations that are required for
the power of this approach. the functionality of the scheme. One problem is that homomorphic
encryption schemes are generally very sensitive to the computation
CCS CONCEPTS being organized in the exactly right way, and the maintenance
operations must be used appropriately. Another problem is that
• Security and privacy → Public key encryption.
the encryption schemes have a number of parameters that must be
carefully chosen; otherwise, the result of the encrypted computation
KEYWORDS
may be incorrect, or the performance may be several orders of
compilers; fully homomorphic encryption magnitude worse than it needs to be. These details can be difficult
to handle even for experts and make homomorphic encryption
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed nearly inaccessible to most developers.
for profit or commercial advantage and that copies bear this notice and the full citation One approach to addressing these issues is to create a domain
on the first page. Copyrights for components of this work owned by others than ACM specific language (DSL) that is tailored for homomorphic encryption
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a and optimizing compiler that reorders the computation according to
fee. Request permissions from [email protected]. some heuristic strategy, inserts necessary maintenance operations,
WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea selects optimal encryption parameters, and compiles the DSL into
© 2021 Association for Computing Machinery.
ACM ISBN 978-1-4503-8656-2/21/11. . . $15.00 native homomorphic encryption library API calls. Several such
https://doi.org/10.1145/3474366.3486929 compilers have been created recently, targeting different use-cases,

43
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

schemes, and library back-ends [4, 5, 15, 17–19]. In this work we In other words, modulus switching (ModSwitch) moves a cipher-
improve upon the EVA compiler [18] that defines a general purpose text or plaintext one step down in level, without changing the scale,
DSL for encrypted computations with the CKKS scheme, targeting whereas rescaling (Rescale) moves a ciphertext or plaintext one
the Microsoft SEAL [33] library back-end. step down in level and divides its scale by the prime number that
was removed in the step P 𝑗 → P 𝑗+1 from the modulus 𝑞.
1.2 CKKS Scheme We note that encoding, encryption, and these operations do not
The CKKS scheme [11, 12] was invented in 2016 and quickly gained commute in the mathematical sense as one might expect, but in-
significant popularity as one of the most useful homomorphic en- stead encryption is randomized and the other operations introduce
cryption schemes. It is now implemented in several libraries [28, various small amounts of error into the plaintexts and ciphertexts.
30, 33, 35]. We refer readers to [11] for full details on the scheme 1.2.4 Encrypted Computing. The CKKS scheme supports encrypted
and will limit our discussion here to the parts that are essential for element-wise vector addition, negation, multiplication, and complex
understanding the functionality. conjugation, as well as cyclic vector rotation (in either direction); for
addition and multiplication it supports both ciphertext-ciphertext
1.2.1 Encryption Parameters. The CKKS scheme is parameterized
Î and ciphertext-plaintext variants of the operations:
by a power-of-two integer 𝑛 and an integer 𝑞 = 𝑞𝑖 , where 𝑞𝑖
are distinct prime numbers congruent to 1 modulo 2𝑛. In modern Add : 𝐶 P𝑗 ,𝑠 × 𝐶 P𝑗 ,𝑠 → 𝐶 P𝑗 ,𝑠 , 𝐶 P𝑗 ,𝑠 × 𝑃 P𝑗 ,𝑠 → 𝐶 P𝑗 ,𝑠
implementations the 𝑞𝑖 are usually up to around 60 bits in size, Neg : 𝐶 P𝑗 ,𝑠 → 𝐶 P𝑗 ,𝑠
but may be much smaller as well. These parameters determine the
security level of the scheme as described in [1]. In short, a larger Mul : 𝐶 P𝑗 ,𝑠1 × 𝐶 P𝑗 ,𝑠2 → 𝐶 P𝑗 ,𝑠1𝑠2 , 𝐶 P𝑗 ,𝑠1 × 𝑃 P𝑗 ,𝑠2 → 𝐶 P𝑗 ,𝑠1𝑠2
𝑛 increases the security level, whereas a larger 𝑞 decreases the Conj : 𝐶 P𝑗 ,𝑠 → 𝐶 P𝑗 ,𝑠
security level. However, a larger 𝑞 enables computations with higher Rot : 𝐶 P𝑗 ,𝑠 × Z → 𝐶 P𝑗 ,𝑠
multiplicative depth to be computed on encrypted data. The number
of primes in 𝑞 does not matter for security, but it does matter for As expected, addition requires both inputs to have the same scale,
Î
the multiplicative depth. Let P = (𝑛, 𝑘𝑖=1 𝑞𝑖 ) denote a parameter whereas multiplication tolerates different input scales and yields
set for CKKS. We expand P to derive a chain of parameter sets a result with scale that is the product of the input scales. For all
binary operations both inputs must be at the same level.
P0 → P1 → · · · → P𝑘 ′ −1 , where P 𝑗 = (𝑛, 𝑖=1 𝑞𝑖 ) and 𝑘 ′ ≤ 𝑘.
Î𝑘−𝑗
There is one aspect of ciphertexts and plaintexts that we have
Note that P0 = P. We would like to emphasize that 𝑞 is an ordered
ignored thus far: size. Concretely, each CKKS (P, −)-plaintext is a
product of (distinct) prime numbers 𝑞𝑖 : the order matters a lot, as
single polynomial of degree at most 𝑛 − 1 and coefficients modulo 𝑞,
we will see soon.
where P = (𝑛, 𝑞); we say that the size of a plaintext is 1. Each freshly
We refer to the position of the parameter set used for a specific
encrypted ciphertext consists instead of 2 such polynomials; thus,
ciphertext or plaintext as its level. A ciphertext or plaintext using
we say that the size of such a ciphertext is 2. In practice, addition,
the parameter set P0 (resp., P𝑘 ′ −1 ) is said to be at the highest level
negation, ciphertext-plaintext multiplication, complex conjugation,
(resp., lowest level). We will see below that it is easy to move cipher-
and rotations all preserve the size. If addition inputs two ciphertext
texts and plaintexts down in parameter chain until they reach the
of different sizes, the output will have a size that is the larger of the
lowest level. Moving up in the level is also possible, but much more
input sizes. However, ciphertext-ciphertext multiplication increases
challenging and is less commonly implemented or used [9, 10, 27].
the size as follows: if the inputs have sizes 𝐴 and 𝐵, the output has
1.2.2 Encoding and Encryption. Before encrypting, data must first size 𝐴 + 𝐵 − 1. By far the most common case is multiplying two
be encoded into CKKS plaintexts. The encoding process takes a pa- ciphertexts of size 2, resulting in an output ciphertext of size 3.
rameter set P and a scale 𝑠 ∈ R>0 and encodes an 𝑛/2-dimensional Unfortunately, larger ciphertexts are very slow to operate on. To
vector of complex numbers into a (P, 𝑠)-plaintext pt P,𝑠 . The encod- resolve this issue, CKKS supports a so-called relinarization opera-
ing process involves permutation, conjugation, and linear transfor- tion, that reduces the size of a ciphertext. In practice this always
mation on the complex vector, resulting in an output 𝑛-dimensional means reducing from size 3 back to size 2, without changing the
vector of real numbers, which are further multiplied with 𝑠 and underlying plaintext data:
rounded to Z. Thus, the scale quite literally denotes the precision Relin : 𝐶 P𝑗 ,𝑠 → 𝐶 P𝑗 ,𝑠 .
of the fractional part. Encryption converts pt P,𝑠 into a (P, 𝑠)-
ciphertext ct P,𝑠 . There is no way to encode fewer than 𝑛/2 numbers 1.2.5 Stabilizing the Scale. We define one more operation for CKKS:
into a (P, −)-plaintext; consequently, a (P, −)-ciphertext cannot a manual scale change. This is a no-op that merely changes the
encrypt fewer than 𝑛/2 numbers. metadata of a ciphertext or plaintext:

1.2.3 Changing the Level. Let 𝐶 P,𝑠 and 𝑃 P,𝑠 denote the (P, 𝑠)- ChangeScale𝑠 ′ : 𝐶 P𝑗 ,𝑠 → 𝐶 P𝑗 ,𝑠 ′ .
ciphertext and plaintext spaces, respectively. A parameter chain Namely, given a (P, 𝑠)-plaintext, one can simply lie to the decoder
P0 → P1 → · · · → P𝑘 ′ −1 induces two operations between these operation that the plaintext has scale 𝑠 ′ instead. When the decoder
spaces: divides the result vector (of complex numbers) by the scale, it in-
ModSwitch : 𝐶 P𝑗 ,𝑠 → 𝐶 P𝑗 +1 ,𝑠 , 𝑃 P𝑗 ,𝑠 → 𝑃 P𝑗 +1 ,𝑠 correctly divides it by 𝑠 ′ instead of 𝑠, changing the result by a
multiplicative factor 𝑠/𝑠 ′ from what it was supposed to be. For ex-
Rescale : 𝐶 P𝑗 ,𝑠 → 𝐶 P𝑗 +1 ,𝑠/𝑞𝑘−𝑗 , 𝑃 P𝑗 ,𝑠 → 𝑃 P𝑗 +1 ,𝑠/𝑞𝑘−𝑗 ample, given a positive real number 𝑟 ∈ R>0 , one could implement

44
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

a division-by-𝑟 operation by simply changing the scale of a cipher- precision is initially controlled by the user in choosing the scale.
text or plaintext from initial scale 𝑠 to 𝑟𝑠. In general this is not very The exact fractional precision preserved lives between the noise
practical or useful, but there is one very important application that and the scale, so if the scale is large enough, the relevant part of
we now describe. the data remains protected from the noise. In practice, one may
Recall that all prime factors 𝑞𝑖 of the modulus 𝑞 are distinct. For want to test whether specific choices for scale in the inputs yields
example, consider a simple computation like 𝑥 2 + 𝑦 on encrypted appropriate precision.
inputs 𝑥, 𝑦, both encoded (and encrypted) at the highest level with The underlying data in ciphertexts are real values multiplied by
the same scale 𝑠. Computing 𝑥 2 changes the scale to 𝑠 2 and main- a scale 𝑠 and rounded to integers modulo 𝑞. If the integer part of
tains the same level; one may want to relinearize at this point, the underlying data is too large it exceeds the capacity of integers
which does not affect the scale or level. But how can we add 𝑦 to modulo 𝑞, causing an overflow situation. This typically wipes out
the result? One option would be to change the scale of 𝑦 up: this the entire result. Overflow is rarely a problem in the encoding
is possible, for example, by encoding a vector of all ones with a process, but can become a problem at lower levels (i.e., deeper into
scale 𝑠, and performing a ciphertext-plaintext multiplication with the encrypted computation). To avoid this problem, the programmer
this plaintext. This approach has its uses, but the whole idea of must ensure that throughout the computation the data always
the CKKS scheme is to enable removing of unnecessary precision fits into integers modulo the current 𝑞. On one hand, encrypted
through rescaling, not adding more of it. Consider instead rescaling computations often increase the underlying data. On the other hand,
the 𝑥 2 result, yielding a (P1, 𝑠 2 /𝑞𝑘 )-ciphertext, where 𝑞𝑘 is some rescaling reduces the size of 𝑞. Thus, the user must ensure through
prime. Since 𝑦 is a (P0, 𝑠)-ciphertext, we are far from being able proper choices of the scale, 𝑞, and the layout of the computation,
to add these. We can modulus switch 𝑦 down one level, yielding a that the data will not overflow, while maintaining a large enough
(P1, 𝑠)-ciphertext, but we still cannot add, as 𝑠 ≠ 𝑠 2 /𝑞𝑘 . However, if scale for a high-enough effective precision.
𝑞𝑘 ≈ 𝑠, then we could simply change the scale of the 𝑥 2 ciphertext
to 𝑠, introducing a multiplicative scaling error 𝑠/𝑞𝑘 ≈ 1, but allow- 1.2.7 Designing CKKS Programs. At this point the reader might
ing us to immediately compute the sum. In practice it is convenient already anticipate the challenge of designing CKKS programs. For a
to choose the prime numbers 𝑞𝑖 to be close to powers of two. For complicated program with multiple inputs and outputs it is incredi-
example, suppose every prime 𝑞𝑖 is chosen to be close to 240 ; then bly difficult to find a good set of parameters that provides sufficient
we can encode all input data at scale exactly 240 and the above accuracy and efficiency, and yields consistency with the limitations
technique results in the scale stabilizing at 240 in multiplications. of the various operations described above. It is difficult to appro-
The forced scale changes result in a slight error in the result due to priately balance the scales and levels for the internal “wires” in a
the primes 𝑞𝑖 only being close to 240 , but this is often insignificant computation. Moreover, changing the input scale requires often a
and a small price to pay for the added convenience. total change in the encryption parameters and possibly changes to
Scale stabilization makes it often convenient to rescale after every how the computation is structured, making it difficult for a program-
ciphertext-plaintext multiplication, and relinearize and rescale after mer to test whether specific settings would yield better accuracy or
every ciphertext-ciphertext multiplication, although there are many performance than other settings. These are the problems the EVA
situations where this naive rule results in much worse performance language and compiler [18] were designed to solve.
than more clever strategies. We would like to note that in some
implementations the order of relinearization and rescaling may 1.3 EVA
matter, but in SEAL it does not. To address the usability issues of CKKS, Dathathri et al. [18] pre-
We would also like to note that many implementations, including sented a domain-specific language and an optimizing compiler
SEAL, requires the user to designate one or more of the primes targeting the CKKS scheme. EVA has a Python front-end, PyEVA,
in 𝑞 as special primes that do not take part in the parameter chain that allows computations to be expressed using basic arithmetic
construction, but are used for other purposes. While EVA needs operations in Python, creates a computation graph, inserts appropri-
to know how to deal with them, special primes do not play any ate rescaling (and other) operations, and uses heuristic approaches
particular role in this work, and we will ignore them in the rest of to find good encryption parameters.
the text for the sake of simplicity. We have made several changes to EVA to improve usability for
both users and developers. We present EVA here in its improved
1.2.6 Error and Overflow. The CKKS scheme preserves approxi- form and detail our changes in Section 2.
mate computation, producing results with errors in the least sig-
1.3.1 PyEVA. The EVA toolchain creates several types of objects:
nificant bits. One minor source of error, machine error in IEEE
a program (input or compiled), encryption parameters, a signature, a
double precision floats, is introduced when the input complex vec-
public context, and a secret context. The input program is created by
tor is linearly transformed to a vector of real numbers in encoding
the user using the PyEVA language. Such a program may look as
(Section 1.2.2) and when the reverse transformation is performed
follows:
in decoding. This is the maximum effective precision that can be
achieved in most concrete implementations. Another more im- from eva import EvaProgram , Input , Output
portant source of error is the so-called noise added to the least prog = EvaProgram ( ' example ', vec_size =4)
significant bits of encrypted data by all CKKS operations. To be with prog :
precise, the CKKS scheme preserves approximate computation on inx = Input ( 'x ')
a transformed vector of real or complex numbers, whose fractional iny = Input ( 'y ')

45
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

sqsum = inx **2 + iny **2 Opcode Description


Output ( ' out ' , sqsum + inx + iny )
Input, Output Markers for input/output values.
prog . set_input_scales (30)
Constant Produce a Raw value.
prog . set_output_ranges (20)
Negate Unary negation.
The above listing creates an EVA program computing the squared Add, Sub, Mul Binary arithmetic operations.
sum of two inputs plus the sum of the two inputs. Each value is RotateLeftConst New: Rotate left by a fixed offset.
a vector (encrypted) of size 4, and the program is evaluated in a RotateRightConst New: Rotate right by a fixed offset.
SIMD fashion on the input vector values. Due to the limitations of
Relinearize Relinearize result of multiplication.
how CKKS must be parameterized, the vector size must be a power
ModSwitch Drop next modulus.
of two. We set the scale for the program to be 30, which means
Rescale Use next modulus to divide value and scale.
EVA will attempt to stabilize the scales at 230 (Section 1.2.5) and
Undef Illegal in valid programs. Internal use only.
indicate that the absolute value of all outputs is at most 220 , which
Encode New: Encode Raw value into Plain.
EVA uses to avoid overflow.
The next step is to compile the input program into a fully func- Table 1: Operations supported by EVA, including new ones
tioning EVA program that can be executed on encrypted data. For introduced in this work. Opcodes above the line are the ones
example, the above program is in no way ready to be executed: user programs may use.
there is no information about what encryption parameters to use,
where to perform modulus switching or rescaling, and where to
relinearize. Compilation handles all this:
from eva . ckks import CKKSCompiler
has an in-memory representation implemented in C++ on which
compiler = CKKSCompiler ()
the compiler’s passes operate on. Additionally, there is a serialized
prog , params , signature = compiler . compile ( prog )
representation implemented with Protocol Buffers for use as a wire
and on-disk format.
EVA inserts the necessary maintenance operations into the compu- EVA programs are directed acyclic graphs (DAGs) of terms,
tation and selects appropriate encryption parameters. In addition where a term has and maintains:
to the compiled program, EVA returns an object containing the
op The opcode that determines the computation performed by
encryption parameters, as well as a signature object describing how
this term, one of those in Table 1.
each input should be encoded.
operands List of all terms used as operands for this term.
Suppose you want to encrypt private input data and share it
uses List of all terms that include this term in their operands.
with an evaluator, e.g., an untrusted cloud computing environment,
holding the program object. The evaluator would share the pa- Each term is additionally categorized into one of three types: Cipher,
rameter and signature objects with you, allowing you to set up a Plain and Raw (new; see Section 1.2.2). Types are deduced by EVA
public/secret key pair, and encrypt your input data: at compile time.
In addition to the terms, each EVA program has and maintains:
from eva . seal import generate_keys
public_ctx , secret_ctx = generate_keys ( params ) name User-given name for the program.
input = { 'x ': [1 ,2 ,3 ,4] , 'y ': [5 ,6 ,7 ,8]} sources and sinks The terms that have no operands and uses,
enc_input = public_ctx . encrypt ( input , signature ) respectively. These are used as entry points for forward and
The public context holds only public key data and can in principle backward traversals.
shared with other parties for providing input data to the compu- inputs and outputs Maps from names to Input and Output
tation; the secret context holds the secret key and should be kept terms, respectively.
private. The input data is provided as a dictionary keyed by the vec_size Global size for all EVA vectors. Independent of the
input wire names. For each input wire, we provide a list of size 4 – SEAL vector size, i.e., the number of slots.
the vector size specified when the program object was created. 1.3.3 Passes. EVA operates on this IR with two kinds of program
Next the encrypted inputs and the public context must be sent traversals: rewrite passes are allowed to modify, create and remove
to the evaluator, who can then execute the EVA program: terms and are always single-threaded, while analysis passes only
enc_output = public_ctx . execute ( prog , enc_input ) visit each term, but may use a highly scalable multi-core traversal
implemented with the Galois library [29]. Both kinds of passes may
The encrypted outputs can be sent back to the secret context
additionally use a forward or backward traversal, which guarantee
owner, who can decrypt them and obtain the results:
that each terms operands or uses, respectively, are visited before
output = secret_ctx . decrypt ( enc_output , signature ) term itself.
The decrypted result is a dictionary, keyed by names of the outputs. Table 2 lists both existing passes in EVA and passes added in
In this case we had only a single output wire named out. Printing this work. Most passes use a forward traversal, with ModSwitcher
outputs[’out’] returns a list of size 4. being the only example of a backward pass.

1.3.2 Internal Representation (IR). EVA has a unified IR for input 1 TheMinimum and Always policies can’t handle some programs, in which case this
programs, all stages of compilation and executable format. This IR pass will instruct the user to switch policies.

46
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

Pass Description • A rewrite of EVA’s internal APIs make it easier for developers
to extend with new passes (Section 2.2).
CommonSubexpressionEliminator New: Combine duplicate terms.
• A new Raw type and Encode operation allow unencrypted in-
TypeDeducer Populate map from terms to types.
puts to be used on multiple levels without having to perform
ConstantFolder Replace terms with only Constant
modulus switching on SEAL plaintexts or transmit multiple
operands with a Constant term.
encodings of the same input (Section 2.4).
ReductionCombiner New: Flatten trees of Add or Mul
• Two new optimizations, common subexpression elimina-
terms into N-ary terms.
tion (CSE) and reduction balancing, allow simpler and more
ReductionLogExpander New: Expand N-ary terms
modular PyEVA programs (Sections 2.5 and 2.6).
MinimumRescaler 

AlwaysRescaler


Rescaling policies; see [18]. Second, we note that despite our improvements to PyEVA, many
EagerWaterlineRescaler common computations can still be relatively tedious to write due to


LazyWaterlineRescaler Similar to Eager version, but delay

its low level of abstraction. As one example, summing the elements
rescaling until Mul terms. of a ciphertext (i.e., “horizontal sum”) is a very standard operation,
EncodeInserter New: Add Encode terms for Raw but is non-trivial to implement.2 As another example, basic vector
operands of Cipher terms. and matrix arithmetic can be surprisingly complicated to implement
EagerRelinearizer Add Relinearize after each Mul. with EVA due to the limitations in the vector sizes that the SEAL
LazyRelinearizer Add Relinearize after each Mul as back-end supports.3
late as possible. We propose an EVA Extension Library (EXL) containing such
ModSwitcher Add ModSwitch terms to bring un- common functions implemented in PyEVA. The benefits of imple-
equal operands to same level. menting EXL in PyEVA, instead of directly against SEAL, are: (1)
SEALLowering Do SEAL specific transformations. it is much easier to do; (2) EVA can co-optimize EXL functions
LevelsChecker Assert all operand levels match. together with the rest of the program. We create the beginnings
ParameterChecker Check Rescale consistency.1 of EXL, including the horizontal sum function, simple vector and
ScalesChecker Check Add and Sub terms operands matrix classes, and a few statistical functions, to demonstrate the
have equal scale. power of this approach (Section 3).
EncryptionParametersSelector Select CKKS parameters based on
Rescale terms. 1.5 Related Work
RotationKeysSelector Find all rotation offsets required. Multiple homomorphic encryption compiler projects have been
Table 2: Passes implemented in EVA. Listed in order of use done in the past, targeting different schemes, different library back-
in EVA’s CKKS compiler, although all are not always used. ends, and different use-cases [37]. Some of the projects clearly define
a custom DSL and an obvious optimizing compiler component,
whereas others could more correctly be called libraries, wrapping
the hard-to-use homomorphic encryption libraries with a more
convenient interface targeting specific applications. Most of the
1.3.4 Back-end. EVA includes a back-end for executing compiled projects are incomparable with each other: they target different
programs against SEAL’s CKKS implementation. This is imple- scenarios or back-ends with vastly different properties.
mented using a forward analysis pass that constructs a map from One of the first attempts at a homomorphic encryption was cre-
terms to values, with optional use of the multi-core support for par- ated for the IARPA RAMPARTS program [2]. This compiler allowed
allelization. Mappings for Input terms are initialized with inputs the programmer to write computations in the Julia language, and
values provided by the user and after the pass has run output is translated them to appropriate PALISADE [30] API calls.
produced from the values stored for any Output terms. Alchemy [17] compiles for the Λ ◦ 𝜆 library [16], which im-
plements a BGV scheme variant. A valid program written in the
1.3.5 Limitations. PyEVA allows the user to specify whether Input
Alchemy DSL is guaranteed to work correctly and minimize the
terms are encrypted or not. For example, a computation may in-
chance of runtime errors. This is an important property for com-
volve data from multiple data owners, one of which hopes to input
pilers, because mistakes in homomorphic encryption programs
encrypted data, and the other unencrypted data. However, EVA
written using native library APIs, or improper parameterizations,
did not implement support for computation between unencrypted
will typically lead to runtime errors that can be difficult to debug
values, as SEAL does not implement pure plaintext operations. Sec-
for non-experts.
tion 2.4 describes our changes to enable these scenarios.
Some compilers even support multiple back-ends. Cingulata [8]
(earlier known as Armadillo) allows the user to write programs in
1.4 Our Contributions C++ and translates them into Boolean or arithmetic circuits that
Our contributions are two-fold. First, we improve the EVA toolchain can be executed using the TFHE library (Boolean circuits) or a
in multiple ways:
2 A version with alternating power-of-two rotations and addition requires 𝑂 (log 𝑛)
• PyEVA now has: (1) transparent support for Python numbers
operations while the naïve sum of all rotations needs 𝑂 (𝑛) .
and lists; (2) named inputs and outputs instead of positional; 3 This is not exactly a limitation of SEAL, but rather a limitation of the algebraic
(3) scaling specified outside program (Section 2.1). structures CKKS is based on.

47
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

def convolution ( image , width , kernel ): 1 implemented with the improved PyEVA language. We invite the
conv = 0 2 reader to compare this with Figure 6 in [18], which presented the
for i in range ( len ( kernel )): 3 same program in the original PyEVA language.
for j in range ( len ( kernel [0])): 4 PyEVA now allows using Python numbers and lists of numbers
rot = image << i * width + j 5 transparently in expressions involving encrypted values. When
conv += rot * kernel [i ][ j] 6 a Python value is encountered, EVA automatically creates a Con-
return conv 7 stant term. In contrast, the old PyEVA required the user to call
8 a constant(scale,value) function to mark constants and indicate
def sqrt (x ): 9 their scale. A consequence of this is that the user no longer has to
return 2.214* x - 1.098* x **2 + 0.173* x **3 10 set the scales of constants, and instead EVA’s new Raw type support
11 described in Section 2.4 infers the required scales.
sobel = EvaProgram (" sobel " , vec_size =64*64) 12 The new PyEVA omits scales in the Input and Output functions.
with sobel : 13 Instead, the user calls set_input_scales and set_output_ranges to
image = Input (" image ") 14 set these parameters. This allows the orthogonal concern of setting
hor = convolution ( image , 64 , 15 appropriate scales to be separated from specifying the computation.
[[ -1 ,0 ,1] , [ -2 ,0 ,2] , [ -1 ,0 ,1]]) 16 EVA and PyEVA now use named arguments instead of positional
ver = convolution ( image , 64 , 17 arguments. This makes it easier to rearrange the computation inside
[[ -1 , -2 , -1] , [0 ,0 ,0] , [1 ,2 ,1]]) 18 a PyEVA program without having to change application code that
Output (" image " , sqrt ( hor **2 + ver **2)) 19 handles inputs and outputs.

Figure 1: Sobel edge detection in PyEVA


2.2 Making Passes Easy to Write
A primary goal of EVA, alongside making homomorphic encryption
accessible to non-experts, is to make it easy for homomorphic
custom implementation of BFV (arithmetic circuits). SHEEP [34] is encryption experts to translate their knowledge into executable
possibly the most generic of all compiler projects: it defines a DSL form. By implementing passes that achieve the optimizations they
that can be compiled to multiple library back-ends and schemes. desire, their knowledge can benefit every new application and user.
Marble [38] is to some extent similar in nature to Cingulata. E3 [13] From this point of view, it is critical that the internal APIs of EVA
is also designed for multiple back-ends. are expressive and ergonomic to use.
Porcupine [15] specifically targets the difficulty in manually Dealing with loops in transformations and analyses is in our view
vectorizing programs. As was explained earlier, homomorphic en- one of the main challenges in compilers for beginners. The original
cryption often operates on large encrypted vectors in a SIMD fash- EVA [18] took the most important step in making passes easy to
ion. However, producing code that leverages this parallelism is not write by designing the IR as a loop-free DAG. Instead of having to
straightforward for developers; this is what Porcupine helps with. think about lattices, fixpoints, and termination, developers of new
It compiles a custom DSL to the BFV scheme API in SEAL. passes can focus on the logic of their transformations inside simpler
CHET [19], nGraph-HE [4, 5], SEALion [36], and TenSEAL [3] forward and backward traversals. The tradeoff is that the program
target machine learning applications by compiling primarily to the representation is less compact, but in our experience this has not
CKKS scheme. been a problem as due to the relative slow-down of homomorphic
Most recently Google announced the homomorphic encryption encryption programs tend to be smaller than traditional ones.
transpiler [24] targeting the TFHE scheme. The transpiler allows We have continued the work started in the original EVA by
the user to write normal C++ functions and repurposes the XLS further refining its APIs to make writing passes easy. This section
toolchain to translate the functions to TFHE API (Boolean circuits), details the most important improvements.
instead of Verilog code.
As has been explained above, EVA [18] defines a DSL and com- 2.2.1 Rewriting API. We have analyzed the API usage patterns in
piles to the CKKS scheme API in SEAL. The present work improves existing passes and identified the most common transformations
upon EVA. made to the DAG. While passes in the original EVA operated directly
on C++ vectors for both operands and uses, we have encapsulated
2 IMPROVED EVA common modifications as the following new member functions in
EVA’s Term class:
This section describes and motivates the improvements we have
made to EVA. We have also made EVA available open-source under void addOperand ( const Ptr & term );
the MIT license to engage the homomorphic encryption users and bool eraseOperand ( const Ptr & term );
research communities.4 bool replaceOperand ( Ptr oldTerm , Ptr newTerm );
void setOperands ( std :: vector < Ptr > o );
2.1 PyEVA Front-End Improvements void replaceUsesWithIf ( Ptr term ,
We have made several improvements to PyEVA that make writ- std :: function < bool ( const Ptr &) >);
ing programs more natural. Figure 1 shows Sobel edge detection void replaceAllUsesWith ( Ptr term );
void replaceOtherUsesWith ( Ptr term );
4 URL to be provided after notification.

48
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

Attribute Type Used in Term maps provide a safe replacement for EVA’s previous ad-hoc
usage of std::vector with a globally incremented index per term.
RescaleDivisorAttribute uint32_t Rescale
Hashmaps would have been another option, but EVA’s multi-core
RotationAttribute int32_t Rotate*Const
support would have required a concurrent implementation. Term
ConstantValueAttribute shared_ptr<Constant> Constant
maps are trivially thread safe when passes write only to the current
TypeAttribute Type Input
term, thanks to pre-allocation of the buffer.
RangeAttribute uint32_t Output
EncodeAtScaleAttribute uint32_t Input,Encode
EncodeAtLevelAttribute uint32_t Input,Encode 2.3 Pseudo-Code
Table 3: Attributes currently in EVA. For readability, the following sections present our new passes in
pseudocode that closely matches the structure of their C++ counter-
parts in EVA. Common mathematical notation is used for brevity.
The global mappings used in the pseudocode correspond to term
maps. A map 𝑀 is indexed with a term 𝑡 with 𝑀 [𝑡] and an unini-
The replace*UsesWith* functions significantly simplified existing
tialized map is ∅. A list 𝐿 can be indexed by a non-negative integer
passes in EVA by replacing complex loops over the uses of terms.
𝑖 with 𝐿[𝑖], lists are constructed with [𝑎, 𝑏, 𝑐, . . . ] and concatenated
Original EVA required passes to manually keep the pointers from
with 𝐿 ++ 𝐿 ′ . Attributes are accessed with 𝑡 .get⟨AttributeName⟩,
terms to their uses up to date, which was a common source of bugs.
modified with 𝑡 .set⟨AttributeName⟩(), and their names are abbre-
Our changes remove direct write access to the operands list and
viated for brevity.
instead have the member functions of terms automatically manage
the use list. This significantly simplified code in passes.
The Ptr type above is an std::shared_ptr, which makes mem- 2.4 Raw Type and Encoding Insertion
ory management simple. The original EVA had chosen to use Many applications need to deal with both encrypted and unen-
shared_ptr for use pointers and weak_ptr for operands, which crypted data. For example, machine learning inferencing tasks may
did not matter when uses were manually updated. However, with operate on encrypted inputs, but allow weights to be unencrypted.
automatic use management switching the direction of ownership EVA’s previous approach to unencrypted inputs was to perform
gave us dead code elimination for free: any term that does not appear CKKS encoding for all input values, but skip the encryption step
in a subexpression of an Output term is automatically freed. for unencrypted inputs. This, however, meant that arithmetic be-
tween unencrypted values was not supported, as SEAL does not
2.2.2 Attributes. To improve EVA’s API ergonomics we have in- offer arithmetic for plaintext values and, while implementable, it
troduced attributes, which are strongly typed named constants would not be desirable either due to the slowdown involved. While
attached to terms. Attributes solve several problems in EVA: it is always possible to move these kinds of computations to the
• Named and strongly typed attributes make passes more read- surrounding application code, allowing unencrypted arithmetic in-
able and safer by avoiding manual type checks. side EVA programs is more flexible. Another issue with performing
• EVA’s type system for terms can be kept simple and targeted encoding early is that this inflates the size of those inputs and may
to homomorphic encryption by moving complex metadata result in higher communication costs.
into attributes. We have extended EVA’s type system with a new Raw type along-
• As EVA adds support for new schemes, back-ends and op- side the existing Cipher and Plain types. Values of type Raw rep-
timizations, adding information as fields in terms would resent vectors of vec_size double precision floating point elements.
inflate the memory footprint for all terms and during all All Constant terms and Inputs that the user has marked unen-
passes. Attributes allow only storing what is required. crypted (by passing is_unencrypted=True to Input) are of type Raw.
Figure 3 lists the attributes currently included and which terms All of EVA’s arithmetic operations are supported between terms of
they are used in. Attributes are accessed with templated has<A>, type Raw, while operations specific to homomorphic encryption,
get<A> and set<A> methods directly from the Term class, where such as ModSwitch, naturally are not.
A is a tag type naming the attribute. Terms store their attributes To move values of type Raw to Plain, we’ve added a new opcode,
in a linked list that embeds the first element in the term instance, Encode, which has a single operand for the Raw value to be encoded
which for most cases avoids an indirection. and two attributes, Scale and Level, to control how the value gets
encoded. Encode terms are added by a new Encoding Insertion
2.2.3 Term Maps. While attributes are suitable for metadata that pass detailed in Algorithm 1. It depends on two prepopulated maps:
should be persisted, many passes require tracking more ephemeral 𝑆 for the scales and 𝑇 for types of all existing terms, which are
per-term information. For example, EVA’s rescaling policies track produced by EVA’s rescaling insertion and type deduction passes,
the scale of all terms, but this is only required for at execution respectively. For each term of type Cipher, the pass checks if any
time for Input and Encode terms. For such usage we provide two of the operands are of type Raw and if so inserts new Encode terms.
template classes: TermMap<T> and TermMapOptional<T>. These One complication handled by the pass is that for Add and Sub
encapsulate a contiguous array of type T that can be indexed directly all operands must have the same scale and thus the pass sets the
with term instances. Internally EVA maintains a unique index for Scale attribute appropriately. For multiplications the scale set for
each term in a program and automatically manages space in any the operand is used, which is typically one set by the user in PyEVA
registered term maps when new terms are created. with set_input_scales. Encode terms also need the Level attribute

49
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

Require: 𝑆 and 𝑇 are maps from existing terms to their scales and Require: 𝑈 = ∅ and VisitCSE is used in a forward traversal.
types, respectively, and VisitEI is used in a forward traversal. Ensure: Program has no two syntactically equivalent terms.
Ensure: There are no terms with mixed Cipher and Raw operands. 1: procedure VisitCSE(𝑡)
1: procedure VisitEI(𝑡) 2: if ∃𝑟 ∈ 𝑈 : SyntacticEqals(𝑡, 𝑟 ) then
2: if 𝑇 [𝑡] = Cipher then 3: 𝑡 .replaceAllUsesWith(𝑟 ) ⊲ Remove term 𝑡.
3: for 𝑜 ∈ 𝑡 .operands s.t. 𝑇 [𝑜] = Raw do 4: else
4: 𝑒 ← Newterm(Encode, [𝑜]) 5: 𝑈 ← 𝑈 ∪ {𝑡 }
5: if 𝑡 .op =Add ∨ 𝑡 .op =Sub then 6: procedure SyntacticEqals(𝑡, 𝑟 )
6: 𝑒.set⟨Scale⟩(𝑆 [𝑡]) 7: if 𝑡 .op ≠ 𝑟 .op ∨ 𝑡 .operands ≠ 𝑟 .operands then
7: else 8: return False
8: 𝑒.set⟨Scale⟩(𝑆 [𝑜]) 9: switch 𝑡 .op do
9: 𝑇 [𝑒] ← Plain 10: case Undef
10: 𝑆 [𝑒] ← 𝑒.get⟨Scale⟩ 11: return False
11: 𝑡 .replaceOperand(𝑜, 𝑒) 12: case Negate, Add, Sub, Mul, Relinearize, ModSwitch
13: return True
Algorithm 1: Encoding Insertion 14: case Input, Output
15: return 𝑡 = 𝑟 ⊲ Pointer equality.
16: case Constant
17: return 𝑡 .get⟨ConstValue⟩ = 𝑟 .get⟨ConstValue⟩
18: case RotateLeftConst, RotateRightConst
set, but this is added during EVA’s modulus switching insertion 19: return 𝑡 .get⟨Rotation⟩ = 𝑟 .get⟨Rotation⟩
pass, which we modified to add support for Encode. 20: case Rescale
21: return 𝑡 .get⟨Divisor⟩ = 𝑟 .get⟨Divisor⟩
2.5 Common Subexpression Elimination 22: case Encode
Common Subexpression Elimination (CSE) is a well known opti- 23: return 𝑡 .get⟨Scale⟩ = 𝑟 .get⟨Scale⟩ ∧
mization that ensures a program has a unique representative term 𝑡 .get⟨Level⟩ = 𝑟 .get⟨Level⟩
for each syntactically equivalent subexpression. Providing a CSE 24: return False ⊲ Unreachable, but it’s safe to return false.
pass in EVA means that authors of EVA programs do not have to
memoize common subexpressions. As an example, consider the Algorithm 2: Common Subexpression Elimination
program in Figure 1. The program calls the convolution function
twice to run separate filters for detecting horizontal and vertical
edges. However, as both are 3×3 kernels they will perform the same 2.6 Reduction Balancing
rotations on line 5. While this could be remedied in the user code
Consider the following arithmetic expressions:
by memoizing the results of line 5 or by fusing the loops of the
two invocations (as was done in Figure 6 of [18]), being able to rely (𝑎 · 𝑏) · (𝑐 · 𝑑) (1)
on CSE greatly simplifies the user code. Similarly, on line 10 the ((𝑎 · 𝑏) · 𝑐) · 𝑑 (2)
polynomial approximation of square root does not have to factor
out the shared powers of x. While arithmetically equivalent, expression 1 is better because it
We have implemented CSE in EVA as detailed in Algorithm 2. has a lower multiplicative depth. While it is always possible to avoid
During a forward pass CSE checks for each term if a syntactically the equivalent of expression 2 in PyEVA programs manually, doing
equivalent term 𝑟 was already visited earlier. If so, all uses of the cur- so may make code less modular. Consider the following program:
rent term 𝑡 are redirected to 𝑟 , by replacing 𝑡 with 𝑟 in all operand
def poly ( x ):
lists that mention 𝑡. If no such 𝑟 exists, then 𝑡 is kept and remem-
return 0.837 * x **2
bered as the representative of its syntactic equivalence class.
SyntacticEqals checks that the terms have the same opcode
prog = EvaProgram ( " inbalanced " , vec_size =4096)
and operands. Crucially, the operands are compared using pointer
with prog :
equality, avoiding recursive calls into SyntacticEqals. This works,
a = Input ( " a " )
because in a forward pass operands are visited before the term it-
b = Input ( " b " )
self. After checking the opcode and operands, operation specific
Output ( " c " , poly ( a ) * b )
attributes may be checked. For example, in the case of rescaling it
is checked that the divisors match. Inputs and outputs are never The expression constructed for the output c is essentially ((𝑥 · 𝑥) ·
eliminated, as EVA already ensures there’s a single representative 0.837) · 𝑏, which has depth 4. Balancing this expression in user
for each named input/output. The implementation of CSE uses a code would require inlining the poly function by rewriting line 8
hash set for 𝑈 , which requires hash codes in addition to the equality to Output("c", (0.837 * b) * a**2). In some cases the exponent
operation. These syntactic hashes are calculated using the same operator has to be avoided too, as for example a * b**3 produces
fields that SyntacticEqals uses. an unbalanced expression.

50
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

Require: VisitRC is used in a forward traversal. The second pass, detailed in Algorithm 4, expands the flattened
Ensure: Trees of reductions with Mul or Add are collapsed into reductions back into balanced binary trees. One subtlety to be
single terms with multiple operands. considered is that it is beneficial to both reduce Raw terms as well
1: procedure VisitRC(𝑡) as Cipher terms of a similar scale together first. For example, if
2: if 𝑡 .op ≠ Mul ∨ 𝑡 .op ≠ Add then in expression 1 terms 𝑎 and 𝑐 are of type Raw, while 𝑏 and 𝑑 are
3: return Cipher, then the encoding insertion pass in Algorithm 1 will add
4: if |𝑡 .uses| = 1 then two Encode terms. However, if the Raw terms are 𝑎 and 𝑏 instead,
5: use ← 𝑡 .uses[0] only one Encode term is required. The sorting on line 6 together
6: if use.op = 𝑡 .op then with the scale estimation in the SetScale procedure implement a
7: while 𝑢𝑠𝑒.eraseOperand(𝑡) do heuristic that groups terms with the same type and scale together.
8: for o ∈ 𝑡 .operands do The balanced reduction trees produced by these passes minimize
9: use.addOperand(𝑜) the amount of modulus consumed by multiplications. The trans-
formation is also useful in the case of Add operations as balanced
Algorithm 3: Reduction Combiner trees expose more parallelism: an unbalanced tree requires 𝑂 (𝑛)
time to evaluate in the worst case, while with sufficient processors
a balanced tree only requires 𝑂 (log 𝑛) time.
Require: 𝑆 = ∅, 𝑇 is a map from existing terms to their types, and
VisitRLE is used in a forward traversal.
Ensure: There are no terms with mixed Cipher and Raw operands.
1: procedure VisitRLE(𝑡) 3 EVA EXTENSION LIBRARY
2: SetScale(𝑡) 3.1 Why an Extension Library?
3: if 𝑡 .op ≠ Mul ∨ 𝑡 .op ≠ Add ∨ |𝑡 .operands| ≤ 2 then
Even though EVA greatly improves the user experience of CKKS,
4: return
we note that it still leaves much to be desired. We illustrate the
5: 𝑂 ← 𝑡 .operands
problem with a few examples.
6: 𝑂 ← Sorted(𝑂, 𝜆𝑎, 𝑏 : 𝑇 [𝑎] = Cipher ⇒
Consider computing the sum of all elements in an EVA vector,
𝑇 [𝑏] = Cipher ∧ 𝑆 [𝑎] ≤ 𝑆 [𝑏])
e.g., when evaluating a high-dimensional dot product, where the
7: while |𝑂 | > 2 do ⊲ Expand 𝑂 into balanced tree.
EVA program’s vector size is set to a large value (several thousands)
8: 𝑂 ′ ← []
and a single SIMD multiplication is performed between two input
9: 𝑖←0
vectors. To complete the dot product, the elements of the multiplied
10: while 𝑖 + 1 < |𝑂 | do
vector must be added up. This can be done using rotations and
11: 𝑂 ′ ← 𝑂 ′ ++ [Newterm(t.op, [𝑂 [𝑖], 𝑂 [𝑖 + 1]])]
additions, but a naïve version will use several thousand rotations,
12: 𝑖 ←𝑖 +2
while a good one gets by with a logarithmic number of rotations.
13: if 𝑖 < |𝑂 | then
Ideally, a good implementation for the “horizontal sum” would be
14: 𝑂 ′ ← 𝑂 ′ ++ [𝑂 [𝑖]]
readily available to avoid this pitfall.
15: 𝑂 ← 𝑂′ It is common for applications to include computations that are
16: assert |𝑂 | = 2 not directly supported in homomorphic encryption, in which case
17: 𝑡 .operands ← 𝑂 polynomial approximation becomes useful. Given an input function,
18: procedure SetScale(𝑡) a Taylor polynomial, a Chebyshev approximation, or some other
19: if |𝑡 .operands| = 0 then polynomial approximation valid for a specific input domain may be
20: 𝑆 [𝑡] ← 𝑡 .get⟨Scale⟩ ⊲ Sources have user given scales. used. It is very application dependent which kind of approximation
21: else if 𝑇 [𝑡] = Cipher ∧ 𝑡 .op = Mul then is appropriate, so a library of several methods would be useful.
22: 𝑆 [𝑡] ← Σ𝑜 ∈𝑡 .operands 𝑆 [𝑜] As a third example, consider the case of arbitrary-sized vectors
23: else or matrices. Such objects are not simple to implement using the
24: assert ∀𝑜, 𝑜 ′ ∈ 𝑡 .operands : 𝑆 [𝑜] = 𝑆 [𝑜 ′ ] power-of-two size vectors that EVA natively supports. While these
25: 𝑆 [𝑡] ← 𝑆 [𝑡 .operands[0]] could be directly offered in EVA’s back-end, especially for matrices
there are numerous ways to encode the data and implementing
Algorithm 4: Reduction Log Expander them all would inflate the complexity of EVA’s core.
We propose building an EVA Extension Library (EXL) in Python
using PyEVA to address these use-cases. We argue that such a li-
To remedy these problems we have added a reduction balancing brary would be impractical to build directly using SEAL or other im-
feature to EVA by implementing two passes that run in succession. plementations of CKKS, as such low-level implementations would
The first pass, detailed in Algorithm 3, flattens any trees of reduc- become very complex due to the lack of optimizations and automa-
tions with either the Mul or Add operation into a single term with tion that a compiler like EVA provides. Building EXL in PyEVA
all the source terms to the subtree as operands. Note that terms makes it accessible to a huge group of developers, makes it modular
with multiple uses are always retained even if they are otherwise and convenient to use, and makes it performant by allowing EVA
part of a reduction subtree, since that intermediate result is needed to automatically optimize the library functions as they are fused
elsewhere. into EVA programs.

51
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

3.2 Horizontal Sum Input Size = 5 1 2 3 4 5


0 1 2 3 4
As the first demonstration of how easy EXL is to build in PyEVA,
we present an efficient horizontal sum implementation: Vec Size = 4 1 2 3 4 5 0 0 0
0 1 2 3 0 1 2 3
def horizontal_sum (x ): 1 2 3 4 1 2 3 4 5 0 0 0 5 0 0 0
Slot Size = 4096
i = 1 0 …… 4095 0 …… 4095
while i < len (x ):
y = x << i # rotation by i steps
x = x + y Figure 2: This graph shows the instance for splitting the input into
vectors of size of power of two using EXL Vector class.
i *= 2
return x

By successively rotating the value by increasing powers-of-two with prog :


and adding the rotated value back, a horizontal sum can be com- v = VecInput ( 'v ', 5)
puted in a logarithmic number of rotations and additions. PyEVA w = VecInput ( 'w ', 5)
operator overloading makes the code read like normal Python code. VecOutput ( 'y ', vector_dot_product (v , w ))
The vector_horizontal_sum function provides an equivalent
3.3 Vector and Matrix to horizontal_sum for the EXL vector. The vector_dot_product
As a significant concrete demonstration, we implemented an EXL function then implements dot product for two arbitrary-sized vec-
vector class for arbitrary-sized vectors in PyEVA. tors in one line of code. The user program can now invoke these
functions with EXL vector inputs obtained with the VecInput func-
3.3.1 SEAL vector and EVA vector. The new EXL vector is distinct tion and produce an output with the VecOutput function.
both from the native SEAL vectors (i.e., ciphertexts and plaintexts) Figure 2 shows how the EXL vector of size 5 is divided into two
that always have a fixed size set by the encryption parameters EVA vectors of size 4, which are replicated into two SEAL vectors
(Section 1.2), and from the EVA vectors that still have a power-of- of size 4096. Note that in this case the user would benefit from
two size determined by the user when creating the EVA program increasing the EVA vector size to 8 or more.
(Section 1.3.1).
Internally, EVA vectors of size less than the SEAL vector size are 3.3.3 Matrix. Similarly to the EXL vector, we implement a sim-
implemented by replicating the EVA vector values until the SEAL ple EXL matrix class. Matrix-vector and matrix-matrix products
vector is filled. Large EVA vector sizes are always accommodated by are fundamental operations in many interesting applications, e.g.,
potentially inflating the encryption parameters to make the SEAL machine learning. We implemented the matrix-vector product us-
vector at least the same size. In a “wide” enough EVA program, the ing the well-known trick of splitting the matrix in diagonal order.
EVA vector size matches the SEAL vector size to avoid redundant Suppose 𝐴 = (𝑎𝑖,𝑗 ) is an 𝑚 × 𝑚 matrix and 𝑣 = (𝑣𝑖 ) is vector of
computation and unnecessarily large encryption parameters. length 𝑚. EXL splits 𝐴 into 𝑚 vectors in diagonal order:
𝑑1 = {𝑎 1,1, 𝑎 2,2, . . . , 𝑎𝑚,𝑚 }
3.3.2 EXL vector. These issues make some very natural computa-
tions unnecessarily complex, as applications often need to operate 𝑑𝑖 {𝑎 1,𝑖 , 𝑎 2,𝑖+1, . . . , 𝑎𝑚−𝑖+1,𝑚 , 𝑎𝑚−𝑖+2,1, . . . , 𝑎𝑚,𝑖−1 }
=
Í
on data of arbitrary size. Consider an EVA program with a very The product 𝐴𝑣 is computed as: 𝑚 𝑖=1 𝑑𝑖 · Rot(𝑣, 𝑖) , where function
large input vector, e.g., a million elements. Setting the EVA vector Rot (Section 1.2.4) cyclically rotates the vector 𝑣 by 𝑖 steps.
size to match would be very inefficient due to the large encryp-
tion parameters required. Instead, the user would have to manually 3.4 Vector Size and Program Generators
break up their input data into multiple EVA vectors and express
EVA partly decouples the sizes of vectors used in input programs
the computation in terms of these, but now the user has to also
from the SEAL vector size, i.e., the number of slots, determined
break up their computations to work in terms of these subsections
in EVA’s encryption parameter selection. If the EVA vector size
of the data. Rotations, in particular, become very complex due to
is smaller than the SEAL vector size, the executor emulates the
the number of corner-cases involved.
smaller size by replicating vectors. On the other hand, if it is larger,
Our EXL vector class resolves all of these problems; it handles
then the SEAL vector size is expanded to fit the EVA vector size.
the idiosyncrasies of splitting up user data of any size to fit into EVA
While this scheme allows users to select any (power-of-two) vec-
vectors of any size. The generated code also naturally benefits from
tor size for their EVA programs and get a runnable program, it has
EVA’s automatic parallelization. Consider the following example:
drawbacks. If the EVA vector size is smaller than the SEAL vector
def vector_horizontal_sum (x ): size then each operation is performing redundant computation. On
x = sum (x. exprs ) the other hand, if the SEAL vector size was expanded to a larger
return horizontal_sum (x) size to fit the EVA vector size then each operation is slower than
necessary for security. EVA does instruct the user in these cases to
def vector_dot_product (x , y ): consider possible optimizations.
return vector_horizontal_sum (x * y) EXL vector above introduces yet another level of vector sizes
and, for programs that use EXL vectors exclusively, mostly hides
prog = EvaProgram ( ' DotProduct ', vec_size =4) the EVA vector size. However, users must still specify vec_size for

52
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

Require: GenerateProgram works for al 𝑣 s.t. 𝑣 ≥ 𝑣 min and return vector_horizontal_sum ( x **2)
𝑣 large ≥ 𝑣 min .
Ensure: The selected 𝑃, 𝑄, 𝑣 are secure and 𝑃 was generated with def sum_of_squared_errors ( avg , x ):
a 𝑣 ′ ≥ 𝑣 min . return sum_of_squares ( x - avg )
1: procedure IterateUp(𝑣 min )
2: 𝑣 ← 𝑣 min def variance ( x ):
3: loop x = sum_of_squared_errors ( average ( x ) , x )
4: 𝑃, 𝑄, 𝑣 secure ← TryVecSize(𝑣) return x / ( len ( x ) - 1)
5: if 𝑣 ≥ 𝑣 secure then
6: return 𝑃, 𝑄, 𝑣 def poly_approx ( fun , low , up , degree ):
7: else def poly ( x ):
8: 𝑣 ← 2·𝑣 ...
9: procedure TryVecSize(𝑣) return y
10: 𝑃 ← GenerateProgram(𝑣) return poly
11: 𝑃 ′, 𝑄 ← Compile(𝑃)
12: 𝑣 ′ ← MinSlotsForModulus(𝑄) # Standard deviation
13: return 𝑃 ′, 𝑄, 𝑣 ′ def sd ( x ):
sqrt = poly_approx ( np . sqrt , 0 , 100 , degree =6)
Algorithm 5: Simple vector size selection procedures return sqrt ( variance ( x ))

def correlation (x , y ):
the EvaProgram constructor and different choices for this parameter avg_x = average ( x )
will result in very different performance. EXL does improve the avg_y = average ( y )
situation over vanilla EVA by allowing different values for the EVA sum_xy = vector_dot_product (x - avg_x , y - avg_y )
vector size to be tried with no other changes to the program, but mul_sd_xy = sd ( x ) * sd ( y ) * len ( x )
ideally this would be handled transparently by EVA. return ( sum_xy , mul_sd_xy )
We propose remedying this by having users provide an program
generator instead of a concrete EVA program. The example program We have omitted the polynomial approximation implementation
in Section 3.3.2 would for example be expressed as: above, as choice of the method is very application-dependent. We
def prog ():
leave the task of designing a comprehensive library of polynomial
v = VecInput ( 'v ', 5)
approximation methods for future work.
w = VecInput ( 'w ', 5)
The code above serves as a good example of how EXL functions
VecOutput ( 'y ' , vec_dot_product (v , w ))
build on both the EXL vector class and each other. This demon-
strates the benefit of building a comprehensive library in PyEVA.
PyEVA would use this to generate candidate programs with different And we hope to start a virtuous cycle of developers contributing to
vec_size arguments and select a good one based on some criteria. EXL and making future contributions easier while doing so.
Algorithm 5 illustrates a simple way to select an EVA vector
size given a user-provided GenerateProgram procedure and a 4 CONCLUSIONS AND FUTURE WORK
minimum vector size the program needs. IterateUp will find the
In this paper we have demonstrated valuable improvements to
smallest valid vector size that is secure by iterating up from the
the EVA toolchain, providing further evidence that the approach
minimum. This, however, has the drawback that the program will
EVA is taking towards a CKKS-specific compiler toolchain can
be compiled potentially many times. We leave designing a good
provide good performance and a far better user-experience than
general purpose selection procedure for future work.
native CKKS API in any library. We have demonstrated that building
3.5 Statistical Functions and More higher level libraries on top of EVA is meaningful and can hugely aid
in simple use-cases of CKKS, e.g., implementing simple statistical
We have added a set of common statistical functions to EXL, as functions, or arbitrary-sized vector computations. Concretely, we
well as a polynomial approximation generator for evaluating non- built the beginnings of an EVA Extension Library (EXL). As future
polynomial functions. These were implemented using the EXL work, we believe EXL can be augmented with a much richer set of
vector class and the functions for horizontal sum and dot prod- data types and functions, e.g., tensors and various implementations
uct described in Section 3.3.2. Thanks to PyEVA’s expressiveness, for neural network kernels.
implementing them is simple: There are multiple direction for extending EVA itself. As was
import numpy as np discussed in Section 3.4, the EVA vector size is a tricky concept that
would ideally be abstracted away by EXL, which would require yet
def average (x ): another layer of abstraction in the program creation.
return vector_horizontal_sum (x) / len (x) Another useful feature that EVA does not currently support
is combining inputs from multiple data sources. For example, a
def sum_of_squares (x ): computation may calculate the correlation between data from two

53
Session 3 WAHC ’21, November 15, 2021, Virtual Event, Republic of Korea

data owners, both holding the same public EVA context (public [11] Jung Hee Cheon, Kyoohyung Han, Andrey Kim, Miran Kim, and Yongsoo Song.
key). However, EVA requires the user to specify all input wires at 2018. A full RNS variant of approximate homomorphic encryption. In Int’l Conf.
on Selected Areas in Cryptography. Springer, 347–368.
once and does not allow one party to specify part of the inputs. [12] Jung Hee Cheon, Andrey Kim, Miran Kim, and Yongsoo Song. 2017. Homo-
While EVA currently targets only CKKS, we believe there may be morphic encryption for arithmetic of approximate numbers. In Int’l Conf. on the
Theory and Application of Cryptology and Information Security. Springer, 409–437.
benefit in targeting BFV and BGV as well. Both BGV and BFV would [13] Eduardo Chielle, Oleg Mazonka, Nektarios Georgios Tsoutsos, and Michail Ma-
require a noise estimator to be built into EVA, as SEAL currently niatakos. 2018. E3: A framework for compiling C++ programs with encrypted
does not include such an estimator. However, some other libraries, operands. IACR Cryptol. ePrint Arch. 2018 (2018), 1013.
[14] Ilaria Chillotti, Nicolas Gama, Mariya Georgieva, and Malika Izabachène. 2020.
notably HElib [26], already implement noise estimators, which EVA TFHE: Fast fully homomorphic encryption over the torus. Journal of Cryptology
could leverage directly. Extending EVA to support other library 33, 1 (2020), 34–91.
back-ends, targeting either CKKS or BFV/BGV, would be valuable. [15] Meghan Cowan, Deeksha Dangwal, Armin Alaghi, Caroline Trippel, Vincent T
Lee, and Brandon Reagen. 2021. Porcupine: A synthesizing compiler for vector-
On the usability side several open questions remain. Does EVA ized homomorphic encryption. arXiv preprint arXiv:2101.07841 (2021).
and EXL provide a sufficient level of abstraction to make homo- [16] Eric Crockett and Chris Peikert. 2016. Λo𝜆 : Functional lattice cryptography. In
Proceedings of the 2016 ACM SIGSAC Conf. on Computer and Communications
morphic encryption, and the CKKS scheme in particular, generally Security. 993–1005.
available to developers without extensive training? Cryptographic [17] Eric Crockett, Chris Peikert, and Chad Sharp. 2018. Alchemy: A language and
libraries are notorious for poor usability [25, 31], and homomorphic compiler for homomorphic encryption made easy. In Proceedings of the 2018 ACM
SIGSAC Conf. on Computer and Communications Security. 1020–1037.
encryption libraries are undoubtedly extremely challenging to use [18] Roshan Dathathri, Blagovesta Kostova, Olli Saarikivi, Wei Dai, Kim Laine, and
for non-experts. Performing usability studies would help compiler Madan Musuvathi. 2020. EVA: An encrypted vector arithmetic language and
developer teams identify directions to pursue in the future. compiler for efficient homomorphic computation. In Proceedings of the 41st ACM
SIGPLAN Conf. on Programming Language Design and Implementation. 546–561.
Porcupine [15] takes a interesting approach in attempt to help [19] Roshan Dathathri, Olli Saarikivi, Hao Chen, Kim Laine, Kristin Lauter, Saeed
users vectorize their computations. Since vectorization is essential Maleki, Madanlal Musuvathi, and Todd Mytkowicz. 2019. CHET: An optimizing
compiler for fully-homomorphic neural-network inferencing. In Proceedings of the
to making CKKS (also BFV/BGV) applications meaningfully perfor- 40th ACM SIGPLAN Conf. on Programming Language Design and Implementation.
mant, it would make sense to combine a tool such as Porcupine with 142–156.
EVA. Currently EVA/EXL users would need to understand to utilize [20] Léo Ducas and Daniele Micciancio. 2015. FHEW: Bootstrapping homomorphic en-
cryption in less than a second. In Annual Int’l Conf. on the Theory and Applications
the EXL vector classes to the extreme to achieve good performance, of Cryptographic Techniques. Springer, 617–640.
which is not going to be obvious for normal developers. [21] Junfeng Fan and Frederik Vercauteren. 2012. Somewhat practical fully homomor-
phic encryption. IACR Cryptol. ePrint Arch. 2012 (2012), 144.
[22] Craig Gentry. 2009. Fully homomorphic encryption using ideal lattices. In Pro-
5 ACKNOWLEDGMENTS ceedings of the forty-first annual ACM Symp. on Theory of computing. 169–178.
[23] Craig Gentry, Amit Sahai, and Brent Waters. 2013. Homomorphic encryption
We would like to thank Roshan Dathathri for helpful discussions. from learning with errors: Conceptually-simpler, asymptotically-faster, attribute-
based. In Annual Cryptology Conf. Springer, 75–92.
[24] Shruthi Gorantala, Rob Springer, Sean Purser-Haskell, William Lam, Royce Wil-
REFERENCES son, Asra Ali, Eric P Astor, Itai Zukerman, Sam Ruth, Christoph Dibak, et al. 2021.
[1] Martin Albrecht, Melissa Chase, Hao Chen, Jintai Ding, Shafi Goldwasser, Sergey A general purpose transpiler for fully homomorphic encryption. arXiv preprint
Gorbunov, Shai Halevi, Jeffrey Hoffstein, Kim Laine, Kristin Lauter, Satya Lokam, arXiv:2106.07893 (2021).
Daniele Micciancio, Dustin Moody, Travis Morrison, Amit Sahai, and Vinod [25] Matthew Green and Matthew Smith. 2016. Developers are not the enemy!: The
Vaikuntanathan. 2018. Homomorphic encryption security standard. Technical need for usable security apis. IEEE Security & Privacy 14, 5 (2016), 40–46.
Report. HomomorphicEncryption.org, Toronto, Canada. [26] Shai Halevi and Victor Shoup. 2014. Algorithms in helib. In Annual Cryptology
[2] David W Archer, José Manuel Calderón Trilla, Jason Dagit, Alex Malozemoff, Conf. Springer, 554–571.
Yuriy Polyakov, Kurt Rohloff, and Gerard Ryan. 2019. Ramparts: A programmer- [27] Kyoohyung Han and Dohyeong Ki. 2020. Better bootstrapping for approximate
friendly system for building homomorphic encryption applications. In Proceedings homomorphic encryption. In Cryptographers’ Track at the RSA Conf. Springer,
of the 7th ACM Workshop on Encrypted Computing & Applied Homomorphic 364–390.
Cryptography. 57–68. [28] Lattigo 2020. Lattigo v2.1.1. http://github.com/ldsec/lattigo. EPFL-LDS.
[3] Ayoub Benaissa, Bilal Retiat, Bogdan Cebere, and Alaa Eddine Belfedhal. 2021. [29] Donald Nguyen, Andrew Lenharth, and Keshav Pingali. 2013. A lightweight
TenSEAL: A library for encrypted tensor operations using homomorphic encryp- infrastructure for graph analytics. In Proceedings of the Twenty-Fourth ACM
tion. arXiv preprint arXiv:2104.03152 (2021). Symp. on Operating Systems Principles (Farminton, Pennsylvania) (SOSP ’13).
[4] Fabian Boemer, Anamaria Costache, Rosario Cammarota, and Casimir Wierzyn- Association for Computing Machinery, New York, NY, USA, 456–471. https:
ski. 2019. ngraph-he2: A high-throughput framework for neural network infer- //doi.org/10.1145/2517349.2522739
ence on encrypted data. In Proceedings of the 7th ACM Workshop on Encrypted [30] PALISADE 2021. PALISADE Lattice Cryptography Library (release 1.11.2). https:
Computing & Applied Homomorphic Cryptography. 45–56. //palisade-crypto.org.
[5] Fabian Boemer, Yixing Lao, Rosario Cammarota, and Casimir Wierzynski. 2019. [31] Nikhil Patnaik, Joseph Hallett, and Awais Rashid. 2019. Usability smells: An
ngraph-he: A graph compiler for deep learning on homomorphically encrypted analysis of developers’ struggle with crypto libraries. In Fifteenth Symp. on Usable
data. In Proceedings of the 16th ACM Int’l Conf. on Computing Frontiers. 3–13. Privacy and Security ( {SOUPS } 2019).
[6] Zvika Brakerski, Craig Gentry, and Vinod Vaikuntanathan. 2014. (Leveled) [32] Ronald L Rivest, Len Adleman, Michael L Dertouzos, et al. 1978. On data banks
fully homomorphic encryption without bootstrapping. ACM Transactions on and privacy homomorphisms. Foundations of secure computation 4, 11 (1978),
Computation Theory (TOCT) 6, 3 (2014), 1–36. 169–180.
[7] Zvika Brakerski and Vinod Vaikuntanathan. 2011. Fully homomorphic encryption [33] SEAL 2020. Microsoft SEAL (release 3.6). https://github.com/Microsoft/SEAL.
from ring-LWE and security for key dependent messages. In Annual cryptology Microsoft Research, Redmond, WA.
Conf. Springer, 505–524. [34] SHEEP 2019. SHEEP is a homomorphic encryption evaluation platform. https:
[8] Sergiu Carpov, Paul Dubrulle, and Renaud Sirdey. 2015. Armadillo: A compilation //github.com/alan-turing-institute/SHEEP.
chain for privacy preserving applications. In Proceedings of the 3rd Int’l Workshop [35] Seoul National University. 2020. HEAAN. https://github.com/snucrypto/HEAAN.
on Security in Cloud Computing. 13–19. [36] Tim van Elsloo, Giorgio Patrini, and Hamish Ivey-Law. 2019. SEALion: A
[9] Hao Chen, Ilaria Chillotti, and Yongsoo Song. 2019. Improved bootstrapping for framework for neural network inference on encrypted data. arXiv preprint
approximate homomorphic encryption. In Annual Int’l Conf. on the Theory and arXiv:1904.12840 (2019).
Applications of Cryptographic Techniques. Springer, 34–54. [37] Alexander Viand. 2021. SoK: Fully homomorphic encryption compilers. In IEEE
[10] Jung Hee Cheon, Kyoohyung Han, Andrey Kim, Miran Kim, and Yongsoo Song. Symp. on Security and Privacy.
2018. Bootstrapping for approximate homomorphic encryption. In Annual Int’l [38] Alexander Viand and Hossein Shafagh. 2018. Marble: Making fully homomorphic
Conf. on the Theory and Applications of Cryptographic Techniques. Springer, 360– encryption accessible to all. In Proceedings of the 6th Workshop on Encrypted
384. Computing & Applied Homomorphic Cryptography. 49–60.

54

You might also like