Privacy and Utility Tradeoff in Approximate Differential Privacy
Privacy and Utility Tradeoff in Approximate Differential Privacy
Google AI
New York, NY 10011
Email: qgeng, vvei, guorq, [email protected]
Abstract
We characterize the minimum noise amplitude and power for noise-adding mechanisms in (, δ)-
dierential privacy for single real-valued query function. We derive new lower bounds using the duality
arXiv:1810.00877v2 [cs.CR] 5 Feb 2019
of linear programming, and new upper bounds by proposing a new class of (, δ)-dierentially private
mechanisms, the truncated Laplacian mechanisms. We show that the multiplicative gap of the lower
bounds and upper bounds goes to zero in various high privacy regimes, proving the tightness of the
lower and upper bounds and thus establishing the optimality of the truncated Laplacian mechanism.
In particular, our results close the previous constant multiplicative gap in the discrete setting. Numeric
experiments show the improvement of the truncated Laplacian mechanism over the optimal Gaussian
mechanism in all privacy regimes.
1 Introduction
Dierential privacy, introduced by Dwork et al. (2006b), is a framework to quantify to what extent
individual privacy in a statistical dataset is preserved while releasing useful aggregate information about the
dataset. Dierential privacy provides strong privacy guarantees by requiring the near-indistinguishability
of whether an individual is in the dataset or not based on the released information. For more motivation
and background of dierential privacy, we refer the readers to the survey by Dwork (2008) and the book
by Dwork & Roth (2014).
Since its introduction, dierential privacy has spawned a large body of research in dierentially private
data-releasing mechanism design, and the noise-adding mechanism has been applied in many machine
learning algorithms to preserve dierential privacy, e.g., logistic regression (Chaudhuri & Monteleoni,
2008), empirical risk minimization (Chaudhuri et al., 2011; Wang et al., 2018), online learning (Jain et al.,
2012), statistical risk minimization (Duchi et al., 2012), statistical learning (Dziugaite & Roy, 2018),
deep learning (Shokri & Shmatikov, 2015; Abadi et al., 2016; Phan et al., 2016) distributed optimization
(Agarwal et al., 2018), hypothesis testing (Sheet, 2018), matrix completion (Jain et al., 2018), expectation
maximization (Park et al., 2017), and principal component analysis (Chaudhuri et al., 2012; Ge et al.,
2018).
The classic dierential privacy is called -dierential privacy, which imposes an upper bound e on
the multiplicative distance of the probability distributions of the randomized query outputs for any two
neighboring datasets. The standard approach for preserving -dierential privacy is adding a noise with
the Laplacian distribution to the query output. Introduced by Dwork et al. (2006a), the approximate
dierential privacy is (, δ)-dierential privacy, and the common interpretation of (, δ)-dierential privacy
is that it is -dierential privacy except with probability δ (Mironov, 2017). The standard approach for
preserving (, δ)-dierential privacy is the Gaussian mechanism, which adds a Gaussian noise to the query
output.
To fully make use of the dierentially private mechanisms, it is important to understand the fundamental
trade-o between privacy and utility (accuracy). For example, within the class of noise-adding mechanisms,
given the privacy constraint and δ, we are interested in deriving the minimum amount of noise added
to achieve the highest accuracy and utility while preserving the dierential privacy. In the literature,
there have been many works on optimal dierential privacy mechanism design and characterizing the
privacy and utility tradeo in dierential privacy. For a single count query function under -dierential
privacy, Ghosh et al. (2009) show that the geometric mechanism is universally optimal under a Bayesian
framework, and Gupte & Sundararajan (2010) derived the optimal noise probability distributions under
a minimax cost framework. Geng & Viswanath (2016b) show that the optimal noise distribution has
1
a staircase-shaped probability density function for single real-valued query function under -dierential
privacy, and Geng et al. (2015) generalized the result to two-dimensional query functions. Soria-Comas
& Domingo-Ferrer (2013) also independently derived the staircase-shaped noise probability distribution
under a dierent optimization framework.
Geng & Viswanath (2016a) show that for a single integer-valued query function under (, δ)-dierential
privacy, the discrete uniform noise distribution and the discrete Laplacian noise distribution are asymp-
totically optimal within a constant multiplicative gap in the high privacy regions. Balle & Wang (2018)
improved the classic analysis of the Gaussian mechanism for (, δ)-dierential in the high privacy regime
( → 0), and developed an optimal Gaussian mechanism whose variance is calibrated directly using the
Gaussian cumulative density function instead of a tail bound approximation.
1.2 Organization
The paper is organized as follows. In Section 2, we give some preliminaries on dierential privacy, and
derive the (, δ)-dierential privacy constraint on the additive noise probability distribution and dene
the minimum noise amplitude and noise power under (, δ)-dierential privacy. Section 3 presents the
truncated Laplacian mechanism for preserving (, δ)-dierential privacy, and derives new upper bounds
for minimum noise amplitude and noise power. Section 4 derives new lower bounds on the minimum noise
magnitude and noise power. Section 5 shows that the multiplicative gap between the lower bounds and
the upper bounds goes to zero in various privacy regimes, and thus proves the tightness of the new lower
and upper bounds. Section 6 conducts comprehensive numeric experiments to compare the performance
of the truncated Laplacian mechanism with the optimal Gaussian mechanisms, and demonstrates the
improvement in all privacy regimes. Section 7 discusses some additional properties of the truncated
Laplacian mechanism and concludes this paper.
2 Problem Formulation
In this section, we rst give some preliminaries on dierential privacy, and then dene the minimum noise
amplitude V1∗ and minimum noise power V2∗ for (, δ)-dierentially private noise-adding mechanisms.
Consider a real-valued query function q : D → R, where D is the set of all possible datasets. The
real-valued query function q will be applied to a dataset, and the query output is a real number. Two
datasets D1 , D2 ∈ D are called neighboring datasets if they dier in at most one element, i.e., one is a
proper subset of the other and the larger dataset contains just one additional element Dwork (2008). A
2
randomized query-answering mechanism K for the query function q will randomly output a number with
probability distribution depending on query output q(D), where D is the dataset.
Denition 1 ((, δ)-dierential privacy (Dwork et al., 2006a)). A randomized mechanism K gives (, δ)-
dierential privacy if for all data sets D1 and D2 diering on at most one element, and for any measurable
set S ⊂ Range(K),
The sensitivity of a real-valued query function measures how the query changes for neighboring
datasets.
K(D) = t + X.
We derive the dierential privacy constraint on the noise probability distribution P in Lemma 1.
Lemma 1. Given the query sensitivity ∆ and privacy parameters and δ, the noise probability distribution
P preserves (, δ)-dierential privacy if and only if
Proof. The dierential privacy constraint (1) on K is that for any t1 , t2 ∈ R such that t1 − t2 ≤ ∆
(corresponding to the query outputs for two neighboring datasets1 ),
Let P,δ denote the set of noise probability distributions satisfying the (, δ)-dierential
privacy
constraint (2). Given P ∈ P,δ , the expected noise amplitude and noise power are x∈R xP(dx) and
x∈R
x2 P(dx). The goal of this work is to characterize the minimum expected noise amplitude and noise
power under (, δ)-dierential privacy. More precisely, dene
∫
∗
V1 := inf xP(dx) (min noise amplitude),
P∈P,δ x∈R
∫
V2∗ := inf x2 P(dx) (min noise power).
P∈P,δ x∈R
In this work, we characterize V1∗ and V2∗ in terms of ∆, , δ by deriving tight lower bounds V1low , V2low
and upper bounds V1upp , V2upp such that V1low ≤ V1∗ ≤ V1upp and V2low ≤ V2∗ ≤ V2upp .
In the next section, we present the new upper bounds V1upp and V2upp . The lower bounds V1low and
V2low are presented in Section 4.
1 In this work we impose no prior on the query function other than the query sensitivity ∆. For any t , t ∈ R such that
1 2
|t1 − t2 | ≤ ∆, there may exist two neighboring datasets D1 and D2 with q(D1 ) = t1 and q(D2 ) = t2 .
3
3 Upper Bound: Truncated Laplacian Mechanism
In this section, we present a new class of (, δ)-dierentially private noise-adding mechanism, truncated
Laplacian mechanism. Applying the truncated Laplacian mechanism, we derive new achievable (and tight)
upper bounds V1upp and V2upp on minimum noise amplitude V1∗ and minimum noise power V2∗ in Theorem
2 and Theorem 3.
Before presenting the exact form of the truncated Laplacian mechanism, we rst discuss some key
ideas and insights behind the new mechanism design.
The standard Laplacian distribution for preserving -dierential privacy has a symmetric probability
− |x| f (x)
density function f (x) = 2∆ e ∆ . Note that for any x ≥ 0, the probability density decay rate, f (x+∆) , is
exactly e . Geng & Viswanath (2016b) show that the decay rate e is optimal under -dierential privacy.
Indeed, if the decay rate is higher, it is no longer -dierentially private; if the decay rate is lower, it will
incur a higher cost. However, under (, δ)-dierential privacy, Laplacian distribution is not optimal as it
has a heavy tail distribution.
(, δ)-dierential privacy relaxes the -dierential privacy constraint, and it allows that for a set of
points with a probability mass δ, the decay rate can exceed e . The Gaussian mechanism is widely used
2
− x2
f (x) e σ
in (, δ)-dierential privacy, and for x > 0, its probability density decay rate is f (x+∆) = (x+∆)2
=
−
e σ2
∆2 +2∆x ∆2 2∆
e σ2 = e e σ2 x , which is exponentially increasing with respect to x. When x is big, the decay rate
σ2
can be very high. While the Gaussian mechanism addresses the long tail distribution to some extent by
having higher decay rate for large x, the decay rate is smaller than e when x is small.
Motivated by the observation that under (, δ)-dierential privacy, the decay rate shall be as high as
possible without exceeding e , except for a set of points with a probability mass δ (for those there is no
limit on the decay rate), we derive a symmetric truncated Laplacian distribution where the probability
density decay rate is exactly e , except for a set of points with probability mass δ where the decay rate is
innite.
Denition 3 (Truncated Laplacian Distribution). Given the privacy parameters 0 < δ < 12 , > 0 and
the query sensitivity ∆ > 0, the probability density function of the truncated Laplacian distribution PTLap
is dened as:
{ |x|
Be− λ , for x ∈ [−A, A]
fTLap (x) := (4)
0, otherwise
where
∆
λ := ,
∆ e − 1
A := log(1 + ),
2δ
1 1
B := = .
2 ∆ (1 − 1+ e1 −1 )
A
2λ(1 − e− λ )
2δ
4
Figure 1: Noise probability density function fTLap of the truncated Laplacian mechanism. fTLap is a
symmetric truncated exponentinal function with a probability mass δ in the last interval with length ∆ in the
fTLap (x)
support of fTLap , i.e., the interval [A − ∆, A]. The decay rate fTLap (x+∆) is exactly e for x ∈ [0, A − ∆).
The parameters A and B are then derived by solving the equations that x∈R fTLap (x)dx = 1 and
A
f
A−∆ TLap
(x)dx = δ.
A |x|
fTLap is a valid probability density function, as fTLap (x) ≥ 0 and x∈R fTLap (x)dx = 0 2Be− λ dx =
A
2λB(1 − e− λ ) = 1.
We discuss the key properties of the symmetric probability density function fTLap (x):
fTLap (x)
• The decay rate in [0, A − ∆] is exactly e , i.e., fTLap (x+∆) = e , ∀x ∈ [0, A − ∆].
• The probability mass in the interval [A − ∆, A] is δ, i.e., PTLap ([A − ∆, A]) = δ. Indeed,
∫ A ∫ A
|x|
fTLap (x)dx = Be− λ dx
A−∆ A−∆
A−∆ A A ∆
= λB(e− λ − e− λ ) = λBe− λ (e λ − 1)
A
∆ e− λ ∆ 1
= (e λ − 1) A = (e λ − 1) A = δ.
2(1 − e− λ ) 2(e λ − 1)
fTLap (x)
• The decay rate fTLap (x+∆) is +∞ for x ∈ (A − ∆, A], as fTLap (x) = 0 for x ∈ (A, +∞).
Denition 4 (Truncated Laplacian mechanism). Given the query sensitivity ∆, and the privacy parameters
, δ, the truncated Laplacian mechanism adds a noise with probability distribution PTLap dened in (4) to
the query output.
Proof. Equivalently, we need to show that the truncated Laplacian distribution PTLap dened in (4)
satises the (, δ)-dierential privacy constraint (2).
We are interested in maximizing PTLap (S) − e PTLap (S + d) in (2) and show that the maximum over
S ⊆ R is upper bounded by δ. Since fTLap (x) is symmetric and monotonically decreasing in [0, +∞),
without loss of generality, we can assume d ≥ 0 and thus d ∈ [0, ∆].
To maximize PTLap (S) − e PTLap (S + d), S shall not contain points in (−∞, − ∆2 ], as
∆
fTLap (x) ≤ fTLap (x + d), ∀x ∈ (−∞, − ].
2
S shall not contain points in [− ∆
2 , A − ∆], as
∆
fTLap (x) ≤ e fTLap (x + d), ∀x ∈ [− , A − ∆].
2
Therefore, PTLap (S) − e PTLap (S + d) is maximized for some set S ⊆ [A − ∆, +∞). Since fTLap (x) is
monotonically decreasing in [A − ∆, +∞), PTLap (S) − e PTLap (S + d) is maximized at S = [A − ∆, +∞)
A−∆+d A
and the maximum value is A−∆ f (x)dx ≤ A−∆ fTLap (x)dx = δ.
We conclude that PTLap satises the (, δ)-dierential privacy constraint (2).
5
Next, we apply the truncated Laplacian mechanism to derive new upper bounds on the minimum
noise amplitude V1∗ and noise power V2∗ .
Theorem 2 (Upper Bound on Minimum Noise Amplitude).
−1
∆ log(1 + e 2δ )
V1∗ ≤ V1upp := (1 − e −1 ). (5)
2δ
Proof. We can compute the expected noise amplitude for the truncated Laplacian distribution PTLap
dened in (4) via
∫ ∫ A
x
V1upp := fTLap (x)xdx = 2 Be− λ xdx
x∈R 0
∫ A
−A x
−λ
= 2Bλ(−Ae λ + e dx)
0
A A
= 2Bλ(−Ae− λ + λ(1 − e− λ ))
A
Ae− λ A
=− A +λ=λ− A
1− e− λ e λ −1
−1
∆ log(1 + e 2δ )
= (1 − e −1 ).
2δ
Since the truncated Laplacian mechanism preserves (, δ)-dierential privacy, this gives an upper bound
on the minimum noise amplitude V1∗ under (, δ)-dierential privacy.
In Theorem 2, the upper bound V1upp is composed of two parts. The rst part is ∆ , which is the noise
amplitude of the Laplacian mechanism under -dierential privacy. The second part reduces the noise by
−1
log(1+ e 2δ )
a portion of e −1 due to the δ-relaxation in (, δ)-dierential privacy.
2δ
We analyze the asympotic properties of V1upp in the high privacy regimes as → 0, δ → 0:
• Given , limδ→0 V1upp = ∆ . The truncated Laplacian mechanism will be reduced to the standard
Laplacian mechanism as δ → 0,.
e −1
• Given δ, lim→0 V1upp = ∆
4δ . Indeed, when → 0, 2δ → 0, and thus
−1 2
e −1 ( e 2δ )
∆ −
V1upp ≈ (1 − 2δ
e −1
2
)
2δ
−1
∆ e 2δ ∆ ∆
≈ =
= .
2 4δ 4δ
As → 0, the truncated Laplacian distribution is reduced to a uniform distribution in the interval
∆ ∆ δ
[− 2δ , 2δ ] with probability density ∆ .
• In the regime δ = → 0, the upper bound
−1
∆ log(1 + e 2 )
V1upp = (1 −
e −1 )
2
∆ log(1 + 2 )
≈ (1 − )
2
∆ 3
= (1 − 2 log ). (6)
2
In Section 5, we show that the constant factor (1 − 2 log 32 ) is actually tight and the upper bound
V1upp matches the lower bound V1low dened in Theorem 4.
Theorem 3 (Upper Bound on Minimum Noise Power). Dene
e −1 e −1
2∆2 1
log2 (1 + 2δ ) + log(1 + 2δ )
V2upp := (1 − 2
e −1 ). (7)
2 2δ
We have
V2∗ ≤ V2upp .
6
Proof. We can compute the cost for the truncated Laplacian distribution via
∫ ∫ A ∫ A
x
V2upp := fTLap (x)x2 dx = 2 f (x)x2 dx = 2 Be− λ x2 dx
x∈R 0 0
∫ A
A x
= 2Bλ(−A2 e− λ + e− λ 2xdx)
0
∫ A
−A −A x
= 2Bλ(−A e 2 λ + 2λ(−Ae λ + e− λ dx))
0
A A A
= 2Bλ(−A2 e− λ + 2λ(−Ae− λ + λ − λe− λ )
A A A
−A2 e− λ − 2λAe− λ + 2λ2 − 2λ2 e− λ
= A
1 − e− λ
A A
A2 e− λ + 2λAe− λ
= 2λ2 − A
1 − e− λ
2
A + 2λA
= 2λ2 − A
eλ − 1
∆2 e −1 2∆2 e −1
2∆2 2 log2 (1 + 2δ ) + 2 log(1 + 2δ )
= 2 − e −1
2δ
2∆2 1
2 log2 (1 −1
+ e 2δ −1
) + log(1 + e 2δ )
= (1 − e −1 ).
2 2δ
Since fTLap (x) can preserve (, δ)-dierential privacy, this gives an upper bound on V2∗ .
It turns out that the upper bounds V1upp and V2upp in Theorem 2 and Theorem 3 are tight. We derive
new lower bounds for V1∗ and V2∗ in the next section, and show that the multiplicative gap between the
lower bounds and the upper bounds goes to zero in the high privacy regions in Section 5.
4 Lower Bound
In this section, we derive new lower bounds V1low and V2low on the minimum noise amplitude V1∗ and
minimum noise power V2∗ , respectively. The key technique is to discretize the continuous probability
distribution and the loss function, and transform the continuous functional optimization problem to linear
programming, and then apply the discrete result from Geng & Viswanath (2016a).
Geng & Viswanath (2016a) derived lower bounds for an integer-valued query function under (, δ)-
dierential privacy. For integer-valued query functions, they formulate a linear programming problem
with the objective of minimizing the additive noise. They studied the dual problem and constructed a
dual feasible solution which gives a lower bound. Extending this result to the continuous setting, we show
a similar lower bound for real-valued query function under (, δ)-dierential privacy.
First, we give a lower bound for (, δ)-dierential privacy for integer-valued query function due to
Geng & Viswanath (2016a).
Dene
δ + e 2−1
a := ,
e
b := e− .
n−1
To avoid integer rounding issues, assume that there exists an integer n such that k=0 abk = 12 .
Lemma 2 (Theorem 8 in Geng & Viswanath (2016a)). Consider a symmetric cost function L(·) : Z → R,
where Z denotes the set of all integers. Given the privacy parameters , δ and the discrete query sensitivity
˜ ∈ Z+ , if a discrete probability distribution P satises
∆
˜
P(S) − e P(S + d) ≤ δ, ∀S ⊆ Z, ∀d ∈ Z, d ≤ ∆ (8)
7
and the cost function L(·) satises
n−1
∑
˜ − L(1 + (i − 1)∆)
bi 2L(i∆) ˜ − L(1 + i∆)
˜ ≥ L(1), (9)
i=1
then we have
n−1
∑
Σi∈Z L(i)P(i) ≥ 2 ˜
abk L(1 + k ∆). (10)
k=0
As the continuous probability distribution P satises (, δ)-dierential privacy constraint (2) with the
query sensitivity ∆, the discrete probability distribution P̃N satises the discrete (, δ)-dierential privacy
˜ = N , i.e., P̃N satises
constraint (8) with query sensitivity ∆
P̃N (S) − e P̃N (S + d) ≤ δ, ∀S ⊆ Z, d ≤ N.
˜ =N
We can verify that the condition (9) in Lemma 2 holds for L̃N and P̃N with query sensitivity ∆
when N is suciently large. Indeed, when N ≥ a + 2,
n−1
∑
bi [2L̃N (iN ) − L̃N (1 + (i − 1)N ) − L̃N (1 + iN )]
i=1
− L̃N (1)
n−1
∑ ∆
= bi [2(2iN − 1) − 2(1 + (i − 1)N ) + 1
i=1
2N
∆
− 2(1 + iN ) + 1] −
2N
n−1
∑ ∆ ∆
= bi (2N − 4) −
i=1
2N 2N
n−1
∆ ∑ i
= ( b (2N − 4) − 1)
2N i=1
∆ 2N − 4 ∆ N −2
= ( − 1) = ( − 1) ≥ 0.
2N 2a 2N a
8
The corresponding lower bound in (10) for L̃N and P̃N is
n−1
∑ n−1
∑
k ∆
2 ab L̃N (1 + kN ) = 2 abk (2kN + 1)
2N
k=0 k=0
n−1
∑ ∑ n−1
∆ ∆
=2 abk (k∆ +
) = 2a∆ bk k +
2N 2N
k=0 k=0
n−1
∑ ( n
)
b−b (n − 1)bn
≥ 2a∆ bk k = 2a − ∆
(1 − b)2 1−b
k=0
= V1low
Similarly, we derive the lower bound for the minimum noise power V2∗ .
V2∗ ≥ V2low .
Proof. We rst discretize the probability distribution P. Given a positive integer N ≥ 0, dene a discrete
probability distribution P̃N via
∆ ∆
P̃N (i) , P [ (2i − 1), (2i + 1)) , ∀i ∈ Z.
2N 2N
As the continuous probability distribution P satises (, δ)-dierential privacy constraint with continu-
ous query sensitivity ∆, the discrete probability distribution P̃N satises the discrete (, δ)-dierential
privacy constraint with discrete query sensitivity N , i.e., P̃N satises
9
Next we verify that the condition (9) in Lemma 2 holds when N is suciently large for the `2 cost
function. Indeed,
n−1
∑
bi 2L̃N (iN ) − L̃N (1 + (i − 1)N ) − L̃N (1 + iN ) − L̃N (1)
i=1
n−1
∑ ∆2 ∆2
= bi 2
2(2iN − 1)2 − (2(1 + (i − 1)N ) − 1)2 − (2(1 + iN ) − 1)2 −
i=1
4N 4N 2
n−1
∑ ∆2 ∆2
= bi ((8i − 4)N 2
− 16iN + 4N ) −
i=1
4N 2 4N 2
≥ 0,
10
Theorem 6 (Tightness of Lower bound and Upper bound on Minimum Noise Amplitude).
V1low
lim upp ≥ 1 − 2δ.
→0 V1
V1low
lim upp ≥ = 1 − + O(2 ).
δ→0 V1 e −1 2
V1low
lim upp = 1.
=δ→0 V1
Note that 1 − 2δ → 1, as δ → 0, and thus the multiplicative gap between V1low and V1upp converges to
zero.
2. is xed, and δ → 0:
When δ → 0, the upper bound V1upp → ∆ . For the lower bound V1low , we have
1 − e−
a→ ,
2
bn → 0,
nbn → 0,
∆
and thus V1low → −1 as δ → 0. Therefore,
∆
V1low −1
lim upp ≥ ∆
= = 1 − + O(2 ).
δ→0 V1
e − 1 2
Therefore, the multiplicative gap between V1low and V1upp converges to zero as → 0.
3. = δ → 0:
In this regime, V1upp ≈ ∆ (1 − 2 log 23 ) as shown in Section 3. For the lower bound V1low , since
n−1 k 1
k=0 ab = 2 , we have
1 − bn 1 1−b
a = ⇒ bn = 1 − .
1−b 2 2a
1−b 1−e−
As = δ → 0, 2a =
δ+ e −1
→ 31 , and thus
2 2
e
1 2
lim bn = 1 − = ,
δ→0 3 3
log 32
n = Θ( ).
δ
Note that a = Θ( 32 δ) as δ → 0.
Therefore, as = δ → 0,
( )
b − bn (n − 1)bn
2a 2
− ∆
(1 − b) 1−b
3
log
1− 2 2 2
≈ 2a( 2 3 − δ 3 )∆
δ δ
2 3
1 log( )
= 2a( 2 − 3 2 2 )∆
3δ δ
2
3 1 log( 3 )
≈ 2 δ( 2 − 3 2 2 )∆
2 3δ δ
3 ∆
= (1 − 2 log ) .
2 δ
11
Therefore, V1∗ is lower bounded by V1low ≈ (1 − 2 log 32 ) ∆
δ in the regime = δ → 0. Since it is also
V low
upper bounded by V1upp ≈ ∆ (1 − 2 log 23 ), we conclude that lim=δ→0 V1upp = 1.
1
Note that our result closes the constant multiplicative gap in the discrete setting (see Equation (67)
and (69) in Geng & Viswanath (2016a)).
Similarly, we show that the lower bound V2low and the upper bound V2upp on the minimum noise power
are also tight.
Theorem 7 (Tightness of Lower bound and Upper bound on Minimum Noise Power).
V2low 2
lim upp ≥ (1 − δ)(1 − 2δ) = 1 − 3δ + 2δ .
→0 V2
V2low 2 (1 + e )
lim upp ≥ = 1 − + O(2 ).
δ→0 V2 2(e − 1)2 2
V2low
lim upp = 1.
=δ→0 V2
∆2
Proof. • Case → 0: When → 0, the upper bound V2upp converges to 12δ 2 . For the lower bound,
when → 0, we have
a → δ,
b → 1,
1
n→ ,
2δ
n−1
and thus the lower bound V2low = 2a k=0 bk k 2 ∆2 converges to
(n − 1)n(2n − 1) 2 ( 1 − 1) 2δ
1 1
( δ − 1) 2 ∆2 1 1
2δ ∆ = 2δ 2δ ∆ = ( − 1)( − 2),
6 6 12 δ δ
which matches the upper bound as δ → 0. Therefore,
∆2 1
V low 12 ( δ − 1)( 1δ − 2)
lim 2upp ≥ ∆2
= (1 − δ)(1 − 2δ).
→0 V2
12δ
2∆2
• Case δ → 0: When δ → 0, the upper bound V2upp converges to 2 . For the lower bound, we have
1 − e− 1−b
a→ =
2 2
bn → 0
n2 bn → 0,
b b2 b2 + b 2 e−2 + e− 2 1 + e
(−b + 2 2
− )∆2 = 2
∆ = − 2
∆ = ∆2 ,
(1 − b) 1−b (1 − b) (1 − e ) (e − 1)2
2∆2
and this matches 2 as → 0. Therefore,
1+e 2
V low (e −1)2 ∆ 2 (1 + e )
lim 2upp ≥ 2∆2
= .
δ→0 V2
2
2(e − 1)2
2∆2
• Case = δ → 0: The upper bound V2upp converges to 2 (1 − log2 3
2 − 2 log 23 ). When = δ → 0,
12
the lower bound V2low is
n−1
(n−1)bn b2 (1−bn−2 )
−b + 2( b(1−b
(1−b)2
)
− 1−b ) − 1−b − (n − 1)2 bn
2a ∆2
1−b
1− 2 2
log1
3
log2 3
3 2( 2 3 − 3 2 2 ) − 3 − 23 2
2
≈2 ∆2
2
3∆2 2 4 3 2 3
= 2 ( − log − log2 )
3 3 2 3 2
2∆2 3 2 3
= 2 (1 − 2 log − log ),
2 2
which matches the uppber bound. We conclude that
V2low
lim upp = 1.
=δ→0 V2
Figure 2: Ratio of the Noise Amplitude of the Truncated Laplacian Mechanism and the Optimal Gaussian
Mechanism.
We plot the ratio of the noise amplitude of truncated Laplacian mechanism and the optimal Gaussian
mechanism in Fig. 2, and plot the ratio of the noise power of truncated Laplacian mechanism and the
optimal Gaussian mechanism in Fig. 3, where ∈ [10−4 , 10] and δ ∈ [10−6 , 0.1]. Note that compared
with the optimal Gaussian mechanism, the truncated Laplacian mechanism signicantly reduces the
noise amplitude and noise power in all privacy regimes. The improvement is not very surprising, as the
truncated Laplacian mechanism universally improves the probability density decay rate (for both small
and big noises) and thus leads to smaller noise amplitude and noise power in expectation.
Figure 3: Ratio of the Noise Power of the Truncated Laplacian Mechanism and the Optimal Gaussian
Mechanism.
13
7 Conclusion and Discussion
In this work, we characterize the minimum noise amplitude and noise power for noise-adding mechanisms
in (, δ)-dierential privacy for single real-valued query function. We derive new lower bounds using the
duality of linear programming, and derive new upper bounds by proposing a new class of (, δ)-dierentially
private mechanisms, the truncated Laplacian mechanisms. We show that the multiplicative gap of the
lower bounds and upper bounds goes to zero in various high privacy regimes, proving the tightness of
the lower and upper bounds and thus establishing the optimality of the truncated Laplacian mechanism.
In particular, our results close the previous constant multiplicative gap in Geng & Viswanath (2016a).
Comprehensive numeric experiments show the improvement of the truncated Laplacian mechanism over
the optimal Gaussian mechanism in Balle & Wang (2018) in all privacy regimes.
An obvious question is how to further improve the truncated Laplacian mechanism to provide stronger
privacy guarantees. To minimize the additive noise, an important property of the truncated Laplacian
mechanism is that the range of the output noise is bounded between [−A, A]. Therefore, for two neighboring
datasets, the randomized output ranges will have some non-overlapped set. While the truncated Laplacian
mechanism can strictly preserve (, δ)-dierential privacy, with a small probability up to δ (corresponding
to the probability that the output is in the non-overlapped set), an adversary can distinguish the two
neighboring datasets. To address this concern, one can improve over the truncated Laplacian mechanism
and impose an arbitrarily light tail distribution over [A, +∞) to ensure that the output space is the same
for all possible datasets.
References
Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., and Zhang, L. Deep
learning with dierential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer
and Communications Security, CCS ’16, pp. 308–318. ACM, 2016.
Agarwal, N., Suresh, A. T., Yu, F., Kumar, S., and McMahan, B. cpSGD: Communication-ecient and
dierentially-private distributed SGD. In Advances in Neural Information Processing Systems. 2018.
Balle, B. and Wang, Y.-X. Improving the Gaussian mechanism for dierential privacy: Analytical
calibration and optimal denoising. In Proceedings of the 35th International Conference on Machine
Learning, volume 80 of Proceedings of Machine Learning Research, pp. 394–403. PMLR, 2018.
Chaudhuri, K., Monteleoni, C., and Sarwate, A. D. Dierentially private empirical risk minimization.
Journal of Machine Learning Research, 12:1069–1109, 2011.
Chaudhuri, K., Sarwate, A., and Sinha, K. Near-optimal dierentially private principal components. In
Advances in Neural Information Processing Systems 25, pp. 989–997. 2012.
Duchi, J., Jordan, M., and Wainwright, M. Privacy aware learning. In Advances in Neural Information
Processing Systems, pp. 1430–1438, 2012.
Dwork, C. Dierential Privacy: A Survey of Results. In Theory and Applications of Models of Computation,
volume 4978, pp. 1–19, 2008.
Dwork, C. and Roth, A. The algorithmic foundations of dierential privacy. Foundations and Trends in
Theoretical Computer Science, 9(3-4):211–407, 2014.
Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., and Naor, M. Our data, ourselves: privacy via
distributed noise generation. In Proceedings of the 24th annual international conference on The Theory
and Applications of Cryptographic Techniques, EUROCRYPT’06, pp. 486–503. Springer-Verlag, 2006a.
Dwork, C., McSherry, F., Nissim, K., and Smith, A. Calibrating noise to sensitivity in private data
analysis. In Theory of Cryptography, volume 3876 of Lecture Notes in Computer Science, pp. 265–284.
Springer Berlin / Heidelberg, 2006b.
Dziugaite, G. K. and Roy, D. M. Data-dependent pac-bayes priors via dierential privacy. In Advances in
Neural Information Processing Systems 31, pp. 8440–8450. 2018.
14
Ge, J., Wang, Z., Wang, M., and Liu, H. Minimax-optimal privacy-preserving sparse pca in distributed
systems. In Proceedings of the Twenty-First International Conference on Articial Intelligence and
Statistics, volume 84 of Proceedings of Machine Learning Research, pp. 1589–1598. PMLR, 09–11 Apr
2018.
Geng, Q. and Viswanath, P. Optimal noise adding mechanisms for approximate dierential privacy. IEEE
Transactions on Information Theory, 62(2):952–969, Feb 2016a.
Geng, Q. and Viswanath, P. The optimal noise-adding mechanism in dierential privacy. IEEE Transactions
on Information Theory, 62(2):925–951, Feb 2016b.
Geng, Q., Kairouz, P., Oh, S., and Viswanath, P. The staircase mechanism in dierential privacy. IEEE
Journal of Selected Topics in Signal Processing, 9(7):1176–1184, Oct 2015.
Ghosh, A., Roughgarden, T., and Sundararajan, M. Universally utility-maximizing privacy mechanisms.
In Proceedings of the 41st annual ACM symposium on Theory of computing, STOC ’09, pp. 351–360.
ACM, 2009.
Gupte, M. and Sundararajan, M. Universally optimal privacy mechanisms for minimax agents. In
Symposium on Principles of Database Systems, pp. 135–146, 2010.
Jain, P., Kothari, P., and Thakurta, A. Dierentially private online learning. In Proceedings of the 25th
Annual Conference on Learning Theory, volume 23 of Proceedings of Machine Learning Research, pp.
24.1–24.34. PMLR, 25–27 Jun 2012.
Jain, P., Thakkar, O. D., and Thakurta, A. Dierentially private matrix completion revisited. In
Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of
Machine Learning Research, pp. 2215–2224. PMLR, 10–15 Jul 2018.
Mironov, I. Rényi dierential privacy. In 2017 IEEE 30th Computer Security Foundations Symposium
(CSF), pp. 263–275, Aug. 2017.
Park, M., Foulds, J., Chaudhuri, K., and Welling, M. DP-EM: Dierentially Private Expectation
Maximization. In Proceedings of the 20th International Conference on Articial Intelligence and
Statistics, volume 54 of Proceedings of Machine Learning Research, pp. 896–904. PMLR, 2017.
Phan, N.-S., Wang, Y., Wu, X., and Dou, D. Dierential privacy preservation for deep auto-encoders: an
application of human behavior prediction. In AAAI, 2016.
Sheet, O. Locally private hypothesis testing. In Proceedings of the 35th International Conference on
Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp. 4605–4614. PMLR,
10–15 Jul 2018.
Shokri, R. and Shmatikov, V. Privacy-preserving deep learning. In Proceedings of the 22Nd ACM SIGSAC
Conference on Computer and Communications Security, CCS ’15, pp. 1310–1321. ACM, 2015.
Soria-Comas, J. and Domingo-Ferrer, J. Optimal data-independent noise for dierential privacy. Informa-
tion Sciences, 250:200 – 214, 2013.
Wang, D., Gaboardi, M., and Xu, J. Empirical risk minimization in non-interactive local dierential
privacy revisited. In Advances in Neural Information Processing Systems 31, pp. 973–982. 2018.
15