0% found this document useful (0 votes)
14 views15 pages

Privacy and Utility Tradeoff in Approximate Differential Privacy

This document discusses the trade-off between privacy and utility in the context of approximate differential privacy, specifically focusing on the (ε, δ)-differential privacy framework. The authors introduce a new class of noise-adding mechanisms called truncated Laplacian mechanisms, which provide improved bounds on noise amplitude and power for single real-valued query functions. Their findings demonstrate that these mechanisms outperform traditional Gaussian mechanisms across various privacy regimes, thereby closing previous gaps in optimality.

Uploaded by

nandhini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views15 pages

Privacy and Utility Tradeoff in Approximate Differential Privacy

This document discusses the trade-off between privacy and utility in the context of approximate differential privacy, specifically focusing on the (ε, δ)-differential privacy framework. The authors introduce a new class of noise-adding mechanisms called truncated Laplacian mechanisms, which provide improved bounds on noise amplitude and power for single real-valued query functions. Their findings demonstrate that these mechanisms outperform traditional Gaussian mechanisms across various privacy regimes, thereby closing previous gaps in optimality.

Uploaded by

nandhini
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Privacy and Utility Tradeo in Approximate Dierential Privacy

Quan Geng, Wei Ding, Ruiqi Guo, and Sanjiv Kumar

Google AI
New York, NY 10011
Email: qgeng, vvei, guorq, [email protected]

Abstract
We characterize the minimum noise amplitude and power for noise-adding mechanisms in (, δ)-
dierential privacy for single real-valued query function. We derive new lower bounds using the duality
arXiv:1810.00877v2 [cs.CR] 5 Feb 2019

of linear programming, and new upper bounds by proposing a new class of (, δ)-dierentially private
mechanisms, the truncated Laplacian mechanisms. We show that the multiplicative gap of the lower
bounds and upper bounds goes to zero in various high privacy regimes, proving the tightness of the
lower and upper bounds and thus establishing the optimality of the truncated Laplacian mechanism.
In particular, our results close the previous constant multiplicative gap in the discrete setting. Numeric
experiments show the improvement of the truncated Laplacian mechanism over the optimal Gaussian
mechanism in all privacy regimes.

1 Introduction
Dierential privacy, introduced by Dwork et al. (2006b), is a framework to quantify to what extent
individual privacy in a statistical dataset is preserved while releasing useful aggregate information about the
dataset. Dierential privacy provides strong privacy guarantees by requiring the near-indistinguishability
of whether an individual is in the dataset or not based on the released information. For more motivation
and background of dierential privacy, we refer the readers to the survey by Dwork (2008) and the book
by Dwork & Roth (2014).
Since its introduction, dierential privacy has spawned a large body of research in dierentially private
data-releasing mechanism design, and the noise-adding mechanism has been applied in many machine
learning algorithms to preserve dierential privacy, e.g., logistic regression (Chaudhuri & Monteleoni,
2008), empirical risk minimization (Chaudhuri et al., 2011; Wang et al., 2018), online learning (Jain et al.,
2012), statistical risk minimization (Duchi et al., 2012), statistical learning (Dziugaite & Roy, 2018),
deep learning (Shokri & Shmatikov, 2015; Abadi et al., 2016; Phan et al., 2016) distributed optimization
(Agarwal et al., 2018), hypothesis testing (Sheet, 2018), matrix completion (Jain et al., 2018), expectation
maximization (Park et al., 2017), and principal component analysis (Chaudhuri et al., 2012; Ge et al.,
2018).
The classic dierential privacy is called -dierential privacy, which imposes an upper bound e on
the multiplicative distance of the probability distributions of the randomized query outputs for any two
neighboring datasets. The standard approach for preserving -dierential privacy is adding a noise with
the Laplacian distribution to the query output. Introduced by Dwork et al. (2006a), the approximate
dierential privacy is (, δ)-dierential privacy, and the common interpretation of (, δ)-dierential privacy
is that it is -dierential privacy except with probability δ (Mironov, 2017). The standard approach for
preserving (, δ)-dierential privacy is the Gaussian mechanism, which adds a Gaussian noise to the query
output.
To fully make use of the dierentially private mechanisms, it is important to understand the fundamental
trade-o between privacy and utility (accuracy). For example, within the class of noise-adding mechanisms,
given the privacy constraint  and δ, we are interested in deriving the minimum amount of noise added
to achieve the highest accuracy and utility while preserving the dierential privacy. In the literature,
there have been many works on optimal dierential privacy mechanism design and characterizing the
privacy and utility tradeo in dierential privacy. For a single count query function under -dierential
privacy, Ghosh et al. (2009) show that the geometric mechanism is universally optimal under a Bayesian
framework, and Gupte & Sundararajan (2010) derived the optimal noise probability distributions under
a minimax cost framework. Geng & Viswanath (2016b) show that the optimal noise distribution has

1
a staircase-shaped probability density function for single real-valued query function under -dierential
privacy, and Geng et al. (2015) generalized the result to two-dimensional query functions. Soria-Comas
& Domingo-Ferrer (2013) also independently derived the staircase-shaped noise probability distribution
under a dierent optimization framework.
Geng & Viswanath (2016a) show that for a single integer-valued query function under (, δ)-dierential
privacy, the discrete uniform noise distribution and the discrete Laplacian noise distribution are asymp-
totically optimal within a constant multiplicative gap in the high privacy regions. Balle & Wang (2018)
improved the classic analysis of the Gaussian mechanism for (, δ)-dierential in the high privacy regime
( → 0), and developed an optimal Gaussian mechanism whose variance is calibrated directly using the
Gaussian cumulative density function instead of a tail bound approximation.

1.1 Our Contributions


In this work, we characterize the minimum noise amplitude and power for noise-adding mechanisms in
(, δ)-dierential privacy for single real-valued query function. Our contributions are three-fold:
First, we present a new class of (, δ)-dierentially private noise-adding mechanisms, truncated Laplacian
mechanisms. Applying the truncated Laplacian mechanism, we derive new achievable upper bounds
on minimum noise amplitude and noise power in (, δ)-dierential privacy for single real-valued query
function. The key insights from the new mechanisms design are that the noise probability density function
shall decay as fast as possible while being -dierentially private when the noise is small, and then sharply
reduce to zero when the noise is big, to avoid a heavy tail distribution which would incur a high cost.
Second, we derive new lower bounds on the minimum noise amplitude and minimum noise power. The
key technique is to discretize the continuous probability distribution and the loss function, and transform
the continuous functional optimization problem to linear programming. Applying the lower bound result
in Geng & Viswanath (2016a) for integer-valued query function, which is based on the duality of linear
programming, we derive new lower bounds for real-valued query functions under (, δ)-dierential privacy.
Third, we show that the multiplicative gap of the lower bounds and upper bounds goes to zero in
various high privacy regimes, proving the tightness of the lower and upper bounds, and thus establish the
optimality of the truncated Laplacian mechanism for minimizing the noise amplitude and noise power
under (, δ)-dierential privacy. In particular, our result closes the previous constant multiplicative gap
between the lower bound and the upper bound (using discrete uniform distribution and discrete Laplacian
distribution) in Geng & Viswanath (2016a).
Comprehensive numeric experiments show the improvement of the truncated Laplacian mechanism over
the optimal Gaussian mechanism in Balle & Wang (2018) by signicantly reducing the noise amplitude
and noise power in all privacy regimes.

1.2 Organization
The paper is organized as follows. In Section 2, we give some preliminaries on dierential privacy, and
derive the (, δ)-dierential privacy constraint on the additive noise probability distribution and dene
the minimum noise amplitude and noise power under (, δ)-dierential privacy. Section 3 presents the
truncated Laplacian mechanism for preserving (, δ)-dierential privacy, and derives new upper bounds
for minimum noise amplitude and noise power. Section 4 derives new lower bounds on the minimum noise
magnitude and noise power. Section 5 shows that the multiplicative gap between the lower bounds and
the upper bounds goes to zero in various privacy regimes, and thus proves the tightness of the new lower
and upper bounds. Section 6 conducts comprehensive numeric experiments to compare the performance
of the truncated Laplacian mechanism with the optimal Gaussian mechanisms, and demonstrates the
improvement in all privacy regimes. Section 7 discusses some additional properties of the truncated
Laplacian mechanism and concludes this paper.

2 Problem Formulation
In this section, we rst give some preliminaries on dierential privacy, and then dene the minimum noise
amplitude V1∗ and minimum noise power V2∗ for (, δ)-dierentially private noise-adding mechanisms.
Consider a real-valued query function q : D → R, where D is the set of all possible datasets. The
real-valued query function q will be applied to a dataset, and the query output is a real number. Two
datasets D1 , D2 ∈ D are called neighboring datasets if they dier in at most one element, i.e., one is a
proper subset of the other and the larger dataset contains just one additional element Dwork (2008). A

2
randomized query-answering mechanism K for the query function q will randomly output a number with
probability distribution depending on query output q(D), where D is the dataset.

Denition 1 ((, δ)-dierential privacy (Dwork et al., 2006a)). A randomized mechanism K gives (, δ)-
dierential privacy if for all data sets D1 and D2 diering on at most one element, and for any measurable
set S ⊂ Range(K),

Pr[K(D1 ) ∈ S] ≤ e Pr[K(D2 ) ∈ S] + δ. (1)

The sensitivity of a real-valued query function measures how the query changes for neighboring
datasets.

Denition 2 (Query Sensitivity). The sensitivity of q is dened as

∆ := max q(D1 ) − q(D2 ),


D1 ,D2 ∈D

for all D1 , D2 diering in at most one element.

A standard approach for preserving dierential privacy is query-output independent noise-adding


mechanisms, where a random noise is added to the query output. Given a dataset D, a query-output
independent noise-adding mechanism K will release the query output t = q(D) corrupted by an additive
random noise X with probability distribution P:

K(D) = t + X.

We derive the dierential privacy constraint on the noise probability distribution P in Lemma 1.

Lemma 1. Given the query sensitivity ∆ and privacy parameters  and δ, the noise probability distribution
P preserves (, δ)-dierential privacy if and only if

P(S) − e P(S + d) ≤ δ, ∀ d ≤ ∆, measurable set S ⊂ R. (2)

Proof. The dierential privacy constraint (1) on K is that for any t1 , t2 ∈ R such that t1 − t2  ≤ ∆
(corresponding to the query outputs for two neighboring datasets1 ),

P(S − t1 ) ≤ e P(S − t2 ) + δ, ∀ measurable set S ⊂ R, (3)

where ∀t ∈ R, S + t is dened as the set s + t  s ∈ S.


Since (3) has to hold for any measurable set S and any t1 − t2  ≤ ∆, equivalently, we have

P(S) ≤ e P(S + d) + δ, ∀ d ≤ ∆, measurable set S ⊂ R.

Let P,δ denote the set of noise probability distributions satisfying the (, δ)-dierential
 privacy
constraint (2). Given P ∈ P,δ , the expected noise amplitude and noise power are x∈R xP(dx) and

x∈R
x2 P(dx). The goal of this work is to characterize the minimum expected noise amplitude and noise
power under (, δ)-dierential privacy. More precisely, dene


V1 := inf xP(dx) (min noise amplitude),
P∈P,δ x∈R

V2∗ := inf x2 P(dx) (min noise power).
P∈P,δ x∈R

In this work, we characterize V1∗ and V2∗ in terms of ∆, , δ by deriving tight lower bounds V1low , V2low
and upper bounds V1upp , V2upp such that V1low ≤ V1∗ ≤ V1upp and V2low ≤ V2∗ ≤ V2upp .
In the next section, we present the new upper bounds V1upp and V2upp . The lower bounds V1low and
V2low are presented in Section 4.
1 In this work we impose no prior on the query function other than the query sensitivity ∆. For any t , t ∈ R such that
1 2
|t1 − t2 | ≤ ∆, there may exist two neighboring datasets D1 and D2 with q(D1 ) = t1 and q(D2 ) = t2 .

3
3 Upper Bound: Truncated Laplacian Mechanism
In this section, we present a new class of (, δ)-dierentially private noise-adding mechanism, truncated
Laplacian mechanism. Applying the truncated Laplacian mechanism, we derive new achievable (and tight)
upper bounds V1upp and V2upp on minimum noise amplitude V1∗ and minimum noise power V2∗ in Theorem
2 and Theorem 3.
Before presenting the exact form of the truncated Laplacian mechanism, we rst discuss some key
ideas and insights behind the new mechanism design.
The standard Laplacian distribution for preserving -dierential privacy has a symmetric probability
 − |x| f (x)
density function f (x) = 2∆ e ∆ . Note that for any x ≥ 0, the probability density decay rate, f (x+∆) , is
exactly e . Geng & Viswanath (2016b) show that the decay rate e is optimal under -dierential privacy.
Indeed, if the decay rate is higher, it is no longer -dierentially private; if the decay rate is lower, it will
incur a higher cost. However, under (, δ)-dierential privacy, Laplacian distribution is not optimal as it
has a heavy tail distribution.
(, δ)-dierential privacy relaxes the -dierential privacy constraint, and it allows that for a set of
points with a probability mass δ, the decay rate can exceed e . The Gaussian mechanism is widely used
2
− x2
f (x) e σ
in (, δ)-dierential privacy, and for x > 0, its probability density decay rate is f (x+∆) = (x+∆)2
=

e σ2
∆2 +2∆x ∆2 2∆
e σ2 = e e σ2 x , which is exponentially increasing with respect to x. When x is big, the decay rate
σ2

can be very high. While the Gaussian mechanism addresses the long tail distribution to some extent by
having higher decay rate for large x, the decay rate is smaller than e when x is small.
Motivated by the observation that under (, δ)-dierential privacy, the decay rate shall be as high as
possible without exceeding e , except for a set of points with a probability mass δ (for those there is no
limit on the decay rate), we derive a symmetric truncated Laplacian distribution where the probability
density decay rate is exactly e , except for a set of points with probability mass δ where the decay rate is
innite.

Denition 3 (Truncated Laplacian Distribution). Given the privacy parameters 0 < δ < 12 ,  > 0 and
the query sensitivity ∆ > 0, the probability density function of the truncated Laplacian distribution PTLap
is dened as:
{ |x|
Be− λ , for x ∈ [−A, A]
fTLap (x) := (4)
0, otherwise

where

λ := ,

∆ e − 1
A := log(1 + ),
 2δ

1 1
B := = .
2 ∆ (1 − 1+ e1 −1 )
A
2λ(1 − e− λ )

4
Figure 1: Noise probability density function fTLap of the truncated Laplacian mechanism. fTLap is a
symmetric truncated exponentinal function with a probability mass δ in the last interval with length ∆ in the
fTLap (x)
support of fTLap , i.e., the interval [A − ∆, A]. The decay rate fTLap (x+∆) is exactly e for x ∈ [0, A − ∆).

The parameters A and B are then derived by solving the equations that x∈R fTLap (x)dx = 1 and
A
f
A−∆ TLap
(x)dx = δ.

 A |x|
fTLap is a valid probability density function, as fTLap (x) ≥ 0 and x∈R fTLap (x)dx = 0 2Be− λ dx =
A
2λB(1 − e− λ ) = 1.
We discuss the key properties of the symmetric probability density function fTLap (x):
fTLap (x)
• The decay rate in [0, A − ∆] is exactly e , i.e., fTLap (x+∆) = e , ∀x ∈ [0, A − ∆].

• The probability mass in the interval [A − ∆, A] is δ, i.e., PTLap ([A − ∆, A]) = δ. Indeed,
∫ A ∫ A
|x|
fTLap (x)dx = Be− λ dx
A−∆ A−∆
A−∆ A A ∆
= λB(e− λ − e− λ ) = λBe− λ (e λ − 1)
A
∆ e− λ ∆ 1
= (e λ − 1) A = (e λ − 1) A = δ.
2(1 − e− λ ) 2(e λ − 1)

fTLap (x)
• The decay rate fTLap (x+∆) is +∞ for x ∈ (A − ∆, A], as fTLap (x) = 0 for x ∈ (A, +∞).

Denition 4 (Truncated Laplacian mechanism). Given the query sensitivity ∆, and the privacy parameters
, δ, the truncated Laplacian mechanism adds a noise with probability distribution PTLap dened in (4) to
the query output.

Theorem 1. The truncated Laplacian mechanism preserves (, δ)-dierential privacy.

Proof. Equivalently, we need to show that the truncated Laplacian distribution PTLap dened in (4)
satises the (, δ)-dierential privacy constraint (2).
We are interested in maximizing PTLap (S) − e PTLap (S + d) in (2) and show that the maximum over
S ⊆ R is upper bounded by δ. Since fTLap (x) is symmetric and monotonically decreasing in [0, +∞),
without loss of generality, we can assume d ≥ 0 and thus d ∈ [0, ∆].
To maximize PTLap (S) − e PTLap (S + d), S shall not contain points in (−∞, − ∆2 ], as


fTLap (x) ≤ fTLap (x + d), ∀x ∈ (−∞, − ].
2
S shall not contain points in [− ∆
2 , A − ∆], as


fTLap (x) ≤ e fTLap (x + d), ∀x ∈ [− , A − ∆].
2
Therefore, PTLap (S) − e PTLap (S + d) is maximized for some set S ⊆ [A − ∆, +∞). Since fTLap (x) is
monotonically decreasing in [A − ∆, +∞), PTLap (S) − e PTLap (S + d) is maximized at S = [A − ∆, +∞)
 A−∆+d A
and the maximum value is A−∆ f (x)dx ≤ A−∆ fTLap (x)dx = δ.
We conclude that PTLap satises the (, δ)-dierential privacy constraint (2).

5
Next, we apply the truncated Laplacian mechanism to derive new upper bounds on the minimum
noise amplitude V1∗ and noise power V2∗ .
Theorem 2 (Upper Bound on Minimum Noise Amplitude).

−1
∆ log(1 + e 2δ )
V1∗ ≤ V1upp := (1 − e  −1 ). (5)
 2δ

Proof. We can compute the expected noise amplitude for the truncated Laplacian distribution PTLap
dened in (4) via
∫ ∫ A
x
V1upp := fTLap (x)xdx = 2 Be− λ xdx
x∈R 0
∫ A
−A x
−λ
= 2Bλ(−Ae λ + e dx)
0
A A
= 2Bλ(−Ae− λ + λ(1 − e− λ ))
A
Ae− λ A
=− A +λ=λ− A
1− e− λ e λ −1

−1
∆ log(1 + e 2δ )
= (1 − e −1 ).
 2δ

Since the truncated Laplacian mechanism preserves (, δ)-dierential privacy, this gives an upper bound
on the minimum noise amplitude V1∗ under (, δ)-dierential privacy.
In Theorem 2, the upper bound V1upp is composed of two parts. The rst part is ∆ , which is the noise
amplitude of the Laplacian mechanism under -dierential privacy. The second part reduces the noise by
 −1
log(1+ e 2δ )
a portion of e −1 due to the δ-relaxation in (, δ)-dierential privacy.

We analyze the asympotic properties of V1upp in the high privacy regimes as  → 0, δ → 0:
• Given , limδ→0 V1upp = ∆ . The truncated Laplacian mechanism will be reduced to the standard
Laplacian mechanism as δ → 0,.
e −1
• Given δ, lim→0 V1upp = ∆
4δ . Indeed, when  → 0, 2δ → 0, and thus

−1 2
e −1 ( e 2δ )
∆ −
V1upp ≈ (1 − 2δ
e −1
2
)
 2δ

−1
∆ e 2δ ∆  ∆
≈ =
= .
 2  4δ 4δ
As  → 0, the truncated Laplacian distribution is reduced to a uniform distribution in the interval
∆ ∆ δ
[− 2δ , 2δ ] with probability density ∆ .
• In the regime δ =  → 0, the upper bound

−1
∆ log(1 + e 2 )
V1upp = (1 − 
e −1 )
 2

∆ log(1 + 2 )
≈ (1 −  )
 2
∆ 3
= (1 − 2 log ). (6)
 2
In Section 5, we show that the constant factor (1 − 2 log 32 ) is actually tight and the upper bound
V1upp matches the lower bound V1low dened in Theorem 4.
Theorem 3 (Upper Bound on Minimum Noise Power). Dene
e −1 e −1
2∆2 1
log2 (1 + 2δ ) + log(1 + 2δ )
V2upp := (1 − 2
e −1 ). (7)
2 2δ

We have
V2∗ ≤ V2upp .

6
Proof. We can compute the cost for the truncated Laplacian distribution via
∫ ∫ A ∫ A
x
V2upp := fTLap (x)x2 dx = 2 f (x)x2 dx = 2 Be− λ x2 dx
x∈R 0 0
∫ A
A x
= 2Bλ(−A2 e− λ + e− λ 2xdx)
0
∫ A
−A −A x
= 2Bλ(−A e 2 λ + 2λ(−Ae λ + e− λ dx))
0
A A A
= 2Bλ(−A2 e− λ + 2λ(−Ae− λ + λ − λe− λ )
A A A
−A2 e− λ − 2λAe− λ + 2λ2 − 2λ2 e− λ
= A
1 − e− λ
A A
A2 e− λ + 2λAe− λ
= 2λ2 − A
1 − e− λ
2
A + 2λA
= 2λ2 − A
eλ − 1
∆2 e −1 2∆2 e −1
2∆2 2 log2 (1 + 2δ ) + 2 log(1 + 2δ )
= 2 − e −1
 2δ
 
2∆2 1
2 log2 (1 −1
+ e 2δ −1
) + log(1 + e 2δ )
= (1 − e −1 ).
2 2δ

Since fTLap (x) can preserve (, δ)-dierential privacy, this gives an upper bound on V2∗ .

It turns out that the upper bounds V1upp and V2upp in Theorem 2 and Theorem 3 are tight. We derive
new lower bounds for V1∗ and V2∗ in the next section, and show that the multiplicative gap between the
lower bounds and the upper bounds goes to zero in the high privacy regions in Section 5.

4 Lower Bound
In this section, we derive new lower bounds V1low and V2low on the minimum noise amplitude V1∗ and
minimum noise power V2∗ , respectively. The key technique is to discretize the continuous probability
distribution and the loss function, and transform the continuous functional optimization problem to linear
programming, and then apply the discrete result from Geng & Viswanath (2016a).
Geng & Viswanath (2016a) derived lower bounds for an integer-valued query function under (, δ)-
dierential privacy. For integer-valued query functions, they formulate a linear programming problem
with the objective of minimizing the additive noise. They studied the dual problem and constructed a
dual feasible solution which gives a lower bound. Extending this result to the continuous setting, we show
a similar lower bound for real-valued query function under (, δ)-dierential privacy.
First, we give a lower bound for (, δ)-dierential privacy for integer-valued query function due to
Geng & Viswanath (2016a).
Dene

δ + e 2−1
a := ,
e
b := e− .
n−1
To avoid integer rounding issues, assume that there exists an integer n such that k=0 abk = 12 .

Lemma 2 (Theorem 8 in Geng & Viswanath (2016a)). Consider a symmetric cost function L(·) : Z → R,
where Z denotes the set of all integers. Given the privacy parameters , δ and the discrete query sensitivity
˜ ∈ Z+ , if a discrete probability distribution P satises

˜
P(S) − e P(S + d) ≤ δ, ∀S ⊆ Z, ∀d ∈ Z, d ≤ ∆ (8)

7
and the cost function L(·) satises
n−1
∑  
˜ − L(1 + (i − 1)∆)
bi 2L(i∆) ˜ − L(1 + i∆)
˜ ≥ L(1), (9)
i=1

then we have
n−1

Σi∈Z L(i)P(i) ≥ 2 ˜
abk L(1 + k ∆). (10)
k=0

Theorem 4 (Lower Bound on Minimum Noise Amplitude). Dene


n−1

V1low := 2a bk k∆
k=0
( )
b − bn (n − 1)bn
= 2a 2
− ∆. (11)
(1 − b) 1−b
We have
V1∗ ≥ V1low .
Proof. Given P ∈ P,δ , we can derive a lower bound on the cost by discretizing the probability distributions
and applying the lower bound (10) for integer-valued query functions in Lemma 2.
We rst discretize the probability distributions P. Given a positive integer N ≥ 0, dene a discrete
probability distribution P̃N via
 ∆ ∆ 
P̃N (i) := P [ (2i − 1), (2i + 1)) , ∀i ∈ Z.
2N 2N
For the noise cost function x, dene the corresponding discrete cost function L̃N via


0, i=0
L̃N (i) , 2N∆
(2i − 1), i≥1


L̃N (−i), i < 0.
It is ready to see that

xP(dx) ≥ Σi∈Z P̃N (i)L̃N (i).
x∈R

As the continuous probability distribution P satises (, δ)-dierential privacy constraint (2) with the
query sensitivity ∆, the discrete probability distribution P̃N satises the discrete (, δ)-dierential privacy
˜ = N , i.e., P̃N satises
constraint (8) with query sensitivity ∆
P̃N (S) − e P̃N (S + d) ≤ δ, ∀S ⊆ Z, d ≤ N.
˜ =N
We can verify that the condition (9) in Lemma 2 holds for L̃N and P̃N with query sensitivity ∆
when N is suciently large. Indeed, when N ≥ a + 2,
n−1

bi [2L̃N (iN ) − L̃N (1 + (i − 1)N ) − L̃N (1 + iN )]
i=1

− L̃N (1)
n−1
∑ ∆
= bi [2(2iN − 1) − 2(1 + (i − 1)N ) + 1
i=1
2N

− 2(1 + iN ) + 1] −
2N
n−1
∑ ∆ ∆
= bi (2N − 4) −
i=1
2N 2N
n−1
∆ ∑ i
= ( b (2N − 4) − 1)
2N i=1
∆ 2N − 4 ∆ N −2
= ( − 1) = ( − 1) ≥ 0.
2N 2a 2N a

8
The corresponding lower bound in (10) for L̃N and P̃N is
n−1
∑ n−1

k ∆
2 ab L̃N (1 + kN ) = 2 abk (2kN + 1)
2N
k=0 k=0
n−1
∑ ∑ n−1
∆ ∆
=2 abk (k∆ +
) = 2a∆ bk k +
2N 2N
k=0 k=0
n−1
∑ ( n
)
b−b (n − 1)bn
≥ 2a∆ bk k = 2a − ∆
(1 − b)2 1−b
k=0

= V1low

Therefore, for any P ∈ P,δ , we have



xP(dx) ≥ Σi∈Z P̃N (i)L̃N (i) ≥ V1low ,
x∈R

and thus V1∗ ≥ V1low .

Similarly, we derive the lower bound for the minimum noise power V2∗ .

Theorem 5 (Lower Bound on Minimum Noise Power). Dene


n−1

V2low := 2 abk k 2 ∆2
k=0
2a∆2 b(1 − bn−1 ) (n − 1)bn
= [−b + 2( − )
1−b (1 − b)2 1−b
b2 (1 − bn−2 )
− − (n − 1)2 bn ]. (12)
1−b
We have

V2∗ ≥ V2low .

Proof. We rst discretize the probability distribution P. Given a positive integer N ≥ 0, dene a discrete
probability distribution P̃N via
 ∆ ∆ 
P̃N (i) , P [ (2i − 1), (2i + 1)) , ∀i ∈ Z.
2N 2N

Dene the corresponding discrete cost function L̃N via




0, i=0
L̃N (i) , ( 2N

(2i − 1))2 , i≥1


L̃N (−i), i < 0.

It is easy to see that



L(x)P(dx) ≥ Σi∈Z P̃N (i)L̃N (i).
x∈R

As the continuous probability distribution P satises (, δ)-dierential privacy constraint with continu-
ous query sensitivity ∆, the discrete probability distribution P̃N satises the discrete (, δ)-dierential
privacy constraint with discrete query sensitivity N , i.e., P̃N satises

P̃N (S) − e P̃N (S + d) ≤ δ, ∀S ⊆ Z, d ≤ N.

9
Next we verify that the condition (9) in Lemma 2 holds when N is suciently large for the `2 cost
function. Indeed,
n−1
∑  
bi 2L̃N (iN ) − L̃N (1 + (i − 1)N ) − L̃N (1 + iN ) − L̃N (1)
i=1
n−1
∑ ∆2   ∆2
= bi 2
2(2iN − 1)2 − (2(1 + (i − 1)N ) − 1)2 − (2(1 + iN ) − 1)2 −
i=1
4N 4N 2
n−1
∑ ∆2 ∆2
= bi ((8i − 4)N 2
− 16iN + 4N ) −
i=1
4N 2 4N 2
≥ 0,

where the last step holds when N is suciently large.


The lower bound in (10) is
n−1

2 abk L̃N (1 + kN )
k=0
n−1
∑ ∆2
=2 abk (2kN + 1)2
4N 2
k=0
n−1
∑ ∆2
=2 abk (4k 2 N 2 + 4kN + 1)
4N 2
k=0
n−1
∑ ∆2
≥2 abk 4k 2 N 2
4N 2
k=0
n−1

=2 abk k 2 ∆2
k=0
n−1
(n−1)bn b2 (1−bn−2 )
−b + 2( b(1−b
(1−b)2
)
− 1−b ) − 1−b − (n − 1)2 bn
= 2a ∆2 .
1−b
Therefore, for any P ∈ P,δ , we have

x2 P(dx) ≥ Σi∈Z P̃N (i)L̃N (i)
x∈R
n−1
(n−1)bn b2 (1−bn−2 )
−b + 2( b(1−b
(1−b)2
)
− 1−b ) − 1−b − (n − 1)2 bn
≥ 2a ∆2 ,
1−b
and thus
n−1
(n−1)bn b2 (1−bn−2 )
−b + 2( b(1−b
(1−b)2
)
− 1−b ) − 1−b − (n − 1)2 bn
V2∗ ≥ 2a ∆2 = V2low .
1−b

5 Tightness of the Lower and Upper Bounds


In this section, we compare the lower bounds V1low , V2low and the upper bounds V1upp , V2upp (derived
from the truncated Laplacian mechanism) for the minimum noise amplitude and noise power under
(, δ)-dierential privacy. We show that they are close in the high privacy regions and the multiplicative
gap goes to zero, which proves the tightness of these lower and upper bounds and thus establishes the
near-optimality of the truncated Laplacian mechanism.

10
Theorem 6 (Tightness of Lower bound and Upper bound on Minimum Noise Amplitude).
V1low
lim upp ≥ 1 − 2δ.
→0 V1

V1low  
lim upp ≥  = 1 − + O(2 ).
δ→0 V1 e −1 2
V1low
lim upp = 1.
=δ→0 V1

Proof. 1. δ is xed, and  → 0:


1 1
( 2δ −1)
When  → 0, the upper bound V1upp → ∆
4δ , and the lower bound V1low → 2δ n(n−1)
2 ∆ = 2δ 2δ 2 ∆ =
1
( 4δ − 12 )∆. Therefore,
1
V1low ( 4δ − 12 )∆
lim upp ≥ ∆
= 1 − 2δ.
→0 V1

Note that 1 − 2δ → 1, as δ → 0, and thus the multiplicative gap between V1low and V1upp converges to
zero.
2.  is xed, and δ → 0:
When δ → 0, the upper bound V1upp → ∆ . For the lower bound V1low , we have
1 − e−
a→ ,
2
bn → 0,
nbn → 0,

and thus V1low →  −1 as δ → 0. Therefore,

V1low  −1  
lim upp ≥ ∆
= = 1 − + O(2 ).
δ→0 V1

e − 1 2

Therefore, the multiplicative gap between V1low and V1upp converges to zero as  → 0.
3.  = δ → 0:
In this regime, V1upp ≈ ∆ (1 − 2 log 23 ) as shown in Section 3. For the lower bound V1low , since
n−1 k 1
k=0 ab = 2 , we have

1 − bn 1 1−b
a = ⇒ bn = 1 − .
1−b 2 2a
1−b 1−e−
As  = δ → 0, 2a = 
δ+ e −1
→ 31 , and thus
2 2
e

1 2
lim bn = 1 − = ,
δ→0 3 3
log 32
n = Θ( ).
δ
Note that a = Θ( 32 δ) as δ → 0.
Therefore, as  = δ → 0,
( )
b − bn (n − 1)bn
2a 2
− ∆
(1 − b) 1−b
3
log
1− 2 2 2

≈ 2a( 2 3 − δ 3 )∆
δ δ
2 3
1 log( )
= 2a( 2 − 3 2 2 )∆
3δ δ
2
3 1 log( 3 )
≈ 2 δ( 2 − 3 2 2 )∆
2 3δ δ
3 ∆
= (1 − 2 log ) .
2 δ

11
Therefore, V1∗ is lower bounded by V1low ≈ (1 − 2 log 32 ) ∆
δ in the regime  = δ → 0. Since it is also
V low
upper bounded by V1upp ≈ ∆ (1 − 2 log 23 ), we conclude that lim=δ→0 V1upp = 1.
1
Note that our result closes the constant multiplicative gap in the discrete setting (see Equation (67)
and (69) in Geng & Viswanath (2016a)).

Similarly, we show that the lower bound V2low and the upper bound V2upp on the minimum noise power
are also tight.

Theorem 7 (Tightness of Lower bound and Upper bound on Minimum Noise Power).

V2low 2
lim upp ≥ (1 − δ)(1 − 2δ) = 1 − 3δ + 2δ .
→0 V2

V2low 2 (1 + e ) 
lim upp ≥ = 1 − + O(2 ).
δ→0 V2 2(e − 1)2 2
V2low
lim upp = 1.
=δ→0 V2

∆2
Proof. • Case  → 0: When  → 0, the upper bound V2upp converges to 12δ 2 . For the lower bound,
when  → 0, we have

a → δ,
b → 1,
1
n→ ,

n−1
and thus the lower bound V2low = 2a k=0 bk k 2 ∆2 converges to

(n − 1)n(2n − 1) 2 ( 1 − 1) 2δ
1 1
( δ − 1) 2 ∆2 1 1
2δ ∆ = 2δ 2δ ∆ = ( − 1)( − 2),
6 6 12 δ δ
which matches the upper bound as δ → 0. Therefore,
∆2 1
V low 12 ( δ − 1)( 1δ − 2)
lim 2upp ≥ ∆2
= (1 − δ)(1 − 2δ).
→0 V2
12δ

2∆2
• Case δ → 0: When δ → 0, the upper bound V2upp converges to 2 . For the lower bound, we have

1 − e− 1−b
a→ =
2 2
bn → 0
n2 bn → 0,

b(1−bn−1 ) (n−1)bn b2 (1−bn−2 )


−b+2( (1−b)2
− 1−b )− 1−b −(n−1)2 bn
and thus the lower bound 2a 1−b ∆2 converges to

b b2 b2 + b 2 e−2 + e− 2 1 + e
(−b + 2 2
− )∆2 = 2
∆ = − 2
∆ =  ∆2 ,
(1 − b) 1−b (1 − b) (1 − e ) (e − 1)2
2∆2
and this matches 2 as  → 0. Therefore,
1+e 2
V low (e −1)2 ∆ 2 (1 + e )
lim 2upp ≥ 2∆2
= .
δ→0 V2
2
2(e − 1)2

2∆2
• Case  = δ → 0: The upper bound V2upp converges to 2 (1 − log2 3
2 − 2 log 23 ). When  = δ → 0,

12
the lower bound V2low is
n−1
(n−1)bn b2 (1−bn−2 )
−b + 2( b(1−b
(1−b)2
)
− 1−b ) − 1−b − (n − 1)2 bn
2a ∆2
1−b
1− 2 2
log1
3
log2 3
3 2( 2 3 − 3 2 2 ) − 3 − 23 2
2

≈2  ∆2
2 
3∆2 2 4 3 2 3
= 2 ( − log − log2 )
 3 3 2 3 2
2∆2 3 2 3
= 2 (1 − 2 log − log ),
 2 2
which matches the uppber bound. We conclude that

V2low
lim upp = 1.
=δ→0 V2

6 Comparison with the Optimal Gaussian Mechanism


In this section we conduct numeric experiments to compare the performance of the truncated Laplacian
mechanisms with the optimal Gaussian mechanism described in Balle & Wang (2018).
√ Gaussian mechanism is that for any , δ ∈ (0, 1), adding a Gaussian noise with
A classic result on the
2 log(125δ)
standard deviation σ =  ∆ preserves (, δ)-dierential privacy Dwork & Roth (2014). Balle &
Wang (2018) developed the optimal Gaussian mechanism whose variance is calibrated directly using the
Gaussian cumulative density function instead of a tail bound approximation.

Figure 2: Ratio of the Noise Amplitude of the Truncated Laplacian Mechanism and the Optimal Gaussian
Mechanism.

We plot the ratio of the noise amplitude of truncated Laplacian mechanism and the optimal Gaussian
mechanism in Fig. 2, and plot the ratio of the noise power of truncated Laplacian mechanism and the
optimal Gaussian mechanism in Fig. 3, where  ∈ [10−4 , 10] and δ ∈ [10−6 , 0.1]. Note that compared
with the optimal Gaussian mechanism, the truncated Laplacian mechanism signicantly reduces the
noise amplitude and noise power in all privacy regimes. The improvement is not very surprising, as the
truncated Laplacian mechanism universally improves the probability density decay rate (for both small
and big noises) and thus leads to smaller noise amplitude and noise power in expectation.

Figure 3: Ratio of the Noise Power of the Truncated Laplacian Mechanism and the Optimal Gaussian
Mechanism.

13
7 Conclusion and Discussion
In this work, we characterize the minimum noise amplitude and noise power for noise-adding mechanisms
in (, δ)-dierential privacy for single real-valued query function. We derive new lower bounds using the
duality of linear programming, and derive new upper bounds by proposing a new class of (, δ)-dierentially
private mechanisms, the truncated Laplacian mechanisms. We show that the multiplicative gap of the
lower bounds and upper bounds goes to zero in various high privacy regimes, proving the tightness of
the lower and upper bounds and thus establishing the optimality of the truncated Laplacian mechanism.
In particular, our results close the previous constant multiplicative gap in Geng & Viswanath (2016a).
Comprehensive numeric experiments show the improvement of the truncated Laplacian mechanism over
the optimal Gaussian mechanism in Balle & Wang (2018) in all privacy regimes.
An obvious question is how to further improve the truncated Laplacian mechanism to provide stronger
privacy guarantees. To minimize the additive noise, an important property of the truncated Laplacian
mechanism is that the range of the output noise is bounded between [−A, A]. Therefore, for two neighboring
datasets, the randomized output ranges will have some non-overlapped set. While the truncated Laplacian
mechanism can strictly preserve (, δ)-dierential privacy, with a small probability up to δ (corresponding
to the probability that the output is in the non-overlapped set), an adversary can distinguish the two
neighboring datasets. To address this concern, one can improve over the truncated Laplacian mechanism
and impose an arbitrarily light tail distribution over [A, +∞) to ensure that the output space is the same
for all possible datasets.

References
Abadi, M., Chu, A., Goodfellow, I., McMahan, H. B., Mironov, I., Talwar, K., and Zhang, L. Deep
learning with dierential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer
and Communications Security, CCS ’16, pp. 308–318. ACM, 2016.

Agarwal, N., Suresh, A. T., Yu, F., Kumar, S., and McMahan, B. cpSGD: Communication-ecient and
dierentially-private distributed SGD. In Advances in Neural Information Processing Systems. 2018.

Balle, B. and Wang, Y.-X. Improving the Gaussian mechanism for dierential privacy: Analytical
calibration and optimal denoising. In Proceedings of the 35th International Conference on Machine
Learning, volume 80 of Proceedings of Machine Learning Research, pp. 394–403. PMLR, 2018.

Chaudhuri, K. and Monteleoni, C. Privacy-preserving logistic regression. In Neural Information Processing


Systems, pp. 289–296, 2008.

Chaudhuri, K., Monteleoni, C., and Sarwate, A. D. Dierentially private empirical risk minimization.
Journal of Machine Learning Research, 12:1069–1109, 2011.

Chaudhuri, K., Sarwate, A., and Sinha, K. Near-optimal dierentially private principal components. In
Advances in Neural Information Processing Systems 25, pp. 989–997. 2012.

Duchi, J., Jordan, M., and Wainwright, M. Privacy aware learning. In Advances in Neural Information
Processing Systems, pp. 1430–1438, 2012.

Dwork, C. Dierential Privacy: A Survey of Results. In Theory and Applications of Models of Computation,
volume 4978, pp. 1–19, 2008.

Dwork, C. and Roth, A. The algorithmic foundations of dierential privacy. Foundations and Trends in
Theoretical Computer Science, 9(3-4):211–407, 2014.

Dwork, C., Kenthapadi, K., McSherry, F., Mironov, I., and Naor, M. Our data, ourselves: privacy via
distributed noise generation. In Proceedings of the 24th annual international conference on The Theory
and Applications of Cryptographic Techniques, EUROCRYPT’06, pp. 486–503. Springer-Verlag, 2006a.

Dwork, C., McSherry, F., Nissim, K., and Smith, A. Calibrating noise to sensitivity in private data
analysis. In Theory of Cryptography, volume 3876 of Lecture Notes in Computer Science, pp. 265–284.
Springer Berlin / Heidelberg, 2006b.

Dziugaite, G. K. and Roy, D. M. Data-dependent pac-bayes priors via dierential privacy. In Advances in
Neural Information Processing Systems 31, pp. 8440–8450. 2018.

14
Ge, J., Wang, Z., Wang, M., and Liu, H. Minimax-optimal privacy-preserving sparse pca in distributed
systems. In Proceedings of the Twenty-First International Conference on Articial Intelligence and
Statistics, volume 84 of Proceedings of Machine Learning Research, pp. 1589–1598. PMLR, 09–11 Apr
2018.

Geng, Q. and Viswanath, P. Optimal noise adding mechanisms for approximate dierential privacy. IEEE
Transactions on Information Theory, 62(2):952–969, Feb 2016a.

Geng, Q. and Viswanath, P. The optimal noise-adding mechanism in dierential privacy. IEEE Transactions
on Information Theory, 62(2):925–951, Feb 2016b.

Geng, Q., Kairouz, P., Oh, S., and Viswanath, P. The staircase mechanism in dierential privacy. IEEE
Journal of Selected Topics in Signal Processing, 9(7):1176–1184, Oct 2015.

Ghosh, A., Roughgarden, T., and Sundararajan, M. Universally utility-maximizing privacy mechanisms.
In Proceedings of the 41st annual ACM symposium on Theory of computing, STOC ’09, pp. 351–360.
ACM, 2009.

Gupte, M. and Sundararajan, M. Universally optimal privacy mechanisms for minimax agents. In
Symposium on Principles of Database Systems, pp. 135–146, 2010.

Jain, P., Kothari, P., and Thakurta, A. Dierentially private online learning. In Proceedings of the 25th
Annual Conference on Learning Theory, volume 23 of Proceedings of Machine Learning Research, pp.
24.1–24.34. PMLR, 25–27 Jun 2012.

Jain, P., Thakkar, O. D., and Thakurta, A. Dierentially private matrix completion revisited. In
Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of
Machine Learning Research, pp. 2215–2224. PMLR, 10–15 Jul 2018.

Mironov, I. Rényi dierential privacy. In 2017 IEEE 30th Computer Security Foundations Symposium
(CSF), pp. 263–275, Aug. 2017.

Park, M., Foulds, J., Chaudhuri, K., and Welling, M. DP-EM: Dierentially Private Expectation
Maximization. In Proceedings of the 20th International Conference on Articial Intelligence and
Statistics, volume 54 of Proceedings of Machine Learning Research, pp. 896–904. PMLR, 2017.

Phan, N.-S., Wang, Y., Wu, X., and Dou, D. Dierential privacy preservation for deep auto-encoders: an
application of human behavior prediction. In AAAI, 2016.

Sheet, O. Locally private hypothesis testing. In Proceedings of the 35th International Conference on
Machine Learning, volume 80 of Proceedings of Machine Learning Research, pp. 4605–4614. PMLR,
10–15 Jul 2018.

Shokri, R. and Shmatikov, V. Privacy-preserving deep learning. In Proceedings of the 22Nd ACM SIGSAC
Conference on Computer and Communications Security, CCS ’15, pp. 1310–1321. ACM, 2015.

Soria-Comas, J. and Domingo-Ferrer, J. Optimal data-independent noise for dierential privacy. Informa-
tion Sciences, 250:200 – 214, 2013.

Wang, D., Gaboardi, M., and Xu, J. Empirical risk minimization in non-interactive local dierential
privacy revisited. In Advances in Neural Information Processing Systems 31, pp. 973–982. 2018.

15

You might also like