-
Adapting to Function Difficulty and Growth Conditions in Private Optimization
Authors:
Hilal Asi,
Daniel Levy,
John Duchi
Abstract:
We develop algorithms for private stochastic convex optimization that adapt to the hardness of the specific function we wish to optimize. While previous work provide worst-case bounds for arbitrary convex functions, it is often the case that the function at hand belongs to a smaller class that enjoys faster rates. Concretely, we show that for functions exhibiting $κ$-growth around the optimum, i.e…
▽ More
We develop algorithms for private stochastic convex optimization that adapt to the hardness of the specific function we wish to optimize. While previous work provide worst-case bounds for arbitrary convex functions, it is often the case that the function at hand belongs to a smaller class that enjoys faster rates. Concretely, we show that for functions exhibiting $κ$-growth around the optimum, i.e., $f(x) \ge f(x^*) + λκ^{-1} \|x-x^*\|_2^κ$ for $κ> 1$, our algorithms improve upon the standard ${\sqrt{d}}/{n\varepsilon}$ privacy rate to the faster $({\sqrt{d}}/{n\varepsilon})^{\tfracκ{κ- 1}}$. Crucially, they achieve these rates without knowledge of the growth constant $κ$ of the function. Our algorithms build upon the inverse sensitivity mechanism, which adapts to instance difficulty (Asi & Duchi, 2020), and recent localization techniques in private optimization (Feldman et al., 2020). We complement our algorithms with matching lower bounds for these function classes and demonstrate that our adaptive algorithm is \emph{simultaneously} (minimax) optimal over all $κ\ge 1+c$ whenever $c = Θ(1)$.
△ Less
Submitted 5 August, 2021;
originally announced August 2021.
-
Commuting Varieties and Cohomological Complexity
Authors:
Nham V. Ngo,
Paul D. Levy,
Klemen Šivic
Abstract:
In this paper we determine, for all $r$ sufficiently large, the irreducible component(s) of maximal dimension of the variety of commuting $r$-tuples of nilpotent elements of $\mathfrak{gl}_n$. Our main result is that in characteristic $\neq 2,3$, this nilpotent commuting variety has dimension $(r+1)\lfloor \frac{n^2}{4}\rfloor$ for $n\geq 4$, $r\geq 7$. We use this to find the dimension of the (or…
▽ More
In this paper we determine, for all $r$ sufficiently large, the irreducible component(s) of maximal dimension of the variety of commuting $r$-tuples of nilpotent elements of $\mathfrak{gl}_n$. Our main result is that in characteristic $\neq 2,3$, this nilpotent commuting variety has dimension $(r+1)\lfloor \frac{n^2}{4}\rfloor$ for $n\geq 4$, $r\geq 7$. We use this to find the dimension of the (ordinary) $r$-th commuting varieties of $\mathfrak{gl}_n$ and $\mathfrak{sl}_n$ for the same range of values of $r$ and $n$.
Our principal motivation is the connection between nilpotent commuting varieties and cohomological complexity of finite group schemes, which we exploit in the last section of the paper to obtain explicit values for complexities of a large family of modules over the $r$-th Frobenius kernel $({\rm GL}_n)_{(r)}$. These results indicate an inequality between the complexities of a rational $G$-module $M$ when restricted to $G_{(r)}$ or to $G(\mathbb F_{p^r})$; we subsequently establish this inequality for every simple algebraic group $G$ defined over an algebraically closed field of good characteristic, significantly extending a result of Lin and Nakano.
△ Less
Submitted 4 April, 2022; v1 submitted 17 May, 2021;
originally announced May 2021.
-
Learning with User-Level Privacy
Authors:
Daniel Levy,
Ziteng Sun,
Kareem Amin,
Satyen Kale,
Alex Kulesza,
Mehryar Mohri,
Ananda Theertha Suresh
Abstract:
We propose and analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution ($m \ge 1$ samples), providing more stringent but more realistic protection against information leaks. We show that for high-dimensional mean estimation, empirical…
▽ More
We propose and analyze algorithms to solve a range of learning tasks under user-level differential privacy constraints. Rather than guaranteeing only the privacy of individual samples, user-level DP protects a user's entire contribution ($m \ge 1$ samples), providing more stringent but more realistic protection against information leaks. We show that for high-dimensional mean estimation, empirical risk minimization with smooth losses, stochastic convex optimization, and learning hypothesis classes with finite metric entropy, the privacy cost decreases as $O(1/\sqrt{m})$ as users provide more samples. In contrast, when increasing the number of users $n$, the privacy cost decreases at a faster $O(1/n)$ rate. We complement these results with lower bounds showing the minimax optimality of our algorithms for mean estimation and stochastic convex optimization. Our algorithms rely on novel techniques for private mean estimation in arbitrary dimension with error scaling as the concentration radius $τ$ of the distribution rather than the entire range.
△ Less
Submitted 3 December, 2021; v1 submitted 23 February, 2021;
originally announced February 2021.
-
Large-Scale Methods for Distributionally Robust Optimization
Authors:
Daniel Levy,
Yair Carmon,
John C. Duchi,
Aaron Sidford
Abstract:
We propose and analyze algorithms for distributionally robust optimization of convex losses with conditional value at risk (CVaR) and $χ^2$ divergence uncertainty sets. We prove that our algorithms require a number of gradient evaluations independent of training set size and number of parameters, making them suitable for large-scale applications. For $χ^2$ uncertainty sets these are the first such…
▽ More
We propose and analyze algorithms for distributionally robust optimization of convex losses with conditional value at risk (CVaR) and $χ^2$ divergence uncertainty sets. We prove that our algorithms require a number of gradient evaluations independent of training set size and number of parameters, making them suitable for large-scale applications. For $χ^2$ uncertainty sets these are the first such guarantees in the literature, and for CVaR our guarantees scale linearly in the uncertainty level rather than quadratically as in previous work. We also provide lower bounds proving the worst-case optimality of our algorithms for CVaR and a penalized version of the $χ^2$ problem. Our primary technical contributions are novel bounds on the bias of batch robust risk estimation and the variance of a multilevel Monte Carlo gradient estimator due to [Blanchet & Glynn, 2015]. Experiments on MNIST and ImageNet confirm the theoretical scaling of our algorithms, which are 9--36 times more efficient than full-batch methods.
△ Less
Submitted 10 December, 2020; v1 submitted 12 October, 2020;
originally announced October 2020.
-
Necessary and Sufficient Geometries for Gradient Methods
Authors:
Daniel Levy,
John C. Duchi
Abstract:
We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient metho…
▽ More
We study the impact of the constraint set and gradient geometry on the convergence of online and stochastic methods for convex optimization, providing a characterization of the geometries for which stochastic gradient and adaptive gradient methods are (minimax) optimal. In particular, we show that when the constraint set is quadratically convex, diagonally pre-conditioned stochastic gradient methods are minimax optimal. We further provide a converse that shows that when the constraints are not quadratically convex---for example, any $\ell_p$-ball for $p < 2$---the methods are far from optimal. Based on this, we can provide concrete recommendations for when one should use adaptive, mirror or stochastic gradient methods.
△ Less
Submitted 28 October, 2019; v1 submitted 23 September, 2019;
originally announced September 2019.
-
Dioid Partitions of Groups
Authors:
Ishay Haviv,
Dan Levy
Abstract:
A partition of a group is a dioid partition if the following three conditions are met: The setwise product of any two parts is a union of parts, there is a part that multiplies as an identity element, and the inverse of a part is a part. This kind of a group partition was first introduced by Tamaschke in 1968. We show that a dioid partition defines a dioid structure over the group, analogously to…
▽ More
A partition of a group is a dioid partition if the following three conditions are met: The setwise product of any two parts is a union of parts, there is a part that multiplies as an identity element, and the inverse of a part is a part. This kind of a group partition was first introduced by Tamaschke in 1968. We show that a dioid partition defines a dioid structure over the group, analogously to the way a Schur ring over a group is defined. After proving fundamental properties of dioid partitions, we focus on three part dioid partitions of cyclic groups of prime order. We provide classification results for their isomorphism types as well as for the partitions themselves.
△ Less
Submitted 7 July, 2018; v1 submitted 9 August, 2017;
originally announced August 2017.
-
Set-Direct Factorizations of Groups
Authors:
Dan Levy,
Attila Maróti
Abstract:
We consider factorizations $G=XY$ where $G$ is a general group, $X$ and $Y$ are normal subsets of $G$ and any $g\in G$ has a unique representation $g=xy$ with $x\in X$ and $y\in Y$. This definition coincides with the customary and extensively studied definition of a direct product decomposition by subsets of a finite abelian group. Our main result states that a group $G$ has such a factorization i…
▽ More
We consider factorizations $G=XY$ where $G$ is a general group, $X$ and $Y$ are normal subsets of $G$ and any $g\in G$ has a unique representation $g=xy$ with $x\in X$ and $y\in Y$. This definition coincides with the customary and extensively studied definition of a direct product decomposition by subsets of a finite abelian group. Our main result states that a group $G$ has such a factorization if and only if $G$ is a central product of $\left\langle X\right\rangle $ and $\left\langle Y\right\rangle $ and the central subgroup $\left\langle X\right\rangle \cap \left\langle Y\right\rangle $ satisfies certain abelian factorization conditions. We analyze some special cases and give examples. In particular, simple groups have no non-trivial set-direct factorization.
△ Less
Submitted 10 October, 2018; v1 submitted 14 July, 2017;
originally announced July 2017.
-
Symmetric Complete Sum-free Sets in Cyclic Groups
Authors:
Ishay Haviv,
Dan Levy
Abstract:
We present constructions of symmetric complete sum-free sets in general finite cyclic groups. It is shown that the relative sizes of the sets are dense in $[0,\frac{1}{3}]$, answering a question of Cameron, and that the number of those contained in the cyclic group of order $n$ is exponential in $n$. For primes $p$, we provide a full characterization of the symmetric complete sum-free subsets of…
▽ More
We present constructions of symmetric complete sum-free sets in general finite cyclic groups. It is shown that the relative sizes of the sets are dense in $[0,\frac{1}{3}]$, answering a question of Cameron, and that the number of those contained in the cyclic group of order $n$ is exponential in $n$. For primes $p$, we provide a full characterization of the symmetric complete sum-free subsets of $\mathbb{Z}_p$ of size at least $(\frac{1}{3}-c) \cdot p$, where $c>0$ is a universal constant.
△ Less
Submitted 1 May, 2017; v1 submitted 12 March, 2017;
originally announced March 2017.
-
Primitive permutation groups as products of point stabilizers
Authors:
Martino Garonzi,
Dan Levy,
Attila Maróti,
Iulian I. Simion
Abstract:
We prove that there exists a universal constant $c$ such that any finite primitive permutation group of degree $n$ with a non-trivial point stabilizer is a product of no more than $c\log n$ point stabilizers.
We prove that there exists a universal constant $c$ such that any finite primitive permutation group of degree $n$ with a non-trivial point stabilizer is a product of no more than $c\log n$ point stabilizers.
△ Less
Submitted 23 August, 2015;
originally announced August 2015.
-
Factorizations of finite groups by conjugate subgroups which are solvable or nilpotent
Authors:
Martino Garonzi,
Dan Levy,
Attila Maróti,
Iulian I. Simion
Abstract:
We consider factorizations of a finite group $G$ into conjugate subgroups, $G=A^{x_{1}}\cdots A^{x_{k}}$ for $A\leq G$ and $x_{1},\ldots ,x_{k}\in G$, where $A$ is nilpotent or solvable. First we exploit the split $BN$-pair structure of finite simple groups of Lie type to give a unified self-contained proof that every such group is a product of four or three unipotent Sylow subgroups. Then we deri…
▽ More
We consider factorizations of a finite group $G$ into conjugate subgroups, $G=A^{x_{1}}\cdots A^{x_{k}}$ for $A\leq G$ and $x_{1},\ldots ,x_{k}\in G$, where $A$ is nilpotent or solvable. First we exploit the split $BN$-pair structure of finite simple groups of Lie type to give a unified self-contained proof that every such group is a product of four or three unipotent Sylow subgroups. Then we derive an upper bound on the minimal length of a solvable conjugate factorization of a general finite group. Finally, using conjugate factorizations of a general finite solvable group by any of its Carter subgroups, we obtain an upper bound on the minimal length of a nilpotent conjugate factorization of a general finite group.
△ Less
Submitted 6 March, 2015; v1 submitted 22 January, 2015;
originally announced January 2015.
-
Groups equal to a product of three conjugate subgroups
Authors:
John Cannon,
Martino Garonzi,
Dan Levy,
Attila Maróti,
Iulian I. Simion
Abstract:
Let $G$ be a finite non-solvable group. We prove that there exists a proper subgroup $A$ of $G$ such that $G$ is the product of three conjugates of $A$, thus replacing an earlier upper bound of $36$ with the smallest possible value. The proof relies on an equivalent formulation in terms of double cosets, and uses the following theorem which is of independent interest and wider scope: Any group…
▽ More
Let $G$ be a finite non-solvable group. We prove that there exists a proper subgroup $A$ of $G$ such that $G$ is the product of three conjugates of $A$, thus replacing an earlier upper bound of $36$ with the smallest possible value. The proof relies on an equivalent formulation in terms of double cosets, and uses the following theorem which is of independent interest and wider scope: Any group $G$ with a $BN$-pair and a finite Weyl group $W$ satisfies $G=\left( Bn_{0}B\right) ^{2}=BB^{n_{0}}B$ where $n_{0}$ is any preimage of the longest element of $W$. The proof of the last theorem is formulated in the dioid consisting of all unions of double cosets of $B$ in $G$. Other results on minimal length product covers of a group by conjugates of a proper subgroup are given.
△ Less
Submitted 22 January, 2015;
originally announced January 2015.
-
Factorizing a Finite Group into Conjugates of a Subgroup
Authors:
Dan Levy,
Martino Garonzi
Abstract:
For every non-nilpotent finite group $G$, there exists at least one proper subgroup $M$ such that $G$ is the setwise product of a finite number of conjugates of $M$. We define $γ_{\text{cp}}\left( G\right) $ to be the smallest number $k$ such that $G$ is a product, in some order, of $k$ pairwise conjugated proper subgroups of $G$. We prove that if $G$ is non-solvable then…
▽ More
For every non-nilpotent finite group $G$, there exists at least one proper subgroup $M$ such that $G$ is the setwise product of a finite number of conjugates of $M$. We define $γ_{\text{cp}}\left( G\right) $ to be the smallest number $k$ such that $G$ is a product, in some order, of $k$ pairwise conjugated proper subgroups of $G$. We prove that if $G$ is non-solvable then $γ_{\text{cp}}\left( G\right) \leq36$ while if $G$ is solvable then $γ_{\text{cp}}\left( G\right) $ can attain any integer value bigger than $2$, while, on the other hand, $γ_{\text{cp}}\left( G\right) \leq4\log_{2}\left\vert G\right\vert $.
△ Less
Submitted 22 July, 2014;
originally announced July 2014.
-
Criteria for solvable radical membership via p-elements
Authors:
Simon Guest,
Dan Levy
Abstract:
Guralnick, Kunyavskii, Plotkin and Shalev have shown that the solvable radical of a finite group $G$ can be characterized as the set of all $x\in G$ such that $<x,y>$ is solvable for all $y\in G$. We prove two generalizations of this result. Firstly, it is enough to check the solvability of $<x,y>$ for every $p$-element $y\in G$ for every odd prime $p$. Secondly, if $x$ has odd order, then it is e…
▽ More
Guralnick, Kunyavskii, Plotkin and Shalev have shown that the solvable radical of a finite group $G$ can be characterized as the set of all $x\in G$ such that $<x,y>$ is solvable for all $y\in G$. We prove two generalizations of this result. Firstly, it is enough to check the solvability of $<x,y>$ for every $p$-element $y\in G$ for every odd prime $p$. Secondly, if $x$ has odd order, then it is enough to check the solvability of $<x,y>$ for every 2-element $y\in G$.
△ Less
Submitted 22 February, 2013;
originally announced February 2013.
-
The Neighbor-Net Algorithm
Authors:
Dan Levy,
Lior Pachter
Abstract:
The neighbor-joining algorithm is a popular phylogenetics method for constructing trees from dissimilarity maps. The neighbor-net algorithm is an extension of the neighbor-joining algorithm and is used for constructing split networks. We begin by describing the output of neighbor-net in terms of the tessellation of $\bar{\MM}_{0}^n(\mathbb{R})$ by associahedra. This highlights the fact that neig…
▽ More
The neighbor-joining algorithm is a popular phylogenetics method for constructing trees from dissimilarity maps. The neighbor-net algorithm is an extension of the neighbor-joining algorithm and is used for constructing split networks. We begin by describing the output of neighbor-net in terms of the tessellation of $\bar{\MM}_{0}^n(\mathbb{R})$ by associahedra. This highlights the fact that neighbor-net outputs a tree in addition to a circular ordering and we explain when the neighbor-net tree is the neighbor-joining tree. A key observation is that the tree constructed in existing implementations of neighbor-net is not a neighbor-joining tree. Next, we show that neighbor-net is a greedy algorithm for finding circular split systems of minimal balanced length. This leads to an interpretation of neighbor-net as a greedy algorithm for the traveling salesman problem. The algorithm is optimal for Kalmanson matrices, from which it follows that neighbor-net is consistent and has optimal radius 1/2. We also provide a statistical interpretation for the balanced length for a circular split system as the length based on weighted least squares estimates of the splits. We conclude with applications of these results and demonstrate the implications of our theorems for a recently published comparison of Papuan and Austronesian languages.
△ Less
Submitted 12 May, 2008; v1 submitted 17 February, 2007;
originally announced February 2007.
-
Neighbor joining with phylogenetic diversity estimates
Authors:
Dan Levy,
Ruriko Yoshida,
Lior Pachter
Abstract:
The Neighbor-Joining algorithm is a recursive procedure for reconstructing trees that is based on a transformation of pairwise distances between leaves. We present a generalization of the neighbor-joining transformation, which uses estimates of phylogenetic diversity rather than pairwise distances in the tree. This leads to an improved neighbor-joining algorithm whose total running time is still…
▽ More
The Neighbor-Joining algorithm is a recursive procedure for reconstructing trees that is based on a transformation of pairwise distances between leaves. We present a generalization of the neighbor-joining transformation, which uses estimates of phylogenetic diversity rather than pairwise distances in the tree. This leads to an improved neighbor-joining algorithm whose total running time is still polynomial in the number of taxa. On simulated data, the method outperforms other distance-based methods.
We have implemented neighbor-joining for subtree weights in a program called MJOIN which is freely available under the Gnu Public License at http://bio.math.berkeley.edu/mjoin/ .
△ Less
Submitted 30 July, 2005;
originally announced August 2005.
-
A Third-Order Semi-Discrete Central Scheme for Conservation Laws and Convection-Diffusion Equations
Authors:
Alexander Kurganov,
Doron Levy
Abstract:
We present a new third-order, semi-discrete, central method for approximating solutions to multi-dimensional systems of hyperbolic conservation laws, convection-diffusion equations, and related problems. Our method is a high-order extension of the recently proposed second-order, semi-discrete method in [16].
The method is derived independently of the specific piecewise polynomial reconstructio…
▽ More
We present a new third-order, semi-discrete, central method for approximating solutions to multi-dimensional systems of hyperbolic conservation laws, convection-diffusion equations, and related problems. Our method is a high-order extension of the recently proposed second-order, semi-discrete method in [16].
The method is derived independently of the specific piecewise polynomial reconstruction which is based on the previously computed cell-averages. We demonstrate our results, by focusing on the new third-order CWENO reconstruction presented in [21]. The numerical results we present, show the desired accuracy, high resolution and robustness of our method.
△ Less
Submitted 16 February, 2000;
originally announced February 2000.
-
Statistical Mechanics of the Periodic Camassa-Holm Equation
Authors:
Adrian Constantin,
Doron Levy
Abstract:
The paper has been withdrawn
The paper has been withdrawn
△ Less
Submitted 8 May, 2000; v1 submitted 10 January, 2000;
originally announced January 2000.
-
Optimal Prediction for Hamiltonian partial differential equations
Authors:
A. J. Chorin,
R. Kupferman,
D. Levy
Abstract:
Optimal prediction methods compensate for a lack of resolution in the numerical solution of time-dependent differential equations through the use of prior statistical information. We present a new derivation of the basic methodology, show that field-theoretical perturbation theory provides a useful device for dealing with quasi-linear problems, and provide a nonlinear example that illuminates th…
▽ More
Optimal prediction methods compensate for a lack of resolution in the numerical solution of time-dependent differential equations through the use of prior statistical information. We present a new derivation of the basic methodology, show that field-theoretical perturbation theory provides a useful device for dealing with quasi-linear problems, and provide a nonlinear example that illuminates the difference between a pseudo-spectral method and an optimal prediction method with Fourier kernels. Along the way, we explain the differences and similarities between optimal prediction, the representer method in data assimilation, and duality methods for finding weak solutions. We also discuss the conditions under which a simple implementation of the optimal prediction method can be expected to perform well.
△ Less
Submitted 12 November, 1999;
originally announced November 1999.
-
Compact Central WENO Schemes for Multidimensional Conservation Laws
Authors:
D. Levy,
G. Puppo,
G. Russo
Abstract:
We present a new third-order central scheme for approximating solutions of systems of conservation laws in one and two space dimensions. In the spirit of Godunov-type schemes,our method is based on reconstructing a piecewise-polynomial interpolant from cell-averages which is then advanced exactly in time. In the reconstruction step, we introduce a new third-order as a convex combination of inter…
▽ More
We present a new third-order central scheme for approximating solutions of systems of conservation laws in one and two space dimensions. In the spirit of Godunov-type schemes,our method is based on reconstructing a piecewise-polynomial interpolant from cell-averages which is then advanced exactly in time. In the reconstruction step, we introduce a new third-order as a convex combination of interpolants based on different stencils. The heart of the matter is that one of these interpolants is taken as an arbitrary quadratic polynomial and the weights of the convex combination are set as to obtain third-order accuracy in smooth regions. The embedded mechanism in the WENO-like schemes guarantees that in regions with discontinuities or large gradients, there is an automatic switch to a one-sided second-order reconstruction, which prevents the creation of spurious oscillations. In the one-dimensional case, our new third order scheme is based on an extremely compact point stencil. Analogous compactness is retained in more space dimensions. The accuracy, robustness and high-resolution properties of our scheme are demonstrated in a variety of one and two dimensional problems.
△ Less
Submitted 12 November, 1999;
originally announced November 1999.
-
Dissipative Behavior of Some Fully Non-Linear KdV-Type Equations
Authors:
D. Levy,
Y. Brenier
Abstract:
The KdV equation can be considered as a special case of the general equation u_{t} + f(u)_{x} - δg(u_{xx})_x = 0, \qquad δ> 0, where f is non-linear and g is linear, namely $f(u)=u^2/2$ and g(v)=v. As the parameter $δ$ tends to 0, the dispersive behavior of the KdV equation has been throughly investigated . We show through numerical evidence that a completely different, dissipative behavior occu…
▽ More
The KdV equation can be considered as a special case of the general equation u_{t} + f(u)_{x} - δg(u_{xx})_x = 0, \qquad δ> 0, where f is non-linear and g is linear, namely $f(u)=u^2/2$ and g(v)=v. As the parameter $δ$ tends to 0, the dispersive behavior of the KdV equation has been throughly investigated . We show through numerical evidence that a completely different, dissipative behavior occurs when g is non-linear, namely when g is an even concave function such as $g(v)=-|v|$ or $g(v)=-v^2$. In particular, our numerical results hint that as $δ-> 0$ the solutions converge to the unique entropy solution of the formal limit equation, in total contrast with the solutions of the KdV equation.
△ Less
Submitted 12 November, 1999;
originally announced November 1999.