Search | arXiv e-print repository

Re-Simulation-based Self-Supervised Learning for Pre-Training Foundation Models

Authors: Philip Harris, Michael Kagan, Jeffrey Krupa, Benedikt Maier, Nathaniel Woodward

Abstract: Self-Supervised Learning (SSL) is at the core of training modern large machine learning models, providing a scheme for learning powerful representations that can be used in a variety of downstream tasks. However, SSL strategies must be adapted to the type of training data and downstream tasks required. We propose RS3L, a novel simulation-based SSL strategy that employs a method of re-simulation to… ▽ More Self-Supervised Learning (SSL) is at the core of training modern large machine learning models, providing a scheme for learning powerful representations that can be used in a variety of downstream tasks. However, SSL strategies must be adapted to the type of training data and downstream tasks required. We propose RS3L, a novel simulation-based SSL strategy that employs a method of re-simulation to drive data augmentation for contrastive learning. By intervening in the middle of the simulation process and re-running simulation components downstream of the intervention, we generate multiple realizations of an event, thus producing a set of augmentations covering all physics-driven variations available in the simulator. Using experiments from high-energy physics, we explore how this strategy may enable the development of a foundation model; we show how R3SL pre-training enables powerful performance in downstream tasks such as discrimination of a variety of objects and uncertainty mitigation. In addition to our results, we make the RS3L dataset publicly available for further studies on how to improve SSL strategies. △ Less

Submitted 11 March, 2024; originally announced March 2024.

Comments: 24 pages, 9 figures

arXiv:2401.13537 [pdf, other]

Masked Particle Modeling on Sets: Towards Self-Supervised High Energy Physics Foundation Models

Authors: Tobias Golling, Lukas Heinrich, Michael Kagan, Samuel Klein, Matthew Leigh, Margarita Osadchy, John Andrew Raine

Abstract: We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards bui… ▽ More We propose masked particle modeling (MPM) as a self-supervised method for learning generic, transferable, and reusable representations on unordered sets of inputs for use in high energy physics (HEP) scientific data. This work provides a novel scheme to perform masked modeling based pre-training to learn permutation invariant functions on sets. More generally, this work provides a step towards building large foundation models for HEP that can be generically pre-trained with self-supervised learning and later fine-tuned for a variety of down-stream tasks. In MPM, particles in a set are masked and the training objective is to recover their identity, as defined by a discretized token representation of a pre-trained vector quantized variational autoencoder. We study the efficacy of the method in samples of high energy jets at collider physics experiments, including studies on the impact of discretization, permutation invariance, and ordering. We also study the fine-tuning capability of the model, showing that it can be adapted to tasks such as supervised and weakly supervised jet classification, and that the model can transfer efficiently with small fine-tuning data sets to new classes and new data domains. △ Less

Submitted 11 July, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

arXiv:2310.12804 [pdf, other]

Differentiable Vertex Fitting for Jet Flavour Tagging

Authors: Rachel E. C. Smith, Inês Ochoa, Rúben Inácio, Jonathan Shoemaker, Michael Kagan

Abstract: We propose a differentiable vertex fitting algorithm that can be used for secondary vertex fitting, and that can be seamlessly integrated into neural networks for jet flavour tagging. Vertex fitting is formulated as an optimization problem where gradients of the optimized solution vertex are defined through implicit differentiation and can be passed to upstream or downstream neural network compone… ▽ More We propose a differentiable vertex fitting algorithm that can be used for secondary vertex fitting, and that can be seamlessly integrated into neural networks for jet flavour tagging. Vertex fitting is formulated as an optimization problem where gradients of the optimized solution vertex are defined through implicit differentiation and can be passed to upstream or downstream neural network components for network training. More broadly, this is an application of differentiable programming to integrate physics knowledge into neural network models in high energy physics. We demonstrate how differentiable secondary vertex fitting can be integrated into larger transformer-based models for flavour tagging and improve heavy flavour jet classification. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: 11 pages

arXiv:2308.16680 [pdf, other]

Branches of a Tree: Taking Derivatives of Programs with Discrete and Branching Randomness in High Energy Physics

Authors: Michael Kagan, Lukas Heinrich

Abstract: We propose to apply several gradient estimation techniques to enable the differentiation of programs with discrete randomness in High Energy Physics. Such programs are common in High Energy Physics due to the presence of branching processes and clustering-based analysis. Thus differentiating such programs can open the way for gradient based optimization in the context of detector design optimizati… ▽ More We propose to apply several gradient estimation techniques to enable the differentiation of programs with discrete randomness in High Energy Physics. Such programs are common in High Energy Physics due to the presence of branching processes and clustering-based analysis. Thus differentiating such programs can open the way for gradient based optimization in the context of detector design optimization, simulator tuning, or data analysis and reconstruction optimization. We discuss several possible gradient estimation strategies, including the recent Stochastic AD method, and compare them in simplified detector design experiments. In doing so we develop, to the best of our knowledge, the first fully differentiable branching program. △ Less

Submitted 31 August, 2023; originally announced August 2023.

Comments: 8 pages

arXiv:2208.03284 [pdf, ps, other]

doi 10.2172/1886020

Interpretable Uncertainty Quantification in AI for HEP

Authors: Thomas Y. Chen, Biprateep Dey, Aishik Ghosh, Michael Kagan, Brian Nord, Nesar Ramachandra

Abstract: Estimating uncertainty is at the core of performing scientific measurements in HEP: a measurement is not useful without an estimate of its uncertainty. The goal of uncertainty quantification (UQ) is inextricably linked to the question, "how do we physically and statistically interpret these uncertainties?" The answer to this question depends not only on the computational task we aim to undertake,… ▽ More Estimating uncertainty is at the core of performing scientific measurements in HEP: a measurement is not useful without an estimate of its uncertainty. The goal of uncertainty quantification (UQ) is inextricably linked to the question, "how do we physically and statistically interpret these uncertainties?" The answer to this question depends not only on the computational task we aim to undertake, but also on the methods we use for that task. For artificial intelligence (AI) applications in HEP, there are several areas where interpretable methods for UQ are essential, including inference, simulation, and control/decision-making. There exist some methods for each of these areas, but they have not yet been demonstrated to be as trustworthy as more traditional approaches currently employed in physics (e.g., non-AI frequentist and Bayesian methods). Shedding light on the questions above requires additional understanding of the interplay of AI systems and uncertainty quantification. We briefly discuss the existing methods in each area and relate them to tasks across HEP. We then discuss recommendations for avenues to pursue to develop the necessary techniques for reliable widespread usage of AI with UQ over the next decade. △ Less

Submitted 6 September, 2022; v1 submitted 5 August, 2022; originally announced August 2022.

Comments: Submitted to the Proceedings of the US Community Study on the Future of Particle Physics (Snowmass 2021)

Report number: FERMILAB-FN-1179-SCD; arXiv:2208.03284 oai:inspirehep.net:2132723

arXiv:2207.00559 [pdf, other]

Ultra-low latency recurrent neural network inference on FPGAs for physics applications with hls4ml

Authors: Elham E Khoda, Dylan Rankin, Rafael Teixeira de Lima, Philip Harris, Scott Hauck, Shih-Chieh Hsu, Michael Kagan, Vladimir Loncar, Chaitanya Paikara, Richa Rao, Sioni Summers, Caterina Vernieri, Aaron Wang

Abstract: Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neura… ▽ More Recurrent neural networks have been shown to be effective architectures for many tasks in high energy physics, and thus have been widely adopted. Their use in low-latency environments has, however, been limited as a result of the difficulties of implementing recurrent architectures on field-programmable gate arrays (FPGAs). In this paper we present an implementation of two types of recurrent neural network layers -- long short-term memory and gated recurrent unit -- within the hls4ml framework. We demonstrate that our implementation is capable of producing effective designs for both small and large models, and can be customized to meet specific design requirements for inference latencies and FPGA resources. We show the performance and synthesized designs for multiple neural networks, many of which are trained specifically for jet identification tasks at the CERN Large Hadron Collider. △ Less

Submitted 1 July, 2022; originally announced July 2022.

Comments: 12 pages, 6 figures, 5 tables

arXiv:2205.11480 [pdf, other]

doi 10.1088/1748-0221/17/08/P08021

Novel Light Field Imaging Device with Enhanced Light Collection for Cold Atom Clouds

Authors: Sanha Cheong, Josef C. Frisch, Sean Gasiorowski, Jason M. Hogan, Michael Kagan, Murtaza Safdari, Ariel Schwartzman, Maxime Vandegar

Abstract: We present a light field imaging system that captures multiple views of an object with a single shot. The system is designed to maximize the total light collection by accepting a larger solid angle of light than a conventional lens with equivalent depth of field. This is achieved by populating a plane of virtual objects using mirrors and fully utilizing the available field of view and depth of fie… ▽ More We present a light field imaging system that captures multiple views of an object with a single shot. The system is designed to maximize the total light collection by accepting a larger solid angle of light than a conventional lens with equivalent depth of field. This is achieved by populating a plane of virtual objects using mirrors and fully utilizing the available field of view and depth of field. Simulation results demonstrate that this design is capable of single-shot tomography of objects of size $\mathcal{O}$(1 mm$^3$), reconstructing the 3-dimensional (3D) distribution and features not accessible from any single view angle in isolation. In particular, for atom clouds used in atom interferometry experiments, the system can reconstruct 3D fringe patterns with size $\mathcal{O}$(100 $μ$m). We also demonstrate this system with a 3D-printed prototype. The prototype is used to take images of $\mathcal{O}$(1 mm$^{3}$) sized objects, and 3D reconstruction algorithms running on a single-shot image successfully reconstruct $\mathcal{O}$(100 $μ$m) internal features. The prototype also shows that the system can be built with 3D printing technology and hence can be deployed quickly and cost-effectively in experiments with needs for enhanced light collection or 3D reconstruction. Imaging of cold atom clouds in atom interferometry is a key application of this new type of imaging device where enhanced light collection, high depth of field, and 3D tomographic reconstruction can provide new handles to characterize the atom clouds. △ Less

Submitted 23 May, 2022; originally announced May 2022.

Journal ref: 2022 JINST 17 P08021

arXiv:2203.12852 [pdf, other]

Graph Neural Networks in Particle Physics: Implementations, Innovations, and Challenges

Authors: Savannah Thais, Paolo Calafiura, Grigorios Chachamis, Gage DeZoort, Javier Duarte, Sanmay Ganguly, Michael Kagan, Daniel Murnane, Mark S. Neubauer, Kazuhiro Terao

Abstract: Many physical systems can be best understood as sets of discrete data with associated relationships. Where previously these sets of data have been formulated as series or image data to match the available machine learning architectures, with the advent of graph neural networks (GNNs), these systems can be learned natively as graphs. This allows a wide variety of high- and low-level physical featur… ▽ More Many physical systems can be best understood as sets of discrete data with associated relationships. Where previously these sets of data have been formulated as series or image data to match the available machine learning architectures, with the advent of graph neural networks (GNNs), these systems can be learned natively as graphs. This allows a wide variety of high- and low-level physical features to be attached to measurements and, by the same token, a wide variety of HEP tasks to be accomplished by the same GNN architectures. GNNs have found powerful use-cases in reconstruction, tagging, generation and end-to-end analysis. With the wide-spread adoption of GNNs in industry, the HEP community is well-placed to benefit from rapid improvements in GNN latency and memory usage. However, industry use-cases are not perfectly aligned with HEP and much work needs to be done to best match unique GNN capabilities to unique HEP obstacles. We present here a range of these capabilities, predictions of which are currently being well-adopted in HEP communities, and which are still immature. We hope to capture the landscape of graph techniques in machine learning as well as point out the most significant gaps that are inhibiting potentially large leaps in research. △ Less

Submitted 25 March, 2022; v1 submitted 23 March, 2022; originally announced March 2022.

Comments: contribution to Snowmass 2021

arXiv:2203.08806 [pdf, other]

New directions for surrogate models and differentiable programming for High Energy Physics detector simulation

Authors: Andreas Adelmann, Walter Hopkins, Evangelos Kourlitis, Michael Kagan, Gregor Kasieczka, Claudius Krause, David Shih, Vinicius Mikuni, Benjamin Nachman, Kevin Pedro, Daniel Winklehner

Abstract: The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, pr… ▽ More The computational cost for high energy physics detector simulation in future experimental facilities is going to exceed the current available resources. To overcome this challenge, new ideas on surrogate models using machine learning methods are being explored to replace computationally expensive components. Additionally, differentiable programming has been proposed as a complementary approach, providing controllable and scalable simulation routines. In this document, new and ongoing efforts for surrogate models and differential programming applied to detector simulation are discussed in the context of the 2021 Particle Physics Community Planning Exercise (`Snowmass'). △ Less

Submitted 15 March, 2022; originally announced March 2022.

Comments: contribution to Snowmass 2021

Report number: FERMILAB-CONF-22-199-SCD

arXiv:2203.00057 [pdf, other]

doi 10.1088/1742-6596/2438/1/012137

Differentiable Matrix Elements with MadJax

Authors: Lukas Heinrich, Michael Kagan

Abstract: MadJax is a tool for generating and evaluating differentiable matrix elements of high energy scattering processes. As such, it is a step towards a differentiable programming paradigm in high energy physics that facilitates the incorporation of high energy physics domain knowledge, encoded in simulation software, into gradient based learning and optimization pipelines. MadJax comprises two componen… ▽ More MadJax is a tool for generating and evaluating differentiable matrix elements of high energy scattering processes. As such, it is a step towards a differentiable programming paradigm in high energy physics that facilitates the incorporation of high energy physics domain knowledge, encoded in simulation software, into gradient based learning and optimization pipelines. MadJax comprises two components: (a) a plugin to the general purpose matrix element generator MadGraph that integrates matrix element and phase space sampling code with the JAX differentiable programming framework, and (b) a standalone wrapping API for accessing the matrix element code and its gradients, which are computed with automatic differentiation. The MadJax implementation and example applications of simulation based inference and normalizing flow based matrix element modeling, with capabilities enabled uniquely with differentiable matrix elements, are presented. △ Less

Submitted 28 February, 2022; originally announced March 2022.

Comments: 6 pages, Proceedings of the 20th International Workshop on Advanced Computing and Analysis Techniques in Physics Research (ACAT 2021)

arXiv:2107.02958 [pdf, other]

End-to-End Simultaneous Learning of Single-particle Orientation and 3D Map Reconstruction from Cryo-electron Microscopy Data

Authors: Youssef S. G. Nashed, Frederic Poitevin, Harshit Gupta, Geoffrey Woollard, Michael Kagan, Chuck Yoon, Daniel Ratner

Abstract: Cryogenic electron microscopy (cryo-EM) provides images from different copies of the same biomolecule in arbitrary orientations. Here, we present an end-to-end unsupervised approach that learns individual particle orientations from cryo-EM data while reconstructing the average 3D map of the biomolecule, starting from a random initialization. The approach relies on an auto-encoder architecture wher… ▽ More Cryogenic electron microscopy (cryo-EM) provides images from different copies of the same biomolecule in arbitrary orientations. Here, we present an end-to-end unsupervised approach that learns individual particle orientations from cryo-EM data while reconstructing the average 3D map of the biomolecule, starting from a random initialization. The approach relies on an auto-encoder architecture where the latent space is explicitly interpreted as orientations used by the decoder to form an image according to the linear projection model. We evaluate our method on simulated data and show that it is able to reconstruct 3D particle maps from noisy- and CTF-corrupted 2D projection images of unknown particle orientations. △ Less

Submitted 6 July, 2021; originally announced July 2021.

Comments: 13 pages, 4 figures

arXiv:2012.09719 [pdf, other]

Image-Based Jet Analysis

Authors: Michael Kagan

Abstract: Image-based jet analysis is built upon the jet image representation of jets that enables a direct connection between high energy physics and the fields of computer vision and deep learning. Through this connection, a wide array of new jet analysis techniques have emerged. In this text, we survey jet image based classification models, built primarily on the use of convolutional neural networks, exa… ▽ More Image-based jet analysis is built upon the jet image representation of jets that enables a direct connection between high energy physics and the fields of computer vision and deep learning. Through this connection, a wide array of new jet analysis techniques have emerged. In this text, we survey jet image based classification models, built primarily on the use of convolutional neural networks, examine the methods to understand what these models have learned and what is their sensitivity to uncertainties, and review the recent successes in moving these models from phenomenological studies to real world application on experiments at the LHC. Beyond jet classification, several other applications of jet image based techniques, including energy estimation, pileup noise reduction, data generation, and anomaly detection, are discussed. △ Less

Submitted 18 December, 2020; v1 submitted 17 December, 2020; originally announced December 2020.

Comments: To appear in Artificial Intelligence for High Energy Physics, World Scientific Publishing

arXiv:2011.05836 [pdf, other]

Neural Empirical Bayes: Source Distribution Estimation and its Applications to Simulation-Based Inference

Authors: Maxime Vandegar, Michael Kagan, Antoine Wehenkel, Gilles Louppe

Abstract: We revisit empirical Bayes in the absence of a tractable likelihood function, as is typical in scientific domains relying on computer simulations. We investigate how the empirical Bayesian can make use of neural density estimators first to use all noise-corrupted observations to estimate a prior or source distribution over uncorrupted samples, and then to perform single-observation posterior infer… ▽ More We revisit empirical Bayes in the absence of a tractable likelihood function, as is typical in scientific domains relying on computer simulations. We investigate how the empirical Bayesian can make use of neural density estimators first to use all noise-corrupted observations to estimate a prior or source distribution over uncorrupted samples, and then to perform single-observation posterior inference using the fitted source distribution. We propose an approach based on the direct maximization of the log-marginal likelihood of the observations, examining both biased and de-biased estimators, and comparing to variational approaches. We find that, up to symmetries, a neural empirical Bayes approach recovers ground truth source distributions. With the learned source distribution in hand, we show the applicability to likelihood-free inference and examine the quality of the resulting posterior estimates. Finally, we demonstrate the applicability of Neural Empirical Bayes on an inverse problem from collider physics. △ Less

Submitted 26 February, 2021; v1 submitted 11 November, 2020; originally announced November 2020.

Comments: Camera-ready version presented at AISTATS 2021

arXiv:2002.04632 [pdf, other]

Black-Box Optimization with Local Generative Surrogates

Authors: Sergey Shirobokov, Vladislav Belavin, Michael Kagan, Andrey Ustyuzhanin, Atılım Güneş Baydin

Abstract: We propose a novel method for gradient-based optimization of black-box simulators using differentiable local surrogate models. In fields such as physics and engineering, many processes are modeled with non-differentiable simulators with intractable likelihoods. Optimization of these forward models is particularly challenging, especially when the simulator is stochastic. To address such cases, we i… ▽ More We propose a novel method for gradient-based optimization of black-box simulators using differentiable local surrogate models. In fields such as physics and engineering, many processes are modeled with non-differentiable simulators with intractable likelihoods. Optimization of these forward models is particularly challenging, especially when the simulator is stochastic. To address such cases, we introduce the use of deep generative models to iteratively approximate the simulator in local neighborhoods of the parameter space. We demonstrate that these local surrogates can be used to approximate the gradient of the simulator, and thus enable gradient-based optimization of simulator parameters. In cases where the dependence of the simulator on the parameter space is constrained to a low dimensional submanifold, we observe that our method attains minima faster than baseline methods, including Bayesian optimization, numerical optimization, and approaches using score function gradient estimators. △ Less

Submitted 15 June, 2020; v1 submitted 11 February, 2020; originally announced February 2020.

Journal ref: In Advances in Neural Information Processing Systems 34 (NeurIPS), 2020

arXiv:1903.04476 [pdf, other]

Continual Learning via Neural Pruning

Authors: Siavash Golkar, Michael Kagan, Kyunghyun Cho

Abstract: We introduce Continual Learning via Neural Pruning (CLNP), a new method aimed at lifelong learning in fixed capacity models based on neuronal model sparsification. In this method, subsequent tasks are trained using the inactive neurons and filters of the sparsified network and cause zero deterioration to the performance of previous tasks. In order to deal with the possible compromise between model… ▽ More We introduce Continual Learning via Neural Pruning (CLNP), a new method aimed at lifelong learning in fixed capacity models based on neuronal model sparsification. In this method, subsequent tasks are trained using the inactive neurons and filters of the sparsified network and cause zero deterioration to the performance of previous tasks. In order to deal with the possible compromise between model sparsity and performance, we formalize and incorporate the concept of graceful forgetting: the idea that it is preferable to suffer a small amount of forgetting in a controlled manner if it helps regain network capacity and prevents uncontrolled loss of performance during the training of future tasks. CLNP also provides simple continual learning diagnostic tools in terms of the number of free neurons left for the training of future tasks as well as the number of neurons that are being reused. In particular, we see in experiments that CLNP verifies and automatically takes advantage of the fact that the features of earlier layers are more transferable. We show empirically that CLNP leads to significantly improved results over current weight elasticity based methods. △ Less

Submitted 11 March, 2019; originally announced March 2019.

Comments: 12 pages, 5 figures, 3 tables

arXiv:1807.02876 [pdf, other]

Machine Learning in High Energy Physics Community White Paper

Authors: Kim Albertsson, Piero Altoe, Dustin Anderson, John Anderson, Michael Andrews, Juan Pedro Araque Espinosa, Adam Aurisano, Laurent Basara, Adrian Bevan, Wahid Bhimji, Daniele Bonacorsi, Bjorn Burkle, Paolo Calafiura, Mario Campanelli, Louis Capps, Federico Carminati, Stefano Carrazza, Yi-fan Chen, Taylor Childers, Yann Coadou, Elias Coniavitis, Kyle Cranmer, Claire David, Douglas Davis, Andrea De Simone , et al. (103 additional authors not shown)

Abstract: Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We d… ▽ More Machine learning has been applied to several problems in particle physics research, beginning with applications to high-level physics analysis in the 1990s and 2000s, followed by an explosion of applications in particle and event identification and reconstruction in the 2010s. In this document we discuss promising future research and development areas for machine learning in particle physics. We detail a roadmap for their implementation, software and hardware resource requirements, collaborative initiatives with the data science community, academia and industry, and training the particle physics community in data science. The main objective of the document is to connect and motivate these areas of research and development with the physics drivers of the High-Luminosity Large Hadron Collider and future neutrino experiments and identify the resource needs for their implementation. Additionally we identify areas where collaboration with external communities will be of great benefit. △ Less

Submitted 16 May, 2019; v1 submitted 8 July, 2018; originally announced July 2018.

Comments: Editors: Sergei Gleyzer, Paul Seyfert and Steven Schramm

arXiv:1611.01046 [pdf, other]

Learning to Pivot with Adversarial Networks

Authors: Gilles Louppe, Michael Kagan, Kyle Cranmer

Abstract: Several techniques for domain adaptation have been proposed to account for differences in the distribution of the data used for training and testing. The majority of this work focuses on a binary domain label. Similar problems occur in a scientific context where there may be a continuous family of plausible data generation processes associated to the presence of systematic uncertainties. Robust in… ▽ More Several techniques for domain adaptation have been proposed to account for differences in the distribution of the data used for training and testing. The majority of this work focuses on a binary domain label. Similar problems occur in a scientific context where there may be a continuous family of plausible data generation processes associated to the presence of systematic uncertainties. Robust inference is possible if it is based on a pivot -- a quantity whose distribution does not depend on the unknown values of the nuisance parameters that parametrize this family of data generation processes. In this work, we introduce and derive theoretical results for a training procedure based on adversarial networks for enforcing the pivotal property (or, equivalently, fairness with respect to continuous attributes) on a predictive model. The method includes a hyperparameter to control the trade-off between accuracy and robustness. We demonstrate the effectiveness of this approach with a toy example and examples from particle physics. △ Less

Submitted 1 June, 2017; v1 submitted 3 November, 2016; originally announced November 2016.

Comments: v1: Original submission. v2: Fixed references. v3: version submitted to NIPS'2017. Code available at https://github.com/glouppe/paper-learning-to-pivot

Journal ref: Advances in Neural Information Processing Systems 30, pages 981-990, 2017

Showing 1–17 of 17 results for author: Kagan, M