Skip to main content

Showing 1–10 of 10 results for author: Gesmundo, A

Searching in archive stat. Search in all archives.
.
  1. arXiv:2009.04381  [pdf, other

    cs.LG stat.ML

    Routing Networks with Co-training for Continual Learning

    Authors: Mark Collier, Efi Kokiopoulou, Andrea Gesmundo, Jesse Berent

    Abstract: The core challenge with continual learning is catastrophic forgetting, the phenomenon that when neural networks are trained on a sequence of tasks they rapidly forget previously learned tasks. It has been observed that catastrophic forgetting is most severe when tasks are dissimilar to each other. We propose the use of sparse routing networks for continual learning. For each input, these network a… ▽ More

    Submitted 9 September, 2020; originally announced September 2020.

    Comments: Presented at ICML Workshop on Continual Learning 2020

  2. arXiv:1911.11481  [pdf, other

    cs.LG stat.ML

    Ranking architectures using meta-learning

    Authors: Alina Dubatovka, Efi Kokiopoulou, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

    Abstract: Neural architecture search has recently attracted lots of research efforts as it promises to automate the manual design of neural networks. However, it requires a large amount of computing resources and in order to alleviate this, a performance prediction network has been recently proposed that enables efficient architecture search by forecasting the performance of candidate architectures, instead… ▽ More

    Submitted 26 November, 2019; originally announced November 2019.

    Comments: NeurIPS 2019 Meta-Learning workshop

  3. arXiv:1910.04915  [pdf, other

    cs.LG stat.ML

    Flexible Multi-task Networks by Learning Parameter Allocation

    Authors: Krzysztof Maziarz, Efi Kokiopoulou, Andrea Gesmundo, Luciano Sbaiz, Gabor Bartok, Jesse Berent

    Abstract: This paper proposes a novel learning method for multi-task applications. Multi-task neural networks can learn to transfer knowledge across different tasks by using parameter sharing. However, sharing parameters between unrelated tasks can hurt performance. To address this issue, we propose a framework to learn fine-grained patterns of parameter sharing. Assuming that the network is composed of sev… ▽ More

    Submitted 18 July, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

  4. arXiv:1906.08102  [pdf, other

    cs.LG stat.ML

    Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents

    Authors: Zalán Borsos, Andrey Khorlin, Andrea Gesmundo

    Abstract: Recent advances in Neural Architecture Search (NAS) have produced state-of-the-art architectures on several tasks. NAS shifts the efforts of human experts from developing novel architectures directly to designing architecture search spaces and methods to explore them efficiently. The search space definition captures prior knowledge about the properties of the architectures and it is crucial for th… ▽ More

    Submitted 19 June, 2019; originally announced June 2019.

    Comments: 6th ICML Workshop on Automated Machine Learning

  5. arXiv:1902.05781  [pdf, other

    cs.LG stat.ML

    Fast Task-Aware Architecture Inference

    Authors: Efi Kokiopoulou, Anja Hauth, Luciano Sbaiz, Andrea Gesmundo, Gabor Bartok, Jesse Berent

    Abstract: Neural architecture search has been shown to hold great promise towards the automation of deep learning. However in spite of its potential, neural architecture search remains quite costly. To this point, we propose a novel gradient-based framework for efficient architecture search by sharing information across several tasks. We start by training many model architectures on several related (trainin… ▽ More

    Submitted 15 February, 2019; originally announced February 2019.

  6. arXiv:1902.00751  [pdf, other

    cs.LG cs.CL stat.ML

    Parameter-Efficient Transfer Learning for NLP

    Authors: Neil Houlsby, Andrei Giurgiu, Stanislaw Jastrzebski, Bruna Morrone, Quentin de Laroussilhe, Andrea Gesmundo, Mona Attariyan, Sylvain Gelly

    Abstract: Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient: an entire new model is required for every task. As an alternative, we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can… ▽ More

    Submitted 13 June, 2019; v1 submitted 2 February, 2019; originally announced February 2019.

  7. arXiv:1812.10666  [pdf, other

    cs.LG stat.ML

    Neural Architecture Search Over a Graph Search Space

    Authors: Stanisław Jastrzębski, Quentin de Laroussilhe, Mingxing Tan, Xiao Ma, Neil Houlsby, Andrea Gesmundo

    Abstract: Neural Architecture Search (NAS) enabled the discovery of state-of-the-art architectures in many domains. However, the success of NAS depends on the definition of the search space. Current search spaces are defined as a static sequence of decisions and a set of available actions for each decision. Each possible sequence of actions defines an architecture. We propose a more expressive class of sear… ▽ More

    Submitted 31 July, 2019; v1 submitted 27 December, 2018; originally announced December 2018.

  8. arXiv:1811.09828  [pdf, other

    cs.LG cs.NE stat.ML

    Evolutionary-Neural Hybrid Agents for Architecture Search

    Authors: Krzysztof Maziarz, Mingxing Tan, Andrey Khorlin, Marin Georgiev, Andrea Gesmundo

    Abstract: Neural Architecture Search has shown potential to automate the design of neural networks. Deep Reinforcement Learning based agents can learn complex architectural patterns, as well as explore a vast and compositional search space. On the other hand, evolutionary algorithms offer higher sample efficiency, which is critical for such a resource intensive application. In order to capture the best of b… ▽ More

    Submitted 15 February, 2020; v1 submitted 24 November, 2018; originally announced November 2018.

  9. arXiv:1803.02780  [pdf, other

    cs.LG stat.ML

    Transfer Learning with Neural AutoML

    Authors: Catherine Wong, Neil Houlsby, Yifeng Lu, Andrea Gesmundo

    Abstract: We reduce the computational cost of Neural AutoML with transfer learning. AutoML relieves human effort by automating the design of ML algorithms. Neural AutoML has become popular for the design of deep learning architectures, however, this method has a high computation cost. To address this we propose Transfer Neural AutoML that uses knowledge from prior tasks to speed up network design. We extend… ▽ More

    Submitted 28 January, 2019; v1 submitted 7 March, 2018; originally announced March 2018.

  10. arXiv:1710.10776  [pdf, other

    cs.AI cs.LG stat.ML

    Transfer Learning to Learn with Multitask Neural Model Search

    Authors: Catherine Wong, Andrea Gesmundo

    Abstract: Deep learning models require extensive architecture design exploration and hyperparameter optimization to perform well on a given task. The exploration of the model design space is often made by a human expert, and optimized using a combination of grid search and search heuristics over a large space of possible choices. Neural Architecture Search (NAS) is a Reinforcement Learning approach that has… ▽ More

    Submitted 30 October, 2017; originally announced October 2017.