default search action
Romain Laroche
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c59]Harry Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio:
Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning. ICLR 2024 - [c58]Jikun Kang, Romain Laroche, Xingdi Yuan, Adam Trischler, Xue Liu, Jie Fu:
Think Before You Act: Decision Transformers with Working Memory. ICML 2024 - 2023
- [j4]Hiba Dakdouk, Raphaël Féraud, Nadège Varsier, Patrick Maillé, Romain Laroche:
Massive multi-player multi-armed bandits for IoT networks: An application on LoRa networks. Ad Hoc Networks 151: 103283 (2023) - [j3]Yuchen Lu, Zhen Liu, Aristide Baratin, Romain Laroche, Aaron C. Courville, Alessandro Sordoni:
Using Representation Expressiveness and Learnability to Evaluate Self-Supervised Learning Methods. Trans. Mach. Learn. Res. 2023 (2023) - [c57]Nathaniel Weir, Xingdi Yuan, Marc-Alexandre Côté, Matthew J. Hausknecht, Romain Laroche, Ida Momennejad, Harm van Seijen, Benjamin Van Durme:
One-Shot Learning from a Demonstration with Hierarchical Latent Language. AAMAS 2023: 2388-2390 - [c56]Zhang-Wei Hong, Pulkit Agrawal, Remi Tachet des Combes, Romain Laroche:
Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting. ICLR 2023 - [c55]Hongyu Zang, Xin Li, Jie Yu, Chen Liu, Riashat Islam, Remi Tachet des Combes, Romain Laroche:
Behavior Prior Representation learning for Offline Reinforcement Learning. ICLR 2023 - [c54]Romain Laroche, Remi Tachet des Combes:
On the Occupancy Measure of Non-Markovian Policies in Continuous MDPs. ICML 2023: 18548-18562 - [c53]Shangtong Zhang, Remi Tachet des Combes, Romain Laroche:
On the Convergence of SARSA with Linear Function Approximation. ICML 2023: 41613-41646 - [c52]Zhang-Wei Hong, Aviral Kumar, Sathwik Karnik, Abhishek Bhandwaldar, Akash Srivastava, Joni Pajarinen, Romain Laroche, Abhishek Gupta, Pulkit Agrawal:
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets. NeurIPS 2023 - [c51]Hongyu Zang, Xin Li, Leiji Zhang, Yang Liu, Baigui Sun, Riashat Islam, Remi Tachet des Combes, Romain Laroche:
Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning. NeurIPS 2023 - [i35]Jikun Kang, Romain Laroche, Xindi Yuan, Adam Trischler, Xue Liu, Jie Fu:
Think Before You Act: Decision Transformers with Internal Working Memory. CoRR abs/2305.16338 (2023) - [i34]Zhang-Wei Hong, Pulkit Agrawal, Rémi Tachet des Combes, Romain Laroche:
Harnessing Mixed Offline Reinforcement Learning Datasets via Trajectory Weighting. CoRR abs/2306.13085 (2023) - [i33]Mingde Zhao, Safa Alver, Harm van Seijen, Romain Laroche, Doina Precup, Yoshua Bengio:
Combining Spatial and Temporal Abstraction in Planning for Better Generalization. CoRR abs/2310.00229 (2023) - [i32]Zhang-Wei Hong, Aviral Kumar, Sathwik Karnik, Abhishek Bhandwaldar, Akash Srivastava, Joni Pajarinen, Romain Laroche, Abhishek Gupta, Pulkit Agrawal:
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets. CoRR abs/2310.04413 (2023) - [i31]Hongyu Zang, Xin Li, Leiji Zhang, Yang Liu, Baigui Sun, Riashat Islam, Remi Tachet des Combes, Romain Laroche:
Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning. CoRR abs/2310.17139 (2023) - 2022
- [j2]Shangtong Zhang, Remi Tachet des Combes, Romain Laroche:
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch. J. Mach. Learn. Res. 23: 343:1-343:91 (2022) - [c50]Romain Laroche, Remi Tachet des Combes:
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms. AISTATS 2022: 5658-5688 - [c49]Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes:
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms. AAMAS 2022: 1491-1499 - [c48]David Brandfonbrener, Alberto Bietti, Jacob Buckman, Romain Laroche, Joan Bruna:
When does return-conditioned supervised learning work for offline reinforcement learning? NeurIPS 2022 - [c47]Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet des Combes:
Discrete Compositional Representations as an Abstraction for Goal Conditioned Reinforcement Learning. NeurIPS 2022 - [i30]Shangtong Zhang, Remi Tachet des Combes, Romain Laroche:
On the Chattering of SARSA with Linear Function Approximation. CoRR abs/2202.06828 (2022) - [i29]Romain Laroche, Remi Tachet des Combes:
Beyond the Policy Gradient Theorem for Efficient Policy Updates in Actor-Critic Algorithms. CoRR abs/2202.07496 (2022) - [i28]Nathaniel Weir, Xingdi Yuan, Marc-Alexandre Côté, Matthew J. Hausknecht, Romain Laroche, Ida Momennejad, Harm van Seijen, Benjamin Van Durme:
One-Shot Learning from a Demonstration with Hierarchical Latent Language. CoRR abs/2203.04806 (2022) - [i27]Romain Laroche, Remi Tachet des Combes, Jacob Buckman:
Non-Markovian policies occupancy measures. CoRR abs/2205.13950 (2022) - [i26]David Brandfonbrener, Alberto Bietti, Jacob Buckman, Romain Laroche, Joan Bruna:
When does return-conditioned supervised learning work for offline reinforcement learning? CoRR abs/2206.01079 (2022) - [i25]David Brandfonbrener, Remi Tachet des Combes, Romain Laroche:
Incorporating Explicit Uncertainty Estimates into Deep Offline Reinforcement Learning. CoRR abs/2206.01085 (2022) - [i24]Yuchen Lu, Zhen Liu, Aristide Baratin, Romain Laroche, Aaron C. Courville, Alessandro Sordoni:
Expressiveness and Learnability: A Unifying View for Evaluating Self-Supervised Learning. CoRR abs/2206.01251 (2022) - [i23]Yoann Lemesle, Tristan Karch, Romain Laroche, Clément Moulin-Frier, Pierre-Yves Oudeyer:
Emergence of Shared Sensory-motor Graphical Language from Visual Input. CoRR abs/2210.06468 (2022) - [i22]Riashat Islam, Hongyu Zang, Anirudh Goyal, Alex Lamb, Kenji Kawaguchi, Xin Li, Romain Laroche, Yoshua Bengio, Remi Tachet des Combes:
Discrete Factorial Representations as an Abstraction for Goal Conditioned Reinforcement Learning. CoRR abs/2211.00247 (2022) - [i21]Hongyu Zang, Xin Li, Jie Yu, Chen Liu, Riashat Islam, Remi Tachet des Combes, Romain Laroche:
Behavior Prior Representation learning for Offline Reinforcement Learning. CoRR abs/2211.00863 (2022) - 2021
- [c46]Eva Portelance, Michael C. Frank, Dan Jurafsky, Alessandro Sordoni, Romain Laroche:
The Emergence of the Shape Bias Results from Communicative Efficiency. CoNLL 2021: 607-623 - [c45]Harsh Satija, Philip S. Thomas, Joelle Pineau, Romain Laroche:
Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs. NeurIPS 2021: 2004-2017 - [c44]Romain Laroche, Remi Tachet des Combes:
Dr Jekyll & Mr Hyde: the strange case of off-policy policy updates. NeurIPS 2021: 24442-24454 - [i20]Harsh Satija, Philip S. Thomas, Joelle Pineau, Romain Laroche:
Multi-Objective SPIBB: Seldonian Offline Policy Improvement with Safety Constraints in Finite MDPs. CoRR abs/2106.00099 (2021) - [i19]Eva Portelance, Michael C. Frank, Dan Jurafsky, Alessandro Sordoni, Romain Laroche:
The Emergence of the Shape Bias Results from Communicative Efficiency. CoRR abs/2109.06232 (2021) - [i18]Romain Laroche, Remi Tachet des Combes:
Dr Jekyll and Mr Hyde: the Strange Case of Off-Policy Policy Updates. CoRR abs/2109.14727 (2021) - [i17]Romain Laroche, Othmane Safsafi, Raphaël Féraud, Nicolas Broutin:
Batched Bandits with Crowd Externalities. CoRR abs/2109.14733 (2021) - [i16]Shangtong Zhang, Remi Tachet des Combes, Romain Laroche:
Global Optimality and Finite Sample Analysis of Softmax Off-Policy Actor Critic under State Distribution Mismatch. CoRR abs/2111.02997 (2021) - 2020
- [c43]Thiago D. Simão, Romain Laroche, Rémi Tachet des Combes:
Safe Policy Improvement with an Estimated Baseline Policy. AAMAS 2020: 1269-1277 - [c42]Dmitrii Krylov, Remi Tachet des Combes, Romain Laroche, Michael Rosenblum, Dmitry V. Dylov:
Reinforcement Learning Framework for Deep Brain Stimulation Study. IJCAI 2020: 2847-2854 - [c41]Ashutosh Adhikari, Xingdi Yuan, Marc-Alexandre Côté, Mikulas Zelinka, Marc-Antoine Rondeau, Romain Laroche, Pascal Poupart, Jian Tang, Adam Trischler, William L. Hamilton:
Learning Dynamic Belief Graphs to Generalize on Text-Based Games. NeurIPS 2020 - [i15]Ashutosh Adhikari, Xingdi Yuan, Marc-Alexandre Côté, Mikulas Zelinka, Marc-Antoine Rondeau, Romain Laroche, Pascal Poupart, Jian Tang, Adam Trischler, William L. Hamilton:
Learning Dynamic Knowledge Graphs to Generalize on Text-Based Games. CoRR abs/2002.09127 (2020) - [i14]Dmitrii Krylov, Rémi Tachet des Combes, Romain Laroche, Michael Rosenblum, Dmitry V. Dylov:
Reinforcement Learning Framework for Deep Brain Stimulation Study. CoRR abs/2002.10948 (2020) - [i13]Shangtong Zhang, Romain Laroche, Harm van Seijen, Shimon Whiteson, Remi Tachet des Combes:
A Deeper Look at Discounting Mismatch in Actor-Critic Algorithms. CoRR abs/2010.01069 (2020)
2010 – 2019
- 2019
- [c40]Raphaël Féraud, Réda Alami, Romain Laroche:
Decentralized Exploration in Multi-Armed Bandits. ICML 2019: 1901-1909 - [c39]Romain Laroche, Paul Trichelair, Remi Tachet des Combes:
Safe Policy Improvement with Baseline Bootstrapping. ICML 2019: 3652-3661 - [c38]Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin:
Budgeted Reinforcement Learning in Continuous State Space. NeurIPS 2019: 9295-9305 - [c37]Kimia Nadjahi, Romain Laroche, Rémi Tachet des Combes:
Safe Policy Improvement with Soft Baseline Bootstrapping. ECML/PKDD (3) 2019: 53-68 - [i12]Nicolas Carrara, Edouard Leurent, Romain Laroche, Tanguy Urvoy, Odalric-Ambrym Maillard, Olivier Pietquin:
Scaling up budgeted reinforcement learning. CoRR abs/1903.01004 (2019) - [i11]Kimia Nadjahi, Romain Laroche, Rémi Tachet des Combes:
Safe Policy Improvement with Soft Baseline Bootstrapping. CoRR abs/1907.05079 (2019) - [i10]Thiago D. Simão, Romain Laroche, Rémi Tachet des Combes:
Safe Policy Improvement with an Estimated Baseline Policy. CoRR abs/1909.05236 (2019) - [i9]Mikulas Zelinka, Xingdi Yuan, Marc-Alexandre Côté, Romain Laroche, Adam Trischler:
Building Dynamic Knowledge Graphs from Text-based Games. CoRR abs/1910.09532 (2019) - 2018
- [j1]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
A methodology for turn-taking capabilities enhancement in Spoken Dialogue Systems using Reinforcement Learning. Comput. Speech Lang. 47: 93-111 (2018) - [c36]Lucas Lehnert, Romain Laroche, Harm van Seijen:
On Value Function Representation of Long Horizon Problems. AAAI 2018: 3457-3465 - [c35]Merwan Barlier, Romain Laroche, Olivier Pietquin:
Training Dialogue Systems With Human Advice. AAMAS 2018: 999-1007 - [c34]Romain Laroche, Raphaël Féraud:
Reinforcement Learning Algorithm Selection. ICLR (Poster) 2018 - [c33]Romain Laroche, Harm van Seijen:
In reinforcement learning, all objective functions are not equal. ICLR (Workshop) 2018 - [i8]Xingdi Yuan, Marc-Alexandre Côté, Alessandro Sordoni, Romain Laroche, Remi Tachet des Combes, Matthew J. Hausknecht, Adam Trischler:
Counting to Explore and Generalize in Text-based Games. CoRR abs/1806.11525 (2018) - [i7]Raphaël Féraud, Réda Alami, Romain Laroche:
Decentralized Exploration in Multi-Armed Bandits. CoRR abs/1811.07763 (2018) - 2017
- [c32]Romain Laroche, Merwan Barlier:
Transfer Reinforcement Learning with Shared Dynamics. AAAI 2017: 2147-2153 - [c31]Harm van Seijen, Mehdi Fatemi, Romain Laroche, Joshua Romoff, Tavian Barnes, Jeffrey Tsang:
Hybrid Reward Architecture for Reinforcement Learning. NIPS 2017: 5392-5402 - [i6]Romain Laroche, Raphaël Féraud:
Algorithm selection of off-policy reinforcement learning algorithm. CoRR abs/1701.08810 (2017) - [i5]Romain Laroche, Mehdi Fatemi, Joshua Romoff, Harm van Seijen:
Multi-Advisor Reinforcement Learning. CoRR abs/1704.00756 (2017) - [i4]Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche, Tavian Barnes, Jeffrey Tsang:
Hybrid Reward Architecture for Reinforcement Learning. CoRR abs/1706.04208 (2017) - [i3]Romain Laroche:
The Complex Negotiation Dialogue Game. CoRR abs/1707.01450 (2017) - [i2]Romain Laroche, Paul Trichelair:
Safe Policy Improvement with Baseline Bootstrapping. CoRR abs/1712.06924 (2017) - 2016
- [c30]Layla El Asri, Bilal Piot, Matthieu Geist, Romain Laroche, Olivier Pietquin:
Score-based Inverse Reinforcement Learning. AAMAS 2016: 457-465 - [c29]Aude Genevay, Romain Laroche:
Transfer Learning for User Adaptation in Spoken Dialogue Systems. AAMAS 2016: 975-983 - [c28]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
Reinforcement Learning for Turn-Taking Management in Incremental Spoken Dialogue Systems. IJCAI 2016: 2831-2837 - [c27]Merwan Barlier, Romain Laroche, Olivier Pietquin:
A Stochastic Model for Computer-Aided Human-Human Dialogue. INTERSPEECH 2016: 2051-2055 - [c26]Layla El Asri, Romain Laroche, Olivier Pietquin:
Compact and Interpretable Dialogue State Representation with Genetic Sparse Distributed Memory. IWSDS 2016: 39-51 - [c25]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
Incremental Human-Machine Dialogue Simulation. IWSDS 2016: 53-66 - [c24]Romain Laroche, Aude Genevay:
The Negotiation Dialogue Game. IWSDS 2016: 403-410 - [c23]Merwan Barlier, Romain Laroche, Olivier Pietquin:
Learning dialogue dynamics with the method of moments. SLT 2016: 98-105 - [c22]Tatiana Ekeinhor-Komi, Jean Léon Bouraoui, Romain Laroche, Fabrice Lefèvre:
Towards a virtual personal assistant based on a user-defined portfolio of multi-domain vocal applications. SLT 2016: 106-113 - [i1]Harm van Seijen, Mehdi Fatemi, Joshua Romoff, Romain Laroche:
Improving Scalability of Reinforcement Learning by Separation of Concerns. CoRR abs/1612.05159 (2016) - 2015
- [c21]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
Turn-taking phenomena in incremental dialogue systems. EMNLP 2015: 1890-1895 - [c20]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
Dialogue Efficiency Evaluation of Turn-Taking Phenomena in a Multi-layer Incremental Simulated Environment. HCI (27) 2015: 753-758 - [c19]Romain Laroche:
Content finder AssistanT. ICIN 2015: 231-238 - [c18]Merwan Barlier, Julien Pérolat, Romain Laroche, Olivier Pietquin:
Human-Machine Dialogue as a Stochastic Game. SIGDIAL Conference 2015: 2-11 - [c17]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
Optimising Turn-Taking Strategies With Reinforcement Learning. SIGDIAL Conference 2015: 315-324 - 2014
- [c16]Layla El Asri, Hatim Khouzaimi, Romain Laroche, Olivier Pietquin:
Ordinal regression for interaction quality prediction. ICASSP 2014: 3221-3225 - [c15]Djallel Bouneffouf, Romain Laroche, Tanguy Urvoy, Raphaël Féraud, Robin Allesiardo:
Contextual Bandit for Active Learning: Active Thompson Sampling. ICONIP (1) 2014: 405-412 - [c14]Layla El Asri, Rémi Lemonnier, Romain Laroche, Olivier Pietquin, Hatim Khouzaimi:
NASTIA: Negotiating Appointment Setting Interface. LREC 2014: 266-271 - [c13]Layla El Asri, Romain Laroche, Olivier Pietquin:
DINASTI: Dialogues with a Negotiating Appointment Setting Interface. LREC 2014: 272-278 - [c12]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
An easy method to make dialogue systems incremental. SIGDIAL Conference 2014: 98-107 - [c11]Romain Laroche:
CFAsT: Content-Finder AssistanT [in French]. TALN (3) 2014: 9-10 - [c10]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
DictaNum: a dialogue system for numbers dictation (DictaNum : système de dialogue incrémental pour la dictée de numéros.) [in French]. TALN (3) 2014: 23-25 - [c9]Tatiana Ekeinhor-Komi, Hajar Falih, Christine Chardenon, Romain Laroche, Fabrice Lefèvre:
Enia : A customizable multi-domain assistant (Un assistant vocal personnalisable) [in French]. TALN (3) 2014: 28-29 - [c8]Hatim Khouzaimi, Romain Laroche, Fabrice Lefèvre:
A simple approach to make dialogue systems incremental (Vers une approche simplifiée pour introduire le caractère incrémental dans les systèmes de dialogue) [in French]. TALN (1) 2014: 196-207 - 2013
- [c7]Layla El Asri, Romain Laroche:
Will my Spoken Dialogue System be a Slow Learner ? SIGDIAL Conference 2013: 97-101 - [c6]Layla El Asri, Romain Laroche, Olivier Pietquin:
Reward Shaping for Statistical Optimisation of Dialogue Management. SLSP 2013: 93-101 - 2012
- [c5]Layla El Asri, Romain Laroche, Olivier Pietquin:
Reward Function Learning for Dialogue Management. STAIRS 2012: 95-106 - 2010
- [c4]Romain Laroche, Philippe Bretier, Ghislain Putois:
Enhanced monitoring tools and online dialogue optimisation merged into a new spoken dialogue system design experience. INTERSPEECH 2010: 3006-3009 - [c3]Romain Laroche, Ghislain Putois, Philippe Bretier:
Optimising a handcrafted dialogue system design. INTERSPEECH 2010: 3010-3013 - [c2]Ghislain Putois, Romain Laroche, Philippe Bretier:
Enhanced Monitoring Tools and Online Dialogue Optimisation Merged into a New Spoken Dialogue System Design Experience. SIGDIAL Conference 2010: 185-192
2000 – 2009
- 2009
- [c1]Romain Laroche, Ghislain Putois, Philippe Bretier, Bernadette Bouchon-Meunier:
Hybridisation of expertise and reinforcement learning in dialogue systems. INTERSPEECH 2009: 2479-2482
Coauthor Index
aka: Rémi Tachet des Combes
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-01-09 13:00 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint