default search action
Aviral Kumar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [c44]Kevin Black, Mitsuhiko Nakamoto, Pranav Atreya, Homer Rich Walke, Chelsea Finn, Aviral Kumar, Sergey Levine:
Zero-Shot Robotic Manipulation with Pre-Trained Image-Editing Diffusion Models. ICLR 2024 - [c43]Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal:
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL. ICML 2024 - [c42]Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar:
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data. ICML 2024 - [c41]Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar:
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL. ICML 2024 - [c40]Chethan Bhateja, Derek Guo, Dibya Ghosh, Anikait Singh, Manan Tomar, Quan Vuong, Yevgen Chebotar, Sergey Levine, Aviral Kumar:
Robotic Offline RL from Internet Videos via Value-Function Learning. ICRA 2024: 16977-16984 - [c39]Rafael Rafailov, Kyle Beltran Hatch, Anikait Singh, Aviral Kumar, Laura M. Smith, Ilya Kostrikov, Philippe Hansen-Estruch, Victor Kolev, Philip J. Ball, Jiajun Wu, Sergey Levine, Chelsea Finn:
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning. RLC 2024: 2178-2197 - [i57]William Chen, Oier Mees, Aviral Kumar, Sergey Levine:
Vision-Language Models Provide Promptable Representations for Reinforcement Learning. CoRR abs/2402.02651 (2024) - [i56]Yifei Zhou, Andrea Zanette, Jiayi Pan, Sergey Levine, Aviral Kumar:
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL. CoRR abs/2402.19446 (2024) - [i55]Jesse Farebrother, Jordi Orbay, Quan Vuong, Adrien Ali Taïga, Yevgen Chebotar, Ted Xiao, Alex Irpan, Sergey Levine, Pablo Samuel Castro, Aleksandra Faust, Aviral Kumar, Rishabh Agarwal:
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL. CoRR abs/2403.03950 (2024) - [i54]Katie Kang, Eric Wallace, Claire J. Tomlin, Aviral Kumar, Sergey Levine:
Unfamiliar Finetuning Examples Control How Language Models Hallucinate. CoRR abs/2403.05612 (2024) - [i53]Fahim Tajwar, Anikait Singh, Archit Sharma, Rafael Rafailov, Jeff Schneider, Tengyang Xie, Stefano Ermon, Chelsea Finn, Aviral Kumar:
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data. CoRR abs/2404.14367 (2024) - [i52]Seohong Park, Kevin Frans, Sergey Levine, Aviral Kumar:
Is Value Learning Really the Main Bottleneck in Offline RL? CoRR abs/2406.09329 (2024) - [i51]Hao Bai, Yifei Zhou, Mert Cemri, Jiayi Pan, Alane Suhr, Sergey Levine, Aviral Kumar:
DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning. CoRR abs/2406.11896 (2024) - [i50]Amrith Setlur, Saurabh Garg, Xinyang Geng, Naman Garg, Virginia Smith, Aviral Kumar:
RL on Incorrect Synthetic Data Scales the Efficiency of LLM Math Reasoning by Eight-Fold. CoRR abs/2406.14532 (2024) - [i49]Yuxiao Qu, Tianjun Zhang, Naman Garg, Aviral Kumar:
Recursive Introspection: Teaching Language Model Agents How to Self-Improve. CoRR abs/2407.18219 (2024) - [i48]Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar:
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters. CoRR abs/2408.03314 (2024) - [i47]Rafael Rafailov, Kyle Hatch, Anikait Singh, Laura M. Smith, Aviral Kumar, Ilya Kostrikov, Philippe Hansen-Estruch, Victor Kolev, Philip J. Ball, Jiajun Wu, Chelsea Finn, Sergey Levine:
D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning. CoRR abs/2408.08441 (2024) - [i46]Lunjun Zhang, Arian Hosseini, Hritik Bansal, Mehran Kazemi, Aviral Kumar, Rishabh Agarwal:
Generative Verifiers: Reward Modeling as Next-Token Prediction. CoRR abs/2408.15240 (2024) - [i45]Aviral Kumar, Vincent Zhuang, Rishabh Agarwal, Yi Su, John D. Co-Reyes, Avi Singh, Kate Baumli, Shariq Iqbal, Colton Bishop, Rebecca Roelofs, Lei M. Zhang, Kay McKinney, Disha Shrivastava, Cosmin Paduraru, George Tucker, Doina Precup, Feryal M. P. Behbahani, Aleksandra Faust:
Training Language Models to Self-Correct via Reinforcement Learning. CoRR abs/2409.12917 (2024) - [i44]Tianqi Liu, Wei Xiong, Jie Ren, Lichang Chen, Junru Wu, Rishabh Joshi, Yang Gao, Jiaming Shen, Zhen Qin, Tianhe Yu, Daniel Sohn, Anastasiia Makarova, Jeremiah Z. Liu, Yuan Liu, Bilal Piot, Abe Ittycheriah, Aviral Kumar, Mohammad Saleh:
RRM: Robust Reward Model Training Mitigates Reward Hacking. CoRR abs/2409.13156 (2024) - 2023
- [c38]Jianlan Luo, Perry Dong, Jeffrey Wu, Aviral Kumar, Xinyang Geng, Sergey Levine:
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning. CoRL 2023: 1348-1361 - [c37]Yevgen Chebotar, Quan Vuong, Karol Hausman, Fei Xia, Yao Lu, Alex Irpan, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz, Ofir Nachum, Sumedh Anand Sontakke, Grecia Salazar, Huong T. Tran, Jodilyn Peralta, Clayton Tan, Deeksha Manjunath, Jaspiar Singh, Brianna Zitkovich, Tomas Jackson, Kanishka Rao, Chelsea Finn, Sergey Levine:
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions. CoRL 2023: 3909-3928 - [c36]Joey Hong, Aviral Kumar, Sergey Levine:
Confidence-Conditioned Value Functions for Offline Reinforcement Learning. ICLR 2023 - [c35]Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine:
Offline Q-learning on Diverse Multi-Task Data Both Scales And Generalizes. ICLR 2023 - [c34]Qiyang Li, Aviral Kumar, Ilya Kostrikov, Sergey Levine:
Efficient Deep Reinforcement Learning Requires Regulating Overfitting. ICLR 2023 - [c33]Zhang-Wei Hong, Aviral Kumar, Sathwik Karnik, Abhishek Bhandwaldar, Akash Srivastava, Joni Pajarinen, Romain Laroche, Abhishek Gupta, Pulkit Agrawal:
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets. NeurIPS 2023 - [c32]Mitsuhiko Nakamoto, Simon Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine:
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning. NeurIPS 2023 - [c31]Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine:
ReDS: Offline RL With Heteroskedastic Datasets via Support Constraints. NeurIPS 2023 - [c30]Aviral Kumar, Anikait Singh, Frederik D. Ebert, Mitsuhiko Nakamoto, Yanlai Yang, Chelsea Finn, Sergey Levine:
Pre-Training for Robots: Offline RL Enables Learning New Tasks in a Handful of Trials. Robotics: Science and Systems 2023 - [i43]Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine:
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning. CoRR abs/2303.05479 (2023) - [i42]Qiyang Li, Aviral Kumar, Ilya Kostrikov, Sergey Levine:
Efficient Deep Reinforcement Learning Requires Regulating Overfitting. CoRR abs/2304.10466 (2023) - [i41]Yevgen Chebotar, Quan Vuong, Alex Irpan, Karol Hausman, Fei Xia, Yao Lu, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz, Ofir Nachum, Sumedh Sontakke, Grecia Salazar, Huong T. Tran, Jodilyn Peralta, Clayton Tan, Deeksha Manjunath, Jaspiar Singh, Brianna Zitkovich, Tomas Jackson, Kanishka Rao, Chelsea Finn, Sergey Levine:
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions. CoRR abs/2309.10150 (2023) - [i40]Chethan Bhateja, Derek Guo, Dibya Ghosh, Anikait Singh, Manan Tomar, Quan Vuong, Yevgen Chebotar, Sergey Levine, Aviral Kumar:
Robotic Offline RL from Internet Videos via Value-Function Pre-Training. CoRR abs/2309.13041 (2023) - [i39]Zhang-Wei Hong, Aviral Kumar, Sathwik Karnik, Abhishek Bhandwaldar, Akash Srivastava, Joni Pajarinen, Romain Laroche, Abhishek Gupta, Pulkit Agrawal:
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced Datasets. CoRR abs/2310.04413 (2023) - [i38]Han Qi, Xinyang Geng, Stefano Rando, Iku Ohama, Aviral Kumar, Sergey Levine:
Latent Conservative Objective Models for Data-Driven Crystal Structure Prediction. CoRR abs/2310.10056 (2023) - [i37]Kevin Black, Mitsuhiko Nakamoto, Pranav Atreya, Homer Walke, Chelsea Finn, Aviral Kumar, Sergey Levine:
Zero-Shot Robotic Manipulation with Pretrained Image-Editing Diffusion Models. CoRR abs/2310.10639 (2023) - [i36]Jianlan Luo, Perry Dong, Jeffrey Wu, Aviral Kumar, Xinyang Geng, Sergey Levine:
Action-Quantized Offline Reinforcement Learning for Robotic Skill Learning. CoRR abs/2310.11731 (2023) - 2022
- [c29]Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, Sergey Levine:
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning. CoRL 2022: 1652-1662 - [c28]Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron C. Courville, George Tucker, Sergey Levine:
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization. ICLR 2022 - [c27]Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine:
Should I Run Offline Reinforcement Learning or Behavioral Cloning? ICLR 2022 - [c26]Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine:
Data-Driven Offline Optimization for Architecting Hardware Accelerators. ICLR 2022 - [c25]Brandon Trabucco, Xinyang Geng, Aviral Kumar, Sergey Levine:
Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization. ICML 2022: 21658-21676 - [c24]Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Chelsea Finn, Sergey Levine:
How to Leverage Unlabeled Data in Offline Reinforcement Learning. ICML 2022: 25611-25635 - [c23]Han Qi, Yi Su, Aviral Kumar, Sergey Levine:
Data-Driven Offline Decision-Making via Invariant Representation Learning. NeurIPS 2022 - [c22]Quan Vuong, Aviral Kumar, Sergey Levine, Yevgen Chebotar:
DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning. NeurIPS 2022 - [c21]Minmin Chen, Can Xu, Vince Gatto, Devanshu Jain, Aviral Kumar, Ed H. Chi:
Off-Policy Actor-critic for Recommender Systems. RecSys 2022: 338-349 - [i35]Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Chelsea Finn, Sergey Levine:
How to Leverage Unlabeled Data in Offline Reinforcement Learning. CoRR abs/2202.01741 (2022) - [i34]Brandon Trabucco, Xinyang Geng, Aviral Kumar, Sergey Levine:
Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization. CoRR abs/2202.08450 (2022) - [i33]Aviral Kumar, Joey Hong, Anikait Singh, Sergey Levine:
When Should We Prefer Offline Reinforcement Learning Over Behavioral Cloning? CoRR abs/2204.05618 (2022) - [i32]Homer Walke, Jonathan Yang, Albert Yu, Aviral Kumar, Jedrzej Orbik, Avi Singh, Sergey Levine:
Don't Start From Scratch: Leveraging Prior Data to Automate Robotic Reinforcement Learning. CoRR abs/2207.04703 (2022) - [i31]Aviral Kumar, Anikait Singh, Frederik Ebert, Yanlai Yang, Chelsea Finn, Sergey Levine:
Pre-Training for Robots: Offline RL Enables Learning New Tasks from a Handful of Trials. CoRR abs/2210.05178 (2022) - [i30]Anikait Singh, Aviral Kumar, Quan Vuong, Yevgen Chebotar, Sergey Levine:
Offline RL With Realistic Datasets: Heteroskedasticity and Support Constraints. CoRR abs/2211.01052 (2022) - [i29]Quan Vuong, Aviral Kumar, Sergey Levine, Yevgen Chebotar:
Dual Generator Offline Reinforcement Learning. CoRR abs/2211.01471 (2022) - [i28]Han Qi, Yi Su, Aviral Kumar, Sergey Levine:
Data-Driven Offline Decision-Making via Invariant Representation Learning. CoRR abs/2211.11349 (2022) - [i27]Aviral Kumar, Rishabh Agarwal, Xinyang Geng, George Tucker, Sergey Levine:
Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes. CoRR abs/2211.15144 (2022) - [i26]Joey Hong, Aviral Kumar, Sergey Levine:
Confidence-Conditioned Value Functions for Offline Reinforcement Learning. CoRR abs/2212.04607 (2022) - 2021
- [c20]Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine:
A Workflow for Offline Model-Free Robotic Reinforcement Learning. CoRL 2021: 417-428 - [c19]Anurag Ajay, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum:
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning. ICLR 2021 - [c18]Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, Animesh Garg:
Conservative Safety Critics for Exploration. ICLR 2021 - [c17]Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine:
Benchmarks for Deep Off-Policy Evaluation. ICLR 2021 - [c16]Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine:
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning. ICLR 2021 - [c15]Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine:
Conservative Objective Models for Effective Offline Model-Based Optimization. ICML 2021: 10358-10368 - [c14]Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Sergey Levine, Chelsea Finn:
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning. NeurIPS 2021: 11501-11516 - [c13]Dibya Ghosh, Jad Rahme, Aviral Kumar, Amy Zhang, Ryan P. Adams, Sergey Levine:
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability. NeurIPS 2021: 25502-25515 - [c12]Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn:
COMBO: Conservative Offline Model-Based Policy Optimization. NeurIPS 2021: 28954-28967 - [i25]Tianhe Yu, Aviral Kumar, Rafael Rafailov, Aravind Rajeswaran, Sergey Levine, Chelsea Finn:
COMBO: Conservative Offline Model-Based Policy Optimization. CoRR abs/2102.08363 (2021) - [i24]Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine:
Benchmarks for Deep Off-Policy Evaluation. CoRR abs/2103.16596 (2021) - [i23]Dibya Ghosh, Jad Rahme, Aviral Kumar, Amy Zhang, Ryan P. Adams, Sergey Levine:
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability. CoRR abs/2107.06277 (2021) - [i22]Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine:
Conservative Objective Models for Effective Offline Model-Based Optimization. CoRR abs/2107.06882 (2021) - [i21]Tianhe Yu, Aviral Kumar, Yevgen Chebotar, Karol Hausman, Sergey Levine, Chelsea Finn:
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning. CoRR abs/2109.08128 (2021) - [i20]Aviral Kumar, Anikait Singh, Stephen Tian, Chelsea Finn, Sergey Levine:
A Workflow for Offline Model-Free Robotic Reinforcement Learning. CoRR abs/2109.10813 (2021) - [i19]Aviral Kumar, Amir Yazdanbakhsh, Milad Hashemi, Kevin Swersky, Sergey Levine:
Data-Driven Offline Optimization For Architecting Hardware Accelerators. CoRR abs/2110.11346 (2021) - [i18]Aviral Kumar, Rishabh Agarwal, Tengyu Ma, Aaron C. Courville, George Tucker, Sergey Levine:
DR3: Value-Based Deep Reinforcement Learning Requires Explicit Regularization. CoRR abs/2112.04716 (2021) - 2020
- [c11]Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine:
Chaining Behaviors from Data with Model-Free Reinforcement Learning. CoRL 2020: 2162-2177 - [c10]Aviral Kumar, Abhishek Gupta, Sergey Levine:
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction. NeurIPS 2020 - [c9]Saurabh Kumar, Aviral Kumar, Sergey Levine, Chelsea Finn:
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL. NeurIPS 2020 - [c8]Aviral Kumar, Sergey Levine:
Model Inversion Networks for Model-Based Optimization. NeurIPS 2020 - [c7]Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine:
Conservative Q-Learning for Offline Reinforcement Learning. NeurIPS 2020 - [i17]Aviral Kumar, Abhishek Gupta, Sergey Levine:
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction. CoRR abs/2003.07305 (2020) - [i16]Justin Fu, Aviral Kumar, Ofir Nachum, George Tucker, Sergey Levine:
D4RL: Datasets for Deep Data-Driven Reinforcement Learning. CoRR abs/2004.07219 (2020) - [i15]Sergey Levine, Aviral Kumar, George Tucker, Justin Fu:
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. CoRR abs/2005.01643 (2020) - [i14]Aviral Kumar, Aurick Zhou, George Tucker, Sergey Levine:
Conservative Q-Learning for Offline Reinforcement Learning. CoRR abs/2006.04779 (2020) - [i13]Anurag Ajay, Aviral Kumar, Pulkit Agrawal, Sergey Levine, Ofir Nachum:
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning. CoRR abs/2010.13611 (2020) - [i12]Saurabh Kumar, Aviral Kumar, Sergey Levine, Chelsea Finn:
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL. CoRR abs/2010.14484 (2020) - [i11]Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, Animesh Garg:
Conservative Safety Critics for Exploration. CoRR abs/2010.14497 (2020) - [i10]Aviral Kumar, Rishabh Agarwal, Dibya Ghosh, Sergey Levine:
Implicit Under-Parameterization Inhibits Data-Efficient Deep Reinforcement Learning. CoRR abs/2010.14498 (2020) - [i9]Avi Singh, Albert Yu, Jonathan Yang, Jesse Zhang, Aviral Kumar, Sergey Levine:
COG: Connecting New Skills to Past Experience with Offline Reinforcement Learning. CoRR abs/2010.14500 (2020)
2010 – 2019
- 2019
- [c6]Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine:
Diagnosing Bottlenecks in Deep Q-learning Algorithms. ICML 2019: 2021-2030 - [c5]Aviral Kumar, Justin Fu, Matthew Soh, George Tucker, Sergey Levine:
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction. NeurIPS 2019: 11761-11771 - [c4]Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, Kevin Swersky:
Graph Normalizing Flows. NeurIPS 2019: 13556-13566 - [i8]Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine:
Diagnosing Bottlenecks in Deep Q-learning Algorithms. CoRR abs/1902.10250 (2019) - [i7]Aviral Kumar, Sunita Sarawagi:
Calibration of Encoder Decoder Models for Neural Machine Translation. CoRR abs/1903.00802 (2019) - [i6]Jenny Liu, Aviral Kumar, Jimmy Ba, Jamie Kiros, Kevin Swersky:
Graph Normalizing Flows. CoRR abs/1905.13177 (2019) - [i5]Aviral Kumar, Justin Fu, George Tucker, Sergey Levine:
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction. CoRR abs/1906.00949 (2019) - [i4]Xue Bin Peng, Aviral Kumar, Grace Zhang, Sergey Levine:
Advantage-Weighted Regression: Simple and Scalable Off-Policy Reinforcement Learning. CoRR abs/1910.00177 (2019) - [i3]Aviral Kumar, Sergey Levine:
Model Inversion Networks for Model-Based Optimization. CoRR abs/1912.13464 (2019) - [i2]Aviral Kumar, Xue Bin Peng, Sergey Levine:
Reward-Conditioned Policies. CoRR abs/1912.13465 (2019) - 2018
- [c3]Aviral Kumar, Sunita Sarawagi, Ujjwal Jain:
Trainable Calibration Measures For Neural Networks From Kernel Mean Embeddings. ICML 2018: 2810-2819 - 2017
- [c2]Shankara Narayanan Krishna, Aviral Kumar, Fabio Somenzi, Behrouz Touri, Ashutosh Trivedi:
The Reach-Avoid Problem for Constant-Rate Multi-mode Systems. ATVA 2017: 463-479 - [c1]Stanley Bak, Sergiy Bogomolov, Thomas A. Henzinger, Aviral Kumar:
Challenges and Tool Implementation of Hybrid Rapidly-Exploring Random Trees. NSV@CAV 2017: 83-89 - [i1]Shankara Narayanan Krishna, Aviral Kumar, Fabio Somenzi, Behrouz Touri, Ashutosh Trivedi:
The Reach-Avoid Problem for Constant-Rate Multi-Mode Systems. CoRR abs/1707.04151 (2017)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-11-13 23:44 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint