


default search action
Hongyi Guo
2020 – today
- 2025
- [i13]Han Zhong, Yutong Yin, Shenao Zhang, Xiaojun Xu, Yuanxin Liu, Yifei Zuo, Zhihan Liu, Boyi Liu, Sirui Zheng, Hongyi Guo, Liwei Wang, Mingyi Hong, Zhaoran Wang:
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning. CoRR abs/2501.18858 (2025) - 2024
- [j4]Xudong Yu
, Chenjia Bai
, Hongyi Guo, Changhong Wang
, Zhen Wang:
Diverse randomized value functions: A provably pessimistic approach for offline reinforcement learning. Inf. Sci. 680: 121146 (2024) - [j3]Hongyi Guo
, Antonio Miguel Martínez-Graña
:
Landslide Hazard Prediction Based on Small Baseline Subset-Interferometric Synthetic-Aperture Radar Technology Combined with Land-Use Dynamic Change and Hydrological Conditions (Sichuan, China). Remote. Sens. 16(15): 2715 (2024) - [c10]Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang:
Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents. ICML 2024 - [c9]Zhihan Liu, Miao Lu, Shenao Zhang, Boyi Liu, Hongyi Guo, Yingxiang Yang, Jose H. Blanchet, Zhaoran Wang:
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer. NeurIPS 2024 - [i12]Hongyi Guo, Yuanshun Yao, Wei Shen, Jiaheng Wei, Xiaoying Zhang, Zhaoran Wang, Yang Liu:
Human-Instruction-Free LLM Self-Alignment with Limited Samples. CoRR abs/2401.06785 (2024) - [i11]Jiaheng Wei, Yuanshun Yao, Jean-Francois Ton, Hongyi Guo, Andrew Estornell, Yang Liu:
Measuring and Reducing LLM Hallucination without Gold-Standard Answers via Expertise-Weighting. CoRR abs/2402.10412 (2024) - [i10]Hongyi Guo, Zhihan Liu, Yufeng Zhang, Zhaoran Wang:
Can Large Language Models Play Games? A Case Study of A Self-Play Approach. CoRR abs/2403.05632 (2024) - [i9]Wei Shen, Xiaoying Zhang, Yuanshun Yao, Rui Zheng, Hongyi Guo, Yang Liu:
Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards. CoRR abs/2403.07708 (2024) - [i8]Xudong Yu, Chenjia Bai, Hongyi Guo, Changhong Wang, Zhen Wang:
Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning. CoRR abs/2404.06188 (2024) - [i7]Zhihan Liu, Miao Lu, Shenao Zhang, Boyi Liu, Hongyi Guo, Yingxiang Yang, Jose H. Blanchet, Zhaoran Wang:
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer. CoRR abs/2405.16436 (2024) - [i6]Rui Zheng, Hongyi Guo, Zhihan Liu, Xiaoying Zhang, Yuanshun Yao, Xiaojun Xu, Zhaoran Wang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Hang Li, Yang Liu:
Toward Optimal LLM Alignments Using Two-Player Games. CoRR abs/2406.10977 (2024) - 2023
- [c8]Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao, Zhen Wang, Peng Liu, Xuelong Li:
Behavior Contrastive Learning for Unsupervised Skill Discovery. ICML 2023: 39183-39204 - [i5]Rushuai Yang, Chenjia Bai, Hongyi Guo, Siyuan Li, Bin Zhao, Zhen Wang, Peng Liu, Xuelong Li:
Behavior Contrastive Learning for Unsupervised Skill Discovery. CoRR abs/2305.04477 (2023) - [i4]Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang:
Reason for Future, Act for Now: A Principled Framework for Autonomous LLM Agents with Provable Sample Efficiency. CoRR abs/2309.17382 (2023) - 2022
- [c7]Hongyi Guo, Qi Cai, Yufeng Zhang, Zhuoran Yang, Zhaoran Wang:
Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes. ICML 2022: 8016-8038 - 2021
- [c6]Liheng Chen, Hongyi Guo, Yali Du, Fei Fang
, Haifeng Zhang, Weinan Zhang, Yong Yu:
Signal Instructed Coordination in Cooperative Multi-agent Reinforcement Learning. DAI 2021: 185-205 - [c5]Hongyi Guo, Zuyue Fu, Zhuoran Yang, Zhaoran Wang:
Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games. ICML 2021: 3899-3909 - [c4]Jingkang Wang, Hongyi Guo, Zhaowei Zhu, Yang Liu:
Policy Learning Using Weak Supervision. NeurIPS 2021: 19960-19973 - 2020
- [j2]Zhigang Gao, Hongyi Guo, Yunfeng Xie, Huijuan Lu, Jianhui Zhang, Wenjie Diao, Ruichao Xu:
An improved localization method in cyber-social environments with obstacles. Comput. Electr. Eng. 86: 106694 (2020) - [c3]Yang Liu, Hongyi Guo:
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates. ICML 2020: 6226-6236 - [i3]Jingkang Wang, Hongyi Guo, Zhaowei Zhu, Yang Liu:
Policy Learning Using Weak Supervision. CoRR abs/2010.01748 (2020)
2010 – 2019
- 2019
- [c2]Wenjie Diao, Zhigang Gao, Ruichao Xu, Yunfeng Xie, Ke Yan
, Hongyi Guo:
Life Assistants for the Elderly Based on Mobile Devices. DASC/PiCom/DataCom/CyberSciTech 2019: 537-542 - [i2]Liheng Chen, Hongyi Guo, Haifeng Zhang, Fei Fang, Yaoming Zhu, Ming Zhou, Weinan Zhang, Qing Wang, Yong Yu:
Signal Instructed Coordination in Team Competition. CoRR abs/1909.04224 (2019) - [i1]Yang Liu, Hongyi Guo:
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates. CoRR abs/1910.03231 (2019) - 2017
- [j1]Zhigang Gao, Hongyi Guo, Yunfeng Xie, Yanjun Luo, Huijuan Lu, Ke Yan
:
ChildGuard: A Child-Safety Monitoring System. IEEE Multim. 24(4): 48-57 (2017) - 2016
- [c1]Gongshen Liu, Kui Meng, Hongyi Guo, Li Pan, Jianhua Li:
Automatic Threshold Calculation Based Label Propagation Algorithm for Overlapping Community. DSC 2016: 382-387

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
[+][–] Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
[+][–] Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-03-05 20:46 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint
