


default search action
Souradip Chakraborty
- > Home > Persons > Souradip Chakraborty
Publications
- 2025
- [c20]Michael-Andrei Panaitescu-Liess, Zora Che, Bang An, Yuancheng Xu, Pankayaraj Pathmanathan, Souradip Chakraborty, Sicheng Zhu, Tom Goldstein, Furong Huang:
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? AAAI 2025: 25002-25009 - [c19]Pankayaraj Pathmanathan, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, Furong Huang:
Is Poisoning a Real Threat to DPO? Maybe More So Than You Think. AAAI 2025: 27556-27564 - [c17]Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh, Tianrui Guan, Mengdi Wang, Ahmad Beirami, Furong Huang, Alvaro Velasquez, Dinesh Manocha, Amrit Singh Bedi:
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment. CVPR 2025: 25038-25049 - [c16]Souradip Chakraborty, Sujay Bhatt, Udari Madhushani Sehwag, Soumya Suvra Ghosal, Jiahao Qiu, Mengdi Wang, Dinesh Manocha, Furong Huang, Alec Koppel, Sumitra Ganesh:
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment. ICLR 2025 - [i32]Souradip Chakraborty, Sujay Bhatt, Udari Madhushani Sehwag, Soumya Suvra Ghosal, Jiahao Qiu, Mengdi Wang, Dinesh Manocha, Furong Huang, Alec Koppel, Sumitra Ganesh:
Collab: Controlled Decoding using Mixture of Agents for LLM Alignment. CoRR abs/2503.21720 (2025) - [i31]Souradip Chakraborty, Mohammadreza Pourreza, Ruoxi Sun, Yiwen Song, Nino Scherrer, Furong Huang, Amrit Singh Bedi, Ahmad Beirami, Jindong Gu, Hamid Palangi, Tomas Pfister:
Review, Refine, Repeat: Understanding Iterative Decoding of AI Agents with Dynamic Evaluation and Selection. CoRR abs/2504.01931 (2025) - [i29]Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy, Yifu Lu, Mengdi Wang, Dinesh Manocha, Furong Huang, Mohammad Ghavamzadeh, Amrit Singh Bedi:
Does Thinking More always Help? Understanding Test-Time Scaling in Reasoning Models. CoRR abs/2506.04210 (2025) - 2024
- [c15]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Huazheng Wang, Dinesh Manocha, Mengdi Wang, Furong Huang:
PARL: A Unified Framework for Policy Alignment in Reinforcement Learning from Human Feedback. ICLR 2024 - [c14]Xiangyu Liu, Souradip Chakraborty, Yanchao Sun, Furong Huang:
Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in RL. ICLR 2024 - [c13]Souradip Chakraborty, Amrit S. Bedi, Sicheng Zhu, Bang An, Dinesh Manocha, Furong Huang:
Position: On the Possibilities of AI-Generated Text Detection. ICML 2024 - [c12]Souradip Chakraborty, Jiahao Qiu, Hui Yuan, Alec Koppel, Dinesh Manocha, Furong Huang, Amrit S. Bedi, Mengdi Wang:
MaxMin-RLHF: Alignment with Diverse Human Preferences. ICML 2024 - [c11]Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin, Dinesh Manocha, Mengdi Wang, Amrit Singh Bedi, Furong Huang:
Transfer Q-star : Principled Decoding for LLM Alignment. NeurIPS 2024 - [i27]Souradip Chakraborty, Jiahao Qiu, Hui Yuan, Alec Koppel, Furong Huang, Dinesh Manocha, Amrit Singh Bedi, Mengdi Wang
:
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences. CoRR abs/2402.08925 (2024) - [i24]Souradip Chakraborty, Soumya Suvra Ghosal, Ming Yin, Dinesh Manocha, Mengdi Wang, Amrit Singh Bedi, Furong Huang:
Transfer Q Star: Principled Decoding for LLM Alignment. CoRR abs/2405.20495 (2024) - [i23]Pankayaraj Pathmanathan, Souradip Chakraborty, Xiangyu Liu, Yongyuan Liang, Furong Huang:
Is poisoning a real threat to LLM alignment? Maybe more so than you think. CoRR abs/2406.12091 (2024) - [i22]Mucong Ding, Souradip Chakraborty, Vibhu Agrawal, Zora Che, Alec Koppel, Mengdi Wang, Amrit S. Bedi, Furong Huang:
SAIL: Self-Improving Efficient Online Alignment of Large Language Models. CoRR abs/2406.15567 (2024) - [i21]Michael-Andrei Panaitescu-Liess, Zora Che, Bang An, Yuancheng Xu, Pankayaraj Pathmanathan, Souradip Chakraborty, Sicheng Zhu, Tom Goldstein, Furong Huang:
Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? CoRR abs/2407.17417 (2024) - [i17]Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh, Tianrui Guan, Mengdi Wang, Ahmad Beirami, Furong Huang, Alvaro Velasquez, Dinesh Manocha, Amrit Singh Bedi:
Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment. CoRR abs/2411.18688 (2024) - [i16]James Beetham, Souradip Chakraborty, Mengdi Wang, Furong Huang, Amrit Singh Bedi, Mubarak Shah:
LIAR: Leveraging Alignment (Best-of-N) to Jailbreak LLMs in Seconds. CoRR abs/2412.05232 (2024) - 2023
- [j1]Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit S. Bedi:
A Survey on the Possibilities & Impossibilities of AI-generated Text Detection. Trans. Mach. Learn. Res. 2023 (2023) - [c10]Souradip Chakraborty, Amrit Singh Bedi, Pratap Tokekar, Alec Koppel, Brian M. Sadler, Furong Huang, Dinesh Manocha:
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning. AAAI 2023: 6980-6988 - [c9]Souradip Chakraborty, Amrit S. Bedi, Alec Koppel, Mengdi Wang, Furong Huang, Dinesh Manocha:
STEERING : Stein Information Directed Exploration for Model-Based Reinforcement Learning. ICML 2023: 3949-3978 - [i15]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Mengdi Wang
, Furong Huang, Dinesh Manocha:
STEERING: Stein Information Directed Exploration for Model-Based Reinforcement Learning. CoRR abs/2301.12038 (2023) - [i13]Souradip Chakraborty, Amrit Singh Bedi, Sicheng Zhu, Bang An, Dinesh Manocha, Furong Huang:
On the Possibilities of AI-Generated Text Detection. CoRR abs/2304.04736 (2023) - [i12]Xiangyu Liu, Souradip Chakraborty, Yanchao Sun, Furong Huang:
Rethinking Adversarial Policies: A Generalized Attack Formulation and Provable Defense in Multi-Agent RL. CoRR abs/2305.17342 (2023) - [i11]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Dinesh Manocha, Huazheng Wang, Furong Huang, Mengdi Wang
:
Aligning Agent Policy with Externalities: Reward Design via Bilevel RL. CoRR abs/2308.02585 (2023) - [i10]Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Singh Bedi:
Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey. CoRR abs/2310.15264 (2023) - 2022
- [i7]Souradip Chakraborty, Amrit Singh Bedi, Alec Koppel, Brian M. Sadler, Furong Huang, Pratap Tokekar, Dinesh Manocha:
Posterior Coreset Construction with Kernelized Stein Discrepancy for Model-Based Reinforcement Learning. CoRR abs/2206.01162 (2022)

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-08-22 02:07 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint