Skip to main content

Showing 1–50 of 122 results for author: Lan, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.09456  [pdf, other

    cs.AR cs.AI cs.ET cs.LG

    In-Memory Learning Automata Architecture using Y-Flash Cell

    Authors: Omar Ghazal, Tian Lan, Shalman Ojukwu, Komal Krishnamurthy, Alex Yakovlev, Rishad Shafik

    Abstract: The modern implementation of machine learning architectures faces significant challenges due to frequent data transfer between memory and processing units. In-memory computing, primarily through memristor-based analog computing, offers a promising solution to overcome this von Neumann bottleneck. In this technology, data processing and storage are located inside the memory. Here, we introduce a no… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  2. arXiv:2408.07094  [pdf

    cs.LG stat.ML

    Overcoming Imbalanced Safety Data Using Extended Accident Triangle

    Authors: Kailai Sun, Tianxiang Lan, Yang Miang Goh, Yueng-Hsiang Huang

    Abstract: There is growing interest in using safety analytics and machine learning to support the prevention of workplace incidents, especially in high-risk industries like construction and trucking. Although existing safety analytics studies have made remarkable progress, they suffer from imbalanced datasets, a common problem in safety analytics, resulting in prediction inaccuracies. This can lead to manag… ▽ More

    Submitted 11 August, 2024; originally announced August 2024.

  3. arXiv:2408.07060  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents

    Authors: Kexun Zhang, Weiran Yao, Zuxin Liu, Yihao Feng, Zhiwei Liu, Rithesh Murthy, Tian Lan, Lei Li, Renze Lou, Jiacheng Xu, Bo Pang, Yingbo Zhou, Shelby Heinecke, Silvio Savarese, Huan Wang, Caiming Xiong

    Abstract: Large language model (LLM) agents have shown great potential in solving real-world software engineering (SWE) problems. The most advanced open-source SWE agent can resolve over 27% of real GitHub issues in SWE-Bench Lite. However, these sophisticated agent frameworks exhibit varying strengths, excelling in certain tasks while underperforming in others. To fully harness the diversity of these agent… ▽ More

    Submitted 13 August, 2024; originally announced August 2024.

  4. arXiv:2408.00930  [pdf, other

    cs.LG cs.AI

    Enabling High Data Throughput Reinforcement Learning on GPUs: A Domain Agnostic Framework for Data-Driven Scientific Research

    Authors: Tian Lan, Huan Wang, Caiming Xiong, Silvio Savarese

    Abstract: We introduce WarpSci, a domain agnostic framework designed to overcome crucial system bottlenecks encountered in the application of reinforcement learning to intricate environments with vast datasets featuring high-dimensional observation or action spaces. Notably, our framework eliminates the need for data transfer between the CPU and GPU, enabling the concurrent execution of thousands of simulat… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  5. arXiv:2407.11477  [pdf, other

    cs.LG cs.AI

    XTraffic: A Dataset Where Traffic Meets Incidents with Explainability and More

    Authors: Xiaochuan Gou, Ziyue Li, Tian Lan, Junpeng Lin, Zhishuai Li, Bingyu Zhao, Chen Zhang, Di Wang, Xiangliang Zhang

    Abstract: Long-separated research has been conducted on two highly correlated tracks: traffic and incidents. Traffic track witnesses complicating deep learning models, e.g., to push the prediction a few percent more accurate, and the incident track only studies the incidents alone, e.g., to infer the incident risk. We, for the first time, spatiotemporally aligned the two tracks in a large-scale region (16,9… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  6. arXiv:2407.02031  [pdf, other

    cs.DC cs.AI cs.LG

    SwiftDiffusion: Efficient Diffusion Model Serving with Add-on Modules

    Authors: Suyi Li, Lingyun Yang, Xiaoxiao Jiang, Hanfeng Lu, Zhipeng Di, Weiyi Lu, Jiawei Chen, Kan Liu, Yinghao Yu, Tao Lan, Guodong Yang, Lin Qu, Liping Zhang, Wei Wang

    Abstract: This paper documents our characterization study and practices for serving text-to-image requests with stable diffusion models in production. We first comprehensively analyze inference request traces for commercial text-to-image applications. It commences with our observation that add-on modules, i.e., ControlNets and LoRAs, that augment the base stable diffusion models, are ubiquitous in generatin… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  7. arXiv:2406.18518  [pdf, other

    cs.CL cs.AI cs.LG cs.SE

    APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

    Authors: Zuxin Liu, Thai Hoang, Jianguo Zhang, Ming Zhu, Tian Lan, Shirley Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby Heinecke, Caiming Xiong

    Abstract: The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scal… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  8. arXiv:2405.19878  [pdf, other

    cs.LG cs.GT

    Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models

    Authors: Zeyu Fang, Tian Lan

    Abstract: Generative models such as diffusion have been employed as world models in offline reinforcement learning to generate synthetic data for more effective learning. Existing work either generates diffusion models one-time prior to training or requires additional interaction data to update it. In this paper, we propose a novel approach for offline reinforcement learning with closed-loop policy evaluati… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

  9. arXiv:2405.16386  [pdf, other

    cs.LG cs.AI

    Variational Offline Multi-agent Skill Discovery

    Authors: Jiayu Chen, Bhargav Ganguly, Tian Lan, Vaneet Aggarwal

    Abstract: Skills are effective temporal abstractions established for sequential decision making tasks, which enable efficient hierarchical learning for long-horizon tasks and facilitate multi-task learning through their transferability. Despite extensive research, research gaps remain in multi-agent scenarios, particularly for automatically extracting subgroup coordination patterns in a multi-agent task. In… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  10. arXiv:2405.14122  [pdf, other

    cs.GT

    Modeling Other Players with Bayesian Beliefs for Games with Incomplete Information

    Authors: Zuyuan Zhang, Mahdi Imani, Tian Lan

    Abstract: Bayesian games model interactive decision-making where players have incomplete information -- e.g., regarding payoffs and private data on players' strategies and preferences -- and must actively reason and update their belief models (with regard to such information) using observation and interaction history. Existing work on counterfactual regret minimization have shown great success for games wit… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2105.08440 by other authors

  11. arXiv:2405.13748  [pdf, other

    cs.CV

    Monocular Gaussian SLAM with Language Extended Loop Closure

    Authors: Tian Lan, Qinwei Lin, Haoqian Wang

    Abstract: Recently,3DGaussianSplattinghasshowngreatpotentialin visual Simultaneous Localization And Mapping (SLAM). Existing methods have achieved encouraging results on RGB-D SLAM, but studies of the monocular case are still scarce. Moreover, they also fail to correct drift errors due to the lack of loop closure and global optimization. In this paper, we present MG-SLAM, a monocular Gaussian SLAM with a la… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  12. arXiv:2405.03967  [pdf, other

    cs.LG cs.AI cs.AR

    SwiftRL: Towards Efficient Reinforcement Learning on Real Processing-In-Memory Systems

    Authors: Kailash Gogineni, Sai Santosh Dayapule, Juan Gómez-Luna, Karthikeya Gogineni, Peng Wei, Tian Lan, Mohammad Sadrosadati, Onur Mutlu, Guru Venkataramani

    Abstract: Reinforcement Learning (RL) trains agents to learn optimal behavior by maximizing reward signals from experience datasets. However, RL training often faces memory limitations, leading to execution latencies and prolonged training times. To overcome this, SwiftRL explores Processing-In-Memory (PIM) architectures to accelerate RL workloads. We achieve near-linear performance scaling by implementing… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

  13. arXiv:2403.15341  [pdf, other

    cs.AI cs.MA

    Collaborative AI Teaming in Unknown Environments via Active Goal Deduction

    Authors: Zuyuan Zhang, Hanhan Zhou, Mahdi Imani, Taeyoung Lee, Tian Lan

    Abstract: With the advancements of artificial intelligence (AI), we're seeing more scenarios that require AI to work closely with other agents, whose goals and strategies might not be known beforehand. However, existing approaches for training collaborative agents often require defined and known reward signals and cannot address the problem of teaming with unknown agents that often have latent objectives/re… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  14. arXiv:2403.01954  [pdf, other

    cs.CL cs.AI cs.LO

    DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation

    Authors: Chen Xu, Tian Lan, Changlong Yu, Wei Wang, Jun Gao, Yu Ji, Qunxi Dong, Kun Qian, Piji Li, Wei Bi, Bin Hu

    Abstract: Constrained decoding approaches aim to control the meaning or style of text generated by a Pre-trained Language Model (PLM) using specific target words during inference. However, these methods often guide plausible continuations by greedily selecting targets, which, while completing the task, may disrupt the natural patterns of human language generation. In this work, we propose a novel decoding f… ▽ More

    Submitted 7 July, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Submitted to IEEE TKDE (Major Revision), 13 pages, 6 figures

  15. arXiv:2403.01890  [pdf, other

    cs.RO

    Aerial Tensile Perching and Disentangling Mechanism for Long-Term Environmental Monitoring

    Authors: Tian Lan, Luca Romanello, Mirko Kovac, Sophie F. Armanini, Basaran Bahadir Kocer

    Abstract: Aerial robots show significant potential for forest canopy research and environmental monitoring by providing data collection capabilities at high spatial and temporal resolutions. However, limited flight endurance hinders their application. Inspired by natural perching behaviours, we propose a multi-modal aerial robot system that integrates tensile perching for energy conservation and a suspended… ▽ More

    Submitted 5 March, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: 7 pages, 8 figures, Accepted in IEEE International Conference on Robotics and Automation (ICRA) 2024

  16. arXiv:2403.01642  [pdf

    cs.LG cs.CE eess.SY

    Blue and Green-Mode Energy-Efficient Chemiresistive Sensor Array Realized by Rapid Ensemble Learning

    Authors: Zeheng Wang, James Cooper, Muhammad Usman, Timothy van der Laan

    Abstract: The rapid advancement of Internet of Things (IoT) necessitates the development of optimized Chemiresistive Sensor (CRS) arrays that are both energy-efficient and capable. This study introduces a novel optimization strategy that employs a rapid ensemble learning-based model committee approach to achieve these goals. Utilizing machine learning models such as Elastic Net Regression, Random Forests, a… ▽ More

    Submitted 3 March, 2024; originally announced March 2024.

    Comments: First version before submission

  17. arXiv:2402.15538  [pdf, other

    cs.MA cs.AI

    AgentLite: A Lightweight Library for Building and Advancing Task-Oriented LLM Agent System

    Authors: Zhiwei Liu, Weiran Yao, Jianguo Zhang, Liangwei Yang, Zuxin Liu, Juntao Tan, Prafulla K. Choubey, Tian Lan, Jason Wu, Huan Wang, Shelby Heinecke, Caiming Xiong, Silvio Savarese

    Abstract: The booming success of LLMs initiates rapid development in LLM agents. Though the foundation of an LLM agent is the generative model, it is critical to devise the optimal reasoning strategies and agent architectures. Accordingly, LLM agent research advances from the simple chain-of-thought prompting to more complex ReAct and Reflection reasoning strategy; agent architecture also evolves from singl… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: preprint. Library is available at https://github.com/SalesforceAIResearch/AgentLite

  18. arXiv:2402.15506  [pdf, other

    cs.AI cs.CL cs.LG

    AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

    Authors: Jianguo Zhang, Tian Lan, Rithesh Murthy, Zhiwei Liu, Weiran Yao, Juntao Tan, Thai Hoang, Liangwei Yang, Yihao Feng, Zuxin Liu, Tulika Awalgaonkar, Juan Carlos Niebles, Silvio Savarese, Shelby Heinecke, Huan Wang, Caiming Xiong

    Abstract: Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce \textbf{AgentOhana} as a comprehensive solution to address these challenges. \… ▽ More

    Submitted 20 March, 2024; v1 submitted 23 February, 2024; originally announced February 2024.

    Comments: Add GitHub repo link at \url{https://github.com/SalesforceAIResearch/xLAM} and HuggingFace model link at \url{https://huggingface.co/Salesforce/xLAM-v0.1-r}

  19. arXiv:2402.13777  [pdf, other

    cs.LG cs.AI

    Deep Generative Models for Offline Policy Learning: Tutorial, Survey, and Perspectives on Future Directions

    Authors: Jiayu Chen, Bhargav Ganguly, Yang Xu, Yongsheng Mei, Tian Lan, Vaneet Aggarwal

    Abstract: Deep generative models (DGMs) have demonstrated great success across various domains, particularly in generating texts, images, and videos using models trained from offline data. Similarly, data-driven decision-making and robotic control also necessitate learning a generator function from the offline data to serve as the strategy or policy. In this case, applying deep generative models in offline… ▽ More

    Submitted 25 May, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

    Comments: We restructured the paper and added more discussion

  20. arXiv:2402.13764  [pdf, other

    cs.CL cs.AI

    CriticBench: Evaluating Large Language Models as Critic

    Authors: Tian Lan, Wenwei Zhang, Chen Xu, Heyan Huang, Dahua Lin, Kai Chen, Xian-ling Mao

    Abstract: Critique ability are crucial in the scalable oversight and self-improvement of Large Language Models (LLMs). While many recent studies explore the critique ability of LLMs to judge and refine flaws in generations, how to comprehensively and reliably measure the critique abilities of LLMs is under-explored. This paper introduces CriticBench, a novel benchmark designed to comprehensively and reliabl… ▽ More

    Submitted 22 February, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

  21. arXiv:2402.12417  [pdf

    cs.LG cs.AI

    Predicting trucking accidents with truck drivers 'safety climate perception across companies: A transfer learning approach

    Authors: Kailai Sun, Tianxiang Lan, Say Hong Kam, Yang Miang Goh, Yueng-Hsiang Huang

    Abstract: There is a rising interest in using artificial intelligence (AI)-powered safety analytics to predict accidents in the trucking industry. Companies may face the practical challenge, however, of not having enough data to develop good safety analytics models. Although pretrained models may offer a solution for such companies, existing safety research using transfer learning has mostly focused on comp… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

    Comments: submitted to journal: accident analysis and prevention

  22. arXiv:2402.10941  [pdf, other

    cs.CL cs.AI cs.LG

    Text2Data: Low-Resource Data Generation with Textual Control

    Authors: Shiyu Wang, Yihao Feng, Tian Lan, Ning Yu, Yu Bai, Ran Xu, Huan Wang, Caiming Xiong, Silvio Savarese

    Abstract: Natural language serves as a common and straightforward control signal for humans to interact seamlessly with machines. Recognizing the importance of this interface, the machine learning community is investing considerable effort in generating data that is semantically coherent with textual instructions. While strides have been made in text-to-data generation spanning image editing, audio synthesi… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

    Comments: We propose a method that can achieve text-to-data generation under low-resource situation

  23. arXiv:2401.14544  [pdf, other

    cs.LG math.FA math.PR

    Bayesian Optimization through Gaussian Cox Process Models for Spatio-temporal Data

    Authors: Yongsheng Mei, Mahdi Imani, Tian Lan

    Abstract: Bayesian optimization (BO) has established itself as a leading strategy for efficiently optimizing expensive-to-evaluate functions. Existing BO methods mostly rely on Gaussian process (GP) surrogate models and are not applicable to (doubly-stochastic) Gaussian Cox processes, where the observation process is modulated by a latent intensity function modeled as a GP. In this paper, we propose a novel… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 2024 International Conference on Learning Representations (ICLR)

  24. arXiv:2312.15555  [pdf, other

    cs.MA

    ConcaveQ: Non-Monotonic Value Function Factorization via Concave Representations in Deep Multi-Agent Reinforcement Learning

    Authors: Huiqun Li, Hanhan Zhou, Yifei Zou, Dongxiao Yu, Tian Lan

    Abstract: Value function factorization has achieved great success in multi-agent reinforcement learning by optimizing joint action-value functions through the maximization of factorized per-agent utilities. To ensure Individual-Global-Maximum property, existing works often focus on value factorization using monotonic functions, which are known to result in restricted representation expressiveness. In this p… ▽ More

    Submitted 24 December, 2023; originally announced December 2023.

    Comments: Accepted at AAAI 2024

    Journal ref: AAAI 2024

  25. arXiv:2312.11742  [pdf, other

    cs.DC cs.AR cs.LG cs.NI

    ACCL+: an FPGA-Based Collective Engine for Distributed Applications

    Authors: Zhenhao He, Dario Korolija, Yu Zhu, Benjamin Ramhorst, Tristan Laan, Lucian Petrica, Michaela Blott, Gustavo Alonso

    Abstract: FPGAs are increasingly prevalent in cloud deployments, serving as Smart NICs or network-attached accelerators. Despite their potential, developing distributed FPGA-accelerated applications remains cumbersome due to the lack of appropriate infrastructure and communication abstractions. To facilitate the development of distributed applications with FPGAs, in this paper we propose ACCL+, an open-sour… ▽ More

    Submitted 18 December, 2023; originally announced December 2023.

  26. arXiv:2312.07696  [pdf, ps, other

    cs.CR cs.AI

    Real-time Network Intrusion Detection via Decision Transformers

    Authors: Jingdi Chen, Hanhan Zhou, Yongsheng Mei, Gina Adam, Nathaniel D. Bastian, Tian Lan

    Abstract: Many cybersecurity problems that require real-time decision-making based on temporal observations can be abstracted as a sequence modeling problem, e.g., network intrusion detection from a sequence of arriving packets. Existing approaches like reinforcement learning may not be suitable for such cybersecurity decision problems, since the Markovian property may not necessarily hold and the underlyin… ▽ More

    Submitted 16 December, 2023; v1 submitted 12 December, 2023; originally announced December 2023.

  27. arXiv:2312.07060  [pdf, other

    cs.DC

    Layered Randomized Quantization for Communication-Efficient and Privacy-Preserving Distributed Learning

    Authors: Guangfeng Yan, Tan Li, Tian Lan, Kui Wu, Linqi Song

    Abstract: Next-generation wireless networks, such as edge intelligence and wireless distributed learning, face two critical challenges: communication efficiency and privacy protection. In this work, our focus is on addressing these issues in a distributed learning framework. We consider a new approach that simultaneously achieves communication efficiency and privacy protection by exploiting the privacy adva… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  28. arXiv:2312.02515  [pdf, other

    cs.LG cs.AI

    ASPEN: High-Throughput LoRA Fine-Tuning of Large Language Models with a Single GPU

    Authors: Zhengmao Ye, Dengchun Li, Jingqi Tian, Tingfeng Lan, Jie Zuo, Lei Duan, Hui Lu, Yexi Jiang, Jian Sha, Ke Zhang, Mingjie Tang

    Abstract: Transformer-based large language models (LLMs) have demonstrated outstanding performance across diverse domains, particularly when fine-turned for specific domains. Recent studies suggest that the resources required for fine-tuning LLMs can be economized through parameter-efficient methods such as Low-Rank Adaptation (LoRA). While LoRA effectively reduces computational burdens and resource demands… ▽ More

    Submitted 5 December, 2023; originally announced December 2023.

    Comments: 14 pages, 14 figures

  29. arXiv:2311.17630  [pdf, other

    cs.NI eess.SP

    Optimization in Mobile Augmented Reality Systems for the Metaverse over Wireless Communications

    Authors: Tianming Lan, Jun Zhao

    Abstract: As the essential technical support for Metaverse, Mobile Augmented Reality (MAR) has attracted the attention of many researchers. MAR applications rely on real-time processing of visual and audio data, and thus those heavy workloads can quickly drain the battery of a mobile device. To address such problem, edge-based solutions have appeared for handling some tasks that require more computing power… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: This paper appears in IEEE Global Communications Conference (GLOBECOM) 2023

  30. arXiv:2311.16018  [pdf, other

    cs.CR cs.AI

    RIDE: Real-time Intrusion Detection via Explainable Machine Learning Implemented in a Memristor Hardware Architecture

    Authors: Jingdi Chen, Lei Zhang, Joseph Riem, Gina Adam, Nathaniel D. Bastian, Tian Lan

    Abstract: Deep Learning (DL) based methods have shown great promise in network intrusion detection by identifying malicious network traffic behavior patterns with high accuracy, but their applications to real-time, packet-level detections in high-speed communication networks are challenging due to the high computation time and resource requirements of Deep Neural Networks (DNNs), as well as lack of explaina… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

  31. arXiv:2310.19841  [pdf

    cs.LG

    An interpretable clustering approach to safety climate analysis: examining driver group distinction in safety climate perceptions

    Authors: Kailai Sun, Tianxiang Lan, Yang Miang Goh, Sufiana Safiena, Yueng-Hsiang Huang, Bailey Lytle, Yimin He

    Abstract: The transportation industry, particularly the trucking sector, is prone to workplace accidents and fatalities. Accidents involving large trucks accounted for a considerable percentage of overall traffic fatalities. Recognizing the crucial role of safety climate in accident prevention, researchers have sought to understand its factors and measure its impact within organizations. While existing data… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

    Comments: Submitted to Journal:Accident Analysis and Prevention

  32. arXiv:2310.10226  [pdf, other

    cs.CL

    Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective

    Authors: Huayang Li, Tian Lan, Zihao Fu, Deng Cai, Lemao Liu, Nigel Collier, Taro Watanabe, Yixuan Su

    Abstract: There are a number of diverging hypotheses about the neural text degeneration problem, i.e., generating repetitive and dull loops, which makes this problem both interesting and confusing. In this work, we aim to advance our understanding by presenting a straightforward and fundamental explanation from the data perspective. Our preliminary investigation reveals a strong correlation between the dege… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  33. arXiv:2310.08670  [pdf, other

    cs.LG cs.DC

    Every Parameter Matters: Ensuring the Convergence of Federated Learning with Dynamic Heterogeneous Models Reduction

    Authors: Hanhan Zhou, Tian Lan, Guru Venkataramani, Wenbo Ding

    Abstract: Cross-device Federated Learning (FL) faces significant challenges where low-end clients that could potentially make unique contributions are excluded from training large models due to their resource bottlenecks. Recent research efforts have focused on model-heterogeneous FL, by extracting reduced-size models from the global model and applying them to local clients accordingly. Despite the empirica… ▽ More

    Submitted 26 October, 2023; v1 submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted at NeurIPS 2023

  34. arXiv:2309.04707  [pdf, other

    cs.AI cs.LG

    Advantage Actor-Critic with Reasoner: Explaining the Agent's Behavior from an Exploratory Perspective

    Authors: Muzhe Guo, Feixu Yu, Tian Lan, Fang Jin

    Abstract: Reinforcement learning (RL) is a powerful tool for solving complex decision-making problems, but its lack of transparency and interpretability has been a major challenge in domains where decisions have significant real-world consequences. In this paper, we propose a novel Advantage Actor-Critic with Reasoner (A2CR), which can be easily applied to Actor-Critic-based RL models and make them interpre… ▽ More

    Submitted 9 September, 2023; originally announced September 2023.

  35. arXiv:2308.14897  [pdf, other

    cs.LG cs.AI cs.DC

    Statistically Efficient Variance Reduction with Double Policy Estimation for Off-Policy Evaluation in Sequence-Modeled Reinforcement Learning

    Authors: Hanhan Zhou, Tian Lan, Vaneet Aggarwal

    Abstract: Offline reinforcement learning aims to utilize datasets of previously gathered environment-action interaction records to learn a policy without access to the real environment. Recent work has shown that offline reinforcement learning can be formulated as a sequence modeling problem and solved via supervised learning with approaches such as decision transformer. While these sequence-based methods a… ▽ More

    Submitted 28 August, 2023; originally announced August 2023.

  36. arXiv:2308.03358  [pdf, other

    cs.AI

    RGMComm: Return Gap Minimization via Discrete Communications in Multi-Agent Reinforcement Learning

    Authors: Jingdi Chen, Tian Lan, Carlee Joe-Wong

    Abstract: Communication is crucial for solving cooperative Multi-Agent Reinforcement Learning tasks in partially observable Markov Decision Processes. Existing works often rely on black-box methods to encode local information/features into messages shared with other agents, leading to the generation of continuous messages with high communication overhead and poor interpretability. Prior attempts at discrete… ▽ More

    Submitted 18 December, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  37. arXiv:2308.00258  [pdf, other

    cs.LG cs.DC

    AQUILA: Communication Efficient Federated Learning with Adaptive Quantization in Device Selection Strategy

    Authors: Zihao Zhao, Yuzhu Mao, Zhenpeng Shi, Yang Liu, Tian Lan, Wenbo Ding, Xiao-Ping Zhang

    Abstract: The widespread adoption of Federated Learning (FL), a privacy-preserving distributed learning methodology, has been impeded by the challenge of high communication overheads, typically arising from the transmission of large-scale models. Existing adaptive quantization methods, designed to mitigate these overheads, operate under the impractical assumption of uniform device participation in every tra… ▽ More

    Submitted 4 October, 2023; v1 submitted 31 July, 2023; originally announced August 2023.

  38. arXiv:2307.11629  [pdf, other

    cs.LG cs.MA

    Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs

    Authors: Jiayu Chen, Jingdi Chen, Tian Lan, Vaneet Aggarwal

    Abstract: Covering skill (a.k.a., option) discovery has been developed to improve the exploration of RL in single-agent scenarios with sparse reward signals, through connecting the most distant states in the embedding space provided by the Fiedler vector of the state transition graph. Given that joint state space grows exponentially with the number of agents in multi-agent systems, existing researches still… ▽ More

    Submitted 20 August, 2023; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: Accepted to NeurIPS 2022. arXiv admin note: substantial text overlap with arXiv:2201.08227

  39. arXiv:2307.06962  [pdf, other

    cs.CL cs.AI

    Copy Is All You Need

    Authors: Tian Lan, Deng Cai, Yan Wang, Heyan Huang, Xian-Ling Mao

    Abstract: The dominant text generation models compose the output by sequentially selecting words from a fixed vocabulary. In this paper, we formulate text generation as progressively copying text segments (e.g., words or phrases) from an existing text collection. We compute the contextualized representations of meaningful text segments and index them using efficient vector search toolkits. The task of text… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Journal ref: The Eleventh International Conference on Learning Representations (ICLR 2023)

  40. arXiv:2306.17054  [pdf, other

    cs.NI

    Two-tiered Online Optimization of Region-wide Datacenter Resource Allocation via Deep Reinforcement Learning

    Authors: Chang-Lin Chen, Hanhan Zhou, Jiayu Chen, Mohammad Pedramfar, Vaneet Aggarwal, Tian Lan, Zheqing Zhu, Chi Zhou, Tim Gasser, Pol Mauri Ruiz, Vijay Menon, Neeraj Kumar, Hongbo Dong

    Abstract: This paper addresses the important need for advanced techniques in continuously allocating workloads on shared infrastructures in data centers, a problem arising due to the growing popularity and scale of cloud computing. It particularly emphasizes the scarcity of research ensuring guaranteed capacity in capacity reservations during large-scale failures. To tackle these issues, the paper presents… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

  41. arXiv:2306.02831  [pdf, other

    stat.ML cs.LG

    MM-DAG: Multi-task DAG Learning for Multi-modal Data -- with Application for Traffic Congestion Analysis

    Authors: Tian Lan, Ziyue Li, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Wolfgang Ketter, Rui Zhao, Chen Zhang

    Abstract: This paper proposes to learn Multi-task, Multi-modal Direct Acyclic Graphs (MM-DAGs), which are commonly observed in complex systems, e.g., traffic, manufacturing, and weather systems, whose variables are multi-modal with scalars, vectors, and functions. This paper takes the traffic congestion analysis as a concrete case, where a traffic intersection is usually regarded as a DAG. In a road network… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted in SIGKDD 2023

  42. arXiv:2306.01075  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    Pedestrian Crossing Action Recognition and Trajectory Prediction with 3D Human Keypoints

    Authors: Jiachen Li, Xinwei Shi, Feiyu Chen, Jonathan Stroud, Zhishuai Zhang, Tian Lan, Junhua Mao, Jeonhyung Kang, Khaled S. Refaat, Weilong Yang, Eugene Ie, Congcong Li

    Abstract: Accurate understanding and prediction of human behaviors are critical prerequisites for autonomous vehicles, especially in highly dynamic and interactive scenarios such as intersections in dense urban areas. In this work, we aim at identifying crossing pedestrians and predicting their future trajectories. To achieve these goals, we not only need the context information of road geometry and other t… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: ICRA 2023

  43. arXiv:2306.00187  [pdf, other

    cs.MA

    AccMER: Accelerating Multi-Agent Experience Replay with Cache Locality-aware Prioritization

    Authors: Kailash Gogineni, Yongsheng Mei, Peng Wei, Tian Lan, Guru Venkataramani

    Abstract: Multi-Agent Experience Replay (MER) is a key component of off-policy reinforcement learning~(RL) algorithms. By remembering and reusing experiences from the past, experience replay significantly improves the stability of RL algorithms and their learning efficiency. In many scenarios, multiple agents interact in a shared environment during online training under centralized training and decentralize… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: Accepted to ASAP'23

  44. arXiv:2305.19153  [pdf, other

    cs.NI cs.AI

    FERN: Leveraging Graph Attention Networks for Failure Evaluation and Robust Network Design

    Authors: Chenyi Liu, Vaneet Aggarwal, Tian Lan, Nan Geng, Yuan Yang, Mingwei Xu, Qing Li

    Abstract: Robust network design, which aims to guarantee network availability under various failure scenarios while optimizing performance/cost objectives, has received significant attention. Existing approaches often rely on model-based mixed-integer optimization that is hard to scale or employ deep learning to solve specific engineering problems yet with limited generalizability. In this paper, we show th… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Journal ref: IEEE/ACM Transactions on Networking 2023

  45. arXiv:2305.17327  [pdf, other

    cs.LG

    Hierarchical Deep Counterfactual Regret Minimization

    Authors: Jiayu Chen, Tian Lan, Vaneet Aggarwal

    Abstract: Imperfect Information Games (IIGs) offer robust models for scenarios where decision-makers face uncertainty or lack complete information. Counterfactual Regret Minimization (CFR) has been one of the most successful family of algorithms for tackling IIGs. The integration of skill-based strategy learning with CFR could potentially mirror more human-like decision-making process and enhance the learni… ▽ More

    Submitted 26 September, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

  46. arXiv:2305.16355  [pdf, other

    cs.CL cs.CV

    PandaGPT: One Model To Instruction-Follow Them All

    Authors: Yixuan Su, Tian Lan, Huayang Li, Jialu Xu, Yan Wang, Deng Cai

    Abstract: We present PandaGPT, an approach to emPower large lANguage moDels with visual and Auditory instruction-following capabilities. Our pilot experiments show that PandaGPT can perform complex tasks such as detailed image description generation, writing stories inspired by videos, and answering questions about audios. More interestingly, PandaGPT can take multimodal inputs simultaneously and compose th… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: Technical report, work in progress. Our project page is at https://panda-gpt.github.io/

  47. arXiv:2305.13411  [pdf, other

    cs.MA

    Towards Efficient Multi-Agent Learning Systems

    Authors: Kailash Gogineni, Peng Wei, Tian Lan, Guru Venkataramani

    Abstract: Multi-Agent Reinforcement Learning (MARL) is an increasingly important research field that can model and control multiple large-scale autonomous systems. Despite its achievements, existing multi-agent learning methods typically involve expensive computations in terms of training time and power arising from large observation-action space and a huge number of training steps. Therefore, a key challen… ▽ More

    Submitted 23 May, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Accepted at MLArchSys, ISCA 2023. Compared to arXiv:2302.05007, we explore a neighbor sampling strategy to improve the locality of data access within the mini-batch sampling phase. Our preliminary experiments provide performance improvement ranging from 26.66% (3 agents) to 27.39% (12 agents) in the sampling phase training run-time

  48. arXiv:2305.12633  [pdf, other

    cs.LG

    Multi-task Hierarchical Adversarial Inverse Reinforcement Learning

    Authors: Jiayu Chen, Dipesh Tamboli, Tian Lan, Vaneet Aggarwal

    Abstract: Multi-task Imitation Learning (MIL) aims to train a policy capable of performing a distribution of tasks based on multi-task expert demonstrations, which is essential for general-purpose robots. Existing MIL algorithms suffer from low data efficiency and poor performance on complex long-horizontal tasks. We develop Multi-task Hierarchical Adversarial Inverse Reinforcement Learning (MH-AIRL) to lea… ▽ More

    Submitted 28 June, 2023; v1 submitted 21 May, 2023; originally announced May 2023.

    Comments: This paper is accepted at ICML 2023. arXiv admin note: text overlap with arXiv:2210.01969

  49. arXiv:2304.01468  [pdf, other

    cs.DC cs.AI cs.DB

    DLRover-RM: Resource Optimization for Deep Recommendation Models Training in the Cloud

    Authors: Qinlong Wang, Tingfeng Lan, Yinghao Tang, Ziling Huang, Yiheng Du, Haitao Zhang, Jian Sha, Hui Lu, Yuanchun Zhou, Ke Zhang, Mingjie Tang

    Abstract: Deep learning recommendation models (DLRM) rely on large embedding tables to manage categorical sparse features. Expanding such embedding tables can significantly enhance model performance, but at the cost of increased GPU/CPU/memory usage. Meanwhile, tech companies have built extensive cloud-based services to accelerate training DLRM models at scale. In this paper, we conduct a deep investigation… ▽ More

    Submitted 28 June, 2024; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: Accepted in VLDB'24

  50. arXiv:2302.10418  [pdf, other

    cs.LG cs.AI cs.MA

    MAC-PO: Multi-Agent Experience Replay via Collective Priority Optimization

    Authors: Yongsheng Mei, Hanhan Zhou, Tian Lan, Guru Venkataramani, Peng Wei

    Abstract: Experience replay is crucial for off-policy reinforcement learning (RL) methods. By remembering and reusing the experiences from past different policies, experience replay significantly improves the training efficiency and stability of RL algorithms. Many decision-making problems in practice naturally involve multiple agents and require multi-agent reinforcement learning (MARL) under centralized t… ▽ More

    Submitted 27 February, 2023; v1 submitted 20 February, 2023; originally announced February 2023.

    Comments: The 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023). arXiv admin note: text overlap with arXiv:2302.05593