Skip to main content

Showing 1–27 of 27 results for author: Liu, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10198  [pdf, other

    cs.CV cs.GR

    MeshFormer: High-Quality Mesh Generation with 3D-Guided Reconstruction Model

    Authors: Minghua Liu, Chong Zeng, Xinyue Wei, Ruoxi Shi, Linghao Chen, Chao Xu, Mengqi Zhang, Zhaoning Wang, Xiaoshuai Zhang, Isabella Liu, Hongzhi Wu, Hao Su

    Abstract: Open-world 3D reconstruction models have recently garnered significant attention. However, without sufficient 3D inductive bias, existing methods typically entail expensive training costs and struggle to extract high-quality 3D meshes. In this work, we introduce MeshFormer, a sparse-view reconstruction model that explicitly leverages 3D native structure, input guidance, and training supervision. S… ▽ More

    Submitted 19 August, 2024; originally announced August 2024.

    Comments: 20 pages, 9 figures

  2. arXiv:2407.04152  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    VoxAct-B: Voxel-Based Acting and Stabilizing Policy for Bimanual Manipulation

    Authors: I-Chun Arthur Liu, Sicheng He, Daniel Seita, Gaurav Sukhatme

    Abstract: Bimanual manipulation is critical to many robotics applications. In contrast to single-arm manipulation, bimanual manipulation tasks are challenging due to higher-dimensional action spaces. Prior works leverage large amounts of data and primitive actions to address this problem, but may suffer from sample inefficiency and limited generalization across various tasks. To this end, we propose VoxAct-… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2407.03640  [pdf, other

    cs.LG cs.CL cs.CV

    Generative Technology for Human Emotion Recognition: A Scope Review

    Authors: Fei Ma, Yucheng Yuan, Yifan Xie, Hongwei Ren, Ivan Liu, Ying He, Fuji Ren, Fei Richard Yu, Shiguang Ni

    Abstract: Affective computing stands at the forefront of artificial intelligence (AI), seeking to imbue machines with the ability to comprehend and respond to human emotions. Central to this field is emotion recognition, which endeavors to identify and interpret human emotional states from different modalities, such as speech, facial images, text, and physiological signals. In recent years, important progre… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Under Review

  4. arXiv:2406.12834  [pdf, other

    cs.CV

    GroPrompt: Efficient Grounded Prompting and Adaptation for Referring Video Object Segmentation

    Authors: Ci-Siang Lin, I-Jieh Liu, Min-Hung Chen, Chien-Yi Wang, Sifei Liu, Yu-Chiang Frank Wang

    Abstract: Referring Video Object Segmentation (RVOS) aims to segment the object referred to by the query sentence throughout the entire video. Most existing methods require end-to-end training with dense mask annotations, which could be computation-consuming and less scalable. In this work, we aim to efficiently adapt foundation segmentation models for addressing RVOS from weak supervision with the proposed… ▽ More

    Submitted 23 June, 2024; v1 submitted 18 June, 2024; originally announced June 2024.

    Comments: CVPR Workshop (CVinW) 2024. Project page: https://jack24658735.github.io/groprompt/

  5. arXiv:2404.12379  [pdf, other

    cs.CV

    Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos

    Authors: Isabella Liu, Hao Su, Xiaolong Wang

    Abstract: Modern 3D engines and graphics pipelines require mesh as a memory-efficient representation, which allows efficient rendering, geometry processing, texture editing, and many other downstream operations. However, it is still highly difficult to obtain high-quality mesh in terms of structure and detail from monocular visual observations. The problem becomes even more challenging for dynamic scenes an… ▽ More

    Submitted 22 April, 2024; v1 submitted 18 April, 2024; originally announced April 2024.

    Comments: Project page: https://www.liuisabella.com/DG-Mesh/

  6. arXiv:2403.03608  [pdf, other

    cs.CV cs.AI cs.LG

    GSNeRF: Generalizable Semantic Neural Radiance Fields with Enhanced 3D Scene Understanding

    Authors: Zi-Ting Chou, Sheng-Yu Huang, I-Jieh Liu, Yu-Chiang Frank Wang

    Abstract: Utilizing multi-view inputs to synthesize novel-view images, Neural Radiance Fields (NeRF) have emerged as a popular research topic in 3D vision. In this work, we introduce a Generalizable Semantic Neural Radiance Field (GSNeRF), which uniquely takes image semantics into the synthesis process so that both novel view images and the associated semantic maps can be produced for unseen scenes. Our GSN… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR2024

  7. arXiv:2401.11117  [pdf

    eess.SP cs.CY

    A Finger on the Pulse of Cardiovascular Health: Estimating Blood Pressure with Smartphone Photoplethysmography-Based Pulse Waveform Analysis

    Authors: Ivan Liu, Fangyuan Liu, Qi Zhong, Shiguang Ni

    Abstract: Utilizing mobile phone cameras for continuous blood pressure (BP) monitoring presents a cost-effective and accessible approach, yet it is challenged by limitations in accuracy and interpretability. This study introduces four innovative strategies to enhance smartphone-based photoplethysmography for BP estimation (SPW-BP), addressing the interpretability-accuracy dilemma. First, we employ often-neg… ▽ More

    Submitted 24 July, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

    Comments: 30 pages, 7 figures

  8. arXiv:2401.09145  [pdf

    cs.CY

    Your blush gives you away: detecting hidden mental states with remote photoplethysmography and thermal imaging

    Authors: Ivan Liu, Fangyuan Liu, Qi Zhong, Fei Ma, Shiguang Ni

    Abstract: Multimodal emotion recognition techniques are increasingly essential for assessing mental states. Image-based methods, however, tend to focus predominantly on overt visual cues and often overlook subtler mental state changes. Psychophysiological research has demonstrated that HR and skin temperature are effective in detecting ANS activities, thereby revealing these subtle changes. However, traditi… ▽ More

    Submitted 17 January, 2024; originally announced January 2024.

    Comments: 28 pages, 6 figures

  9. arXiv:2312.12891  [pdf, other

    cs.AI

    MinePlanner: A Benchmark for Long-Horizon Planning in Large Minecraft Worlds

    Authors: William Hill, Ireton Liu, Anita De Mello Koch, Damion Harvey, Nishanth Kumar, George Konidaris, Steven James

    Abstract: We propose a new benchmark for planning tasks based on the Minecraft game. Our benchmark contains 45 tasks overall, but also provides support for creating both propositional and numeric instances of new Minecraft tasks automatically. We benchmark numeric and propositional planning systems on these tasks, with results demonstrating that state-of-the-art planners are currently incapable of dealing w… ▽ More

    Submitted 28 April, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: Accepted to the 6th ICAPS Workshop on the International Planning Competition (WIPC 2024)

  10. arXiv:2312.07165  [pdf, other

    cs.CV cs.AI cs.LG

    Language-Guided Transformer for Federated Multi-Label Classification

    Authors: I-Jieh Liu, Ci-Siang Lin, Fu-En Yang, Yu-Chiang Frank Wang

    Abstract: Federated Learning (FL) is an emerging paradigm that enables multiple users to collaboratively train a robust model in a privacy-preserving manner without sharing their private data. Most existing approaches of FL only consider traditional single-label image classification, ignoring the impact when transferring the task to multi-label image classification. Nevertheless, it is still challenging for… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted by AAAI 2024

  11. arXiv:2312.06686  [pdf, other

    cs.CV cs.RO

    Robo360: A 3D Omnispective Multi-Material Robotic Manipulation Dataset

    Authors: Litian Liang, Liuyu Bian, Caiwei Xiao, Jialin Zhang, Linghao Chen, Isabella Liu, Fanbo Xiang, Zhiao Huang, Hao Su

    Abstract: Building robots that can automate labor-intensive tasks has long been the core motivation behind the advancements in computer vision and the robotics community. Recent interest in leveraging 3D algorithms, particularly neural fields, has led to advancements in robot perception and physical understanding in manipulation scenarios. However, the real world's complexity poses significant challenges. T… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

  12. arXiv:2309.07921  [pdf, other

    cs.CV

    OpenIllumination: A Multi-Illumination Dataset for Inverse Rendering Evaluation on Real Objects

    Authors: Isabella Liu, Linghao Chen, Ziyang Fu, Liwen Wu, Haian Jin, Zhong Li, Chin Ming Ryan Wong, Yi Xu, Ravi Ramamoorthi, Zexiang Xu, Hao Su

    Abstract: We introduce OpenIllumination, a real-world dataset containing over 108K images of 64 objects with diverse materials, captured under 72 camera views and a large number of different illuminations. For each image in the dataset, we provide accurate camera parameters, illumination ground truth, and foreground segmentation masks. Our dataset enables the quantitative evaluation of most inverse renderin… ▽ More

    Submitted 1 February, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

  13. arXiv:2304.12461  [pdf, other

    cs.CV

    TensoIR: Tensorial Inverse Rendering

    Authors: Haian Jin, Isabella Liu, Peijia Xu, Xiaoshuai Zhang, Songfang Han, Sai Bi, Xiaowei Zhou, Zexiang Xu, Hao Su

    Abstract: We propose TensoIR, a novel inverse rendering approach based on tensor factorization and neural fields. Unlike previous works that use purely MLP-based neural fields, thus suffering from low capacity and high computation costs, we extend TensoRF, a state-of-the-art approach for radiance field modeling, to estimate scene geometry, surface reflectance, and environment illumination from multi-view im… ▽ More

    Submitted 17 March, 2024; v1 submitted 24 April, 2023; originally announced April 2023.

    Comments: Project page: https://haian-jin.github.io/TensoIR

  14. arXiv:2304.03833  [pdf, other

    cs.RO cs.LG

    Learning Robot Manipulation from Cross-Morphology Demonstration

    Authors: Gautam Salhotra, I-Chun Arthur Liu, Gaurav Sukhatme

    Abstract: Some Learning from Demonstrations (LfD) methods handle small mismatches in the action spaces of the teacher and student. Here we address the case where the teacher's morphology is substantially different from that of the student. Our framework, Morphological Adaptation in Imitation Learning (MAIL), bridges this gap allowing us to train an agent from demonstrations by other agents with significantl… ▽ More

    Submitted 29 October, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: Accepted to the Conference on Robot Learning (CoRL) 2023

  15. arXiv:2207.10148  [pdf, other

    cs.RO cs.AI cs.LG

    Learning Deformable Object Manipulation from Expert Demonstrations

    Authors: Gautam Salhotra, I-Chun Arthur Liu, Marcus Dominguez-Kuhne, Gaurav S. Sukhatme

    Abstract: We present a novel Learning from Demonstration (LfD) method, Deformable Manipulation from Demonstrations (DMfD), to solve deformable manipulation tasks using states or images as inputs, given expert demonstrations. Our method uses demonstrations in three different ways, and balances the trade-off between exploring the environment online and using guidance from experts to explore high dimensional s… ▽ More

    Submitted 20 July, 2022; originally announced July 2022.

    Comments: Accepted to IEEE Robotics & Automation Letters (RA-L) and IEEE IROS 2022. Project website: https://uscresl.github.io/dmfd

    Journal ref: IEEE Robotics & Automation Letters (RA-L) Oct 2022

  16. arXiv:2205.06111  [pdf, other

    cs.AI cs.CL

    Asking for Knowledge: Training RL Agents to Query External Knowledge Using Language

    Authors: Iou-Jen Liu, Xingdi Yuan, Marc-Alexandre Côté, Pierre-Yves Oudeyer, Alexander G. Schwing

    Abstract: To solve difficult tasks, humans ask questions to acquire knowledge from external sources. In contrast, classical reinforcement learning agents lack such an ability and often resort to exploratory behavior. This is exacerbated as few present-day environments support querying for knowledge. In order to study how agents can be taught to query external knowledge via language, we first introduce two n… ▽ More

    Submitted 3 July, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: ICML 2022; Project page: https://ioujenliu.github.io/AFK/

  17. arXiv:2112.02772  [pdf, other

    cs.CV

    ActiveZero: Mixed Domain Learning for Active Stereovision with Zero Annotation

    Authors: Isabella Liu, Edward Yang, Jianyu Tao, Rui Chen, Xiaoshuai Zhang, Qing Ran, Zhu Liu, Hao Su

    Abstract: Traditional depth sensors generate accurate real world depth estimates that surpass even the most advanced learning approaches trained only on simulation domains. Since ground truth depth is readily available in the simulation domain but quite difficult to obtain in the real domain, we propose a method that leverages the best of both worlds. In this paper we present a new framework, ActiveZero, wh… ▽ More

    Submitted 5 December, 2021; originally announced December 2021.

  18. arXiv:2111.06383  [pdf, other

    cs.LG cs.AI cs.RO

    Distilling Motion Planner Augmented Policies into Visual Control Policies for Robot Manipulation

    Authors: I-Chun Arthur Liu, Shagun Uppal, Gaurav S. Sukhatme, Joseph J. Lim, Peter Englert, Youngwoon Lee

    Abstract: Learning complex manipulation tasks in realistic, obstructed environments is a challenging problem due to hard exploration in the presence of obstacles and high-dimensional visual observations. Prior work tackles the exploration problem by integrating motion planning and reinforcement learning. However, the motion planner augmented policy requires access to state information, which is often not av… ▽ More

    Submitted 11 November, 2021; originally announced November 2021.

    Comments: Published at the Conference on Robot Learning (CoRL) 2021

  19. arXiv:2108.03319  [pdf, other

    cs.AI

    Semantic Tracklets: An Object-Centric Representation for Visual Multi-Agent Reinforcement Learning

    Authors: Iou-Jen Liu, Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing

    Abstract: Solving complex real-world tasks, e.g., autonomous fleet control, often involves a coordinated team of multiple agents which learn strategies from visual inputs via reinforcement learning. Many existing multi-agent reinforcement learning (MARL) algorithms however don't scale to environments where agents operate on visual inputs. To address this issue, algorithmically, recent works have focused on… ▽ More

    Submitted 6 August, 2021; originally announced August 2021.

    Comments: IROS 2021; Project page: https://ioujenliu.github.io/SemanticTracklets/

  20. arXiv:2107.11444  [pdf, other

    cs.AI

    Cooperative Exploration for Multi-Agent Deep Reinforcement Learning

    Authors: Iou-Jen Liu, Unnat Jain, Raymond A. Yeh, Alexander G. Schwing

    Abstract: Exploration is critical for good results in deep reinforcement learning and has attracted much attention. However, existing multi-agent deep reinforcement learning algorithms still use mostly noise-based techniques. Very recently, exploration methods that consider cooperation among multiple agents have been developed. However, existing methods suffer from a common challenge: agents struggle to ide… ▽ More

    Submitted 23 July, 2021; originally announced July 2021.

    Comments: ICML 2021; Project Page: https://ioujenliu.github.io/CMAE/

  21. arXiv:2105.00931  [pdf, other

    cs.CV cs.AI cs.LG cs.MA

    GridToPix: Training Embodied Agents with Minimal Supervision

    Authors: Unnat Jain, Iou-Jen Liu, Svetlana Lazebnik, Aniruddha Kembhavi, Luca Weihs, Alexander Schwing

    Abstract: While deep reinforcement learning (RL) promises freedom from hand-labeled data, great successes, especially for Embodied AI, require significant work to create supervision via carefully shaped rewards. Indeed, without shaped rewards, i.e., with only terminal rewards, present-day Embodied AI results degrade significantly across Embodied AI problems from single-agent Habitat-based PointGoal Navigati… ▽ More

    Submitted 13 October, 2021; v1 submitted 14 April, 2021; originally announced May 2021.

    Comments: Project page: https://unnat.github.io/gridtopix/ ; last two authors contributed equally

  22. arXiv:2105.00811  [pdf, other

    cs.CL

    CBench: Towards Better Evaluation of Question Answering Over Knowledge Graphs

    Authors: Abdelghny Orogat, Isabelle Liu, Ahmed El-Roby

    Abstract: Recently, there has been an increase in the number of knowledge graphs that can be only queried by experts. However, describing questions using structured queries is not straightforward for non-expert users who need to have sufficient knowledge about both the vocabulary and the structure of the queried knowledge graph, as well as the syntax of the structured query language used to describe the use… ▽ More

    Submitted 5 April, 2021; originally announced May 2021.

  23. arXiv:2012.09849  [pdf, other

    cs.LG cs.AI

    High-Throughput Synchronous Deep RL

    Authors: Iou-Jen Liu, Raymond A. Yeh, Alexander G. Schwing

    Abstract: Deep reinforcement learning (RL) is computationally demanding and requires processing of many data points. Synchronous methods enjoy training stability while having lower data throughput. In contrast, asynchronous methods achieve high throughput but suffer from stability issues and lower sample efficiency due to `stale policies.' To combine the advantages of both methods we propose High-Throughput… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: Accepted to NeurIPS 2020; Project page: https://ioujenliu.github.io/HTS-RL/

  24. arXiv:2007.12173  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Bridging the Imitation Gap by Adaptive Insubordination

    Authors: Luca Weihs, Unnat Jain, Iou-Jen Liu, Jordi Salvador, Svetlana Lazebnik, Aniruddha Kembhavi, Alexander Schwing

    Abstract: In practice, imitation learning is preferred over pure reinforcement learning whenever it is possible to design a teaching agent to provide expert supervision. However, we show that when the teaching agent makes decisions with access to privileged information that is unavailable to the student, this information is marginalized during imitation learning, resulting in an "imitation gap" and, potenti… ▽ More

    Submitted 3 December, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: NeurIPS'21 version. The first two authors contributed equally. Project page: https://unnat.github.io/advisor/

  25. arXiv:1911.00025  [pdf, other

    cs.LG cs.CV stat.ML

    PIC: Permutation Invariant Critic for Multi-Agent Deep Reinforcement Learning

    Authors: Iou-Jen Liu, Raymond A. Yeh, Alexander G. Schwing

    Abstract: Sample efficiency and scalability to a large number of agents are two important goals for multi-agent reinforcement learning systems. Recent works got us closer to those goals, addressing non-stationarity of the environment from a single agent's perspective by utilizing a deep net critic which depends on all observations and actions. The critic input concatenates agent observations and actions in… ▽ More

    Submitted 31 October, 2019; originally announced November 2019.

    Comments: Accepted to CORL2019

  26. arXiv:1904.05878  [pdf, other

    cs.LG stat.ML

    Knowledge Flow: Improve Upon Your Teachers

    Authors: Iou-Jen Liu, Jian Peng, Alexander G. Schwing

    Abstract: A zoo of deep nets is available these days for almost any given task, and it is increasingly unclear which net to start with when addressing a new task, or which net to use as an initialization for fine-tuning a new model. To address this issue, in this paper, we develop knowledge flow which moves 'knowledge' from multiple deep nets, referred to as teachers, to a new deep net model, called the stu… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: Accepted to ICLR 2019

  27. arXiv:1412.3191  [pdf, other

    cs.AI cs.NE

    Bach in 2014: Music Composition with Recurrent Neural Network

    Authors: I-Ting Liu, Bhiksha Ramakrishnan

    Abstract: We propose a framework for computer music composition that uses resilient propagation (RProp) and long short term memory (LSTM) recurrent neural network. In this paper, we show that LSTM network learns the structure and characteristics of music pieces properly by demonstrating its ability to recreate music. We also show that predicting existing music using RProp outperforms Back propagation throug… ▽ More

    Submitted 13 December, 2014; v1 submitted 9 December, 2014; originally announced December 2014.