Skip to main content

Showing 1–50 of 213 results for author: Qi, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.16716  [pdf, ps, other

    cs.NE cs.CV cs.LG

    Exploring The Neural Burden In Pruned Models: An Insight Inspired By Neuroscience

    Authors: Zeyu Wang, Weichen Dai, Xiangyu Zhou, Ji Qi, Yi Zhou

    Abstract: Vision Transformer and its variants have been adopted in many visual tasks due to their powerful capabilities, which also bring significant challenges in computation and storage. Consequently, researchers have introduced various compression methods in recent years, among which the pruning techniques are widely used to remove a significant fraction of the network. Therefore, these methods can reduc… ▽ More

    Submitted 22 July, 2024; originally announced July 2024.

  2. arXiv:2407.13561  [pdf, other

    cs.CL

    Research on Tibetan Tourism Viewpoints information generation system based on LLM

    Authors: Jinhu Qi, Shuai Yan, Wentao Zhang, Yibo Zhang, Zirui Liu, Ke Wang

    Abstract: Tibet, ensconced within China's territorial expanse, is distinguished by its labyrinthine and heterogeneous topography, a testament to its profound historical heritage, and the cradle of a unique religious ethos. The very essence of these attributes, however, has impeded the advancement of Tibet's tourism service infrastructure, rendering existing smart tourism services inadequate for the region's… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Journal ref: ICWOC 2024

  3. arXiv:2407.00056  [pdf, other

    cs.IR cs.AI cs.SI

    MMBee: Live Streaming Gift-Sending Recommendations via Multi-Modal Fusion and Behaviour Expansion

    Authors: Jiaxin Deng, Shiyao Wang, Yuchen Wang, Jiansong Qi, Liqin Zhao, Guorui Zhou, Gaofeng Meng

    Abstract: Live streaming services are becoming increasingly popular due to real-time interactions and entertainment. Viewers can chat and send comments or virtual gifts to express their preferences for the streamers. Accurately modeling the gifting interaction not only enhances users' experience but also increases streamers' revenue. Previous studies on live streaming gifting prediction treat this task as a… ▽ More

    Submitted 15 June, 2024; originally announced July 2024.

    Comments: Accepted at KDD 2024

  4. arXiv:2406.19999  [pdf, other

    cs.CL

    The SIFo Benchmark: Investigating the Sequential Instruction Following Ability of Large Language Models

    Authors: Xinyi Chen, Baohao Liao, Jirui Qi, Panagiotis Eustratiadis, Christof Monz, Arianna Bisazza, Maarten de Rijke

    Abstract: Following multiple instructions is a crucial ability for large language models (LLMs). Evaluating this ability comes with significant challenges: (i) limited coherence between multiple instructions, (ii) positional bias where the order of instructions affects model performance, and (iii) a lack of objectively verifiable tasks. To address these issues, we introduce a benchmark designed to evaluate… ▽ More

    Submitted 28 June, 2024; originally announced June 2024.

  5. arXiv:2406.14709  [pdf, other

    cs.CL

    Factual Dialogue Summarization via Learning from Large Language Models

    Authors: Rongxin Zhu, Jey Han Lau, Jianzhong Qi

    Abstract: Factual consistency is an important quality in dialogue summarization. Large language model (LLM)-based automatic text summarization models generate more factually consistent summaries compared to those by smaller pretrained language models, but they face deployment challenges in real-world applications due to privacy or resource constraints. In this paper, we investigate the use of symbolic knowl… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    ACM Class: F.2.2; I.2.7

  6. arXiv:2406.13663  [pdf, other

    cs.CL cs.AI cs.LG

    Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

    Authors: Jirui Qi, Gabriele Sarti, Raquel Fernández, Arianna Bisazza

    Abstract: Ensuring the verifiability of model answers is a fundamental challenge for retrieval-augmented generation (RAG) in the question answering (QA) domain. Recently, self-citation prompting was proposed to make large language models (LLMs) generate citations to supporting documents along with their answers. However, self-citing LLMs often struggle to match the required format, refer to non-existent sou… ▽ More

    Submitted 1 July, 2024; v1 submitted 19 June, 2024; originally announced June 2024.

    Comments: Under review. Code and data released at https://github.com/Betswish/MIRAGE

  7. arXiv:2406.08035  [pdf, other

    cs.CV cs.AI

    LVBench: An Extreme Long Video Understanding Benchmark

    Authors: Weihan Wang, Zehai He, Wenyi Hong, Yean Cheng, Xiaohan Zhang, Ji Qi, Shiyu Huang, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang

    Abstract: Recent progress in multimodal large language models has markedly enhanced the understanding of short videos (typically under one minute), and several evaluation datasets have emerged accordingly. However, these advancements fall short of meeting the demands of real-world applications such as embodied intelligence for long-term decision-making, in-depth movie reviews and discussions, and live sport… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  8. arXiv:2406.07925  [pdf, other

    cs.DC

    FDLoRA: Personalized Federated Learning of Large Language Model via Dual LoRA Tuning

    Authors: Jiaxing QI, Zhongzhi Luan, Shaohan Huang, Carol Fung, Hailong Yang, Depei Qian

    Abstract: Large language models (LLMs) have emerged as important components across various fields, yet their training requires substantial computation resources and abundant labeled data. It poses a challenge to robustly training LLMs for individual users (clients). To tackle this challenge, the intuitive idea is to introduce federated learning (FL), which can collaboratively train models on distributed pri… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  9. arXiv:2406.03150  [pdf, other

    cs.LG cs.CV

    Sample-specific Masks for Visual Reprogramming-based Prompting

    Authors: Chengyi Cai, Zesheng Ye, Lei Feng, Jianzhong Qi, Feng Liu

    Abstract: Visual reprogramming (VR) is a prompting technique that aims to re-purpose a pre-trained model (e.g., a classifier on ImageNet) to target tasks (e.g., medical data prediction) by learning a small-scale pattern added into input images instead of tuning considerable parameters within the model. The location of the pattern within input samples is usually determined by a pre-defined mask shared across… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  10. arXiv:2405.03989  [pdf

    cs.DB

    A Method for Parsing and Vectorization of Semi-structured Data used in Retrieval Augmented Generation

    Authors: Hang Yang, Jing Guo, Jianchuan Qi, Jinliang Xie, Si Zhang, Siqi Yang, Nan Li, Ming Xu

    Abstract: This paper presents a novel method for parsing and vectorizing semi-structured data to enhance the functionality of Retrieval-Augmented Generation (RAG) within Large Language Models (LLMs). We developed a comprehensive pipeline for converting various data formats into .docx, enabling efficient parsing and structured data extraction. The core of our methodology involves the construction of a vector… ▽ More

    Submitted 8 May, 2024; v1 submitted 7 May, 2024; originally announced May 2024.

    Comments: 20 pages,4 figures, 5 tables

  11. arXiv:2404.05091  [pdf, other

    cs.CL

    MM-MATH: Advancing Multimodal Math Evaluation with Process Evaluation and Fine-grained Classification

    Authors: Kai Sun, Yushi Bai, Ji Qi, Lei Hou, Juanzi Li

    Abstract: To advance the evaluation of multimodal math reasoning in large multimodal models (LMMs), this paper introduces a novel benchmark, MM-MATH. MM-MATH consists of 5,929 open-ended middle school math problems with visual contexts, with fine-grained classification across difficulty, grade level, and knowledge points. Unlike existing benchmarks relying on binary answer comparison, MM-MATH incorporates b… ▽ More

    Submitted 2 July, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

  12. arXiv:2403.18282  [pdf, other

    cs.CV

    SGDM: Static-Guided Dynamic Module Make Stronger Visual Models

    Authors: Wenjie Xing, Zhenchao Cui, Jing Qi

    Abstract: The spatial attention mechanism has been widely used to improve object detection performance. However, its operation is currently limited to static convolutions lacking content-adaptive features. This paper innovatively approaches from the perspective of dynamic convolution. We propose Razor Dynamic Convolution (RDConv) to address thetwo flaws in dynamic weight convolution, making it hard to imple… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 16 pages, 4 figures

  13. arXiv:2403.17448  [pdf, other

    cs.RO

    Adaptive Line-Of-Sight guidance law based on vector fields path following for underactuated unmanned surface vehicle

    Authors: Jie Qi, Ronghua Wanga, Nailong Wu

    Abstract: The focus of this paper is to develop a methodology that enables an unmanned surface vehicle (USV) to efficiently track a planned path. The introduction of a vector field-based adaptive line of-sight guidance law (VFALOS) for accurate trajectory tracking and minimizing the overshoot response time during USV tracking of curved paths improves the overall line-of-sight (LOS) guidance method. These im… ▽ More

    Submitted 5 April, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

  14. arXiv:2403.10309  [pdf, other

    cs.RO

    Revolutionizing Packaging: A Robotic Bagging Pipeline with Constraint-aware Structure-of-Interest Planning

    Authors: Jiaming Qi, Peng Zhou, Pai Zheng, Hongmin Wu, Chenguang Yang, David Navarro-Alarcon, Jia Pan

    Abstract: Bagging operations, common in packaging and assisted living applications, are challenging due to a bag's complex deformable properties. To address this, we develop a robotic system for automated bagging tasks using an adaptive structure-of-interest (SOI) manipulation approach. Our method relies on real-time visual feedback to dynamically adjust manipulation without requiring prior knowledge of bag… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

  15. arXiv:2403.02576  [pdf, other

    cs.DL cs.LG cs.SI

    AceMap: Knowledge Discovery through Academic Graph

    Authors: Xinbing Wang, Luoyi Fu, Xiaoying Gan, Ying Wen, Guanjie Zheng, Jiaxin Ding, Liyao Xiang, Nanyang Ye, Meng Jin, Shiyu Liang, Bin Lu, Haiwen Wang, Yi Xu, Cheng Deng, Shao Zhang, Huquan Kang, Xingli Wang, Qi Li, Zhixin Guo, Jiexing Qi, Pan Liu, Yuyang Ren, Lyuwen Wu, Jungang Yang, Jianping Zhou , et al. (1 additional authors not shown)

    Abstract: The exponential growth of scientific literature requires effective management and extraction of valuable insights. While existing scientific search engines excel at delivering search results based on relational databases, they often neglect the analysis of collaborations between scientific entities and the evolution of ideas, as well as the in-depth analysis of content within scientific publicatio… ▽ More

    Submitted 14 April, 2024; v1 submitted 4 March, 2024; originally announced March 2024.

    Comments: Technical Report for AceMap (https://www.acemap.info)

  16. arXiv:2403.01799  [pdf, other

    cs.CV

    Superpixel Graph Contrastive Clustering with Semantic-Invariant Augmentations for Hyperspectral Images

    Authors: Jianhan Qi, Yuheng Jia, Hui Liu, Junhui Hou

    Abstract: Hyperspectral images (HSI) clustering is an important but challenging task. The state-of-the-art (SOTA) methods usually rely on superpixels, however, they do not fully utilize the spatial and spectral information in HSI 3-D structure, and their optimization targets are not clustering-oriented. In this work, we first use 3-D and 2-D hybrid convolutional neural networks to extract the high-order spa… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  17. arXiv:2403.00799  [pdf, other

    cs.CL cs.AI cs.LG

    An Empirical Study of Data Ability Boundary in LLMs' Math Reasoning

    Authors: Zui Chen, Yezeng Chen, Jiaqi Han, Zhijie Huang, Ji Qi, Yi Zhou

    Abstract: Large language models (LLMs) are displaying emergent abilities for math reasoning tasks,and there is a growing attention on enhancing the ability of open-source LLMs through supervised fine-tuning (SFT).In this paper, we aim to explore a general data strategy for supervised data to help optimize and expand math reasoning ability.Firstly, we determine the ability boundary of reasoning paths augment… ▽ More

    Submitted 23 February, 2024; originally announced March 2024.

    Comments: 33 pages, 5 figures

  18. arXiv:2402.04798  [pdf, other

    cs.CV

    Spiking-PhysFormer: Camera-Based Remote Photoplethysmography with Parallel Spike-driven Transformer

    Authors: Mingxuan Liu, Jiankai Tang, Haoxiang Li, Jiahao Qi, Siwei Li, Kegang Wang, Yuntao Wang, Hong Chen

    Abstract: Artificial neural networks (ANNs) can help camera-based remote photoplethysmography (rPPG) in measuring cardiac activity and physiological signals from facial videos, such as pulse wave, heart rate and respiration rate with better accuracy. However, most existing ANN-based methods require substantial computing resources, which poses challenges for effective deployment on mobile devices. Spiking ne… ▽ More

    Submitted 9 February, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

    Comments: Mingxuan Liu and Jiankai Tang are co-first authors of the article

  19. arXiv:2402.04236  [pdf, other

    cs.CV cs.CL

    CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations

    Authors: Ji Qi, Ming Ding, Weihan Wang, Yushi Bai, Qingsong Lv, Wenyi Hong, Bin Xu, Lei Hou, Juanzi Li, Yuxiao Dong, Jie Tang

    Abstract: Vision-Language Models (VLMs) have demonstrated their broad effectiveness thanks to extensive training in aligning visual instructions to responses. However, such training of conclusive alignment leads models to ignore essential visual reasoning, further resulting in failures in meticulous visual problems and unfaithful responses. Drawing inspiration from human cognition in solving visual problems… ▽ More

    Submitted 22 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

    Comments: 19 pages, 9 figures

  20. arXiv:2401.18058  [pdf, other

    cs.CL cs.LG

    LongAlign: A Recipe for Long Context Alignment of Large Language Models

    Authors: Yushi Bai, Xin Lv, Jiajie Zhang, Yuze He, Ji Qi, Lei Hou, Jie Tang, Yuxiao Dong, Juanzi Li

    Abstract: Extending large language models to effectively handle long contexts requires instruction fine-tuning on input sequences of similar length. To address this, we present LongAlign -- a recipe of the instruction data, training, and evaluation for long context alignment. First, we construct a long instruction-following dataset using Self-Instruct. To ensure the data diversity, it covers a broad range o… ▽ More

    Submitted 31 January, 2024; originally announced January 2024.

  21. arXiv:2401.12436  [pdf, other

    cs.LG cs.CR

    Wasserstein Differential Privacy

    Authors: Chengyi Yang, Jiayin Qi, Aimin Zhou

    Abstract: Differential privacy (DP) has achieved remarkable results in the field of privacy-preserving machine learning. However, existing DP frameworks do not satisfy all the conditions for becoming metrics, which prevents them from deriving better basic private properties and leads to exaggerated values on privacy budgets. We propose Wasserstein differential privacy (WDP), an alternative DP framework to m… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  22. arXiv:2401.11818  [pdf, other

    cs.MM

    MInD: Improving Multimodal Sentiment Analysis via Multimodal Information Disentanglement

    Authors: Weichen Dai, Xingyu Li, Pengbo Hu, Zeyu Wang, Ji Qi, Jianlin Peng, Yi Zhou

    Abstract: Learning effective joint representations has been a central task in multimodal sentiment analysis. Previous methods focus on leveraging the correlations between different modalities and enhancing performance through sophisticated fusion techniques. However, challenges still exist due to the inherent heterogeneity of distinct modalities, which may lead to distributional gap, impeding the full explo… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  23. arXiv:2401.11432  [pdf, other

    cs.RO

    Bimanual Deformable Bag Manipulation Using a Structure-of-Interest Based Latent Dynamics Model

    Authors: Peng Zhou, Pai Zheng, Jiaming Qi, Chenxi Li, Chenguang Yang, David Navarro-Alarcon, Jia Pan

    Abstract: The manipulation of deformable objects by robotic systems presents a significant challenge due to their complex and infinite-dimensional configuration spaces. This paper introduces a novel approach to Deformable Object Manipulation (DOM) by emphasizing the identification and manipulation of Structures of Interest (SOIs) in deformable fabric bags. We propose a bimanual manipulation framework that l… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

  24. arXiv:2401.10518  [pdf, other

    cs.LG

    Spatial-temporal Forecasting for Regions without Observations

    Authors: Xinyu Su, Jianzhong Qi, Egemen Tanin, Yanchuan Chang, Majid Sarvi

    Abstract: Spatial-temporal forecasting plays an important role in many real-world applications, such as traffic forecasting, air pollutant forecasting, crowd-flow forecasting, and so on. State-of-the-art spatial-temporal forecasting models take data-driven approaches and rely heavily on data availability. Such models suffer from accuracy issues when data is incomplete, which is common in reality due to the… ▽ More

    Submitted 19 January, 2024; originally announced January 2024.

    Comments: Accepted by EDBT2024

  25. arXiv:2401.02992  [pdf

    cs.CL cs.AI

    Advanced Unstructured Data Processing for ESG Reports: A Methodology for Structured Transformation and Enhanced Analysis

    Authors: Jiahui Peng, Jing Gao, Xin Tong, Jing Guo, Hang Yang, Jianchuan Qi, Ruiqiao Li, Nan Li, Ming Xu

    Abstract: In the evolving field of corporate sustainability, analyzing unstructured Environmental, Social, and Governance (ESG) reports is a complex challenge due to their varied formats and intricate content. This study introduces an innovative methodology utilizing the "Unstructured Core Library", specifically tailored to address these challenges by transforming ESG reports into structured, analyzable for… ▽ More

    Submitted 4 January, 2024; originally announced January 2024.

  26. arXiv:2401.01577  [pdf, other

    cs.CV

    Test-Time Personalization with Meta Prompt for Gaze Estimation

    Authors: Huan Liu, Julia Qi, Zhenhao Li, Mohammad Hassanpour, Yang Wang, Konstantinos Plataniotis, Yuanhao Yu

    Abstract: Despite the recent remarkable achievement in gaze estimation, efficient and accurate personalization of gaze estimation without labels is a practical problem but rarely touched on in the literature. To achieve efficient personalization, we take inspiration from the recent advances in Natural Language Processing (NLP) by updating a negligible number of parameters, "prompts", at the test time. Speci… ▽ More

    Submitted 12 March, 2024; v1 submitted 3 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024

  27. arXiv:2312.17259  [pdf

    cs.CL cs.AI

    Empowering Working Memory for Large Language Model Agents

    Authors: Jing Guo, Nan Li, Jianchuan Qi, Hang Yang, Ruiqiao Li, Yuzhen Feng, Si Zhang, Ming Xu

    Abstract: Large language models (LLMs) have achieved impressive linguistic capabilities. However, a key limitation persists in their lack of human-like memory faculties. LLMs exhibit constrained memory retention across sequential interactions, hindering complex reasoning. This paper explores the potential of applying cognitive psychology's working memory frameworks, to enhance LLM architecture. The limitati… ▽ More

    Submitted 28 May, 2024; v1 submitted 22 December, 2023; originally announced December 2023.

  28. arXiv:2312.16355  [pdf, other

    cs.DB

    Efficient Cost Modeling of Space-filling Curves

    Authors: Guanli Liu, Lars Kulik, Christian S. Jensen, Tianyi Li, Jianzhong Qi

    Abstract: A space-filling curve (SFC) maps points in a multi-dimensional space to one-dimensional points by discretizing the multi-dimensional space into cells and imposing a linear order on the cells. This way, an SFC enables the indexing of multi-dimensional data using a one-dimensional index such as a B+-tree. Choosing an appropriate SFC is crucial, as different SFCs have different effects on query perfo… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  29. arXiv:2312.05402  [pdf, other

    cs.CL

    Towards Controlled Table-to-Text Generation with Scientific Reasoning

    Authors: Zhixin Guo, Jianping Zhou, Jiexing Qi, Mingxuan Yan, Ziwei He, Guanjie Zheng, Zhouhan Lin, Xinbing Wang, Chenghu Zhou

    Abstract: The sheer volume of scientific experimental results and complex technical statements, often presented in tabular formats, presents a formidable barrier to individuals acquiring preferred information. The realms of scientific reasoning and content generation that adhere to user preferences encounter distinct challenges. In this work, we present a new task for generating fluent and logical descripti… ▽ More

    Submitted 8 December, 2023; originally announced December 2023.

  30. arXiv:2312.04606  [pdf, other

    cs.LG cs.DB

    Urban Region Representation Learning with Attentive Fusion

    Authors: Fengze Sun, Jianzhong Qi, Yanchuan Chang, Xiaoliang Fan, Shanika Karunasekera, Egemen Tanin

    Abstract: An increasing number of related urban data sources have brought forth novel opportunities for learning urban region representations, i.e., embeddings. The embeddings describe latent features of urban regions and enable discovering similar regions for urban planning applications. Existing methods learn an embedding for a region using every different type of region feature data, and subsequently fus… ▽ More

    Submitted 26 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

  31. arXiv:2311.03557  [pdf, other

    cs.LG cs.CV eess.IV

    Spatio-Temporal Similarity Measure based Multi-Task Learning for Predicting Alzheimer's Disease Progression using MRI Data

    Authors: Xulong Wang, Yu Zhang, Menghui Zhou, Tong Liu, Jun Qi, Po Yang

    Abstract: Identifying and utilising various biomarkers for tracking Alzheimer's disease (AD) progression have received many recent attentions and enable helping clinicians make the prompt decisions. Traditional progression models focus on extracting morphological biomarkers in regions of interest (ROIs) from MRI/PET images, such as regional average cortical thickness and regional volume. They are effective… ▽ More

    Submitted 6 November, 2023; originally announced November 2023.

  32. arXiv:2311.03079  [pdf, other

    cs.CV

    CogVLM: Visual Expert for Pretrained Language Models

    Authors: Weihan Wang, Qingsong Lv, Wenmeng Yu, Wenyi Hong, Ji Qi, Yan Wang, Junhui Ji, Zhuoyi Yang, Lei Zhao, Xixuan Song, Jiazheng Xu, Bin Xu, Juanzi Li, Yuxiao Dong, Ming Ding, Jie Tang

    Abstract: We introduce CogVLM, a powerful open-source visual language foundation model. Different from the popular shallow alignment method which maps image features into the input space of language model, CogVLM bridges the gap between the frozen pretrained language model and image encoder by a trainable visual expert module in the attention and FFN layers. As a result, CogVLM enables deep fusion of vision… ▽ More

    Submitted 4 February, 2024; v1 submitted 6 November, 2023; originally announced November 2023.

  33. arXiv:2311.00960  [pdf, other

    cs.DB

    Trajectory Similarity Measurement: An Efficiency Perspective

    Authors: Yanchuan Chang, Egemen Tanin, Gao Cong, Christian S. Jensen, Jianzhong Qi

    Abstract: Trajectories that capture object movement have numerous applications, in which similarity computation between trajectories often plays a key role. Traditionally, the similarity between two trajectories is quantified by means of heuristic measures, e.g., Hausdorff or ERP, that operate directly on the trajectories. In contrast, recent studies exploit deep learning to map trajectories to d-dimensiona… ▽ More

    Submitted 11 June, 2024; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: Accepted by VLDB 2024

  34. arXiv:2310.11466  [pdf, other

    cs.LG cs.AI q-bio.QM

    Protein 3D Graph Structure Learning for Robust Structure-based Protein Property Prediction

    Authors: Yufei Huang, Siyuan Li, Jin Su, Lirong Wu, Odin Zhang, Haitao Lin, Jingqi Qi, Zihan Liu, Zhangyang Gao, Yuyang Liu, Jiangbin Zheng, Stan. ZQ. Li

    Abstract: Protein structure-based property prediction has emerged as a promising approach for various biological tasks, such as protein function prediction and sub-cellular location estimation. The existing methods highly rely on experimental protein structure data and fail in scenarios where these data are unavailable. Predicted protein structures from AI tools (e.g., AlphaFold2) were utilized as alternati… ▽ More

    Submitted 19 October, 2023; v1 submitted 14 October, 2023; originally announced October 2023.

  35. Computational synthesis of locomotive soft robots by topology optimization

    Authors: Hiroki Kobayashi, Farzad Gholami, S. Macrae Montgomery, Masato Tanaka, Liang Yue, Changyoung Yuhn, Yuki Sato, Atsushi Kawamoto, H. Jerry Qi, Tsuyoshi Nomura

    Abstract: Locomotive soft robots (SoRos) have gained prominence due to their adaptability. Traditional locomotive SoRo design is based on limb structures inspired by biological organisms and requires human intervention. Evolutionary robotics, designed using evolutionary algorithms (EAs), have shown potential for automatic design. However, EA-based methods face the challenge of high computational cost when c… ▽ More

    Submitted 24 July, 2024; v1 submitted 17 October, 2023; originally announced October 2023.

    Comments: 36 total pages (27 pages, 9 supplementary pages), 5 Figures, 9 Supplementary figures. 1 Supplementary table

    Journal ref: Sci. Adv. 10, eadn6129 (2024)

  36. arXiv:2310.10590  [pdf, other

    cs.CL

    Mastering the Task of Open Information Extraction with Large Language Models and Consistent Reasoning Environment

    Authors: Ji Qi, Kaixuan Ji, Xiaozhi Wang, Jifan Yu, Kaisheng Zeng, Lei Hou, Juanzi Li, Bin Xu

    Abstract: Open Information Extraction (OIE) aims to extract objective structured knowledge from natural texts, which has attracted growing attention to build dedicated models with human experience. As the large language models (LLMs) have exhibited remarkable in-context learning capabilities, a question arises as to whether the task of OIE can be effectively tackled with this paradigm? In this paper, we exp… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  37. arXiv:2310.10586  [pdf, other

    cs.CV cs.CL

    VidCoM: Fast Video Comprehension through Large Language Models with Multimodal Tools

    Authors: Ji Qi, Kaixuan Ji, Jifan Yu, Duokang Wang, Bin Xu, Lei Hou, Juanzi Li

    Abstract: Building models that comprehends videos and responds specific user instructions is a practical and challenging topic, as it requires mastery of both vision understanding and knowledge reasoning. Compared to language and image modalities, training efficiency remains a serious problem as existing studies train models on massive sparse videos paired with brief descriptions. In this paper, we introduc… ▽ More

    Submitted 27 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

  38. arXiv:2310.10378  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Cross-Lingual Consistency of Factual Knowledge in Multilingual Language Models

    Authors: Jirui Qi, Raquel Fernández, Arianna Bisazza

    Abstract: Multilingual large-scale Pretrained Language Models (PLMs) have been shown to store considerable amounts of factual knowledge, but large variations are observed across languages. With the ultimate goal of ensuring that users with different language backgrounds obtain consistent feedback from the same model, we study the cross-lingual consistency (CLC) of factual knowledge in various multilingual P… ▽ More

    Submitted 9 November, 2023; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP2023 main conference. All code and data are released at https://github.com/Betswish/Cross-Lingual-Consistency

  39. arXiv:2310.08026  [pdf, other

    cs.CV cs.AI

    Beyond Sharing Weights in Decoupling Feature Learning Network for UAV RGB-Infrared Vehicle Re-Identification

    Authors: Xingyue Liu, Jiahao Qi, Chen Chen, Kangcheng Bin, Ping Zhong

    Abstract: Owing to the capacity of performing full-time target search, cross-modality vehicle re-identification (Re-ID) based on unmanned aerial vehicle (UAV) is gaining more attention in both video surveillance and public security. However, this promising and innovative research has not been studied sufficiently due to the data inadequacy issue. Meanwhile, the cross-modality discrepancy and orientation dis… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: 13 pages, 10 figures, 64 citations, submitted to TMM

  40. arXiv:2310.05070   

    cs.CY cs.SI

    CO-ASnet :A Smart Contract Architecture Design based on Blockchain Technology with Active Sensor Networks

    Authors: Feng Liu, Jie Yang, Kun-peng Xu, Cang-long Pu, Jiayin Qi

    Abstract: The influence of opinion leaders impacts different aspects of social finance. How to analyse the utility of opinion leaders' influence in realizing assets on the blockchain and adopt a compliant regulatory scheme is worth exploring and pondering. Taking Musk's call on social media to buy Dogecoin as an example, this paper uses an event study to empirically investigate the phenomenon in which opini… ▽ More

    Submitted 29 June, 2024; v1 submitted 8 October, 2023; originally announced October 2023.

    Comments: This is a study that remains to be optimised and continued to be updated, and all authors have agreed to request a retraction

    MSC Class: 68-06 ACM Class: C.2.0

  41. arXiv:2310.04965  [pdf, other

    cs.CL cs.AI cs.LG

    MULTISCRIPT: Multimodal Script Learning for Supporting Open Domain Everyday Tasks

    Authors: Jingyuan Qi, Minqian Liu, Ying Shen, Zhiyang Xu, Lifu Huang

    Abstract: Automatically generating scripts (i.e. sequences of key steps described in text) from video demonstrations and reasoning about the subsequent steps are crucial to the modern AI virtual assistants to guide humans to complete everyday tasks, especially unfamiliar ones. However, current methods for generative script learning rely heavily on well-structured preceding steps described in text and/or ima… ▽ More

    Submitted 18 January, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

    Comments: Accepted by AAAI 2024. 11 pages, 9 figures, 4 tables

  42. MaaSDB: Spatial Databases in the Era of Large Language Models (Vision Paper)

    Authors: Jianzhong Qi, Zuqing Li, Egemen Tanin

    Abstract: Large language models (LLMs) are advancing rapidly. Such models have demonstrated strong capabilities in learning from large-scale (unstructured) text data and answering user queries. Users do not need to be experts in structured query languages to interact with systems built upon such models. This provides great opportunities to reduce the barrier of information retrieval for the general public.… ▽ More

    Submitted 29 September, 2023; originally announced September 2023.

    Comments: Accepted to appear in ACM SIGSPATIAL 2023

  43. arXiv:2309.11244  [pdf, other

    cs.RO

    Integrating Visual Foundation Models for Enhanced Robot Manipulation and Motion Planning: A Layered Approach

    Authors: Chen Yang, Peng Zhou, Jiaming Qi

    Abstract: This paper presents a novel layered framework that integrates visual foundation models to improve robot manipulation tasks and motion planning. The framework consists of five layers: Perception, Cognition, Planning, Execution, and Learning. Using visual foundation models, we enhance the robot's perception of its environment, enabling more efficient task understanding and accurate motion planning.… ▽ More

    Submitted 20 September, 2023; originally announced September 2023.

    Comments: 3 pages, 2 figures, IEEE Workshop

  44. arXiv:2309.03084  [pdf, other

    cs.AI cs.GT cs.LG

    Pure Monte Carlo Counterfactual Regret Minimization

    Authors: Ju Qi, Ting Feng, Falun Hei, Zhemei Fang, Yunfeng Luo

    Abstract: Counterfactual Regret Minimization (CFR) and its variants are the best algorithms so far for solving large-scale incomplete information games. However, we believe that there are two problems with CFR: First, matrix multiplication is required in CFR iteration, and the time complexity of one iteration is too high; Secondly, the game characteristics in the real world are different. Just using one CFR… ▽ More

    Submitted 13 October, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

  45. arXiv:2309.01189  [pdf, other

    cs.LG cs.AI cs.SE

    LogGPT: Exploring ChatGPT for Log-Based Anomaly Detection

    Authors: Jiaxing Qi, Shaohan Huang, Zhongzhi Luan, Carol Fung, Hailong Yang, Depei Qian

    Abstract: The increasing volume of log data produced by software-intensive systems makes it impractical to analyze them manually. Many deep learning-based methods have been proposed for log-based anomaly detection. These methods face several challenges such as high-dimensional and noisy log data, class imbalance, generalization, and model interpretability. Recently, ChatGPT has shown promising results in va… ▽ More

    Submitted 3 September, 2023; originally announced September 2023.

  46. arXiv:2308.09658  [pdf, other

    cs.CL cs.AI cs.CV

    Tree-of-Mixed-Thought: Combining Fast and Slow Thinking for Multi-hop Visual Reasoning

    Authors: Pengbo Hu, Ji Qi, Xingyu Li, Hong Li, Xinqi Wang, Bing Quan, Ruiyu Wang, Yi Zhou

    Abstract: There emerges a promising trend of using large language models (LLMs) to generate code-like plans for complex inference tasks such as visual reasoning. This paradigm, known as LLM-based planning, provides flexibility in problem solving and endows better interpretability. However, current research is mostly limited to basic scenarios of simple questions that can be straightforward answered in a few… ▽ More

    Submitted 20 August, 2023; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: 16 pages,1 figures, under review

  47. arXiv:2308.00783  [pdf, other

    cs.CV

    Hybrid-SORT: Weak Cues Matter for Online Multi-Object Tracking

    Authors: Mingzhan Yang, Guangxin Han, Bin Yan, Wenhua Zhang, Jinqing Qi, Huchuan Lu, Dong Wang

    Abstract: Multi-Object Tracking (MOT) aims to detect and associate all desired objects across frames. Most methods accomplish the task by explicitly or implicitly leveraging strong cues (i.e., spatial and appearance information), which exhibit powerful instance-level discrimination. However, when object occlusion and clustering occur, spatial and appearance information will become ambiguous simultaneously d… ▽ More

    Submitted 20 January, 2024; v1 submitted 1 August, 2023; originally announced August 2023.

    Comments: Accepted to AAAI 2024

  48. arXiv:2307.12639  [pdf, other

    cs.SI cs.CL cs.GR cs.LG

    Fake News Detection Through Graph-based Neural Networks: A Survey

    Authors: Shuzhi Gong, Richard O. Sinnott, Jianzhong Qi, Cecile Paris

    Abstract: The popularity of online social networks has enabled rapid dissemination of information. People now can share and consume information much more rapidly than ever before. However, low-quality and/or accidentally/deliberately fake information can also spread rapidly. This can lead to considerable and negative impacts on society. Identifying, labelling and debunking online misinformation as early as… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 18 pages, 3 tables, 7 figures

  49. arXiv:2307.11772  [pdf, other

    cs.IR cs.CL cs.LG

    AutoAlign: Fully Automatic and Effective Knowledge Graph Alignment enabled by Large Language Models

    Authors: Rui Zhang, Yixin Su, Bayu Distiawan Trisedya, Xiaoyan Zhao, Min Yang, Hong Cheng, Jianzhong Qi

    Abstract: The task of entity alignment between knowledge graphs (KGs) aims to identify every pair of entities from two different KGs that represent the same entity. Many machine learning-based methods have been proposed for this task. However, to our best knowledge, existing methods all require manually crafted seed alignments, which are expensive to obtain. In this paper, we propose the first fully automat… ▽ More

    Submitted 13 November, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

    Comments: 14 pages, 5 figures, 4 tables, IEEE Transactions on Knowledge and Data Engineering

  50. arXiv:2307.11436  [pdf, other

    math.OC cs.LG eess.SY math.AP

    Neural Operators for PDE Backstepping Control of First-Order Hyperbolic PIDE with Recycle and Delay

    Authors: Jie Qi, Jing Zhang, Miroslav Krstic

    Abstract: The recently introduced DeepONet operator-learning framework for PDE control is extended from the results for basic hyperbolic and parabolic PDEs to an advanced hyperbolic class that involves delays on both the state and the system output or input. The PDE backstepping design produces gain functions that are outputs of a nonlinear operator, mapping functions on a spatial domain into functions on a… ▽ More

    Submitted 14 June, 2024; v1 submitted 21 July, 2023; originally announced July 2023.

    Comments: 20 pages

    Journal ref: Systems & Control Letters, 2024