-
Search for $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0h_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (653 additional authors not shown)
Abstract:
Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and…
▽ More
Using $(2712.4 \pm 14.3) \times 10^6~ψ$(3686) events collected with the BESIII detector operating at the BEPCII collider, we search for the hadronic transition $h_c \to π^+π^-J/ψ$ via $ψ(3686)\to π^0 h_c$. No significant signal is observed. We set the most stringent upper limits to date on the branching fractions $\mathcal{B}(ψ(3686)\to π^0 h_c)\times\mathcal{B}(h_c\toπ^+π^-J/ψ)$ and $\mathcal{B}(h_c \to π^+π^-J/ψ)$ at the 90$\%$ confidence level, which are determined to be $6.7\times 10^{-7}$ and $9.4 \times10^{-4}$, respectively.
△ Less
Submitted 30 August, 2024;
originally announced August 2024.
-
Measurement of the Decay $Ξ^{0}\toΛγ$ with Entangled $Ξ^{0}\barΞ^{0}$ Pairs
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
In this Letter, a systematic study of the weak radiative hyperon decay $Ξ^{0}\toΛγ$ at an electron-positron collider using entangled $Ξ^{0}\barΞ^{0}$ pair events is presented. The absolute branching fraction for this decay has been measured for the first time, and is $\left(1.347 \pm 0.066_{\mathrm stat.}\pm0.054_{\mathrm syst.}\right)\times 10^{-3}$. The decay asymmetry parameter, which character…
▽ More
In this Letter, a systematic study of the weak radiative hyperon decay $Ξ^{0}\toΛγ$ at an electron-positron collider using entangled $Ξ^{0}\barΞ^{0}$ pair events is presented. The absolute branching fraction for this decay has been measured for the first time, and is $\left(1.347 \pm 0.066_{\mathrm stat.}\pm0.054_{\mathrm syst.}\right)\times 10^{-3}$. The decay asymmetry parameter, which characterizes the effect of parity violation in the decay, is determined to be $-0.741 \pm 0.062_{\mathrm stat.}\pm 0.019_{\mathrm syst.}$. The obtained results are consistent with the world average values within the uncertainties, offering valuable insights into the underlying mechanism governing the weak radiative hyperon decays. The charge conjugation parity ($CP$) symmetries of branching fraction and decay asymmetry parameter in the decay are also studied. No statistically significant violation of charge conjugation parity symmetry is observed.
△ Less
Submitted 29 August, 2024; v1 submitted 29 August, 2024;
originally announced August 2024.
-
Seeking the Sufficiency and Necessity Causal Features in Multimodal Representation Learning
Authors:
Boyu Chen,
Junjie Liu,
Zhu Li,
Mengyue yang
Abstract:
Learning representations with a high Probability of Necessary and Sufficient Causes (PNS) has been shown to enhance deep learning models' ability. This task involves identifying causal features that are both sufficient (guaranteeing the outcome) and necessary (without which the outcome cannot occur). However, current research predominantly focuses on unimodal data, and extending PNS learning to mu…
▽ More
Learning representations with a high Probability of Necessary and Sufficient Causes (PNS) has been shown to enhance deep learning models' ability. This task involves identifying causal features that are both sufficient (guaranteeing the outcome) and necessary (without which the outcome cannot occur). However, current research predominantly focuses on unimodal data, and extending PNS learning to multimodal settings presents significant challenges. The challenges arise as the conditions for PNS identifiability, Exogeneity and Monotonicity, need to be reconsidered in a multimodal context, where sufficient and necessary causal features are distributed across different modalities. To address this, we first propose conceptualizing multimodal representations as comprising modality-invariant and modality-specific components. We then analyze PNS identifiability for each component, while ensuring non-trivial PNS estimation. Finally, we formulate tractable optimization objectives that enable multimodal models to learn high-PNS representations, thereby enhancing their predictive performance. Experiments demonstrate the effectiveness of our method on both synthetic and real-world data.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Model-independent determination of the strong-phase difference between $D^0$ and $\bar{D}^0 \to π^+π^-π^+π^-$ decays
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (647 additional authors not shown)
Abstract:
Measurements of the strong-phase difference between $D^0$ and $\bar{D}^0\toπ^+π^-π^+π^-$ are performed in bins of phase space. The study exploits a sample of quantum-correlated $D\bar{D}$ mesons collected by the BESIII experiment in $e^+e^-$ collisions at a center-of-mass energy of 3.773~GeV, corresponding to an integrated luminosity of 2.93~fb$^{-1}$. Here, $D$ denotes a neutral charm meson in a…
▽ More
Measurements of the strong-phase difference between $D^0$ and $\bar{D}^0\toπ^+π^-π^+π^-$ are performed in bins of phase space. The study exploits a sample of quantum-correlated $D\bar{D}$ mesons collected by the BESIII experiment in $e^+e^-$ collisions at a center-of-mass energy of 3.773~GeV, corresponding to an integrated luminosity of 2.93~fb$^{-1}$. Here, $D$ denotes a neutral charm meson in a superposition of flavor eigenstates. The reported results are valuable for measurements of the $C\!P$-violating phase $γ$ (also denoted $φ_3$) in $B^\pm \to DK^\pm$, $D \to π^+π^-π^+π^-$ decays, and the binning schemes are designed to provide good statistical sensitivity to this parameter. The expected uncertainty on $γ$ arising from the precision of the strong-phase measurements, when applied to very large samples of $B$-meson decays, is around $1.5^\circ$ or $2^\circ$, depending on the binning scheme. The binned strong-phase parameters are combined to give a value of $F_+^{4π} = 0.746 \pm 0.010 \pm 0.004$ for the $C\!P$-even fraction of $D^0 \to π^+π^-π^+π^-$ decays, which is around 30\% more precise than the previous best measurement of this quantity.
△ Less
Submitted 29 August, 2024;
originally announced August 2024.
-
Correntropy-Based Improper Likelihood Model for Robust Electrophysiological Source Imaging
Authors:
Yuanhao Li,
Badong Chen,
Zhongxu Hu,
Keita Suzuki,
Wenjun Bai,
Yasuharu Koike,
Okito Yamashita
Abstract:
Bayesian learning provides a unified skeleton to solve the electrophysiological source imaging task. From this perspective, existing source imaging algorithms utilize the Gaussian assumption for the observation noise to build the likelihood function for Bayesian inference. However, the electromagnetic measurements of brain activity are usually affected by miscellaneous artifacts, leading to a pote…
▽ More
Bayesian learning provides a unified skeleton to solve the electrophysiological source imaging task. From this perspective, existing source imaging algorithms utilize the Gaussian assumption for the observation noise to build the likelihood function for Bayesian inference. However, the electromagnetic measurements of brain activity are usually affected by miscellaneous artifacts, leading to a potentially non-Gaussian distribution for the observation noise. Hence the conventional Gaussian likelihood model is a suboptimal choice for the real-world source imaging task. In this study, we aim to solve this problem by proposing a new likelihood model which is robust with respect to non-Gaussian noises. Motivated by the robust maximum correntropy criterion, we propose a new improper distribution model concerning the noise assumption. This new noise distribution is leveraged to structure a robust likelihood function and integrated with hierarchical prior distributions to estimate source activities by variational inference. In particular, the score matching is adopted to determine the hyperparameters for the improper likelihood model. A comprehensive performance evaluation is performed to compare the proposed noise assumption to the conventional Gaussian model. Simulation results show that, the proposed method can realize more precise source reconstruction by designing known ground-truth. The real-world dataset also demonstrates the superiority of our new method with the visual perception task. This study provides a new backbone for Bayesian source imaging, which would facilitate its application using real-world noisy brain signal.
△ Less
Submitted 27 August, 2024;
originally announced August 2024.
-
Channel-wise Influence: Estimating Data Influence for Multivariate Time Series
Authors:
Muyao Wang,
Zeke Xie,
Bo Chen
Abstract:
The influence function, a technique from robust statistics, measures the impact on model parameters or related functions when training data is removed or modified. This effective and valuable post-hoc method allows for studying the interpretability of machine learning models without requiring costly model retraining. It would provide extensions like increasing model performance, improving model ge…
▽ More
The influence function, a technique from robust statistics, measures the impact on model parameters or related functions when training data is removed or modified. This effective and valuable post-hoc method allows for studying the interpretability of machine learning models without requiring costly model retraining. It would provide extensions like increasing model performance, improving model generalization, and offering interpretability. Recently, Multivariate Time Series (MTS) analysis has become an important yet challenging task, attracting significant attention. However, there is no preceding research on the influence functions of MTS to shed light on the effects of modifying the channel of training MTS. Given that each channel in an MTS plays a crucial role in its analysis, it is essential to characterize the influence of different channels. To fill this gap, we propose a channel-wise influence function, which is the first method that can estimate the influence of different channels in MTS, utilizing a first-order gradient approximation that leverages the more informative average gradient of the data set. Additionally, we demonstrate how this influence function can be used to estimate the impact of a channel in MTS. Finally, we validated the accuracy and effectiveness of our influence estimation function in critical MTS analysis tasks, such as MTS anomaly detection and MTS forecasting. According to abundant experiments on real-world dataset, the original influence function performs worse than our method and even fail for the channel pruning problem, which demonstrate the superiority and necessity of channel-wise influence function in MTS analysis tasks.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Morphology of molecular clouds at kiloparsec scale in the Milky Way: Shear-induced alignment and vertical confinement
Authors:
Yi-Heng Xie,
Guang-Xing Li,
Bing-Qiu Chen
Abstract:
The shape of the cold interstellar molecular gas is determined by several processes, including self-gravity, tidal force, turbulence, magnetic field, and galactic shear. Based on the 3D dust extinction map derived by Vergely et al., we identify a sample of 550 molecular clouds (MCs) within $3\hspace{0.2em}\rm kpc$ of the solar vicinity in the Galactic disk. Our sample contains clouds whose size ra…
▽ More
The shape of the cold interstellar molecular gas is determined by several processes, including self-gravity, tidal force, turbulence, magnetic field, and galactic shear. Based on the 3D dust extinction map derived by Vergely et al., we identify a sample of 550 molecular clouds (MCs) within $3\hspace{0.2em}\rm kpc$ of the solar vicinity in the Galactic disk. Our sample contains clouds whose size ranges from pc to kiloparsec, which enables us to study the effect of Galactic-scale processes, such as shear, on cloud evolution. We find that our sample clouds follow a power-law mass-size relation of $M\propto 32.00\hspace{0.2em}{R_{\rm{max}}}^{1.77}$, $M\propto 20.59\hspace{0.2em}{R_S}^{2.04}$ and $M\propto 14.41\hspace{0.2em}{R_V}^{2.29}$, where $R_{\rm{max}}$ is the major axis-based cloud radius, $R_S$ is the area-based radius, and $R_V$ is the volume-based radius, respectively. These clouds have a mean constant surface density of $\sim 7 \hspace{0.2em} \rm {M_{\odot}pc^{-2}}$, and follow a volume density-size relation of $ρ\propto 2.60\hspace{0.2em}{R_{\rm{max}}}^{-0.55}$. As cloud size increases, their shapes gradually transition from ellipsoidal to disk-like to bar-like structures. Large clouds tend to have a pitch angle of $28^{\circ} - 45^{\circ}$, where the angle is measured concerning the Galactic tangential direction. These giant clouds also tend to stay parallel to the Galactic disk plane and are confined within the Galactic molecular gas disk. Our results show that large molecular clouds in the Milky Way can be shaped by Galactic shear and confined in the vertical direction by gravity.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Disentangled Generative Graph Representation Learning
Authors:
Xinyue Hu,
Zhibin Duan,
Xinyang Liu,
Yuxin Li,
Bo Chen,
Mingyuan Zhou
Abstract:
Recently, generative graph models have shown promising results in learning graph representations through self-supervised methods. However, most existing generative graph representation learning (GRL) approaches rely on random masking across the entire graph, which overlooks the entanglement of learned representations. This oversight results in non-robustness and a lack of explainability. Furthermo…
▽ More
Recently, generative graph models have shown promising results in learning graph representations through self-supervised methods. However, most existing generative graph representation learning (GRL) approaches rely on random masking across the entire graph, which overlooks the entanglement of learned representations. This oversight results in non-robustness and a lack of explainability. Furthermore, disentangling the learned representations remains a significant challenge and has not been sufficiently explored in GRL research. Based on these insights, this paper introduces DiGGR (Disentangled Generative Graph Representation Learning), a self-supervised learning framework. DiGGR aims to learn latent disentangled factors and utilizes them to guide graph mask modeling, thereby enhancing the disentanglement of learned representations and enabling end-to-end joint learning. Extensive experiments on 11 public datasets for two different graph learning tasks demonstrate that DiGGR consistently outperforms many previous self-supervised methods, verifying the effectiveness of the proposed approach.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
What Do You Want? User-centric Prompt Generation for Text-to-image Synthesis via Multi-turn Guidance
Authors:
Yilun Liu,
Minggui He,
Feiyu Yao,
Yuhe Ji,
Shimin Tao,
Jingzhou Du,
Duan Li,
Jian Gao,
Li Zhang,
Hao Yang,
Boxing Chen,
Osamu Yoshie
Abstract:
The emergence of text-to-image synthesis (TIS) models has significantly influenced digital image creation by producing high-quality visuals from written descriptions. Yet these models heavily rely on the quality and specificity of textual prompts, posing a challenge for novice users who may not be familiar with TIS-model-preferred prompt writing. Existing solutions relieve this via automatic model…
▽ More
The emergence of text-to-image synthesis (TIS) models has significantly influenced digital image creation by producing high-quality visuals from written descriptions. Yet these models heavily rely on the quality and specificity of textual prompts, posing a challenge for novice users who may not be familiar with TIS-model-preferred prompt writing. Existing solutions relieve this via automatic model-preferred prompt generation from user queries. However, this single-turn manner suffers from limited user-centricity in terms of result interpretability and user interactivity. To address these issues, we propose DialPrompt, a multi-turn dialogue-based TIS prompt generation model that emphasises user-centricity. DialPrompt is designed to follow a multi-turn guidance workflow, where in each round of dialogue the model queries user with their preferences on possible optimization dimensions before generating the final TIS prompt. To achieve this, we mined 15 essential dimensions for high-quality prompts from advanced users and curated a multi-turn dataset. Through training on this dataset, DialPrompt can improve interpretability by allowing users to understand the correlation between specific phrases and image attributes. Additionally, it enables greater user control and engagement in the prompt generation process, leading to more personalized and visually satisfying outputs. Experiments indicate that DialPrompt achieves a competitive result in the quality of synthesized images, outperforming existing prompt engineering approaches by 5.7%. Furthermore, in our user evaluation, DialPrompt outperforms existing approaches by 46.5% in user-centricity score and is rated 7.9/10 by 19 human reviewers.
△ Less
Submitted 23 August, 2024;
originally announced August 2024.
-
Unrolled Decomposed Unpaired Learning for Controllable Low-Light Video Enhancement
Authors:
Lingyu Zhu,
Wenhan Yang,
Baoliang Chen,
Hanwei Zhu,
Zhangkai Ni,
Qi Mao,
Shiqi Wang
Abstract:
Obtaining pairs of low/normal-light videos, with motions, is more challenging than still images, which raises technical issues and poses the technical route of unpaired learning as a critical role. This paper makes endeavors in the direction of learning for low-light video enhancement without using paired ground truth. Compared to low-light image enhancement, enhancing low-light videos is more dif…
▽ More
Obtaining pairs of low/normal-light videos, with motions, is more challenging than still images, which raises technical issues and poses the technical route of unpaired learning as a critical role. This paper makes endeavors in the direction of learning for low-light video enhancement without using paired ground truth. Compared to low-light image enhancement, enhancing low-light videos is more difficult due to the intertwined effects of noise, exposure, and contrast in the spatial domain, jointly with the need for temporal coherence. To address the above challenge, we propose the Unrolled Decomposed Unpaired Network (UDU-Net) for enhancing low-light videos by unrolling the optimization functions into a deep network to decompose the signal into spatial and temporal-related factors, which are updated iteratively. Firstly, we formulate low-light video enhancement as a Maximum A Posteriori estimation (MAP) problem with carefully designed spatial and temporal visual regularization. Then, via unrolling the problem, the optimization of the spatial and temporal constraints can be decomposed into different steps and updated in a stage-wise manner. From the spatial perspective, the designed Intra subnet leverages unpair prior information from expert photography retouched skills to adjust the statistical distribution. Additionally, we introduce a novel mechanism that integrates human perception feedback to guide network optimization, suppressing over/under-exposure conditions. Meanwhile, to address the issue from the temporal perspective, the designed Inter subnet fully exploits temporal cues in progressive optimization, which helps achieve improved temporal consistency in enhancement results. Consequently, the proposed method achieves superior performance to state-of-the-art methods in video illumination, noise suppression, and temporal consistency across outdoor and indoor scenes.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Toward End-to-End Bearing Fault Diagnosis for Industrial Scenarios with Spiking Neural Networks
Authors:
Yongqi Ding,
Lin Zuo,
Mengmeng Jing,
Kunshan Yang,
Biao Chen,
Yunqian Yu
Abstract:
Spiking neural networks (SNNs) transmit information via low-power binary spikes and have received widespread attention in areas such as computer vision and reinforcement learning. However, there have been very few explorations of SNNs in more practical industrial scenarios. In this paper, we focus on the application of SNNs in bearing fault diagnosis to facilitate the integration of high-performan…
▽ More
Spiking neural networks (SNNs) transmit information via low-power binary spikes and have received widespread attention in areas such as computer vision and reinforcement learning. However, there have been very few explorations of SNNs in more practical industrial scenarios. In this paper, we focus on the application of SNNs in bearing fault diagnosis to facilitate the integration of high-performance AI algorithms and real-world industries. In particular, we identify two key limitations of existing SNN fault diagnosis methods: inadequate encoding capacity that necessitates cumbersome data preprocessing, and non-spike-oriented architectures that constrain the performance of SNNs. To alleviate these problems, we propose a Multi-scale Residual Attention SNN (MRA-SNN) to simultaneously improve the efficiency, performance, and robustness of SNN methods. By incorporating a lightweight attention mechanism, we have designed a multi-scale attention encoding module to extract multiscale fault features from vibration signals and encode them as spatio-temporal spikes, eliminating the need for complicated preprocessing. Then, the spike residual attention block extracts high-dimensional fault features and enhances the expressiveness of sparse spikes with the attention mechanism for end-to-end diagnosis. In addition, the performance and robustness of MRA-SNN is further enhanced by introducing the lightweight attention mechanism within the spiking neurons to simulate the biological dendritic filtering effect. Extensive experiments on MFPT and JNU benchmark datasets demonstrate that MRA-SNN significantly outperforms existing methods in terms of accuracy, energy consumption and noise robustness, and is more feasible for deployment in real-world industrial scenarios.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Authors:
Jian Chen,
Vashisth Tiwari,
Ranajoy Sadhukhan,
Zhuoming Chen,
Jinyuan Shi,
Ian En-Hsu Yen,
Beidi Chen
Abstract:
Large Language Models (LLMs) have become more prevalent in long-context applications such as interactive chatbots, document analysis, and agent workflows, but it is challenging to serve long-context requests with low latency and high throughput. Speculative decoding (SD) is a widely used technique to reduce latency without sacrificing performance but the conventional wisdom suggests that its effic…
▽ More
Large Language Models (LLMs) have become more prevalent in long-context applications such as interactive chatbots, document analysis, and agent workflows, but it is challenging to serve long-context requests with low latency and high throughput. Speculative decoding (SD) is a widely used technique to reduce latency without sacrificing performance but the conventional wisdom suggests that its efficacy is limited to small batch sizes. In MagicDec, we show that surprisingly SD can achieve speedup even for a high throughput inference regime for moderate to long sequences. More interestingly, an intelligent drafting strategy can achieve better speedup with increasing batch size based on our rigorous analysis. MagicDec first identifies the bottleneck shifts with increasing batch size and sequence length, and uses these insights to deploy speculative decoding more effectively for high throughput inference. Then, it leverages draft models with sparse KV cache to address the KV bottleneck that scales with both sequence length and batch size. This finding underscores the broad applicability of speculative decoding in long-context serving, as it can enhance throughput and reduce latency without compromising accuracy. For moderate to long sequences, we demonstrate up to 2x speedup for LLaMA-2-7B-32K and 1.84x speedup for LLaMA-3.1-8B when serving batch sizes ranging from 32 to 256 on 8 NVIDIA A100 GPUs. The code is available at https://github.com/Infini-AI-Lab/MagicDec/.
△ Less
Submitted 23 August, 2024; v1 submitted 20 August, 2024;
originally announced August 2024.
-
Efficient and Deployable Knowledge Infusion for Open-World Recommendations via Large Language Models
Authors:
Yunjia Xi,
Weiwen Liu,
Jianghao Lin,
Muyan Weng,
Xiaoling Cai,
Hong Zhu,
Jieming Zhu,
Bo Chen,
Ruiming Tang,
Yong Yu,
Weinan Zhang
Abstract:
Recommender systems (RSs) play a pervasive role in today's online services, yet their closed-loop nature constrains their access to open-world knowledge. Recently, large language models (LLMs) have shown promise in bridging this gap. However, previous attempts to directly implement LLMs as recommenders fall short in meeting the requirements of industrial RSs, particularly in terms of online infere…
▽ More
Recommender systems (RSs) play a pervasive role in today's online services, yet their closed-loop nature constrains their access to open-world knowledge. Recently, large language models (LLMs) have shown promise in bridging this gap. However, previous attempts to directly implement LLMs as recommenders fall short in meeting the requirements of industrial RSs, particularly in terms of online inference latency and offline resource efficiency. Thus, we propose REKI to acquire two types of external knowledge about users and items from LLMs. Specifically, we introduce factorization prompting to elicit accurate knowledge reasoning on user preferences and items. We develop individual knowledge extraction and collective knowledge extraction tailored for different scales of scenarios, effectively reducing offline resource consumption. Subsequently, generated knowledge undergoes efficient transformation and condensation into augmented vectors through a hybridized expert-integrated network, ensuring compatibility. The obtained vectors can then be used to enhance any conventional recommendation model. We also ensure efficient inference by preprocessing and prestoring the knowledge from LLMs. Experiments demonstrate that REKI outperforms state-of-the-art baselines and is compatible with lots of recommendation algorithms and tasks. Now, REKI has been deployed to Huawei's news and music recommendation platforms and gained a 7% and 1.99% improvement during the online A/B test.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Sliced Maximal Information Coefficient: A Training-Free Approach for Image Quality Assessment Enhancement
Authors:
Kang Xiao,
Xu Wang,
Yulin He,
Baoliang Chen,
Xuelin Shen
Abstract:
Full-reference image quality assessment (FR-IQA) models generally operate by measuring the visual differences between a degraded image and its reference. However, existing FR-IQA models including both the classical ones (eg, PSNR and SSIM) and deep-learning based measures (eg, LPIPS and DISTS) still exhibit limitations in capturing the full perception characteristics of the human visual system (HV…
▽ More
Full-reference image quality assessment (FR-IQA) models generally operate by measuring the visual differences between a degraded image and its reference. However, existing FR-IQA models including both the classical ones (eg, PSNR and SSIM) and deep-learning based measures (eg, LPIPS and DISTS) still exhibit limitations in capturing the full perception characteristics of the human visual system (HVS). In this paper, instead of designing a new FR-IQA measure, we aim to explore a generalized human visual attention estimation strategy to mimic the process of human quality rating and enhance existing IQA models. In particular, we model human attention generation by measuring the statistical dependency between the degraded image and the reference image. The dependency is captured in a training-free manner by our proposed sliced maximal information coefficient and exhibits surprising generalization in different IQA measures. Experimental results verify the performance of existing IQA models can be consistently improved when our attention module is incorporated. The source code is available at https://github.com/KANGX99/SMIC.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
HybridOcc: NeRF Enhanced Transformer-based Multi-Camera 3D Occupancy Prediction
Authors:
Xiao Zhao,
Bo Chen,
Mingyang Sun,
Dingkang Yang,
Youxing Wang,
Xukun Zhang,
Mingcheng Li,
Dongliang Kou,
Xiaoyi Wei,
Lihua Zhang
Abstract:
Vision-based 3D semantic scene completion (SSC) describes autonomous driving scenes through 3D volume representations. However, the occlusion of invisible voxels by scene surfaces poses challenges to current SSC methods in hallucinating refined 3D geometry. This paper proposes HybridOcc, a hybrid 3D volume query proposal method generated by Transformer framework and NeRF representation and refined…
▽ More
Vision-based 3D semantic scene completion (SSC) describes autonomous driving scenes through 3D volume representations. However, the occlusion of invisible voxels by scene surfaces poses challenges to current SSC methods in hallucinating refined 3D geometry. This paper proposes HybridOcc, a hybrid 3D volume query proposal method generated by Transformer framework and NeRF representation and refined in a coarse-to-fine SSC prediction framework. HybridOcc aggregates contextual features through the Transformer paradigm based on hybrid query proposals while combining it with NeRF representation to obtain depth supervision. The Transformer branch contains multiple scales and uses spatial cross-attention for 2D to 3D transformation. The newly designed NeRF branch implicitly infers scene occupancy through volume rendering, including visible and invisible voxels, and explicitly captures scene depth rather than generating RGB color. Furthermore, we present an innovative occupancy-aware ray sampling method to orient the SSC task instead of focusing on the scene surface, further improving the overall performance. Extensive experiments on nuScenes and SemanticKITTI datasets demonstrate the effectiveness of our HybridOcc on the SSC task.
△ Less
Submitted 17 August, 2024;
originally announced August 2024.
-
Search for the rare decay $J/ψ\to γD^0+c.c.$ at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (642 additional authors not shown)
Abstract:
Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level.
Using $(10087\pm44)\times10^6J/ψ$ events collected with the BESIII detector, we search for the rare decay $J/ψ\to γD^0+c.c.$ for the first time. No obvious signal is observed and the upper limit on the branching fraction is determined to be ${\cal B}(J/ψ\to γD^{0}+c.c.)< 9.1 \times 10^{-8}$ at 90\% confidence level.
△ Less
Submitted 16 August, 2024;
originally announced August 2024.
-
AIE: Auction Information Enhanced Framework for CTR Prediction in Online Advertising
Authors:
Yang Yang,
Bo Chen,
Chenxu Zhu,
Menghui Zhu,
Xinyi Dai,
Huifeng Guo,
Muyu Zhang,
Zhenhua Dong,
Ruiming Tang
Abstract:
Click-Through Rate (CTR) prediction is a fundamental technique for online advertising recommendation and the complex online competitive auction process also brings many difficulties to CTR optimization. Recent studies have shown that introducing posterior auction information contributes to the performance of CTR prediction. However, existing work doesn't fully capitalize on the benefits of auction…
▽ More
Click-Through Rate (CTR) prediction is a fundamental technique for online advertising recommendation and the complex online competitive auction process also brings many difficulties to CTR optimization. Recent studies have shown that introducing posterior auction information contributes to the performance of CTR prediction. However, existing work doesn't fully capitalize on the benefits of auction information and overlooks the data bias brought by the auction, leading to biased and suboptimal results. To address these limitations, we propose Auction Information Enhanced Framework (AIE) for CTR prediction in online advertising, which delves into the problem of insufficient utilization of auction signals and first reveals the auction bias. Specifically, AIE introduces two pluggable modules, namely Adaptive Market-price Auxiliary Module (AM2) and Bid Calibration Module (BCM), which work collaboratively to excavate the posterior auction signals better and enhance the performance of CTR prediction. Furthermore, the two proposed modules are lightweight, model-agnostic, and friendly to inference latency. Extensive experiments are conducted on a public dataset and an industrial dataset to demonstrate the effectiveness and compatibility of AIE. Besides, a one-month online A/B test in a large-scale advertising platform shows that AIE improves the base model by 5.76% and 2.44% in terms of eCPM and CTR, respectively.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
RSEA-MVGNN: Multi-View Graph Neural Network with Reliable Structural Enhancement and Aggregation
Authors:
Junyu Chen,
Long Shi,
Badong Chen
Abstract:
Graph Neural Networks (GNNs) have exhibited remarkable efficacy in learning from multi-view graph data. In the framework of multi-view graph neural networks, a critical challenge lies in effectively combining diverse views, where each view has distinct graph structure features (GSFs). Existing approaches to this challenge primarily focus on two aspects: 1) prioritizing the most important GSFs, 2)…
▽ More
Graph Neural Networks (GNNs) have exhibited remarkable efficacy in learning from multi-view graph data. In the framework of multi-view graph neural networks, a critical challenge lies in effectively combining diverse views, where each view has distinct graph structure features (GSFs). Existing approaches to this challenge primarily focus on two aspects: 1) prioritizing the most important GSFs, 2) utilizing GNNs for feature aggregation. However, prioritizing the most important GSFs can lead to limited feature diversity, and existing GNN-based aggregation strategies equally treat each view without considering view quality. To address these issues, we propose a novel Multi-View Graph Neural Network with Reliable Structural Enhancement and Aggregation (RSEA-MVGNN). Firstly, we estimate view-specific uncertainty employing subjective logic. Based on this uncertainty, we design reliable structural enhancement by feature de-correlation algorithm. This approach enables each enhancement to focus on different GSFs, thereby achieving diverse feature representation in the enhanced structure. Secondly, the model learns view-specific beliefs and uncertainty as opinions, which are utilized to evaluate view quality. Based on these opinions, the model enables high-quality views to dominate GNN aggregation, thereby facilitating representation learning. Experimental results conducted on five real-world datasets demonstrate that RSEA-MVGNN outperforms several state-of-the-art GNN-based methods.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
Vestigial Gapless Boson Density Wave Emerging between $ν= 1/2$ Fractional Chern Insulator and Finite-Momentum Supersolid
Authors:
Hongyu Lu,
Han-Qing Wu,
Bin-Bin Chen,
Zi Yang Meng
Abstract:
The roton-triggered charge-density-wave (CDW)is widely studied in fractional quantum Hall (FQH) and fractional Chern insulator (FCI) systems, and there also exist field theoretical and numerical realizations of continuous transition from FCI to superfluid (SF). However, the theory and numerical explorations of the transition between FCI and supersolid (SS) are still lacking. In this work, we study…
▽ More
The roton-triggered charge-density-wave (CDW)is widely studied in fractional quantum Hall (FQH) and fractional Chern insulator (FCI) systems, and there also exist field theoretical and numerical realizations of continuous transition from FCI to superfluid (SF). However, the theory and numerical explorations of the transition between FCI and supersolid (SS) are still lacking. In this work, we study the topological flat-band lattice models with $ν$ = 1/2 hard-core bosons, where the previous studies have discovered the existence of FCI states and possible direct FCI-SS transitions. While the FCI is robust, we find the direct FCI-SS transition is absent, and there exist more intriguing scenarios. In the case of checkerboard lattice, we find an intermediate gapless CDW state without SF, sandwiched between FCI and SS. This novel state is triggered by the roton instability in FCI and it further continuously brings about the intertwined finite-momentum SF fluctuation when the CDW order is strong enough, eventually transiting into an unconventional finite-momentum SS state. The intermediate gapless CDW state is a vestige from the SS state, since the increasing quantum fluctuation melts only the Larkin-Ovchinnikov-type SF order in SS but its (secondary) product -- the CDW order -- survives. On honeycomb lattice, we find no evidence of SS, but discover an interesting sequence of FCI-Solid I-Solid II transitions, with both solids incompressible. Moreover, in contrast to previous single-roton condensation, this sequence of FCI-Solid I-Solid II transitions is triggered by the softening of multi-roton modes in FCI. Considering the intertwined wave vectors of the CDW orders, Solid I is a vestige of Solid II. Our work provides new horizon not only for the quantum phase transitions in FCI but also for the intertwined orders and gapless states in bosonic systems, which will inspire future studies.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Search for $η_c(2S)\toωω$ and $ωφ$ decays and measurements of $χ_{cJ}\toωω$ and $ωφ$ in $ψ(2S)$ radiative processes
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $(2712\pm 14)$ $\times$ 10$^{6}$ $ψ(2S)$ events collected with the BESIII detector at the BEPCII collider, we search for the decays $η_{c}(2S)\toωω$ and $η_{c}(2S)\toωφ$ via the process $ψ(2S)\toγη_{c}(2S)$. Evidence of $η_{c}(2S)\toωω$ is found with a statistical significance of $3.2σ$. The branching fraction is measured to be…
▽ More
Using $(2712\pm 14)$ $\times$ 10$^{6}$ $ψ(2S)$ events collected with the BESIII detector at the BEPCII collider, we search for the decays $η_{c}(2S)\toωω$ and $η_{c}(2S)\toωφ$ via the process $ψ(2S)\toγη_{c}(2S)$. Evidence of $η_{c}(2S)\toωω$ is found with a statistical significance of $3.2σ$. The branching fraction is measured to be $\mathcal{B}(η_{c}(2S)\toωω)=(5.65\pm3.77(\rm stat.)\pm5.32(\rm syst.))\times10^{-4}$. No statistically significant signal is observed for the decay $η_{c}(2S)\toωφ$. The upper limit of the branching fraction at the 90\% confidence level is determined to be $\mathcal{B}(ψ(2S)\toγη_{c}(2S),η_{c}(2S)\toωφ)<2.24\times 10^{-7}$. We also update the branching fractions of $χ_{cJ}\to ωω$ and $χ_{cJ}\toωφ$ decays via the $ψ(2S)\toγχ_{cJ}$ transition. The branching fractions are determined to be $\mathcal{B}(χ_{c0}\toωω)=(10.63\pm0.11\pm0.46)\times 10^{-4}$, $\mathcal{B}(χ_{c1}\toωω)=(6.39\pm0.07\pm0.29)\times 10^{-4}$, $\mathcal{B}(χ_{c2}\toωω)=(8.50\pm0.08\pm0.38)\times 10^{-4}$, $\mathcal{B}(χ_{c0}\toωφ)=(1.18\pm0.03\pm0.05)\times 10^{-4}$, $\mathcal{B}(χ_{c1}\toωφ)=(2.03\pm0.15\pm0.12)\times 10^{-5}$, and $\mathcal{B}(χ_{c2}\toωφ)=(9.37\pm1.07\pm0.59)\times 10^{-6}$, where the first uncertainties are statistical and the second are systematic.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Observation of vortex stripes in UTe$_2$
Authors:
Y. F. Wang,
H. X. Yao,
T. Winyard,
Christopher Broyles,
Shannon Gould,
Q. S. He,
P. H. Zhang,
K. Z. Yao,
J. J. Zhu,
B. K. Xiang,
K. Y. Liang,
Z. J. Li,
B. R. Chen,
Q. Z. Zhou,
D. F. Agterberg,
E. Babaev,
S. Ran,
Y. H. Wang
Abstract:
Quantum vortices are fundamentally important for properties of superconductors. In conventional type-II superconductor they determine the magnetic response of the system and tend to form regular lattices. UTe$_2$ is a recently discovered heavy fermion superconductor exhibiting many anomalous macroscopic behaviors. However, the question whether it has a multicomponent order parameter remains open.…
▽ More
Quantum vortices are fundamentally important for properties of superconductors. In conventional type-II superconductor they determine the magnetic response of the system and tend to form regular lattices. UTe$_2$ is a recently discovered heavy fermion superconductor exhibiting many anomalous macroscopic behaviors. However, the question whether it has a multicomponent order parameter remains open. Here, we study magnetic properties of UTe$_2$ by employing scanning superconducting quantum interference device microscopy. We find vortex behavior which is very different from that in ordinary superconductors. We imaged vortices generated by cooling in magnetic field applied along different crystalline directions. While a small out-of-plane magnetic field produces typical isolated vortices, higher field generates vortex stripe patterns which evolve with vortex density. The stripes form at different locations and along different directions in the surface plane when the vortices are crystalized along the crystalline b or c axes. The behavior is reproduced by our simulation based on an anisotropic two-component order parameter. This study shows that UTe$_2$ has a nontrivial disparity of multiple length scales, placing constraints on multicomponent superconductivity. The tendency of vortex stripe formation and their control by external field may be useful in fluxonics applications.
△ Less
Submitted 12 August, 2024;
originally announced August 2024.
-
Observation of single-quantum vortex splitting in the Ba$_{1-x}$K$_x$Fe$_2$As$_2$ superconductor
Authors:
Q. Z. Zhou,
B. R. Chen,
B. K. Xiang,
I. Timoshuk,
J. Garaud,
Y. Li,
K. Y. Liang,
Q. S. He,
Z. J. Li,
P. H. Zhang,
K. Z. Yao,
H. X. Yao,
E. Babaev,
V. Grinenko,
Y. H. Wang
Abstract:
Since their theoretical discovery more than a half-century ago, vortices observed in bulk superconductors have carried a quantized value of magnetic flux determined only by fundamental constants. A recent experiment reported 'unquantized' quantum vortices carrying the same fraction of flux quantum in Ba$_{0.23}$K$_{0.77}$Fe$_2$As$_2$ in a small temperature range below its superconducting critical…
▽ More
Since their theoretical discovery more than a half-century ago, vortices observed in bulk superconductors have carried a quantized value of magnetic flux determined only by fundamental constants. A recent experiment reported 'unquantized' quantum vortices carrying the same fraction of flux quantum in Ba$_{0.23}$K$_{0.77}$Fe$_2$As$_2$ in a small temperature range below its superconducting critical temperature ($T_C$). Here, we use scanning superconducting quantum interference device (sSQUID) microscopy with improved sensitivity to investigate the genesis of fractional vortices in Ba$_{0.23}$K$_{0.77}$Fe$_2$As$_2$. We report the direct observation of a single-flux quantum vortex splitting into two different fractions with increasing temperature. The flux of the two fractions has opposite dependence on temperature, while the total flux sums up to one flux quantum despite their spatial separation. Overall, our study shows the existence of different fractional vortices and their stability in temperature ranging from 0.1 to 0.99 $T_C$. Besides the implications of this observation for the fundamental question of quantum vorticity, the discovery of these objects paves the way for the new platform for anyon quasiparticles and applications for fractional fluxonics.
△ Less
Submitted 27 August, 2024; v1 submitted 11 August, 2024;
originally announced August 2024.
-
Divergence Maximizing Linear Projection for Supervised Dimension Reduction
Authors:
Biao Chen,
Joshua Kortje
Abstract:
This paper proposes two linear projection methods for supervised dimension reduction using only the first and second-order statistics. The methods, each catering to a different parameter regime, are derived under the general Gaussian model by maximizing the Kullback-Leibler divergence between the two classes in the projected sample for a binary classification problem. They subsume existing linear…
▽ More
This paper proposes two linear projection methods for supervised dimension reduction using only the first and second-order statistics. The methods, each catering to a different parameter regime, are derived under the general Gaussian model by maximizing the Kullback-Leibler divergence between the two classes in the projected sample for a binary classification problem. They subsume existing linear projection approaches developed under simplifying assumptions of Gaussian distributions, such as these distributions might share an equal mean or covariance matrix. As a by-product, we establish that the multi-class linear discriminant analysis, a celebrated method for classification and supervised dimension reduction, is provably optimal for maximizing pairwise Kullback-Leibler divergence when the Gaussian populations share an identical covariance matrix. For the case when the Gaussian distributions share an equal mean, we establish conditions under which the optimal subspace remains invariant regardless of how the Kullback-Leibler divergence is defined, despite the asymmetry of the divergence measure itself. Such conditions encompass the classical case of signal plus noise, where both the signal and noise have zero mean and arbitrary covariance matrices. Experiments are conducted to validate the proposed solutions, demonstrate their superior performance over existing alternatives, and illustrate the procedure for selecting the appropriate linear projection solution.
△ Less
Submitted 11 August, 2024;
originally announced August 2024.
-
A Decoding Acceleration Framework for Industrial Deployable LLM-based Recommender Systems
Authors:
Yunjia Xi,
Hangyu Wang,
Bo Chen,
Jianghao Lin,
Menghui Zhu,
Weiwen Liu,
Ruiming Tang,
Weinan Zhang,
Yong Yu
Abstract:
Recently, increasing attention has been paid to LLM-based recommender systems, but their deployment is still under exploration in the industry. Most deployments utilize LLMs as feature enhancers, generating augmentation knowledge in the offline stage. However, in recommendation scenarios, involving numerous users and items, even offline generation with LLMs consumes considerable time and resources…
▽ More
Recently, increasing attention has been paid to LLM-based recommender systems, but their deployment is still under exploration in the industry. Most deployments utilize LLMs as feature enhancers, generating augmentation knowledge in the offline stage. However, in recommendation scenarios, involving numerous users and items, even offline generation with LLMs consumes considerable time and resources. This generation inefficiency stems from the autoregressive nature of LLMs, and a promising direction for acceleration is speculative decoding, a Draft-then-Verify paradigm that increases the number of generated tokens per decoding step. In this paper, we first identify that recommendation knowledge generation is suitable for retrieval-based speculative decoding. Then, we discern two characteristics: (1) extensive items and users in RSs bring retrieval inefficiency, and (2) RSs exhibit high diversity tolerance for text generated by LLMs. Based on the above insights, we propose a Decoding Acceleration Framework for LLM-based Recommendation (dubbed DARE), with Customized Retrieval Pool to improve retrieval efficiency and Relaxed Verification to increase the acceptance rate of draft tokens, respectively. Extensive experiments demonstrate that DARE achieves a 3-5x speedup and is compatible with various frameworks and backbone LLMs. DARE has also been deployed to online advertising scenarios within a large-scale commercial environment, achieving a 3.45x speedup while maintaining the downstream performance.
△ Less
Submitted 10 August, 2024;
originally announced August 2024.
-
Recent Advances in Metallic Riemannian Geometry: A Comprehensive Review
Authors:
Bang-Yen Chen,
Majid Ali Choudhary,
Afshan Perween
Abstract:
Metallic structures, introduced by V. de Spinadel in 2002, opened a new avenue in differential geometry. Building upon this concept, C. E. Hreţcanu and M. Crasmareanu laid the foundation for metallic Riemannian manifolds in 2013. The field's rich potential and diverse applications have since attracted significant research efforts, leading to a wealth of valuable insights. This review delves into t…
▽ More
Metallic structures, introduced by V. de Spinadel in 2002, opened a new avenue in differential geometry. Building upon this concept, C. E. Hreţcanu and M. Crasmareanu laid the foundation for metallic Riemannian manifolds in 2013. The field's rich potential and diverse applications have since attracted significant research efforts, leading to a wealth of valuable insights. This review delves into the latest advances in metallic Riemannian geometry, a rapidly progressing area within the broader field of differential geometry.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
A Comprehensive Review of Solitonic Inequalities in Riemannian Geometry
Authors:
Bang-Yen Chen,
Majid Ali Choudhary,
Mohammed Nisar,
Mohd Danish Siddiqi
Abstract:
In Riemannian geometry, Ricci soliton inequalities are an important field of study that provide profound insights into the geometric and analytic characteristics of Riemannian manifolds. An extensive study of Ricci soliton inequalities is given in this review article, which also summarizes their historical evolution, core ideas, important findings, and applications. We investigate the complex inte…
▽ More
In Riemannian geometry, Ricci soliton inequalities are an important field of study that provide profound insights into the geometric and analytic characteristics of Riemannian manifolds. An extensive study of Ricci soliton inequalities is given in this review article, which also summarizes their historical evolution, core ideas, important findings, and applications. We investigate the complex interactions between curvature conditions and geometric inequalities as well as the several kinds of Ricci solitons, such as expanding, steady, and shrinking solitons. We also go over current developments, unresolved issues, and possible paths for further study in this fascinating area.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
The Economic Analysis of the Common Pool Method through the HARA Utility Functions
Authors:
Mu Lin,
Di Zhang,
Ben Chen,
Hang Zheng
Abstract:
Water market is a contemporary marketplace for water trading and is deemed to one of the most efficient instruments to improve the social welfare. In modern water markets, the two widely used trading systems are an improved pair-wise trading, and a 'smart market' or common pool method. In comparison with the economic model, this paper constructs a conceptual mathematic model through the HARA utili…
▽ More
Water market is a contemporary marketplace for water trading and is deemed to one of the most efficient instruments to improve the social welfare. In modern water markets, the two widely used trading systems are an improved pair-wise trading, and a 'smart market' or common pool method. In comparison with the economic model, this paper constructs a conceptual mathematic model through the HARA utility functions. Mirroring the concepts such as Nash Equilibrium, Pareto optimal and stable matching in economy, three significant propositions are acquired which illustrate the advantages of the common pool method compared with the improved pair-wise trading.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
Instruction Tuning-free Visual Token Complement for Multimodal LLMs
Authors:
Dongsheng Wang,
Jiequan Cui,
Miaoge Li,
Wang Lin,
Bo Chen,
Hanwang Zhang
Abstract:
As the open community of large language models (LLMs) matures, multimodal LLMs (MLLMs) have promised an elegant bridge between vision and language. However, current research is inherently constrained by challenges such as the need for high-quality instruction pairs and the loss of visual information in image-to-text training objectives. To this end, we propose a Visual Token Complement framework (…
▽ More
As the open community of large language models (LLMs) matures, multimodal LLMs (MLLMs) have promised an elegant bridge between vision and language. However, current research is inherently constrained by challenges such as the need for high-quality instruction pairs and the loss of visual information in image-to-text training objectives. To this end, we propose a Visual Token Complement framework (VTC) that helps MLLMs regain the missing visual features and thus improve response accuracy. Specifically, our VTC integrates text-to-image generation as a guide to identifying the text-irrelevant features, and a visual selector is then developed to generate complementary visual tokens to enrich the original visual input. Moreover, an iterative strategy is further designed to extract more visual information by iteratively using the visual selector without any additional training. Notably, the training pipeline requires no additional image-text pairs, resulting in a desired instruction tuning-free property. Both qualitative and quantitative experiments demonstrate the superiority and efficiency of our VTC.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
Analysis of the dynamics of the decay $D^{+}\to K_{S}^{0} π^{0} e^{+}ν_{e}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
The branching fraction of $D^+\to K_{S}^{0} π^{0}e^+ν_e$ is measured for the first time using $7.93~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector operating at the BEPCII collider, and is determined to be ${\mathcal B}$($D^+\to K_S^0π^0e^+ν_e$) = $(0.881~\pm~0.017_{\rm stat.}~\pm~0.016_{\rm syst.})$\%. Based on a…
▽ More
The branching fraction of $D^+\to K_{S}^{0} π^{0}e^+ν_e$ is measured for the first time using $7.93~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at the center-of-mass energy $\sqrt{s}=3.773$~GeV with the BESIII detector operating at the BEPCII collider, and is determined to be ${\mathcal B}$($D^+\to K_S^0π^0e^+ν_e$) = $(0.881~\pm~0.017_{\rm stat.}~\pm~0.016_{\rm syst.})$\%. Based on an analysis of the $D^+\to K_S^0π^0e^+ν_e$ decay dynamics, we observe the $S\text{-}{\rm wave}$ and $P$-wave components with fractions of $f_{S\text{-}{\rm wave}}$ = $(6.13~\pm~0.27_{\rm stat.}~\pm ~0.30_{\rm syst.})\%$ and $f_{\bar K^{*}(892)^0}$ = $(93.88~\pm~0.27_{\rm stat.}~\pm~0.29_{\rm syst.})$\%, respectively. From these results, we obtain the branching fractions ${\mathcal B}$($D^+\to (K_S^0π^0)_{S\text{-}{\rm wave}}~e^+ν_e$) = $(5.41~\pm~0.35_{\rm stat.}~\pm~0.37_{\rm syst.})\times10^{-4}$ and ${\mathcal B}$($D^+\to \bar K^{*}(892)^0e^+ν_e$) = $(4.97~\pm~0.11_{\rm stat.}~\pm~0.12_{\rm syst.})$\%. In addition, the hadronic form-factor ratios of $D^{+} \to \bar {K}^{*}(892)^0e^+ν_e$ at $q^2=0$, assuming a single-pole dominance parameterization, are determined to be $r_V=\frac{V(0)}{A_1(0)}= 1.43~\pm~0.07_{\rm stat.}~\pm~0.03_{\rm syst.}$ and $r_2=\frac{A_2(0)}{A_1(0)}=0.72~\pm~0.06_{\rm stat.}~\pm~0.02_{\rm syst.}$.
△ Less
Submitted 8 August, 2024;
originally announced August 2024.
-
Lifelong Personalized Low-Rank Adaptation of Large Language Models for Recommendation
Authors:
Jiachen Zhu,
Jianghao Lin,
Xinyi Dai,
Bo Chen,
Rong Shan,
Jieming Zhu,
Ruiming Tang,
Yong Yu,
Weinan Zhang
Abstract:
We primarily focus on the field of large language models (LLMs) for recommendation, which has been actively explored recently and poses a significant challenge in effectively enhancing recommender systems with logical reasoning abilities and open-world knowledge. Current mainstream efforts mainly center around injecting personalized information from recommendation models into LLMs by customizing i…
▽ More
We primarily focus on the field of large language models (LLMs) for recommendation, which has been actively explored recently and poses a significant challenge in effectively enhancing recommender systems with logical reasoning abilities and open-world knowledge. Current mainstream efforts mainly center around injecting personalized information from recommendation models into LLMs by customizing input templates or aligning representations between semantic and recommendation spaces at the prediction layer. However, they face three significant limitations: (1) LoRA is mostly used as a core component in existing works, but personalization is not well established in LoRA parameters as the LoRA matrix shared by every user may not cater to different users' characteristics, leading to suboptimal performance. (2) Although lifelong personalized behavior sequences are ideal for personalization, their use raises effectiveness and efficiency issues since LLMs require escalating training and inference time to extend text lengths. (3) Existing approaches aren't scalable for large datasets due to training efficiency constraints. Thus, LLMs only see a small fraction of the datasets (e.g., less than 10%) instead of the whole datasets, limiting their exposure to the full training space. To address these problems, we propose RecLoRA. This model incorporates a Personalized LoRA module that maintains independent LoRAs for different users and a Long-Short Modality Retriever that retrieves different history lengths for different modalities, significantly improving performance while adding minimal time cost. Furthermore, we design a Few2Many Learning Strategy, using a conventional recommendation model as a lens to magnify small training spaces to full spaces. Extensive experiments on public datasets demonstrate the efficacy of our RecLoRA compared to existing baseline models.
△ Less
Submitted 11 August, 2024; v1 submitted 7 August, 2024;
originally announced August 2024.
-
Measurement of the Branching Fraction of \boldmath{$ψ(2S) \to γπ^0$}
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Based on $(2712.4\pm14.1)\times10^{6}~ψ(2S)$ events, 7.9 fb$^{-1}$ $ψ(3773)$ data, and 0.8 fb$^{-1}$ off-resonance data samples collected with the BESIII detector, we measure the branching fraction of $ψ(2S)\rightarrowγπ^{0}$ and $e^{+}e^{-}\rightarrowγπ^{0}$ form factor at momentum transfers $Q^{2}\sim13$ GeV$^{2}$. The $e^{+}e^{-}\rightarrowγπ^{0}$ cross section is fitted with considering the in…
▽ More
Based on $(2712.4\pm14.1)\times10^{6}~ψ(2S)$ events, 7.9 fb$^{-1}$ $ψ(3773)$ data, and 0.8 fb$^{-1}$ off-resonance data samples collected with the BESIII detector, we measure the branching fraction of $ψ(2S)\rightarrowγπ^{0}$ and $e^{+}e^{-}\rightarrowγπ^{0}$ form factor at momentum transfers $Q^{2}\sim13$ GeV$^{2}$. The $e^{+}e^{-}\rightarrowγπ^{0}$ cross section is fitted with considering the interference between the $ψ(2S)$ and continuum amplitudes and two solutions are found, ${\cal B}=3.74\times10^{-7}$ with $φ=3.93$ rad and ${\cal B}=7.87\times10^{-7}$ with $φ=2.08$ rad. Here, ${\cal B}$ is the branching fraction of $ψ(2S)\rightarrowγπ^{0}$ and $φ$ is the relative phase angle between the $ψ(2S)$ and continuum amplitudes. Due to insufficient off-resonance data, the branching fraction ${\cal B}(ψ(2S)\rightarrowγπ^{0})$ is determined to be in the range $[2.7, 9.7]\times10^{-7}$ within one standard deviation of the contour region.
△ Less
Submitted 7 August, 2024; v1 submitted 7 August, 2024;
originally announced August 2024.
-
A Non-negative VAE:the Generalized Gamma Belief Network
Authors:
Zhibin Duan,
Tiansheng Wen,
Muyao Wang,
Bo Chen,
Mingyuan Zhou
Abstract:
The gamma belief network (GBN), often regarded as a deep topic model, has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data. Its notable capability to acquire interpretable latent factors is partially attributed to sparse and non-negative gamma-distributed latent variables. However, the existing GBN and its variations are constrained by the lin…
▽ More
The gamma belief network (GBN), often regarded as a deep topic model, has demonstrated its potential for uncovering multi-layer interpretable latent representations in text data. Its notable capability to acquire interpretable latent factors is partially attributed to sparse and non-negative gamma-distributed latent variables. However, the existing GBN and its variations are constrained by the linear generative model, thereby limiting their expressiveness and applicability. To address this limitation, we introduce the generalized gamma belief network (Generalized GBN) in this paper, which extends the original linear generative model to a more expressive non-linear generative model. Since the parameters of the Generalized GBN no longer possess an analytic conditional posterior, we further propose an upward-downward Weibull inference network to approximate the posterior distribution of the latent variables. The parameters of both the generative model and the inference network are jointly trained within the variational inference framework. Finally, we conduct comprehensive experiments on both expressivity and disentangled representation learning tasks to evaluate the performance of the Generalized GBN against state-of-the-art Gaussian variational autoencoders serving as baselines.
△ Less
Submitted 15 August, 2024; v1 submitted 6 August, 2024;
originally announced August 2024.
-
Measurement of $Σ^+$ transverse polarization in $e^+e^-$ collisions at $\sqrt{s} = 3.68-3.71$ GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at seven energy points ranging from 3.68 to 3.71 GeV and corresponding to an integrated luminosity of $652.1~{\rm pb^{-1}}$, we present an energy-dependent measurement of the transverse polarization, relative phase and modulus ratio of the electromagnetic form factors of the $Σ^+$ hyperon in the $e^+e^- \to Σ^+ \barΣ^-$ reaction. The…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at seven energy points ranging from 3.68 to 3.71 GeV and corresponding to an integrated luminosity of $652.1~{\rm pb^{-1}}$, we present an energy-dependent measurement of the transverse polarization, relative phase and modulus ratio of the electromagnetic form factors of the $Σ^+$ hyperon in the $e^+e^- \to Σ^+ \barΣ^-$ reaction. These results are helpful to understand the production mechanism of the $Σ^+$-$\barΣ^-$ pairs.
△ Less
Submitted 7 August, 2024; v1 submitted 6 August, 2024;
originally announced August 2024.
-
User-in-the-loop Evaluation of Multimodal LLMs for Activity Assistance
Authors:
Mrinal Verghese,
Brian Chen,
Hamid Eghbalzadeh,
Tushar Nagarajan,
Ruta Desai
Abstract:
Our research investigates the capability of modern multimodal reasoning models, powered by Large Language Models (LLMs), to facilitate vision-powered assistants for multi-step daily activities. Such assistants must be able to 1) encode relevant visual history from the assistant's sensors, e.g., camera, 2) forecast future actions for accomplishing the activity, and 3) replan based on the user in th…
▽ More
Our research investigates the capability of modern multimodal reasoning models, powered by Large Language Models (LLMs), to facilitate vision-powered assistants for multi-step daily activities. Such assistants must be able to 1) encode relevant visual history from the assistant's sensors, e.g., camera, 2) forecast future actions for accomplishing the activity, and 3) replan based on the user in the loop. To evaluate the first two capabilities, grounding visual history and forecasting in short and long horizons, we conduct benchmarking of two prominent classes of multimodal LLM approaches -- Socratic Models and Vision Conditioned Language Models (VCLMs) on video-based action anticipation tasks using offline datasets. These offline benchmarks, however, do not allow us to close the loop with the user, which is essential to evaluate the replanning capabilities and measure successful activity completion in assistive scenarios. To that end, we conduct a first-of-its-kind user study, with 18 participants performing 3 different multi-step cooking activities while wearing an egocentric observation device called Aria and following assistance from multimodal LLMs. We find that the Socratic approach outperforms VCLMs in both offline and online settings. We further highlight how grounding long visual history, common in activity assistance, remains challenging in current models, especially for VCLMs, and demonstrate that offline metrics do not indicate online performance.
△ Less
Submitted 11 August, 2024; v1 submitted 4 August, 2024;
originally announced August 2024.
-
Observation of $η_{c}(2S) \to K^{+}K^{-}η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
By analyzing $(27.12 \pm 0.14)\times10^{8}$ $ψ(3686)$ events accumulated with the BESIII detector, the decay $η_{c}(2S) \to K^{+} K^{-} η$ is observed for the first time with a significance of $6.2σ$ after considering systematic uncertainties. The product of the branching fractions of $ψ(3686) \to γη_{c}(2S)$ and $η_{c}(2S) \to K^{+} K^{-} η$ is measured to be…
▽ More
By analyzing $(27.12 \pm 0.14)\times10^{8}$ $ψ(3686)$ events accumulated with the BESIII detector, the decay $η_{c}(2S) \to K^{+} K^{-} η$ is observed for the first time with a significance of $6.2σ$ after considering systematic uncertainties. The product of the branching fractions of $ψ(3686) \to γη_{c}(2S)$ and $η_{c}(2S) \to K^{+} K^{-} η$ is measured to be $\mathcal{B}(ψ(3686) \toγη_{c}(2S))\times \mathcal{B}(η_{c}(2S)\to K^{+} K^{-}η)=(2.39 \pm 0.32 \pm 0.34) \times 10^{-6}$, where the first uncertainty is statistical, and the second one is systematic. The branching fraction of $η_{c}(2S)\to K^{+} K^{-}η$ is determined to be $\mathcal{B}(η_{c}(2S)\to K^{+} K^{-}η) = (3.42 \pm 0.46 \pm 0.48 \pm 2.44) \times 10^{-3}$, where the third uncertainty is due to the branching fraction of $ψ(3686) \to γη_{c}(2S)$. Using a recent BESIII measurement of $\mathcal{B} (η_{c}(2S) \to K^{+} K^{-}π^{0})$, we also determine the ratio between the branching fractions of $η_{c}(2S) \to K^{+} K^{-}η$ and $η_{c}(2S) \to K^{+} K^{-}π^{0}$ to be $1.49 \pm 0.22 \pm 0.25$, which is consistent with the previous result of BaBar at a comparable precision level.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
A comprehensive review of golden Riemannian manifolds
Authors:
Bang-Yen Chen,
Majid Ali Choudhary,
Afshan Perween
Abstract:
In differential geometry, the concept of golden structure, initially proposed by S. I. Goldberg and K. Yano in 1970, presents a compelling area with wide-ranging applications. The exploration of golden Riemannian manifolds was initiated by C. E. Hretcanu and M. Crasmareanu in 2008, following the principles of the golden structure. Subsequently, numerous researchers have contributed significant ins…
▽ More
In differential geometry, the concept of golden structure, initially proposed by S. I. Goldberg and K. Yano in 1970, presents a compelling area with wide-ranging applications. The exploration of golden Riemannian manifolds was initiated by C. E. Hretcanu and M. Crasmareanu in 2008, following the principles of the golden structure. Subsequently, numerous researchers have contributed significant insights into golden Riemannian manifolds. The purpose of this paper is to provide a comprehensive survey on golden Riemannian manifold done over the past decade.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Inflight Performance and Calibrations of the Lyman-alpha Solar Telescope on board the Advanced Space-based Solar Observatory
Authors:
Bo Chen,
Li Feng,
Guang Zhang,
Hui Li,
Lingping He,
Kefei Song,
Quanfeng Guo,
Ying Li,
Yu Huang,
Jingwei Li,
Jie Zhao,
Jianchao Xue,
Gen Li,
Guanglu Shi,
Dechao Song,
Lei Lu,
Beili Ying,
Haifeng Wang,
Shuang Dai,
Xiaodong Wang,
Shilei Mao,
Peng Wang,
Kun Wu,
Shuai Ren,
Liang Sun
, et al. (18 additional authors not shown)
Abstract:
The Lyman-alpha Solar Telescope (LST) on board the Advanced Space-based Solar Observatory (ASO-S) is the first payload to image the full solar disk and the solar corona in both white-light (WL) and ultraviolet (UV) H I Lya, extending up to 2.5 solar radii (Rs). Since the launch of the ASO-S on 9 October 2022, LST has captured various significant solar activities including flares, prominences, coro…
▽ More
The Lyman-alpha Solar Telescope (LST) on board the Advanced Space-based Solar Observatory (ASO-S) is the first payload to image the full solar disk and the solar corona in both white-light (WL) and ultraviolet (UV) H I Lya, extending up to 2.5 solar radii (Rs). Since the launch of the ASO-S on 9 October 2022, LST has captured various significant solar activities including flares, prominences, coronal mass ejections (CMEs). LST covers different passbands of 121.6 nm, 360 nm and 700 nm. The Lya Solar Disk Imager (SDI) has a field of view (FOV) of 38.4 arcmin and a spatial resolution of around 9.5 arcsec, while the White-Light Solar Telescope (WST) has a FOV of 38.43 arcmin and a spatial resolution of around 3.0 arcsec. The FOV of the Lya Solar Corona Imager (SCI) reaches 81.1 arcmin and its spatial resolution is 4.3 arcsec. The stray-light level in the 700 nm waveband is about 7.8e-6 MSB (mean solar brightness) at 1.1 Rs and 7.6e-7 MSB at 2.5 Rs, and in the Lya waveband it is around 4.3e-3 MSB at 1.1 Rs and 4.1e-4 MSB at 2.5 Rs. This article will detail the results from on-orbit tests and calibrations.
△ Less
Submitted 4 August, 2024;
originally announced August 2024.
-
3DStoryline: Immersive Visual Storytelling
Authors:
Haonan Yao,
Lixiang Zhao,
Boyuan Chen,
Kaiwen Li,
Hai-Ning Liang,
Lingyun Yu
Abstract:
Storyline visualization has emerged as an innovative method for illustrating the development and changes in stories across various domains. Traditional approaches typically represent stories with one line per character, progressing from left to right. While effective for simpler narratives, this method faces significant challenges when dealing with complex stories involving multiple characters, as…
▽ More
Storyline visualization has emerged as an innovative method for illustrating the development and changes in stories across various domains. Traditional approaches typically represent stories with one line per character, progressing from left to right. While effective for simpler narratives, this method faces significant challenges when dealing with complex stories involving multiple characters, as well as temporal and spatial dynamics. In this study, we investigate the potential of immersive environments for enhancing storyline visualizations. We begin by summarizing the key design considerations for effective storyline visualization in virtual reality (VR). Guided by these principles, we develop 3DStoryline, a system that allows users to view and interact with 3D immersive storyline visualizations. To evaluate the effectiveness of 3DStoryline, we conduct a task-based user study, revealing that the system significantly enhances users' comprehension of complex narratives.
△ Less
Submitted 3 August, 2024;
originally announced August 2024.
-
Search for $X(3872)\toπ^0π^0χ_{c1,2}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using 10.1 fb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector with center-of-mass energies between 4.15 GeV and 4.30 GeV, we search for the decays $X(3872)\toπ^0π^0χ_{c1,2}$, where the $X(3872)$ is produced in $e^+e^-\toγX(3872)$. No evidence above $3σ$ is found for either decay. Upper limits at the $90\%$ C.L. on the branching fractions of $X(3872)\toπ^0π^0χ_{c1,2}$ normalized…
▽ More
Using 10.1 fb$^{-1}$ of $e^+e^-$ collision data collected by the BESIII detector with center-of-mass energies between 4.15 GeV and 4.30 GeV, we search for the decays $X(3872)\toπ^0π^0χ_{c1,2}$, where the $X(3872)$ is produced in $e^+e^-\toγX(3872)$. No evidence above $3σ$ is found for either decay. Upper limits at the $90\%$ C.L. on the branching fractions of $X(3872)\toπ^0π^0χ_{c1,2}$ normalized to the branching fraction of $X(3872)\toπ^+π^-J/ψ$ are set to be $\mathcal{B}(X(3872)\toπ^0π^0χ_{c1})/\mathcal{B}(X(3872)\toπ^+π^-J/ψ) < 1.1$ and $\mathcal{B}(X(3872)\toπ^0π^0χ_{c2})/\mathcal{B}(X(3872)\toπ^+π^-J/ψ) < 0.5$, taking into account both statistical and systematic uncertainties.
△ Less
Submitted 2 August, 2024;
originally announced August 2024.
-
Trainable Pointwise Decoder Module for Point Cloud Segmentation
Authors:
Bike Chen,
Chen Gong,
Antti Tikanmäki,
Juha Röning
Abstract:
Point cloud segmentation (PCS) aims to make per-point predictions and enables robots and autonomous driving cars to understand the environment. The range image is a dense representation of a large-scale outdoor point cloud, and segmentation models built upon the image commonly execute efficiently. However, the projection of the point cloud onto the range image inevitably leads to dropping points b…
▽ More
Point cloud segmentation (PCS) aims to make per-point predictions and enables robots and autonomous driving cars to understand the environment. The range image is a dense representation of a large-scale outdoor point cloud, and segmentation models built upon the image commonly execute efficiently. However, the projection of the point cloud onto the range image inevitably leads to dropping points because, at each image coordinate, only one point is kept despite multiple points being projected onto the same location. More importantly, it is challenging to assign correct predictions to the dropped points that belong to the classes different from the kept point class. Besides, existing post-processing methods, such as K-nearest neighbor (KNN) search and kernel point convolution (KPConv), cannot be trained with the models in an end-to-end manner or cannot process varying-density outdoor point clouds well, thereby enabling the models to achieve sub-optimal performance. To alleviate this problem, we propose a trainable pointwise decoder module (PDM) as the post-processing approach, which gathers weighted features from the neighbors and then makes the final prediction for the query point. In addition, we introduce a virtual range image-guided copy-rotate-paste (VRCrop) strategy in data augmentation. VRCrop constrains the total number of points and eliminates undesirable artifacts in the augmented point cloud. With PDM and VRCrop, existing range image-based segmentation models consistently perform better than their counterparts on the SemanticKITTI, SemanticPOSS, and nuScenes datasets.
△ Less
Submitted 2 August, 2024;
originally announced August 2024.
-
Randomized Strategyproof Mechanisms with Best of Both Worlds Fairness and Efficiency
Authors:
Ankang Sun,
Bo Chen
Abstract:
We study the problem of mechanism design for allocating a set of indivisible items among agents with private preferences on items. We are interested in such a mechanism that is strategyproof (where agents' best strategy is to report their true preferences) and is expected to ensure fairness and efficiency to a certain degree. We first present an impossibility result that a deterministic mechanism…
▽ More
We study the problem of mechanism design for allocating a set of indivisible items among agents with private preferences on items. We are interested in such a mechanism that is strategyproof (where agents' best strategy is to report their true preferences) and is expected to ensure fairness and efficiency to a certain degree. We first present an impossibility result that a deterministic mechanism does not exist that is strategyproof, fair and efficient for allocating indivisible chores. We then utilize randomness to overcome the strong impossibility. For allocating indivisible chores, we propose a randomized mechanism that is strategyproof in expectation as well as ex-ante and ex-post (best of both worlds) fair and efficient. For allocating mixed items, where an item can be a good (i.e., with a positive utility) for one agent but a chore (i.e., a with negative utility) for another, we propose a randomized mechanism that is strategyproof in expectation with best of both worlds fairness and efficiency when there are two agents.
△ Less
Submitted 2 August, 2024;
originally announced August 2024.
-
Partial wave analysis of $ψ(3686)\toΛ\barΣ^0π^0+c.c.$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Based on a sample of $(2712.4\pm14.3)\times10^6\;ψ(3686)$ events collected with the BESIII detector, a partial wave analysis of the decay $ψ(3686)\toΛ\barΣ^0π^0+c.c.$ is performed to investigate $Λ^*$ and $Σ^*$ resonances in the $π^0\barΣ^0$ and $π^0Λ$ invariant mass distributions. Significant contributions are found from the $Λ(1405)$, $Λ(1520)$, $Λ(1600)$, $Λ(1670)$, $Λ(1690)$, $Λ(1800)$,…
▽ More
Based on a sample of $(2712.4\pm14.3)\times10^6\;ψ(3686)$ events collected with the BESIII detector, a partial wave analysis of the decay $ψ(3686)\toΛ\barΣ^0π^0+c.c.$ is performed to investigate $Λ^*$ and $Σ^*$ resonances in the $π^0\barΣ^0$ and $π^0Λ$ invariant mass distributions. Significant contributions are found from the $Λ(1405)$, $Λ(1520)$, $Λ(1600)$, $Λ(1670)$, $Λ(1690)$, $Λ(1800)$, $Λ(1890)$, $Λ(2325)$, $Σ(1385)$, $Σ(1660)$, $Σ(1670)$, $Σ(1750)$, and $Σ(1910)$. The masses, widths, and production branching fractions for each component are determined. In addition, the branching fraction of $ψ(3686)\toΛ\barΣ^0π^0+c.c.$ is measured to be $(1.544\pm0.013\pm0.069)\times10^{-4}$ for the first time, where the first uncertainty is statistical and the second systematic.
△ Less
Submitted 1 August, 2024;
originally announced August 2024.
-
CREW: Facilitating Human-AI Teaming Research
Authors:
Lingyu Zhang,
Zhengran Ji,
Boyuan Chen
Abstract:
With the increasing deployment of artificial intelligence (AI) technologies, the potential of humans working with AI agents has been growing at a great speed. Human-AI teaming is an important paradigm for studying various aspects when humans and AI agents work together. The unique aspect of Human-AI teaming research is the need to jointly study humans and AI agents, demanding multidisciplinary res…
▽ More
With the increasing deployment of artificial intelligence (AI) technologies, the potential of humans working with AI agents has been growing at a great speed. Human-AI teaming is an important paradigm for studying various aspects when humans and AI agents work together. The unique aspect of Human-AI teaming research is the need to jointly study humans and AI agents, demanding multidisciplinary research efforts from machine learning to human-computer interaction, robotics, cognitive science, neuroscience, psychology, social science, and complex systems. However, existing platforms for Human-AI teaming research are limited, often supporting oversimplified scenarios and a single task, or specifically focusing on either human-teaming research or multi-agent AI algorithms. We introduce CREW, a platform to facilitate Human-AI teaming research and engage collaborations from multiple scientific disciplines, with a strong emphasis on human involvement. It includes pre-built tasks for cognitive studies and Human-AI teaming with expandable potentials from our modular design. Following conventional cognitive neuroscience research, CREW also supports multimodal human physiological signal recording for behavior analysis. Moreover, CREW benchmarks real-time human-guided reinforcement learning agents using state-of-the-art algorithms and well-tuned baselines. With CREW, we were able to conduct 50 human subject studies within a week to verify the effectiveness of our benchmark.
△ Less
Submitted 31 July, 2024;
originally announced August 2024.
-
Contrastive Factor Analysis
Authors:
Zhibin Duan,
Tiansheng Wen,
Yifei Wang,
Chen Zhu,
Bo Chen,
Mingyuan Zhou
Abstract:
Factor analysis, often regarded as a Bayesian variant of matrix factorization, offers superior capabilities in capturing uncertainty, modeling complex dependencies, and ensuring robustness. As the deep learning era arrives, factor analysis is receiving less and less attention due to their limited expressive ability. On the contrary, contrastive learning has emerged as a potent technique with demon…
▽ More
Factor analysis, often regarded as a Bayesian variant of matrix factorization, offers superior capabilities in capturing uncertainty, modeling complex dependencies, and ensuring robustness. As the deep learning era arrives, factor analysis is receiving less and less attention due to their limited expressive ability. On the contrary, contrastive learning has emerged as a potent technique with demonstrated efficacy in unsupervised representational learning. While the two methods are different paradigms, recent theoretical analysis has revealed the mathematical equivalence between contrastive learning and matrix factorization, providing a potential possibility for factor analysis combined with contrastive learning. Motivated by the interconnectedness of contrastive learning, matrix factorization, and factor analysis, this paper introduces a novel Contrastive Factor Analysis framework, aiming to leverage factor analysis's advantageous properties within the realm of contrastive learning. To further leverage the interpretability properties of non-negative factor analysis, which can learn disentangled representations, contrastive factor analysis is extended to a non-negative version. Finally, extensive experimental validation showcases the efficacy of the proposed contrastive (non-negative) factor analysis methodology across multiple key properties, including expressiveness, robustness, interpretability, and accurate uncertainty estimation.
△ Less
Submitted 31 July, 2024; v1 submitted 31 July, 2024;
originally announced July 2024.
-
Observation of $D^0\to b_1(1235)^- e^+ν_e$ and evidence for $D^+\to b_1(1235)^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (647 additional authors not shown)
Abstract:
By analyzing a data sample of $e^+e^-$ collisions with center-of-mass energy $\sqrt{s}=3.773$ GeV, corresponding to an integrated luminosity of $7.9~\rm {fb}^{-1}$ collected with the BESIII detector operating at the BEPCII collider, we study semileptonic decays of the $D^{0(+)}$ mesons into the axial-vector meson $b_1(1235)$ via the decay $b_1(1235)\to ωπ$. The decay…
▽ More
By analyzing a data sample of $e^+e^-$ collisions with center-of-mass energy $\sqrt{s}=3.773$ GeV, corresponding to an integrated luminosity of $7.9~\rm {fb}^{-1}$ collected with the BESIII detector operating at the BEPCII collider, we study semileptonic decays of the $D^{0(+)}$ mesons into the axial-vector meson $b_1(1235)$ via the decay $b_1(1235)\to ωπ$. The decay $D^0\to b_1(1235)^-e^{+}ν_{e}$ is observed with a significance of 5.2$σ$ after considering systematic uncertainty, while evidence for the decay $D^+\to b_1(1235)^0 e^+ν_e$ is obtained with a 3.1$σ$ significance. The product branching fractions are determined to be ${\mathcal B}(D^0\to b_{1}(1235)^-e^{+}ν_{e})\times {\mathcal B} (b_1(1235)^-\to ωπ^-) = (0.72\pm0.18^{+0.06}_{-0.08})\times10^{-4}$ and ${\mathcal B}(D^+\to b_{1}(1235)^0e^{+}ν_{e})\times {\mathcal B} (b_1(1235)^0~\to ωπ^0) = (1.16\pm0.44\pm0.16)\times10^{-4}$, where the first uncertainties are statistical and the second systematic. The ratio of their partial decay widths is determined to be $\frac{Γ(D^0\to b_{1}(1235)^-e^{+}ν_{e})}{2Γ(D^+\to b_{1}(1235)^0e^{+}ν_{e})}=0.78\pm0.19^{+0.04}_{-0.05}$, which is consistent with unity, predicted by isospin invariance, within uncertainties.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
Learning Feature-Preserving Portrait Editing from Generated Pairs
Authors:
Bowei Chen,
Tiancheng Zhi,
Peihao Zhu,
Shen Sang,
Jing Liu,
Linjie Luo
Abstract:
Portrait editing is challenging for existing techniques due to difficulties in preserving subject features like identity. In this paper, we propose a training-based method leveraging auto-generated paired data to learn desired editing while ensuring the preservation of unchanged subject features. Specifically, we design a data generation process to create reasonably good training pairs for desired…
▽ More
Portrait editing is challenging for existing techniques due to difficulties in preserving subject features like identity. In this paper, we propose a training-based method leveraging auto-generated paired data to learn desired editing while ensuring the preservation of unchanged subject features. Specifically, we design a data generation process to create reasonably good training pairs for desired editing at low cost. Based on these pairs, we introduce a Multi-Conditioned Diffusion Model to effectively learn the editing direction and preserve subject features. During inference, our model produces accurate editing mask that can guide the inference process to further preserve detailed subject features. Experiments on costume editing and cartoon expression editing show that our method achieves state-of-the-art quality, quantitatively and qualitatively.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
Making LLMs Work for Enterprise Data Tasks
Authors:
Çağatay Demiralp,
Fabian Wenz,
Peter Baile Chen,
Moe Kayali,
Nesime Tatbul,
Michael Stonebraker
Abstract:
Large language models (LLMs) know little about enterprise database tables in the private data ecosystem, which substantially differ from web text in structure and content. As LLMs' performance is tied to their training data, a crucial question is how useful they can be in improving enterprise database management and analysis tasks. To address this, we contribute experimental results on LLMs' perfo…
▽ More
Large language models (LLMs) know little about enterprise database tables in the private data ecosystem, which substantially differ from web text in structure and content. As LLMs' performance is tied to their training data, a crucial question is how useful they can be in improving enterprise database management and analysis tasks. To address this, we contribute experimental results on LLMs' performance for text-to-SQL and semantic column-type detection tasks on enterprise datasets. The performance of LLMs on enterprise data is significantly lower than on benchmark datasets commonly used. Informed by our findings and feedback from industry practitioners, we identify three fundamental challenges -- latency, cost, and quality -- and propose potential solutions to use LLMs in enterprise data workflows effectively.
△ Less
Submitted 22 July, 2024;
originally announced July 2024.
-
Measurement of the $\boldsymbol{e^{+}e^{-}\to K^+K^-ψ(2S)}$ Cross Section at Center-of-Mass Energies from 4.699 to 4.951 GeV and Search for $\boldsymbol{Z_{cs}^{\pm}}$ in the $\boldsymbol{Z_{cs}^\pm\to K^\pmψ(2S)}$ Decay
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (646 additional authors not shown)
Abstract:
We perform the first investigation of the process $e^{+}e^{-}\to K^+K^-ψ(2S)$ and report its Born cross sections over a range of center-of-mass energies from 4.699 to 4.951~GeV. The measurements are carried out using several partial reconstruction techniques using data samples collected by the BESIII detector with a total integrated luminosity of 2.5~fb$^{-1}$. We search for new tetraquark candida…
▽ More
We perform the first investigation of the process $e^{+}e^{-}\to K^+K^-ψ(2S)$ and report its Born cross sections over a range of center-of-mass energies from 4.699 to 4.951~GeV. The measurements are carried out using several partial reconstruction techniques using data samples collected by the BESIII detector with a total integrated luminosity of 2.5~fb$^{-1}$. We search for new tetraquark candidates $Z_{cs}^\pm$ in the decays $Z_{cs}^\pm\to K^\pmψ(2S)$. No significant $Z_{cs}^\pm$ signals are observed.
△ Less
Submitted 29 July, 2024;
originally announced July 2024.
-
Practical Marketplace Optimization at Uber Using Causally-Informed Machine Learning
Authors:
Bobby Chen,
Siyu Chen,
Jason Dowlatabadi,
Yu Xuan Hong,
Vinayak Iyer,
Uday Mantripragada,
Rishabh Narang,
Apoorv Pandey,
Zijun Qin,
Abrar Sheikh,
Hongtao Sun,
Jiaqi Sun,
Matthew Walker,
Kaichen Wei,
Chen Xu,
Jingnan Yang,
Allen T. Zhang,
Guoqing Zhang
Abstract:
Budget allocation of marketplace levers, such as incentives for drivers and promotions for riders, has long been a technical and business challenge at Uber; understanding lever budget changes' impact and estimating cost efficiency to achieve predefined budgets is crucial, with the goal of optimal allocations that maximize business value; we introduce an end-to-end machine learning and optimization…
▽ More
Budget allocation of marketplace levers, such as incentives for drivers and promotions for riders, has long been a technical and business challenge at Uber; understanding lever budget changes' impact and estimating cost efficiency to achieve predefined budgets is crucial, with the goal of optimal allocations that maximize business value; we introduce an end-to-end machine learning and optimization procedure to automate budget decision-making for cities, relying on feature store, model training and serving, optimizers, and backtesting; proposing state-of-the-art deep learning (DL) estimator based on S-Learner and a novel tensor B-Spline regression model, we solve high-dimensional optimization with ADMM and primal-dual interior point convex optimization, substantially improving Uber's resource allocation efficiency.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
HICEScore: A Hierarchical Metric for Image Captioning Evaluation
Authors:
Zequn Zeng,
Jianqiao Sun,
Hao Zhang,
Tiansheng Wen,
Yudi Su,
Yan Xie,
Zhengjue Wang,
Bo Chen
Abstract:
Image captioning evaluation metrics can be divided into two categories, reference-based metrics and reference-free metrics. However, reference-based approaches may struggle to evaluate descriptive captions with abundant visual details produced by advanced multimodal large language models, due to their heavy reliance on limited human-annotated references. In contrast, previous reference-free metric…
▽ More
Image captioning evaluation metrics can be divided into two categories, reference-based metrics and reference-free metrics. However, reference-based approaches may struggle to evaluate descriptive captions with abundant visual details produced by advanced multimodal large language models, due to their heavy reliance on limited human-annotated references. In contrast, previous reference-free metrics have been proven effective via CLIP cross-modality similarity. Nonetheless, CLIP-based metrics, constrained by their solution of global image-text compatibility, often have a deficiency in detecting local textual hallucinations and are insensitive to small visual objects. Besides, their single-scale designs are unable to provide an interpretable evaluation process such as pinpointing the position of caption mistakes and identifying visual regions that have not been described. To move forward, we propose a novel reference-free metric for image captioning evaluation, dubbed Hierarchical Image Captioning Evaluation Score (HICE-S). By detecting local visual regions and textual phrases, HICE-S builds an interpretable hierarchical scoring mechanism, breaking through the barriers of the single-scale structure of existing reference-free metrics. Comprehensive experiments indicate that our proposed metric achieves the SOTA performance on several benchmarks, outperforming existing reference-free metrics like CLIP-S and PAC-S, and reference-based metrics like METEOR and CIDEr. Moreover, several case studies reveal that the assessment process of HICE-S on detailed captions closely resembles interpretable human judgments.Our code is available at https://github.com/joeyz0z/HICE.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.