-
LAMBDA: A Large Model Based Data Agent
Authors:
Maojun Sun,
Ruijian Han,
Binyan Jiang,
Houduo Qi,
Defeng Sun,
Yancheng Yuan,
Jian Huang
Abstract:
We introduce ``LAMBDA," a novel open-source, code-free multi-agent data analysis system that that harnesses the power of large models. LAMBDA is designed to address data analysis challenges in complex data-driven applications through the use of innovatively designed data agents that operate iteratively and generatively using natural language. At the core of LAMBDA are two key agent roles: the prog…
▽ More
We introduce ``LAMBDA," a novel open-source, code-free multi-agent data analysis system that that harnesses the power of large models. LAMBDA is designed to address data analysis challenges in complex data-driven applications through the use of innovatively designed data agents that operate iteratively and generatively using natural language. At the core of LAMBDA are two key agent roles: the programmer and the inspector, which are engineered to work together seamlessly. Specifically, the programmer generates code based on the user's instructions and domain-specific knowledge, enhanced by advanced models. Meanwhile, the inspector debugs the code when necessary. To ensure robustness and handle adverse scenarios, LAMBDA features a user interface that allows direct user intervention in the operational loop. Additionally, LAMBDA can flexibly integrate external models and algorithms through our knowledge integration mechanism, catering to the needs of customized data analysis. LAMBDA has demonstrated strong performance on various machine learning datasets. It has the potential to enhance data science practice and analysis paradigm by seamlessly integrating human and artificial intelligence, making it more accessible, effective, and efficient for individuals from diverse backgrounds. The strong performance of LAMBDA in solving data science problems is demonstrated in several case studies, which are presented at \url{https://www.polyu.edu.hk/ama/cmfai/lambda.html}.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
BlueTempNet: A Temporal Multi-network Dataset of Social Interactions in Bluesky Social
Authors:
Ujun Jeong,
Bohan Jiang,
Zhen Tan,
H. Russell Bernard,
Huan Liu
Abstract:
Decentralized social media platforms like Bluesky Social (Bluesky) have made it possible to publicly disclose some user behaviors with millisecond-level precision. Embracing Bluesky's principles of open-source and open-data, we present the first collection of the temporal dynamics of user-driven social interactions. BlueTempNet integrates multiple types of networks into a single multi-network, inc…
▽ More
Decentralized social media platforms like Bluesky Social (Bluesky) have made it possible to publicly disclose some user behaviors with millisecond-level precision. Embracing Bluesky's principles of open-source and open-data, we present the first collection of the temporal dynamics of user-driven social interactions. BlueTempNet integrates multiple types of networks into a single multi-network, including user-to-user interactions (following and blocking users) and user-to-community interactions (creating and joining communities). Communities are user-formed groups in custom Feeds, where users subscribe to posts aligned with their interests. Following Bluesky's public data policy, we collect existing Bluesky Feeds, including the users who liked and generated these Feeds, and provide tools to gather users' social interactions within a date range. This data-collection strategy captures past user behaviors and supports the future data collection of user behavior.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Data-driven stellar intrinsic colors and dust reddenings for spectro-photometric data: From the blue-edge method to a machine-learning approach
Authors:
He Zhao,
Shu Wang,
Biwei Jiang,
Jun Li,
Dongwei Fan,
Yi Ren,
Xiaoxiao Ma
Abstract:
Intrinsic colors (ICs) of stars are essential for the studies on both stellar physics and dust reddening. In this work, we developed an XGBoost model to predict the ICs with the atmospheric parameters $T_{\rm eff}$, ${\rm log}\,g$, and $\rm [M/H]$. The model was trained and tested for three colors at Gaia and 2MASS bands with 1,040,446 low-reddening sources. The atmospheric parameters were determi…
▽ More
Intrinsic colors (ICs) of stars are essential for the studies on both stellar physics and dust reddening. In this work, we developed an XGBoost model to predict the ICs with the atmospheric parameters $T_{\rm eff}$, ${\rm log}\,g$, and $\rm [M/H]$. The model was trained and tested for three colors at Gaia and 2MASS bands with 1,040,446 low-reddening sources. The atmospheric parameters were determined by the Gaia DR3 GSP-phot module and were validated by comparing with APOGEE and LAMOST. We further confirmed that the biases in GSP-phot parameters, especially for $\rm [M/H]$, do not present a significant impact on the IC prediction. The generalization error of the model estimated by the test set is 0.014 mag for $(G_{\rm BP}\,{-}\,G_{\rm RP})_0$, 0.050 mag for $(G_{\rm BP}\,{-}\,K_{\rm S})_0$, and 0.040 mag for $(J\,{-}\,K_{\rm S})_0$. The model was applied to a sample containing 5,714,528 reddened stars with stellar parameters from Andrae et al. (2023) to calculate ICs and reddenings. The high consistency in the comparison of $E(J\,{-}\,K_{\rm S})$ between our results and literature values further validates the accuracy of the XGBoost model. The variation of $E(G_{\rm BP}\,{-}\,K_{\rm S})/E(G_{\rm BP}\,{-}\,G_{\rm RP})$, a representation of the extinction law, with Galactic longitude is found on large scales. This work preliminarily presents the feasibility and the accuracy of the machine-learning approach for IC and dust reddening calculation, whose products could be widely applied to spectro-photometric data. The data sets and trained model can be accessed via \url{https://doi.org/10.5281/zenodo.12787594}. The models for more bands will be completed in the following works.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Boosting Large Language Models with Socratic Method for Conversational Mathematics Teaching
Authors:
Yuyang Ding,
Hanglei Hu,
Jie Zhou,
Qin Chen,
Bo Jiang,
Liang He
Abstract:
With the introduction of large language models (LLMs), automatic math reasoning has seen tremendous success. However, current methods primarily focus on providing solutions or using techniques like Chain-of-Thought to enhance problem-solving accuracy. In this paper, we focus on improving the capability of mathematics teaching via a Socratic teaching-based LLM (\texttt{SocraticLLM}), which guides l…
▽ More
With the introduction of large language models (LLMs), automatic math reasoning has seen tremendous success. However, current methods primarily focus on providing solutions or using techniques like Chain-of-Thought to enhance problem-solving accuracy. In this paper, we focus on improving the capability of mathematics teaching via a Socratic teaching-based LLM (\texttt{SocraticLLM}), which guides learners toward profound thinking with clarity and self-discovery via conversation. We collect and release a high-quality mathematical teaching dataset, named \texttt{SocraticMATH}, which provides Socratic-style conversations of problems with extra knowledge. Also, we propose a knowledge-enhanced LLM as a strong baseline to generate reliable responses with review, guidance/heuristic, rectification, and summarization. Experimental results show the great advantages of \texttt{SocraticLLM} by comparing it with several strong generative models. The codes and datasets are available on \url{https://github.com/ECNU-ICALK/SocraticMath}.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Search for $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Using $(2.712\pm0.014)\times10^{9}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII, we find an evidence of the $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay with a statistical significance of 3.1$σ$. Its decay branching fraction is measured to be $(12.24\pm4.60(\mathrm{stat.})\pm2.37(\mathrm{syst.})\pm4.68(\mathrm{extr.}))\times 10^{-4}$, where the first uncertainty is stati…
▽ More
Using $(2.712\pm0.014)\times10^{9}$ $ψ(3686)$ events collected with the BESIII detector operating at the BEPCII, we find an evidence of the $η_{c}(2S)\to K^+ K^- η^{\prime}$ decay with a statistical significance of 3.1$σ$. Its decay branching fraction is measured to be $(12.24\pm4.60(\mathrm{stat.})\pm2.37(\mathrm{syst.})\pm4.68(\mathrm{extr.}))\times 10^{-4}$, where the first uncertainty is statistical, the second is systematic, and the third uncertainty is from the branching fraction of the $ψ(3686)\toγη_{c}(2S)$ decay. The upper limit on the product branching fraction $B[ψ(3686)\toγη_{c}(2S)] \times$ $B[η_{c}(2S)\to K^+ K^- η^{\prime}]$ is set to be $1.14 \times 10^{-6}$ at $90\%$ confidence level. In addition, the branching fractions of $χ_{c1}\to K^+ K^- η^{\prime}$ and $χ_{c2}\to K^+ K^- η^{\prime}$ are updated to be $(8.47\pm0.09(\mathrm{stat.})\pm0.47(\mathrm{syst.}))\times 10^{-4}$ and $(1.53\pm0.04(\mathrm{stat.})\pm0.08(\mathrm{syst.}))\times 10^{-4}$, respectively. The precision is improved by twofold.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Observation of $Λ_c^+ \to Λa_0(980)^+$ and Evidence for $Σ(1380)^+$ in $Λ_c^+ \to Λπ^+ η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Based on $6.1~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at center-of-mass energies from 4.600~GeV to 4.843~GeV with the BESIII detector at the BEPCII collider, a partial wave analysis of $Λ_c^+\toΛπ^+η$ is performed, and branching fractions and decay asymmetry parameters of intermediate processes are determined. The process $Λ_c^+\toΛa_0(980)^+$ is observed for the first time, and…
▽ More
Based on $6.1~\mathrm{fb}^{-1}$ of $e^+e^-$ annihilation data collected at center-of-mass energies from 4.600~GeV to 4.843~GeV with the BESIII detector at the BEPCII collider, a partial wave analysis of $Λ_c^+\toΛπ^+η$ is performed, and branching fractions and decay asymmetry parameters of intermediate processes are determined. The process $Λ_c^+\toΛa_0(980)^+$ is observed for the first time, and evidence for the pentaquark candidate $Σ(1380)^+$ decaying into $Λπ^+$ is found with statistical significance larger than $3σ$. The branching fraction product $\mathcal{B}(Λ_{c}^{+} \to Λa_0(980)^+) \; \mathcal{B}( a_0(980)^+ \to π^{+}η)$ is determined to be $(1.05 \pm 0.16_{\mathrm{stat}} \pm 0.05_{\mathrm{syst}} \pm 0.07_{\mathrm{ext}})\%$, which is larger than theoretical calculations by $1 - 2$ orders of magnitude. Here the third (external) systematic is from $\mathcal{B}(Λ_{c}^{+} \to Λπ^+ η)$. Finally, we precisely obtain the absolute branching fraction $\mathcal{B}(Λ_{c}^{+} \to Λπ^+ η) = (1.94 \pm 0.07_{\mathrm{stat}} \pm 0.11_{\mathrm{syst}})\%$.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of $D^+_s\to \ell^+ν_\ell$ via $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(0.547\pm0.026_{\rm stat}\pm0.016_{\rm syst})\%$ a…
▽ More
Based on $10.64~\mathrm{fb}^{-1}$ of $e^+e^-$ collision data taken at center-of-mass energies between 4.237 and 4.699 GeV with the BESIII detector, we study the leptonic $D^+_s$ decays using the $e^+e^-\to D^{*+}_{s} D^{*-}_{s}$ process. The branching fractions of $D_s^+\to\ell^+ν_{\ell}\,(\ell=μ,τ)$ are measured to be $\mathcal{B}(D_s^+\toμ^+ν_μ)=(0.547\pm0.026_{\rm stat}\pm0.016_{\rm syst})\%$ and $\mathcal{B}(D_s^+\toτ^+ν_τ)=(5.60\pm0.16_{\rm stat}\pm0.20_{\rm syst})\%$, respectively. The product of the decay constant and Cabibbo-Kobayashi-Maskawa matrix element $|V_{cs}|$ is determined to be $f_{D_s^+}|V_{cs}|=(246.5\pm5.9_{\rm stat}\pm3.6_{\rm syst}\pm0.5_{\rm input})_{μν}~\mathrm{MeV}$ and $f_{D_s^+}|V_{cs}|=(252.7\pm3.6_{\rm stat}\pm4.5_{\rm syst}\pm0.6_{\rm input}))_{τν}~\mathrm{MeV}$, respectively. Taking the value of $|V_{cs}|$ from a global fit in the Standard Model, we obtain ${f_{D^+_s}}=(252.8\pm6.0_{\rm stat}\pm3.7_{\rm syst}\pm0.6_{\rm input})_{μν}$ MeV and ${f_{D^+_s}}=(259.2\pm3.6_{\rm stat}\pm4.5_{\rm syst}\pm0.6_{\rm input})_{τν}$ MeV, respectively. Conversely, taking the value for $f_{D_s^+}$ from the latest lattice quantum chromodynamics calculation, we obtain $|V_{cs}| =(0.986\pm0.023_{\rm stat}\pm0.014_{\rm syst}\pm0.003_{\rm input})_{μν}$ and $|V_{cs}| = (1.011\pm0.014_{\rm stat}\pm0.018_{\rm syst}\pm0.003_{\rm input})_{τν}$, respectively.
△ Less
Submitted 18 July, 2024; v1 submitted 16 July, 2024;
originally announced July 2024.
-
HACMan++: Spatially-Grounded Motion Primitives for Manipulation
Authors:
Bowen Jiang,
Yilin Wu,
Wenxuan Zhou,
Chris Paxton,
David Held
Abstract:
Although end-to-end robot learning has shown some success for robot manipulation, the learned policies are often not sufficiently robust to variations in object pose or geometry. To improve the policy generalization, we introduce spatially-grounded parameterized motion primitives in our method HACMan++. Specifically, we propose an action representation consisting of three components: what primitiv…
▽ More
Although end-to-end robot learning has shown some success for robot manipulation, the learned policies are often not sufficiently robust to variations in object pose or geometry. To improve the policy generalization, we introduce spatially-grounded parameterized motion primitives in our method HACMan++. Specifically, we propose an action representation consisting of three components: what primitive type (such as grasp or push) to execute, where the primitive will be grounded (e.g. where the gripper will make contact with the world), and how the primitive motion is executed, such as parameters specifying the push direction or grasp orientation. These three components define a novel discrete-continuous action space for reinforcement learning. Our framework enables robot agents to learn to chain diverse motion primitives together and select appropriate primitive parameters to complete long-horizon manipulation tasks. By grounding the primitives on a spatial location in the environment, our method is able to effectively generalize across object shape and pose variations. Our approach significantly outperforms existing methods, particularly in complex scenarios demanding both high-level sequential reasoning and object generalization. With zero-shot sim-to-real transfer, our policy succeeds in challenging real-world manipulation tasks, with generalization to unseen objects. Videos can be found on the project website: https://sgmp-rss2024.github.io.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Study of the decay and production properties of $D_{s1}(2536)$ and $D_{s2}^*(2573)$
Authors:
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere,
A. Brueggemann
, et al. (645 additional authors not shown)
Abstract:
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be…
▽ More
The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ processes are studied using data samples collected with the BESIII detector at center-of-mass energies from 4.530 to 4.946~GeV. The absolute branching fractions of $D_{s1}(2536)^- \rightarrow \bar{D}^{*0}K^-$ and $D_{s2}^*(2573)^- \rightarrow \bar{D}^0K^-$ are measured for the first time to be $(35.9\pm 4.8\pm 3.5)\%$ and $(37.4\pm 3.1\pm 4.6)\%$, respectively. The measurements are in tension with predictions based on the assumption that the $D_{s1}(2536)$ and $D_{s2}^*(2573)$ are dominated by a bare $c\bar{s}$ component. The $e^+e^-\rightarrow D_s^+D_{s1}(2536)^-$ and $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ cross sections are measured, and a resonant structure at around 4.6~GeV with a width of 50~MeV is observed for the first time with a statistical significance of $15σ$ in the $e^+e^-\rightarrow D_s^+D^*_{s2}(2573)^-$ process. It could be the $Y(4626)$ found by the Belle collaboration in the $D_s^+D_{s1}(2536)^{-}$ final state, since they have similar masses and widths. There is also evidence for a structure at around 4.75~GeV in both processes.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Oracle Bone Inscriptions Multi-modal Dataset
Authors:
Bang Li,
Donghao Luo,
Yujie Liang,
Jing Yang,
Zengmao Ding,
Xu Peng,
Boyuan Jiang,
Shengwei Han,
Dan Sui,
Peichao Qin,
Pian Wu,
Chaoyang Wang,
Yun Qi,
Taisong Jin,
Chengjie Wang,
Xiaoming Huang,
Zhan Shu,
Rongrong Ji,
Yongge Liu,
Yunsheng Wu
Abstract:
Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can prove extremely challenging. Out of the 4,500 oracle bone characters excavated, only a third have been successfully identified. Therefore, leveraging…
▽ More
Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can prove extremely challenging. Out of the 4,500 oracle bone characters excavated, only a third have been successfully identified. Therefore, leveraging the advantages of advanced AI technology to assist in the decipherment of OBI is a highly essential research topic. However, fully utilizing AI's capabilities in these matters is reliant on having a comprehensive and high-quality annotated OBI dataset at hand whereas most existing datasets are only annotated in just a single or a few dimensions, limiting the value of their potential application. For instance, the Oracle-MNIST dataset only offers 30k images classified into 10 categories. Therefore, this paper proposes an Oracle Bone Inscriptions Multi-modal Dataset(OBIMD), which includes annotation information for 10,077 pieces of oracle bones. Each piece has two modalities: pixel-level aligned rubbings and facsimiles. The dataset annotates the detection boxes, character categories, transcriptions, corresponding inscription groups, and reading sequences in the groups of each oracle bone character, providing a comprehensive and high-quality level of annotations. This dataset can be used for a variety of AI-related research tasks relevant to the field of OBI, such as OBI Character Detection and Recognition, Rubbing Denoising, Character Matching, Character Generation, Reading Sequence Prediction, Missing Characters Completion task and so on. We believe that the creation and publication of a dataset like this will help significantly advance the application of AI algorithms in the field of OBI research.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
DART: Deep Adversarial Automated Red Teaming for LLM Safety
Authors:
Bojian Jiang,
Yi Jing,
Tianhao Shen,
Qing Yang,
Deyi Xiong
Abstract:
Manual Red teaming is a commonly-used method to identify vulnerabilities in large language models (LLMs), which, is costly and unscalable. In contrast, automated red teaming uses a Red LLM to automatically generate adversarial prompts to the Target LLM, offering a scalable way for safety vulnerability detection. However, the difficulty of building a powerful automated Red LLM lies in the fact that…
▽ More
Manual Red teaming is a commonly-used method to identify vulnerabilities in large language models (LLMs), which, is costly and unscalable. In contrast, automated red teaming uses a Red LLM to automatically generate adversarial prompts to the Target LLM, offering a scalable way for safety vulnerability detection. However, the difficulty of building a powerful automated Red LLM lies in the fact that the safety vulnerabilities of the Target LLM are dynamically changing with the evolution of the Target LLM. To mitigate this issue, we propose a Deep Adversarial Automated Red Teaming (DART) framework in which the Red LLM and Target LLM are deeply and dynamically interacting with each other in an iterative manner. In each iteration, in order to generate successful attacks as many as possible, the Red LLM not only takes into account the responses from the Target LLM, but also adversarially adjust its attacking directions by monitoring the global diversity of generated attacks across multiple iterations. Simultaneously, to explore dynamically changing safety vulnerabilities of the Target LLM, we allow the Target LLM to enhance its safety via an active learning based data selection mechanism. Experimential results demonstrate that DART significantly reduces the safety risk of the target LLM. For human evaluation on Anthropic Harmless dataset, compared to the instruction-tuning target LLM, DART eliminates the violation risks by 53.4\%. We will release the datasets and codes of DART soon.
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be…
▽ More
A high precision measurement of the branching fraction of the decay $J/ψ\to p \bar{p} η$ is performed using $(10 087 \pm 44) \times 10^6$ $J/ψ$ events recorded by the {BESIII} detector at the {BEPCII} storage ring. The branching fractions of the two decays $J/ψ\to p \bar{p} η(η\to γγ)$ and $J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)$ are measured individually to be $\mathcal{B}(J/ψ\to p \bar{p} η(η\to γγ)) = (1.480 \pm 0.001 \pm 0.024)\times\,10^{-3}$ and $\mathcal{B}(J/ψ\to p \bar{p} η(η\to π^+ π^- π^0)) = (1.557 \pm 0.003 \pm 0.038)\times\,10^{-3}$, where the first uncertainties are statistical and the second systematic. Both results are compatible within their uncorrelated systematic uncertainties. The combined result is $\mathcal{B}(J/ψ\to p \bar{p} η)=(1.495 \pm 0.001 \pm 0.023)\times\,10^{-3}$ where the first uncertainty is the combined statistical uncertainty and the second one the combined systematic uncertainty of both analyses, incorporating correlations between them. In addition, the $p \bar{p}$ threshold region is investigated for a potential threshold enhancement, and no evidence for one is observed.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
SUPER: Seated Upper Body Pose Estimation using mmWave Radars
Authors:
Bo Zhang,
Zimeng Zhou,
Boyu Jiang,
Rong Zheng
Abstract:
In industrial countries, adults spend a considerable amount of time sedentary each day at work, driving and during activities of daily living. Characterizing the seated upper body human poses using mmWave radars is an important, yet under-studied topic with many applications in human-machine interaction, transportation and road safety. In this work, we devise SUPER, a framework for seated upper bo…
▽ More
In industrial countries, adults spend a considerable amount of time sedentary each day at work, driving and during activities of daily living. Characterizing the seated upper body human poses using mmWave radars is an important, yet under-studied topic with many applications in human-machine interaction, transportation and road safety. In this work, we devise SUPER, a framework for seated upper body human pose estimation that utilizes dual-mmWave radars in close proximity. A novel masking algorithm is proposed to coherently fuse data from the radars to generate intensity and Doppler point clouds with complementary information for high-motion but small radar cross section areas (e.g., upper extremities) and low-motion but large RCS areas (e.g. torso). A lightweight neural network extracts both global and local features of upper body and output pose parameters for the Skinned Multi-Person Linear (SMPL) model. Extensive leave-one-subject-out experiments on various motion sequences from multiple subjects show that SUPER outperforms a state-of-the-art baseline method by 30 -- 184%. We also demonstrate its utility in a simple downstream task for hand-object interaction.
△ Less
Submitted 2 July, 2024;
originally announced July 2024.
-
Observation of the Electromagnetic Dalitz Transition $h_c \rightarrow e^+e^-η_c$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
S. Ahmed,
M. Albrecht,
R. Aliberti,
A. Amoroso,
M. R. An,
Q. An,
X. H. Bai,
Y. Bai,
O. Bakina,
R. Baldini Ferroli,
I. Balossino,
Y. Ban,
K. Begzsuren,
N. Berger,
M. Bertani,
D. Bettoni,
F. Bianchi,
J. Bloms,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (495 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions…
▽ More
Using $(27.12\pm 0.14)\times10^8$ $ψ(3686)$ decays and data samples of $e^+e^-$ collisions with $\sqrt{s}$ from 4.130 to 4.780~GeV collected with the BESIII detector, we report the first observation of the electromagnetic Dalitz transition $h_c\to e^+e^-η_c$ with a statistical significance of $5.4σ$. We measure the ratio of the branching fractions $\frac{\mathcal{B}(h_c\rightarrow e^+e^-η_c)}{\mathcal{B}(h_c\rightarrow γη_c)}$ separately for the $h_c$ samples produced via $ψ(3686)\toπ^0h_c$ and $e^+e^-\toπ^+π^-h_c$. The average ratio is determined to be $(0.59\pm0.10(\text{stat.})\pm0.04(\text{syst.}))\%$, where the uncertainty includes both statistical and systematic components.
△ Less
Submitted 2 July, 2024; v1 submitted 28 June, 2024;
originally announced July 2024.
-
Improved measurement of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential dec…
▽ More
Analyzing $e^+e^-$ collision data corresponding to an integrated luminosity of $7.33~\mathrm{fb}^{-1}$ collected at center-of-mass energies between 4.128 and 4.226~GeV with the BESIII detector, we measure the branching fraction of the semileptonic decay $D^+_{s}\to K^0 e^+ν_e$ to be $(2.98\pm0.23\pm0.12)\times10^{-3}$. The $D_s^+\to K^0$ hadronic form factor is determined from the differential decay rate of $D^+_s\to K^0 e^+ν_e$ to be $f^{K^0}_+(0)=0.636\pm0.049\pm0.013$. For both measurements, the first uncertainty is statistical and the second systematic. The branching fraction and form factor measurements are factors of 1.6 and 1.7 more precise than the previous world averages, respectively.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
GeoReasoner: Geo-localization with Reasoning in Street Views using a Large Vision-Language Model
Authors:
Ling Li,
Yu Ye,
Bingchuan Jiang,
Wei Zeng
Abstract:
This work tackles the problem of geo-localization with a new paradigm using a large vision-language model (LVLM) augmented with human inference knowledge. A primary challenge here is the scarcity of data for training the LVLM - existing street-view datasets often contain numerous low-quality images lacking visual clues, and lack any reasoning inference. To address the data-quality issue, we devise…
▽ More
This work tackles the problem of geo-localization with a new paradigm using a large vision-language model (LVLM) augmented with human inference knowledge. A primary challenge here is the scarcity of data for training the LVLM - existing street-view datasets often contain numerous low-quality images lacking visual clues, and lack any reasoning inference. To address the data-quality issue, we devise a CLIP-based network to quantify the degree of street-view images being locatable, leading to the creation of a new dataset comprising highly locatable street views. To enhance reasoning inference, we integrate external knowledge obtained from real geo-localization games, tapping into valuable human inference capabilities. The data are utilized to train GeoReasoner, which undergoes fine-tuning through dedicated reasoning and location-tuning stages. Qualitative and quantitative evaluations illustrate that GeoReasoner outperforms counterpart LVLMs by more than 25% at country-level and 38% at city-level geo-localization tasks, and surpasses StreetCLIP performance while requiring fewer training resources. The data and code are available at https://github.com/lingli1996/GeoReasoner.
△ Less
Submitted 3 June, 2024;
originally announced June 2024.
-
Measurement of the cross sections of $e^+e^-\to K^{-}\barΞ^{+}Λ/Σ^{0}$ at center-of-mass energies between 3.510 and 4.914 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of…
▽ More
Using $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at center-of-mass energies between 3.510 and 4.914GeV, corresponding to an integrated luminosity of 25 fb$^{-1}$, we measure the Born cross sections for the process $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$ at thirty-five energy points with a partial-reconstruction strategy. By fitting the dressed cross sections of $e^+e^-\to K^-\barΞ^+Λ/Σ^{0}$, evidence for $ψ(4160) \to K^{-}\barΞ^{+}Λ$ is found for the first time with a significance of 4.4$σ$, including systematic uncertainties. No evidence for other possible resonances is found. In addition, the products of electronic partial width and branching fraction for all assumed resonances decaying into $K^{-}\barΞ^{+}Λ/Σ^{0}$ are determined.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Measurements of $K_S^0$-$K_L^0$ asymmetries in the decays $Λ_c^+ \to pK_{L,S}^0$, $pK_{L,S}^0π^+π^-$ and $pK_{L,S}^0π^0$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, an…
▽ More
Using $e^+e^-$ annihilation data sets corresponding to an integrated luminosity of 4.5 $\text{fb}^{-1}$, collected with the BESIII detector at center-of-mass energies between 4.600 and 4.699 GeV, we report the first measurements of the absolute branching fractions $\mathcal{B}(Λ_c^+\to pK_{L}^{0})=(1.67 \pm 0.06 \pm 0. 04)\%$, $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^+π^-)=(1.69 \pm 0.10 \pm 0.05)\%$, and $\mathcal{B}(Λ_c^+\to pK_{L}^{0}π^0)=(2.02 \pm 0.13 \pm 0.05)\%$, where the first uncertainties are statistical and the second systematic. Combining with the known branching fractions of $Λ_c^+ \to pK_{S}^{0}$, $Λ_c^+ \to pK_{S}^{0}π^+π^-$, and $Λ_c^+ \to pK_{S}^{0}π^0$, we present the first measurements of the $K_{S}^{0}$-$K_{L}^{0}$ asymmetries $R(Λ_c^+, K_{S,L}^0X) = \frac{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) - \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}{\mathcal{B}(Λ_c^+ \to K_{S}^{0} X) + \mathcal{B}(Λ_c^+ \to K_{L}^{0} X)}$ in charmed baryon decays: $R(Λ_c^+, pK_{S,L}^0) = -0.025 \pm 0.031$, $R(Λ_c^+, pK_{S,L}^0π^+π^-) = -0.027 \pm 0.048$, and $R(Λ_c^+, pK_{S,L}^0π^0) =-0.015 \pm 0.046$. No significant asymmetries within the uncertainties are observed.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Catching Chameleons: Detecting Evolving Disinformation Generated using Large Language Models
Authors:
Bohan Jiang,
Chengshuai Zhao,
Zhen Tan,
Huan Liu
Abstract:
Despite recent advancements in detecting disinformation generated by large language models (LLMs), current efforts overlook the ever-evolving nature of this disinformation. In this work, we investigate a challenging yet practical research problem of detecting evolving LLM-generated disinformation. Disinformation evolves constantly through the rapid development of LLMs and their variants. As a cons…
▽ More
Despite recent advancements in detecting disinformation generated by large language models (LLMs), current efforts overlook the ever-evolving nature of this disinformation. In this work, we investigate a challenging yet practical research problem of detecting evolving LLM-generated disinformation. Disinformation evolves constantly through the rapid development of LLMs and their variants. As a consequence, the detection model faces significant challenges. First, it is inefficient to train separate models for each disinformation generator. Second, the performance decreases in scenarios when evolving LLM-generated disinformation is encountered in sequential order. To address this problem, we propose DELD (Detecting Evolving LLM-generated Disinformation), a parameter-efficient approach that jointly leverages the general fact-checking capabilities of pre-trained language models (PLM) and the independent disinformation generation characteristics of various LLMs. In particular, the learned characteristics are concatenated sequentially to facilitate knowledge accumulation and transformation. DELD addresses the issue of label scarcity by integrating the semantic embeddings of disinformation with trainable soft prompts to elicit model-specific knowledge. Our experiments show that \textit{DELD} significantly outperforms state-of-the-art methods. Moreover, our method provides critical insights into the unique patterns of disinformation generation across different LLMs, offering valuable perspectives in this line of research.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Enhancing Explainability of Knowledge Learning Paths: Causal Knowledge Networks
Authors:
Yuang Wei,
Yizhou Zhou,
Yuan-Hao Jiang,
Bo Jiang
Abstract:
A reliable knowledge structure is a prerequisite for building effective adaptive learning systems and intelligent tutoring systems. Pursuing an explainable and trustworthy knowledge structure, we propose a method for constructing causal knowledge networks. This approach leverages Bayesian networks as a foundation and incorporates causal relationship analysis to derive a causal network. Additionall…
▽ More
A reliable knowledge structure is a prerequisite for building effective adaptive learning systems and intelligent tutoring systems. Pursuing an explainable and trustworthy knowledge structure, we propose a method for constructing causal knowledge networks. This approach leverages Bayesian networks as a foundation and incorporates causal relationship analysis to derive a causal network. Additionally, we introduce a dependable knowledge-learning path recommendation technique built upon this framework, improving teaching and learning quality while maintaining transparency in the decision-making process.
△ Less
Submitted 25 June, 2024; v1 submitted 25 June, 2024;
originally announced June 2024.
-
Study of the $f_{0}(980)$ through the decay $D_{s}^{+}\rightarrow π^{+}π^{+}π^{-}π^{0}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (649 additional authors not shown)
Abstract:
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and…
▽ More
We perform the first amplitude analysis of $D^+_s \to π^+π^+π^-π^0$ decays, based on data samples of electron-positron collisions recorded with the BESIII detector at center-of-mass energies between 4.128 and 4.226 GeV, corresponding to an integrated luminosity of 7.33~fb$^{-1}$. We report the observation of $D_{s}^{+} \to f_0(980)ρ(770)^{+}$ with a statistical significance greater than 10$σ$ and determine the branching fractions $\mathcal{B}(D_s^+\toπ^+π^+π^-π^0|_{{\rm non}-η})=(2.04\pm0.08_{\rm stat.}\pm0.05_{\rm syst.})\%$ and $\mathcal{B}(D_s^+\toηπ^+)=(1.56\pm0.09_{\rm stat.}\pm0.04_{\rm syst.})\%$. Moreover, we measure the relative branching fraction between $φ\toπ^+π^-π^0$ and $φ\to K^+K^-$ to be $\frac{\mathcal{B}(φ(1020) \to π^+π^-π^0)}{\mathcal{B}(φ(1020) \to K^+K^-)}=0.230 \pm 0.014_{\rm stat.} \pm 0.010_{\rm syst.}$, which deviates from the world average value by more than $4σ$.
△ Less
Submitted 25 June, 2024;
originally announced June 2024.
-
Expansive Synthesis: Generating Large-Scale Datasets from Minimal Samples
Authors:
Vahid Jebraeeli,
Bo Jiang,
Hamid Krim,
Derya Cansever
Abstract:
The challenge of limited availability of data for training in machine learning arises in many applications and the impact on performance and generalization is serious. Traditional data augmentation methods aim to enhance training with a moderately sufficient data set. Generative models like Generative Adversarial Networks (GANs) often face problematic convergence when generating significant and di…
▽ More
The challenge of limited availability of data for training in machine learning arises in many applications and the impact on performance and generalization is serious. Traditional data augmentation methods aim to enhance training with a moderately sufficient data set. Generative models like Generative Adversarial Networks (GANs) often face problematic convergence when generating significant and diverse data samples. Diffusion models, though effective, still struggle with high computational cost and long training times. This paper introduces an innovative Expansive Synthesis model that generates large-scale, high-fidelity datasets from minimal samples. The proposed approach exploits expander graph mappings and feature interpolation to synthesize expanded datasets while preserving the intrinsic data distribution and feature structural relationships. The rationale of the model is rooted in the non-linear property of neural networks' latent space and in its capture by a Koopman operator to yield a linear space of features to facilitate the construction of larger and enriched consistent datasets starting with a much smaller dataset. This process is optimized by an autoencoder architecture enhanced with self-attention layers and further refined for distributional consistency by optimal transport. We validate our Expansive Synthesis by training classifiers on the generated datasets and comparing their performance to classifiers trained on larger, original datasets. Experimental results demonstrate that classifiers trained on synthesized data achieve performance metrics on par with those trained on full-scale datasets, showcasing the model's potential to effectively augment training data. This work represents a significant advancement in data generation, offering a robust solution to data scarcity and paving the way for enhanced data availability in machine learning applications.
△ Less
Submitted 24 June, 2024;
originally announced June 2024.
-
Statistical ranking with dynamic covariates
Authors:
Pinjun Dong,
Ruijian Han,
Binyan Jiang,
Yiming Xu
Abstract:
We consider a covariate-assisted ranking model grounded in the Plackett--Luce framework. Unlike existing works focusing on pure covariates or individual effects with fixed covariates, our approach integrates individual effects with dynamic covariates. This added flexibility enhances realistic ranking yet poses significant challenges for analyzing the associated estimation procedures. This paper ma…
▽ More
We consider a covariate-assisted ranking model grounded in the Plackett--Luce framework. Unlike existing works focusing on pure covariates or individual effects with fixed covariates, our approach integrates individual effects with dynamic covariates. This added flexibility enhances realistic ranking yet poses significant challenges for analyzing the associated estimation procedures. This paper makes an initial attempt to address these challenges. We begin by discussing the sufficient and necessary condition for the model's identifiability. We then introduce an efficient alternating maximization algorithm to compute the maximum likelihood estimator (MLE). Under suitable assumptions on the topology of comparison graphs and dynamic covariates, we establish a quantitative uniform consistency result for the MLE with convergence rates characterized by the asymptotic graph connectivity. The proposed graph topology assumption holds for several popular random graph models under optimal leading-order sparsity conditions. A comprehensive numerical study is conducted to corroborate our theoretical findings and demonstrate the application of the proposed model to real-world datasets, including horse racing and tennis competitions.
△ Less
Submitted 8 July, 2024; v1 submitted 24 June, 2024;
originally announced June 2024.
-
Search for the $e^+e^- \to φχ_{c1}(3872)$ process at BESIII
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (639 additional authors not shown)
Abstract:
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction…
▽ More
Based on 368.5 pb$^{-1}$ of $e^+e^-$ collision data collected at center-of-mass energies 4.914 and 4.946 GeV by the BESIII detector, the $e^+e^- \to φχ_{c1}(3872)$ process is searched for the first time. No significant signal is observed and the upper limits at the 90\% confidence level on the product of the Born cross section $σ(e^+e^- \to φχ_{c1}(3872))$ and the branching fraction $\mathcal{B}[χ_{c1}(3872)\toπ^+π^- J/ψ]$ at 4.914 and 4.946 GeV are set to be 0.85 and 0.96 pb, respectively. These measurements provide useful information for the production of the $χ_{c1}(3872)$ at $e^+e^-$ collider and deepen our understanding about the nature of this particle.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Graph Edge Representation via Tensor Product Graph Convolutional Representation
Authors:
Bo Jiang,
Sheng Ge,
Ziyan Zhang,
Beibei Wang,
Jin Tang,
Bin Luo
Abstract:
Graph Convolutional Networks (GCNs) have been widely studied. The core of GCNs is the definition of convolution operators on graphs. However, existing Graph Convolution (GC) operators are mainly defined on adjacency matrix and node features and generally focus on obtaining effective node embeddings which cannot be utilized to address the graphs with (high-dimensional) edge features. To address thi…
▽ More
Graph Convolutional Networks (GCNs) have been widely studied. The core of GCNs is the definition of convolution operators on graphs. However, existing Graph Convolution (GC) operators are mainly defined on adjacency matrix and node features and generally focus on obtaining effective node embeddings which cannot be utilized to address the graphs with (high-dimensional) edge features. To address this problem, by leveraging tensor contraction representation and tensor product graph diffusion theories, this paper analogously defines an effective convolution operator on graphs with edge features which is named as Tensor Product Graph Convolution (TPGC). The proposed TPGC aims to obtain effective edge embeddings. It provides a complementary model to traditional graph convolutions (GCs) to address the more general graph data analysis with both node and edge features. Experimental results on several graph learning tasks demonstrate the effectiveness of the proposed TPGC.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Leveraging Pedagogical Theories to Understand Student Learning Process with Graph-based Reasonable Knowledge Tracing
Authors:
Jiajun Cui,
Hong Qian,
Bo Jiang,
Wei Zhang
Abstract:
Knowledge tracing (KT) is a crucial task in intelligent education, focusing on predicting students' performance on given questions to trace their evolving knowledge. The advancement of deep learning in this field has led to deep-learning knowledge tracing (DLKT) models that prioritize high predictive accuracy. However, many existing DLKT methods overlook the fundamental goal of tracking students'…
▽ More
Knowledge tracing (KT) is a crucial task in intelligent education, focusing on predicting students' performance on given questions to trace their evolving knowledge. The advancement of deep learning in this field has led to deep-learning knowledge tracing (DLKT) models that prioritize high predictive accuracy. However, many existing DLKT methods overlook the fundamental goal of tracking students' dynamical knowledge mastery. These models do not explicitly model knowledge mastery tracing processes or yield unreasonable results that educators find difficulty to comprehend and apply in real teaching scenarios. In response, our research conducts a preliminary analysis of mainstream KT approaches to highlight and explain such unreasonableness. We introduce GRKT, a graph-based reasonable knowledge tracing method to address these issues. By leveraging graph neural networks, our approach delves into the mutual influences of knowledge concepts, offering a more accurate representation of how the knowledge mastery evolves throughout the learning process. Additionally, we propose a fine-grained and psychological three-stage modeling process as knowledge retrieval, memory strengthening, and knowledge learning/forgetting, to conduct a more reasonable knowledge tracing process. Comprehensive experiments demonstrate that GRKT outperforms eleven baselines across three datasets, not only enhancing predictive accuracy but also generating more reasonable knowledge tracing results. This makes our model a promising advancement for practical implementation in educational settings. The source code is available at https://github.com/JJCui96/GRKT.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
WebCanvas: Benchmarking Web Agents in Online Environments
Authors:
Yichen Pan,
Dehan Kong,
Sida Zhou,
Cheng Cui,
Yifei Leng,
Bing Jiang,
Hangyu Liu,
Yanyi Shang,
Shuyan Zhou,
Tongshuang Wu,
Zhengyang Wu
Abstract:
For web agents to be practically useful, they must adapt to the continuously evolving web environment characterized by frequent updates to user interfaces and content. However, most existing benchmarks only capture the static aspects of the web. To bridge this gap, we introduce WebCanvas, an innovative online evaluation framework for web agents that effectively addresses the dynamic nature of web…
▽ More
For web agents to be practically useful, they must adapt to the continuously evolving web environment characterized by frequent updates to user interfaces and content. However, most existing benchmarks only capture the static aspects of the web. To bridge this gap, we introduce WebCanvas, an innovative online evaluation framework for web agents that effectively addresses the dynamic nature of web interactions. WebCanvas contains three main components to facilitate realistic assessments: (1) A novel evaluation metric which reliably capture critical intermediate actions or states necessary for task completions while disregarding noise caused by insignificant events or changed web-elements. (2) A benchmark dataset called Mind2Web-Live, a refined version of original Mind2Web static dataset containing 542 tasks with 2439 intermediate evaluation states; (3) Lightweight and generalizable annotation tools and testing pipelines that enables the community to collect and maintain the high-quality, up-to-date dataset. Building on WebCanvas, we open-source an agent framework with extensible modules for reasoning, providing a foundation for the community to conduct online inference and evaluations. Our best-performing agent achieves a task success rate of 23.1% and a task completion rate of 48.8% on the Mind2Web-Live test set. Additionally, we analyze the performance discrepancies across various websites, domains, and experimental environments. We encourage the community to contribute further insights on online agent evaluation, thereby advancing this field of research.
△ Less
Submitted 16 July, 2024; v1 submitted 18 June, 2024;
originally announced June 2024.
-
A Peek into Token Bias: Large Language Models Are Not Yet Genuine Reasoners
Authors:
Bowen Jiang,
Yangxinyu Xie,
Zhuoqun Hao,
Xiaomeng Wang,
Tanwi Mallick,
Weijie J. Su,
Camillo J. Taylor,
Dan Roth
Abstract:
This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We go beyond evaluating LLMs on accuracy; rather, we aim to investigate their token bias in solving logical reasoning tasks. Specifically, we develop carefully controlled synthetic datasets, featuring conjunction fallacy and syll…
▽ More
This study introduces a hypothesis-testing framework to assess whether large language models (LLMs) possess genuine reasoning abilities or primarily depend on token bias. We go beyond evaluating LLMs on accuracy; rather, we aim to investigate their token bias in solving logical reasoning tasks. Specifically, we develop carefully controlled synthetic datasets, featuring conjunction fallacy and syllogistic problems. Our framework outlines a list of hypotheses where token biases are readily identifiable, with all null hypotheses assuming genuine reasoning capabilities of LLMs. The findings in this study suggest, with statistical guarantee, that most LLMs still struggle with logical reasoning. While they may perform well on classic problems, their success largely depends on recognizing superficial patterns with strong token bias, thereby raising concerns about their actual reasoning and generalization abilities.
△ Less
Submitted 16 June, 2024;
originally announced June 2024.
-
A Unified Graph Selective Prompt Learning for Graph Neural Networks
Authors:
Bo Jiang,
Hao Wu,
Ziyan Zhang,
Beibei Wang,
Jin Tang
Abstract:
In recent years, graph prompt learning/tuning has garnered increasing attention in adapting pre-trained models for graph representation learning. As a kind of universal graph prompt learning method, Graph Prompt Feature (GPF) has achieved remarkable success in adapting pre-trained models for Graph Neural Networks (GNNs). By fixing the parameters of a pre-trained GNN model, the aim of GPF is to mod…
▽ More
In recent years, graph prompt learning/tuning has garnered increasing attention in adapting pre-trained models for graph representation learning. As a kind of universal graph prompt learning method, Graph Prompt Feature (GPF) has achieved remarkable success in adapting pre-trained models for Graph Neural Networks (GNNs). By fixing the parameters of a pre-trained GNN model, the aim of GPF is to modify the input graph data by adding some (learnable) prompt vectors into graph node features to better align with the downstream tasks on the smaller dataset. However, existing GPFs generally suffer from two main limitations. First, GPFs generally focus on node prompt learning which ignore the prompting for graph edges. Second, existing GPFs generally conduct the prompt learning on all nodes equally which fails to capture the importances of different nodes and may perform sensitively w.r.t noisy nodes in aligning with the downstream tasks. To address these issues, in this paper, we propose a new unified Graph Selective Prompt Feature learning (GSPF) for GNN fine-tuning. The proposed GSPF integrates the prompt learning on both graph node and edge together, which thus provides a unified prompt model for the graph data. Moreover, it conducts prompt learning selectively on nodes and edges by concentrating on the important nodes and edges for prompting which thus make our model be more reliable and compact. Experimental results on many benchmark datasets demonstrate the effectiveness and advantages of the proposed GSPF method.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Search for $X(1870)$ via the decay $J/ψ\to ωK^+ K^-η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the…
▽ More
Using a sample of $(10087\pm 44)\times10^{6}$ $J/ψ$ events collected by the BESIII detector at the BEPCII collider, we search for the decay $X(1870)\to K^+ K^-η$ via the $J/ψ\to ωK^+ K^- η$ process for the first time. No significant $X(1870)$ signal is observed. The upper limit on the branching fraction of the decay $ J/ψ\to ωX(1870) \toωK^+ K^- η$ is determined to be $9.55\times 10^{-7}$ at the $90\%$ confidence level. In addition, the branching faction $B(J/ψ\toωK^+ K^- η)$ is measured to be $(3.33\pm0.02(\rm{stat.})\pm 0.12(\rm{syst.}))\times 10^{-4}$.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Oblivious subspace embeddings for compressed Tucker decompositions
Authors:
Matthew Pietrosanu,
Bei Jiang,
Linglong Kong
Abstract:
Emphasis in the tensor literature on random embeddings (tools for low-distortion dimension reduction) for the canonical polyadic (CP) tensor decomposition has left analogous results for the more expressive Tucker decomposition comparatively lacking. This work establishes general Johnson-Lindenstrauss (JL) type guarantees for the estimation of Tucker decompositions when an oblivious random embeddin…
▽ More
Emphasis in the tensor literature on random embeddings (tools for low-distortion dimension reduction) for the canonical polyadic (CP) tensor decomposition has left analogous results for the more expressive Tucker decomposition comparatively lacking. This work establishes general Johnson-Lindenstrauss (JL) type guarantees for the estimation of Tucker decompositions when an oblivious random embedding is applied along each mode. When these embeddings are drawn from a JL-optimal family, the decomposition can be estimated within $\varepsilon$ relative error under restrictions on the embedding dimension that are in line with recent CP results. We implement a higher-order orthogonal iteration (HOOI) decomposition algorithm with random embeddings to demonstrate the practical benefits of this approach and its potential to improve the accessibility of otherwise prohibitive tensor analyses. On moderately large face image and fMRI neuroimaging datasets, empirical results show that substantial dimension reduction is possible with minimal increase in reconstruction error relative to traditional HOOI ($\leq$5% larger error, 50%-60% lower computation time for large models with 50% dimension reduction along each mode). Especially for large tensors, our method outperforms traditional higher-order singular value decomposition (HOSVD) and recently proposed TensorSketch methods.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Observation of $η_{c}$(1S, 2S) and $χ_{cJ}$ decays to 2$(π^{+}π^{-})η$ via $ψ$(3686) radiative transitions
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (636 additional authors not shown)
Abstract:
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measur…
▽ More
Based on $2.7 \times 10^9~ψ(3686)$ decays collected with the BESIII detector, the radiative decay $ψ(3686)\to\gamma2(π^{+}π^{-})η$ is investigated to measure properties of S- and P-wave charmonium states. The branching fraction of the decay $η_{c}(1S) \to 2(π^{+}π^{-})η$, which is found to have a strong dependence on the interference pattern between $η_c(1S)$ and non-$η_c(1S)$ processes, is measured in both destructive and constructive interference scenarios for the first time. The mass and width of the $η_{c}(1S)$ are measured to be $M=(2984.14 \pm 0.13 \pm 0.38)$ MeV/$c^{2}$ and $Γ=(28.82 \pm 0.11 \pm 0.82)$ MeV, respectively. Clear signals for the decays of the $χ_{cJ}(J=0,1,2)$ and the $η_{c}(2S)$ to $2(π^{+}π^{-})η$ are also observed for the first time, and the corresponding branching fractions are measured. The ratio of the branching fractions between the $η_{c}(2S)$ and $η_{c}(1S)$ decays is significantly lower than the theoretical prediction, which might suggest different dynamics in their decays.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Label Smoothing Improves Machine Unlearning
Authors:
Zonglin Di,
Zhaowei Zhu,
Jinghan Jia,
Jiancheng Liu,
Zafar Takhirov,
Bo Jiang,
Yuanshun Yao,
Sijia Liu,
Yang Liu
Abstract:
The objective of machine unlearning (MU) is to eliminate previously learned data from a model. However, it is challenging to strike a balance between computation cost and performance when using existing MU techniques. Taking inspiration from the influence of label smoothing on model confidence and differential privacy, we propose a simple gradient-based MU approach that uses an inverse process of…
▽ More
The objective of machine unlearning (MU) is to eliminate previously learned data from a model. However, it is challenging to strike a balance between computation cost and performance when using existing MU techniques. Taking inspiration from the influence of label smoothing on model confidence and differential privacy, we propose a simple gradient-based MU approach that uses an inverse process of label smoothing. This work introduces UGradSL, a simple, plug-and-play MU approach that uses smoothed labels. We provide theoretical analyses demonstrating why properly introducing label smoothing improves MU performance. We conducted extensive experiments on six datasets of various sizes and different modalities, demonstrating the effectiveness and robustness of our proposed method. The consistent improvement in MU performance is only at a marginal cost of additional computations. For instance, UGradSL improves over the gradient ascent MU baseline by 66% unlearning accuracy without sacrificing unlearning efficiency.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
VersiCode: Towards Version-controllable Code Generation
Authors:
Tongtong Wu,
Weigang Wu,
Xingyu Wang,
Kang Xu,
Suyu Ma,
Bo Jiang,
Ping Yang,
Zhenchang Xing,
Yuan-Fang Li,
Gholamreza Haffari
Abstract:
Significant research has focused on improving the performance of large language model on code-related tasks due to their practical importance. Although performance is typically evaluated using public benchmark datasets, the existing datasets do not account for the concept of \emph{version}, which is crucial in professional software development. In this paper, we introduce VersiCode, the first comp…
▽ More
Significant research has focused on improving the performance of large language model on code-related tasks due to their practical importance. Although performance is typically evaluated using public benchmark datasets, the existing datasets do not account for the concept of \emph{version}, which is crucial in professional software development. In this paper, we introduce VersiCode, the first comprehensive dataset designed to assess the ability of large language models to generate verifiable code for specific library versions. VersiCode encompasses 300 libraries across more than 2,000 versions spanning 9 years. We design two dedicated evaluation tasks: version-specific code completion (VSCC) and version-aware code editing (VACE). Comprehensive experiments are conducted to benchmark the performance of LLMs, revealing the challenging nature of these tasks and VersiCode, that even state-of-the-art LLMs struggle to generate version-correct code. This dataset, together with the proposed tasks, sheds light on LLMs' capabilities and limitations in handling version-specific code generation, and opens up an important new area of research for further investigation. The resources can be found at https://github.com/wutong8023/VersiCode.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
Strong and weak $CP$ tests in sequential decays of polarized $Σ^0$ hyperons
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (644 additional authors not shown)
Abstract:
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The wea…
▽ More
The $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ processes and subsequent decays are studied using the world's largest $J/ψ$ and $ψ(3686)$ data samples collected with the BESIII detector. The strong-$CP$ symmetry is tested in the decays of the $Σ^0$ hyperons for the first time by measuring the decay parameters, $α_{Σ^0} = -0.0017 \pm 0.0021 \pm 0.0018$ and $\barα_{Σ^0} = 0.0021 \pm 0.0020 \pm 0.0022$. The weak-$CP$ test is performed in the subsequent decays of their daughter particles $Λ$ and $\barΛ$. Also for the first time, the transverse polarizations of the $Σ^0$ hyperons in $J/ψ$ and $ψ(3686)$ decays are observed with opposite directions, and the ratios between the S-wave and D-wave contributions of the $J/ψ, ψ(3686) \to Σ^0 \barΣ^{0}$ decays are obtained. These results are crucial to understand the decay dynamics of the charmonium states and the production mechanism of the $Σ^0-\barΣ^0$ pairs.
△ Less
Submitted 16 July, 2024; v1 submitted 10 June, 2024;
originally announced June 2024.
-
Measurement of the integrated luminosity of the data collected at 3.773 GeV by BESIII from 2021 to 2024
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (634 additional authors not shown)
Abstract:
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$,…
▽ More
We present a measurement of the integrated luminosity of $e^+e^-$ collision data collected with the BESIII detector at the BEPCII collider at a center-of-mass energy of $E_{\rm cm} = 3.773$~GeV. The integrated luminosities of the data sets taken from December 2021 to June 2022, from November 2022 to June 2023, and from October 2023 to February 2024 are determined to be $4.995 \pm 0.019$~fb$^{-1}$, $8.157 \pm 0.031$~fb$^{-1}$, and $4.191 \pm 0.016$~fb$^{-1}$, respectively, by analyzing large angle Bhabha scattering events. The uncertainties are dominated by systematic effects and the statistical uncertainties are negligible. Our results provide essential input for future analyses and precision measurements.
△ Less
Submitted 9 June, 2024;
originally announced June 2024.
-
Probing the distinct extinction law of the Pillars of Creation in M16 with JWST
Authors:
Jun Li,
Bingqiu Chen,
Biwei Jiang,
Jian Gao,
Xi Chen
Abstract:
Investigating the extinction law in regions of high dust extinction, such as the Pillars of Creation within the M16 region, is crucial for understanding the densest parts of the interstellar medium (ISM). In this study, we utilize observations from the Near-Infrared Camera (NIRCam) and the Mid-Infrared Instrument (MIRI) onboard the James Webb Space Telescope (JWST) to analyze the color-excess rati…
▽ More
Investigating the extinction law in regions of high dust extinction, such as the Pillars of Creation within the M16 region, is crucial for understanding the densest parts of the interstellar medium (ISM). In this study, we utilize observations from the Near-Infrared Camera (NIRCam) and the Mid-Infrared Instrument (MIRI) onboard the James Webb Space Telescope (JWST) to analyze the color-excess ratios $E(F090W-λ)/E(F090W-F200W)$ across a wavelength range of $0.9-7.7\,μ\mathrm{m}$. Our method involves performing linear regression on color-color diagrams to derive these ratios. The enhanced detection capabilities of JWST data allow us to probe the distinct extinction law to the densest regions in M16 corresponding to an extinction depth up to $A_V \sim 60$\,mag. Remarkably, the resultant color-excess ratio curve exhibits a flatter profile than predicted by typical dust extinction models with $R_V = 5.5$ for dense ISM environments. Moreover, we observe that the mid-infrared (MIR) extinction law diverges from the near-infrared (NIR) power-law, showing a tendency for the slope to flatten as the wavelength increases. These findings have significant implications for our understanding of the dust properties in dense interstellar environments.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^-π^0/η$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (643 additional authors not shown)
Abstract:
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for…
▽ More
Based on $(2712.4\pm 14.3)\times10^{6}$ $ψ(3686)$ events, we investigate four hadronic decay modes of the $P$-wave charmonium spin-singlet state $h_c(^1P_1) \to h^+ h^- π^0/η$ ($h=π$ or $K$) via the process $ψ(3686) \to π^{0}h_c$ at BESIII. The $h_c \to π^+ π^- π^0$ decay is observed with a significance of 9.6$σ$ after taking into account systematic uncertainties. Evidences for $h_c \to K^+ K^- π^0$ and $h_c \to K^+ K^- η$ are found with significances of $3.5σ$ and $3.3σ$, respectively, after considering the systematic uncertainties. The branching fractions of these decays are measured to be $\mathcal{B}(h_c \to π^+ π^- π^0)=(1.36\pm0.16\pm0.14)\times10^{-3}$, $\mathcal{B}(h_c \to K^+ K^- π^0)=(3.26\pm0.84\pm0.36)\times10^{-4}$, and $\mathcal{B}(h_c \to K^+ K^- η)=(3.13\pm1.08\pm0.38)\times10^{-4}$, where the first uncertainties are statistical and the second are systematic. No significant signal of $h_c\toπ^+π^-η$ is found, and the upper limit of its decay branching fraction is determined to be $\mathcal{B}(h_c\toπ^+π^-η) < 4.0 \times 10^{-4}$ at 90% confidence level.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
CE-NAS: An End-to-End Carbon-Efficient Neural Architecture Search Framework
Authors:
Yiyang Zhao,
Yunzhuo Liu,
Bo Jiang,
Tian Guo
Abstract:
This work presents a novel approach to neural architecture search (NAS) that aims to increase carbon efficiency for the model design process. The proposed framework CE-NAS addresses the key challenge of high carbon cost associated with NAS by exploring the carbon emission variations of energy and energy differences of different NAS algorithms. At the high level, CE-NAS leverages a reinforcement-le…
▽ More
This work presents a novel approach to neural architecture search (NAS) that aims to increase carbon efficiency for the model design process. The proposed framework CE-NAS addresses the key challenge of high carbon cost associated with NAS by exploring the carbon emission variations of energy and energy differences of different NAS algorithms. At the high level, CE-NAS leverages a reinforcement-learning agent to dynamically adjust GPU resources based on carbon intensity, predicted by a time-series transformer, to balance energy-efficient sampling and energy-intensive evaluation tasks. Furthermore, CE-NAS leverages a recently proposed multi-objective optimizer to effectively reduce the NAS search space. We demonstrate the efficacy of CE-NAS in lowering carbon emissions while achieving SOTA results for both NAS datasets and open-domain NAS tasks. For example, on the HW-NasBench dataset, CE-NAS reduces carbon emissions by up to 7.22X while maintaining a search efficiency comparable to vanilla NAS. For open-domain NAS tasks, CE-NAS achieves SOTA results with 97.35% top-1 accuracy on CIFAR-10 with only 1.68M parameters and a carbon consumption of 38.53 lbs of CO2. On ImageNet, our searched model achieves 80.6% top-1 accuracy with a 0.78 ms TensorRT latency using FP16 on NVIDIA V100, consuming only 909.86 lbs of CO2, making it comparable to other one-shot-based NAS baselines.
△ Less
Submitted 17 July, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Measurements of the branching fractions of semileptonic $D^{+}_s$ decays via $e^+e^-\to D_s^{*+}D_s^{*-}$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are…
▽ More
We measure the absolute branching fractions of semileptonic $D^+_s$ decays via the $e^+e^-\to D_s^{*+}D_s^{*-}$ process using $e^+e^-$ collision data corresponding to an integrated luminosity of $10.64~\mathrm{fb}^{-1}$ collected by the BESIII detector at center-of-mass energies between 4.237 and 4.699 GeV. The branching fractions are ${\mathcal B}(D_s^+\to ηe^+ν_e)=(2.35\pm0.11_{\rm stat}\pm 0.10_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to η^\prime e^+ν_e)=(0.82\pm0.09_{\rm stat}\pm 0.04_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to φe^+ν_e)=(2.21\pm0.16_{\rm stat}\pm 0.11_{\rm syst})\%,$ ${\mathcal B}(D_s^+\to f_0(980) e^+ν_e,f_0(980)\toπ^+π^-)=(0.15\pm0.02_{\rm stat}\pm 0.01_{\rm syst})\%,$ ${\mathcal
B}(D_s^+\to K^0 e^+ν_e)=(0.24\pm0.04_{\rm stat}\pm 0.01_{\rm syst})\%,$ and ${\mathcal B}(D_s^+\to K^{*0} e^+ν_e)=(0.19\pm0.03_{\rm stat}\pm 0.01_{\rm syst})\%.$ These results are consistent with those measured via the $e^+e^-\to D_s^{*\pm}D_s^{\mp}$ process by BESIII and CLEO. The hadronic transition form factors $D^+_s\to ηe^+ν_e$, $D^+_s\to η^\prime e^+ν_e$, and $D^+_s\to K^0 e^+ν_e$ at four-momentum transfer squared $q^2$ = 0 are determined to be $f^η_+(0) = 0.482 \pm 0.011_{\rm stat} \pm 0.009_{\rm syst}\pm0.004_{\rm input},$ $f^{η^{\prime}}_+(0) = 0.562 \pm 0.031_{\rm stat} \pm 0.014_{\rm
syst}\pm0.003_{\rm input},$ and $f^{K^0}_+(0) = 0.624 \pm 0.052_{\rm
stat} \pm 0.013_{\rm syst}\pm0.002_{\rm input}.$
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey
Authors:
Bowen Jiang,
Yangxinyu Xie,
Xiaomeng Wang,
Weijie J. Su,
Camillo J. Taylor,
Tanwi Mallick
Abstract:
Rationality is the quality of being guided by reason, characterized by logical thinking and decision-making that align with evidence and logical rules. This quality is essential for effective problem-solving, as it ensures that solutions are well-founded and systematically derived. Despite the advancements of large language models (LLMs) in generating human-like text with remarkable accuracy, they…
▽ More
Rationality is the quality of being guided by reason, characterized by logical thinking and decision-making that align with evidence and logical rules. This quality is essential for effective problem-solving, as it ensures that solutions are well-founded and systematically derived. Despite the advancements of large language models (LLMs) in generating human-like text with remarkable accuracy, they present biases inherited from the training data, inconsistency across different contexts, and difficulty understanding complex scenarios involving multiple layers of context. Therefore, recent research attempts to leverage the strength of multiple agents working collaboratively with various types of data and tools for enhanced consistency and reliability. To that end, this paper aims to understand whether multi-modal and multi-agent systems are advancing toward rationality by surveying the state-of-the-art works, identifying advancements over single-agent and single-modal systems in terms of rationality, and discussing open problems and future directions. We maintain an open repository at https://github.com/bowen-upenn/MMMA_Rationality.
△ Less
Submitted 18 June, 2024; v1 submitted 31 May, 2024;
originally announced June 2024.
-
Design, Calibration, and Control of Compliant Force-sensing Gripping Pads for Humanoid Robots
Authors:
Yuanfeng Han,
Boren Jiang,
Gregory S. Chirikjian
Abstract:
This paper introduces a pair of low-cost, light-weight and compliant force-sensing gripping pads used for manipulating box-like objects with smaller-sized humanoid robots. These pads measure normal gripping forces and center of pressure (CoP). A calibration method is developed to improve the CoP measurement accuracy. A hybrid force-alignment-position control framework is proposed to regulate the g…
▽ More
This paper introduces a pair of low-cost, light-weight and compliant force-sensing gripping pads used for manipulating box-like objects with smaller-sized humanoid robots. These pads measure normal gripping forces and center of pressure (CoP). A calibration method is developed to improve the CoP measurement accuracy. A hybrid force-alignment-position control framework is proposed to regulate the gripping forces and to ensure the surface alignment between the grippers and the object. Limit surface theory is incorporated as a contact friction modeling approach to determine the magnitude of gripping forces for slippage avoidance. The integrated hardware and software system is demonstrated with a NAO humanoid robot. Experiments show the effectiveness of the overall approach.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Search for $e^{+}e^{-}\toη'ψ(2S)$ at center-of-mass energies from 4.66 to 4.95 GeV
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence lev…
▽ More
Using data samples with an integrated luminosity of $4.67~\mathrm{fb}^{-1}$ collected by the BESIII detector operating at the BEPCII collider, we search for the process $e^+e^- \rightarrow η' ψ(2S)$ at center-of-mass energies from $4.66$ to $4.95~\mathrm{GeV}$. No significant signal is observed, and upper limits for the Born cross sections $σ^B(e^+e^-\rightarrowη'ψ(2S))$ at the 90\% confidence level are determined.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Study of the decays $χ_{cJ} \rightarrow Λ\barΛφ$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (637 additional authors not shown)
Abstract:
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured t…
▽ More
Based on $(2712.4 \pm 14.3) \times 10^{6}$ $ e^{+}e^{-}\toψ(3686)$ events collected with the BESIII detector operating at the BEPCII collider, we report the first evidence of $χ_{c0}\to Λ\bar Λφ$ decays and the first observation of $χ_{c1,2}\to Λ\bar Λφ$ decays, with significances of $4.5σ$, $11.3σ$ and $13.0σ$, respectively. The decay branching fractions of $χ_{c0,1,2}\to Λ\bar Λφ$ are measured to be $( 2.99\pm1.24\pm0.19) \times 10^{-5}$, $(6.01\pm0.90\pm0.40 )\times 10^{-5}$, and $(7.13\pm0.81\pm0.36) \times 10^{-5}$, where the first uncertainties are statistical and the second systematic. No obvious enhancement near the $Λ\barΛ$ production threshold or excited $Λ$ state is found in the $Λφ$ (or $\barΛφ$) system.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Authors:
Kai Wu,
Boyuan Jiang,
Zhengkai Jiang,
Qingdong He,
Donghao Luo,
Shengzhi Wang,
Qingwen Liu,
Chengjie Wang
Abstract:
Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating lengthy, detailed descriptions for images. Our analysis reveals that hallucinations stem from the inherent summarization mechanism of large language models, leading…
▽ More
Multimodal large language models (MLLMs) contribute a powerful mechanism to understanding visual information building on large language models. However, MLLMs are notorious for suffering from hallucinations, especially when generating lengthy, detailed descriptions for images. Our analysis reveals that hallucinations stem from the inherent summarization mechanism of large language models, leading to excessive dependence on linguistic tokens while neglecting vision information. In this paper, we propose NoiseBoost, a broadly applicable and simple method for alleviating hallucinations for MLLMs through the integration of noise feature perturbations. Noise perturbation acts as a regularizer, facilitating a balanced distribution of attention weights among visual and linguistic tokens. Despite its simplicity, NoiseBoost consistently enhances the performance of MLLMs across common training strategies, including supervised fine-tuning and reinforcement learning. Further, NoiseBoost pioneerly enables semi-supervised learning for MLLMs, unleashing the power of unlabeled data. Comprehensive experiments demonstrate that NoiseBoost improves dense caption accuracy by 8.1% with human evaluation and achieves comparable results with 50% of the data by mining unlabeled data. Code and models are available at https://kaiwu5.github.io/noiseboost.
△ Less
Submitted 31 May, 2024; v1 submitted 30 May, 2024;
originally announced May 2024.
-
Global and local observability of hypergraphs
Authors:
Chencheng Zhang,
Hao Yang,
Shaoxuan Cui,
Bin Jiang,
Ming Cao
Abstract:
This paper studies observability for non-uniform hypergraphs with inputs and outputs. To capture higher-order interactions, we define a canonical non-homogeneous dynamical system with nonlinear outputs on hypergraphs. We then construct algebraic necessary and sufficient conditions based on polynomial ideals and varieties for global observability at an initial state of hypergraphs. An example is gi…
▽ More
This paper studies observability for non-uniform hypergraphs with inputs and outputs. To capture higher-order interactions, we define a canonical non-homogeneous dynamical system with nonlinear outputs on hypergraphs. We then construct algebraic necessary and sufficient conditions based on polynomial ideals and varieties for global observability at an initial state of hypergraphs. An example is given to illustrate the proposed criteria for observability. Further, necessary and sufficient conditions for local observability are derived based on rank conditions of observability matrices, which provide a framework to study local observability for non-uniform hypergraphs. Finally, the similarity of observability for hypergraphs is proposed using similarity of tensors, which reveals the relation of observability between two hypergraphs and helps to check the observability intuitively.
△ Less
Submitted 29 May, 2024;
originally announced May 2024.
-
Knowing What Not to Do: Leverage Language Model Insights for Action Space Pruning in Multi-agent Reinforcement Learning
Authors:
Zhihao Liu,
Xianliang Yang,
Zichuan Liu,
Yifan Xia,
Wei Jiang,
Yuanyu Zhang,
Lijuan Li,
Guoliang Fan,
Lei Song,
Bian Jiang
Abstract:
Multi-agent reinforcement learning (MARL) is employed to develop autonomous agents that can learn to adopt cooperative or competitive strategies within complex environments. However, the linear increase in the number of agents leads to a combinatorial explosion of the action space, which may result in algorithmic instability, difficulty in convergence, or entrapment in local optima. While research…
▽ More
Multi-agent reinforcement learning (MARL) is employed to develop autonomous agents that can learn to adopt cooperative or competitive strategies within complex environments. However, the linear increase in the number of agents leads to a combinatorial explosion of the action space, which may result in algorithmic instability, difficulty in convergence, or entrapment in local optima. While researchers have designed a variety of effective algorithms to compress the action space, these methods also introduce new challenges, such as the need for manually designed prior knowledge or reliance on the structure of the problem, which diminishes the applicability of these techniques. In this paper, we introduce Evolutionary action SPAce Reduction with Knowledge (eSpark), an exploration function generation framework driven by large language models (LLMs) to boost exploration and prune unnecessary actions in MARL. Using just a basic prompt that outlines the overall task and setting, eSpark is capable of generating exploration functions in a zero-shot manner, identifying and pruning redundant or irrelevant state-action pairs, and then achieving autonomous improvement from policy feedback. In reinforcement learning tasks involving inventory management and traffic light control encompassing a total of 15 scenarios, eSpark consistently outperforms the combined MARL algorithm in all scenarios, achieving an average performance gain of 34.4% and 9.9% in the two types of tasks respectively. Additionally, eSpark has proven to be capable of managing situations with a large number of agents, securing a 29.7% improvement in scalability challenges that featured over 500 agents. The code can be found in https://github.com/LiuZhihao2022/eSpark.git.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Koopcon: A new approach towards smarter and less complex learning
Authors:
Vahid Jebraeeli,
Bo Jiang,
Derya Cansever,
Hamid Krim
Abstract:
In the era of big data, the sheer volume and complexity of datasets pose significant challenges in machine learning, particularly in image processing tasks. This paper introduces an innovative Autoencoder-based Dataset Condensation Model backed by Koopman operator theory that effectively packs large datasets into compact, information-rich representations. Inspired by the predictive coding mechanis…
▽ More
In the era of big data, the sheer volume and complexity of datasets pose significant challenges in machine learning, particularly in image processing tasks. This paper introduces an innovative Autoencoder-based Dataset Condensation Model backed by Koopman operator theory that effectively packs large datasets into compact, information-rich representations. Inspired by the predictive coding mechanisms of the human brain, our model leverages a novel approach to encode and reconstruct data, maintaining essential features and label distributions. The condensation process utilizes an autoencoder neural network architecture, coupled with Optimal Transport theory and Wasserstein distance, to minimize the distributional discrepancies between the original and synthesized datasets. We present a two-stage implementation strategy: first, condensing the large dataset into a smaller synthesized subset; second, evaluating the synthesized data by training a classifier and comparing its performance with a classifier trained on an equivalent subset of the original data. Our experimental results demonstrate that the classifiers trained on condensed data exhibit comparable performance to those trained on the original datasets, thus affirming the efficacy of our condensation model. This work not only contributes to the reduction of computational resources but also paves the way for efficient data handling in constrained environments, marking a significant step forward in data-efficient machine learning.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Mechanistic Insights into Non-Adiabatic Interband Transitions on a Semiconductor Surface Induced by Hydrogen Atom Collisions
Authors:
Lingjun Zhu,
Qijing Zheng,
Yingqi Wang,
Kerstin Krüger,
Alec M. Wodtke,
Oliver Bünermann,
Jin Zhao,
Hua Guo,
Bin Jiang
Abstract:
To understand the recently observed mysterious non-adiabatic energy transfer for hyperthermal H atom scattering from a semiconductor surface, Ge(111)c(2*8), we present a mixed quantum-classical non-adiabatic molecular dynamics model based on time-dependent evolution of Kohn-Sham orbitals and a classical path approximation. Our results suggest that facile non-adiabatic transitions occur selectively…
▽ More
To understand the recently observed mysterious non-adiabatic energy transfer for hyperthermal H atom scattering from a semiconductor surface, Ge(111)c(2*8), we present a mixed quantum-classical non-adiabatic molecular dynamics model based on time-dependent evolution of Kohn-Sham orbitals and a classical path approximation. Our results suggest that facile non-adiabatic transitions occur selectively at the rest atom site, featuring excitation of valance band electrons to the conduction band, but not at the adatom site. This drastic site specificity can be attributed to the changes of the local band structure upon energetic H collisions at different surface sites, leading to transient near-degeneracies and significant couplings between occupied and unoccupied orbitals at the rest atom, but not at the adatom. These insights shed valuable light on the collisional induced non-adiabatic dynamics at semiconductor surfaces.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Study of the decays $χ_{cJ}\toΛ\barΛω$
Authors:
BESIII Collaboration,
M. Ablikim,
M. N. Achasov,
P. Adlarson,
O. Afedulidis,
X. C. Ai,
R. Aliberti,
A. Amoroso,
Q. An,
Y. Bai,
O. Bakina,
I. Balossino,
Y. Ban,
H. -R. Bao,
V. Batozskaya,
K. Begzsuren,
N. Berger,
M. Berlowski,
M. Bertani,
D. Bettoni,
F. Bianchi,
E. Bianco,
A. Bortone,
I. Boyko,
R. A. Briere
, et al. (638 additional authors not shown)
Abstract:
Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$,…
▽ More
Using $(27.12\pm 0.14)\times10^{8}$ $ψ(3686)$ events collected with the BESIII detector, we present the first observation of the decays $χ_{cJ}\toΛ\barΛω$, where $J=0, 1, 2$, with statistical significances of $11.7 σ, 11.2 σ$, and $11.8 σ$. The branching fractions of these decays are determined to be $\mathcal{B}(χ_{c0}\toΛ\barΛω)=({2.37 \pm 0.22 \pm 0.23}) \times 10^{-4}$, $\mathcal{B}(χ_{c1}\toΛ\barΛω)=({1.01 \pm 0.10 \pm 0.11}) \times 10^{-4}$, and $\mathcal{B}(χ_{c2}\toΛ\barΛω)=({1.40 \pm 0.13 \pm 0.17}) \times 10^{-4}$, where the first uncertainties are statistical and the second are systematic. We observe no clear intermediate structures.
△ Less
Submitted 21 May, 2024;
originally announced May 2024.