Abstract
A search for the rare decay is performed using
collision data collected with the LHCb dete-ctor at centre-of-mass energies of 7, 8 and 13 TeV, corresponding to an integrated luminosity of 9 fb−1. No significant signal of the decay is observed and an upper limit of
at 90% confidence level is set on the branching fraction.

Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. Article funded by SCOAP and published under licence by Chinese Physical Society and the Institute of High Energy Physics of the Chinese Academy of Sciences and the Institute of Modern Physics of the Chinese Academy of Sciences and IOP Publishing Ltd
I. INTRODUCTION
The
decay was first observed by the LHCb experiment with a branching fraction of
[1]. It proceeds primarily through the Cabibbo-suppressed
transition. The
pair can come either directly from the
decay via an
pair created in the vacuum, or from the decay of intermediate states that contain both
and
components, such as the
resonance
①
. There is a potential contribution from the
meson as an intermediate state. The decay
is suppressed by the Okubo-Zweig-Iizuka (OZI) rule that forbids disconnected quark diagrams [2-4]. The size of this contribution and the exact mechanism to produce the
meson in this process are of particular theoretical interest [5-7]. Under the assumption that the dominant contribution is via a small
component in the
wave-function, arising from
mixing (Fig. 1(a)), the branching fraction of the
decay is predicted to be of the order of
[5]. Contributions to
decays from the OZI-suppressed tri-gluon fusion (Fig. 1(b)), photoproduction and final-state rescattering are estimated to be at least one order of magnitude lower [7]. Experimental studies of the decay
could provide important information about the dynamics of OZI-suppressed decays.
Fig. 1 Feynman diagrams for the decay
via (a)
mixing and (b) tri-gluon fusion.
Download figure:
Standard imageNo significant signal of
decay has been observed in previous searches by several experiments. Upper limits on the branching fraction of the decay have been set by BaBar [8], Belle [9] and LHCb [1]. The LHCb limit was obtained using a data sample corresponding to an integrated luminosity of 1
of
collision data, collected at a centre-of-mass energy of 7
. This paper presents an update on the search for
decays using a data sample corresponding to an integrated luminosity of 9
, including 3
collected at 7 and 8
, denoted as Run 1, and 6
collected at 13
, denoted as Run 2.
The LHCb measurement in Ref. [1] is obtained from an amplitude analysis of
decays over a wide
range from the
mass threshold to 2200
. This paper focuses on the
region, with the
mass in the range 1000–1050
, and on studies of the
and
mass distributions, to distinguish the
signal from the non-resonant decay
and background contaminations. The abundant decay
is used as the normalisation channel. The choice of mass fits over a full amplitude analysis is motivated by several considerations. The sharp
mass peak provides a clear signal characteristic and the lineshape can be very well determined using the copious
decays. On the other hand, interference of the S-wave (either
(980) or non-resonant) and P-wave amplitudes vanishes in the
spectrum, up to negligible angular acceptance effects, after integrating over the angular variables. Furthermore, significant correlations observed between
,
and angular variables make it challenging to describe the mass-dependent angular distributions of both signal and background, which are required for an amplitude analysis. Finally, the power of the amplitude analysis in discriminating the signal from the non-
contribution and background is reduced by the large number of parameters that need to be determined in the fit. In addition, a good understanding of the contamination from
decays in the
mass-region is essential in the search for
.
II. DETECTOR AND SIMULATION
The LHCb detector [10, 11] is a single-arm forward spectrometer covering the pseudorapidity range
, designed for the study of particles containing b or c quarks. The detector includes a high-precision tracking system consisting of a silicon-strip vertex detector surrounding the
interaction region, a large-area silicon-strip detector located upstream of a dipole magnet with a bending power of about
, and three stations of silicon-strip detectors and straw drift tubes placed downstream of the magnet. The tracking system provides a measurement of the momentum, p, of charged particles with a relative uncertainty that varies from 0.5% at low momentum to 1.0% at 200
. The minimum distance of a track to a primary vertex (PV), the impact parameter (IP), is measured with a resolution of
, where
is the component of the momentum transverse to the beam, in
. Different types of charged hadrons are distinguished using information from two ring-imaging Cherenkov detectors. Photons, electrons and hadrons are identified by a calorimeter system consisting of scintillating-pad and preshower detectors, an electromagnetic and a hadronic calorimeter. Muons are identified by a system composed of alternating layers of iron and multiwire proportional chambers.
Samples of simulated decays are used to optimise the signal candidate selection and derive the efficiency of selection. In the simulation,
collisions are generated using PYTHIA [12, 13] with a specific LHCb configuration [14]. Decays of unstable particles are described by EVTGEN [15], in which final-state radiation is generated using PHOTOS [16]. The interaction of the generated particles with the detector, and its response, are implemented using the GEANT4 toolkit [17, 18] as described in Ref. [19].
III. CANDIDATE SELECTION
The online event selection is performed by a trigger, which consists of a hardware stage, based on information from the calorimeter and muon systems, followed by a software stage, which applies a full event reconstruction. An inclusive approach for the hardware trigger is used to maximise the available data sample, as described in Ref. [20]. Since the centre-of-mass energies and trigger thresholds are different for the Run 1 and Run 2 data-taking, the offline selection is performed separately for the two periods, following the procedure described below. The resulting data samples for the two periods are treated separately in the subsequent analysis procedure.
The offline selection comprises two stages. First, a loose selection is used to reconstruct both
and
candidates in the same way, given their similar kinematics. Two oppositely charged muon candidates with
are combined to form a
candidate. The muon pair is required to have a common vertex and an invariant mass,
, in the range 3020–3170
. A pair of oppositely charged kaon candidates identified by the Cherenkov detectors is combined to form a
candidate. The
pair is required to have an invariant mass,
, in the range 1000–1050
. The
and
candidates are combined to form a
candidate, which is required to have good vertex quality and invariant mass,
, in the range 5200–5550
. The resulting
candidate is assigned to the PV with which it has the smallest
, where
is defined as the difference in the vertex-fit
of a given PV reconstructed with and without the particle being considered. The invariant mass of the
candidate is calculated from a kinematic fit for which the momentum vector of the
candidates is aligned with the vector connecting the PV to the
decay vertex and
is constrained to the known
meson mass [21]. In order to suppress the background due to the random combination of a prompt
meson and a pair of charged kaons, the decay time of the
candidate is required to be greater than 0.3
.
In a second selection stage, a boosted decision tree (BDT) classifier [22, 23] is used to further suppress combinatorial background. The BDT classifier is trained using simulated
decays representing the signal, and candidates with
in the range 5480–5550
as background. Candidates in both samples are required to have passed the trigger and the loose selection described above. Using a multivariate technique [24], the
simulation sample is corrected to match the observed distributions in background-subtracted data, including that of the
and pseudorapidity of the
, the
of the
decay vertex, the
of the decay chain of the
candidate [25], the particle identification variables, the track-fit
of the muon and kaon candidates, and the numbers of tracks measured simultaneously in both the vertex detector and tracking stations.
The input variables of the BDT classifier are the minimum track–fit
of the muons and the kaons, the
of the
candidate and the
combination, the
of the
decay vertex, particle identification probabilities for muons and kaons, the minimum
of the muons and kaons, the
of the
decay vertex, the
of the
candidate, and the
of the
decay chain fit. The optimal requirement on the BDT response for the
candidates is obtained by maximising the quantity
, where
is the signal efficiency determined in simulation and N is the number of candidates found in the
region around the known
mass [21].
In addition to combinatorial background, the data also contain fake candidates from
(
) decays, where the proton (pion) is misidentified as a kaon. To suppress these background sources, a
candidate is rejected if its invariant mass, computed with one kaon interpreted as a proton (pion), lies within
of the known
(
) mass [21] and if the kaon candidate also satisfies proton (pion) identification requirements.
A previous study of
decays found that the yield of the background from
decays is only 0.1% of the
signal yield [20]. Furthermore, only 1.2% of these decays, corresponding to about one candidate (three candidates) in the Run 1 (Run 2) data sample, fall in the
mass region 5265–5295
, according to simulation. Thus this background is neglected. The fraction of events containing more than one candidate is 0.11% in Run 1 data and 0.70% in Run 2 data and these events are removed from the total data sample. The acceptance, trigger, reconstruction and selection efficiencies of the signal and normalization channels are determined using simulation, which is corrected for the efficiency differences with respect to the data. The ratio of the total efficiencies of
and
is estimated to be
for Run 1 and
for Run 2, where the first uncertainties are statistical and the second ones are associated with corrections to the simulation. The polarisation amplitudes are assumed to be the same in
and
decays. The systematic uncertainty associated with this assumption is found to be small and is neglected.
IV. MASS FITS
There is a significant correlation between
and
in
decays, as illustrated in Fig. 2. Hence, the search for
decays is carried out by performing sequential fits to the distributions of
and
. A fit to the
distribution is used to estimate the yields of the background components in the
regions around the
and
nominal masses. A subsequent simultaneous fit to the
distributions of candidates falling in the two
mass windows, with the background yields fixed to their values from the first step, is performed to estimate the yield of
decays.
Fig. 2 (color online) Distributions of the invariant mass
in different
intervals with boundaries at 5220, 5265, 5295, 5330, 5400 and 5550
. They are obtained using simulated
decays and normalised to unity.
Download figure:
Standard imageThe probability density function (PDF) for the
distribution of both the
and
decays is modelled by the sum of a Hypatia [26] and a Gaussian function sharing the same mean. The fraction, the width ratio between the Hypatia and Gaussian functions and the Hypatia tail parameters are determined from simulation. The
shape of the
background is described by a template obtained from simulation, while the combinatorial background is described by an exponential function with the slope left to vary. The PDFs of
and
decays share the same shape parameters, and the difference between the
and
masses is constrained to the known mass difference of
[21].
An unbinned maximum-likelihood fit is performed in the
range 5220–5480
for Run 1 and Run 2 data samples separately. The yield of
is estimated from a fit to the
mass distribution with one kaon interpreted as a proton. This yield is then constrained to the resulting estimate of
(
) in the
mass fit for the Run 1 (Run 2). The
distributions, superimposed by the fit results, are shown in Fig. 3. Table 1 lists the obtained yields of the
and
decays, the
background and the combinatorial background in the full range as well as in the
regions around the known
and
masses.
Fig. 3 (color online) The distributions of
, superimposed by the fit results, for (left) Run 1 and (right) Run 2 data samples. The top row shows the full
signals in logarithmic scale while the bottom row is presented in a reduced vertical range to make the B0 peaks visible. The violet (red) solid lines represent the
decays, the orange dotted lines show the
background and the green dotted lines show the combinatorial background.
Download figure:
Standard imageTable 1. Measured yields of all contributions from the fit to
mass distribution, showing the results for the full mass range and for the
and
regions.
Data | Category | Full |
![]() |
![]() |
---|---|---|---|---|
Run 1 |
![]() | 55498 ± 238 | 51859 ± 220 | 35 ± 6 |
![]() | 127 ± 19 | 0 | 119 ± 18 | |
![]() | 407 ± 26 | 55 ± 8 | 61 ± 8 | |
Combinatorial background | 758 ± 55 | 85 ± 11 | 94 ± 11 | |
Run 2 |
![]() | 249670 ± 504 | 233663 ± 472 | 153 ± 12 |
![]() | 637 ± 39 | 0 | 596 ± 38 | |
![]() | 1943 ± 47 | 261 ± 16 | 290 ± 17 | |
Combinatorial background | 2677 ± 109 | 303 ± 20 | 331 ± 21 |
Assuming the efficiency is independent of
, the
meson lineshape from
(
) decays in the
(
) region is given by

where
is a relativistic Breit-Wigner amplitude function [27] defined as

The parameter m (
) denotes the reconstructed (true)
invariant mass,
and
are the mass and decay width of the
meson,
is the
momentum in the
(
) rest frame,
(
) is the momentum of the kaons in the
(
) rest frame,
is the orbital angular momentum between the
and
,
is the Blatt-Weisskopf function, and d is the size of the decaying particle, which is set to be 1.5
0.3 fm [28]. The amplitude squared is folded with a Gaussian resolution function G. For
,
has the form

and depends on the momentum of the decay products
[27].
As is shown in Fig. 2, due to the correlation between the reconstructed masses of
and
, the shape of the
distribution strongly depends on the chosen
range. The top two plots in Fig. 3 show the
distributions for Run 1 and Run 2 separately, where a small
signal can be seen on the tail of a large
signal. Therefore, it is necessary to estimate the lineshape of the
mass spectrum from
decays in the
region. The
distribution of the
tail leaking into the
mass window can be effectively described by Eq. (1) with modified values of
and
, which are extracted from an unbinned maximum-likelihood fit to the
simulation sample.
The non-
contributions to
(
) decays include that from
(980) [1] (
(980) [29]) and nonresonant
in an S-wave configuration. The PDF for this contribution is given by

where m is the
invariant mass,
is the known
mass [21],
is the Blatt-Weisskopf barrier factor of the
meson,
and
represent the resonant (
(980) or
(980)) and nonresonant amplitudes, and
is a relative phase between them. The nonresonant amplitude
is modelled as a constant function. The lineshape of the
(980) (
(980)) resonance can be described by a Flatté function [30] considering the coupled channels
(
) and
. The Flatté functions are given by

for the
(980) resonance and

for the
(980) resonance. The parameter
denotes the pole mass of the resonance for both cases. The constants
(
) and
are the coupling strengths of
(980) (
(980)) to the
(
) and
final states, respectively. The
factors are given by the Lorentz-invariant phase space:



The parameters for the
(980) lineshape are
,
, and
, determined by the Crystal Barrel experiment [31]; the parameters for the
(980) lineshape are
,
, and
, according to the previous analysis of
decays [32].
For the
background, no dependency of the
shape on
is observed in simulation. Therefore, a common PDF is used to describe the
distributions in both the
and
regions. The PDF is modelled by a third-order Chebyshev polynomial function, obtained from the unbinned maximum-likelihood fit to the simulation shown in Fig. 4.
Fig. 4 Distribution of
in a
simulation sample superimposed with a fit to a polynomial function.
Download figure:
Standard imageIn order to study the
shape of the combinatorial background in the
region, a BDT requirement that strongly favours background is applied to form a background-dominated sample. Simulated
and
events are then injected into this sample with negative weights to subtract these contributions. The resulting
distribution is shown in Fig. 5, which comprises a
resonance contribution and random
combinations, where the shape of the former is described by Eq. (1) and the latter by a second-order Chebyshev polynomial function. To validate the underlying assumptions of this procedure, the
shape has been checked to be compatible in different
mass regions and with different BDT requirements.
Fig. 5 (color online)
distributions of the enhanced combinatorial background in the (left) Run 1 and (right) Run 2 data samples. The
and
backgrounds are subtracted by injecting simulated events with negative weights.
Download figure:
Standard imageA simultaneous unbinned maximum-likelihood fit to the four
distributions in both
and
regions of Run 1 and Run 2 data samples is performed. The
resonance in
decays is modelled by Eq. (1). The non-
contribution to
decays is described by Eq. (4). The tail of
decays in the
region is described by the extracted shape from simulation. The
background and the combinatorial background are described by the shapes shown in Figs. 4 and 5, respectively. All
shapes are common to the
and
regions, except that of the
tail, which is only needed for the
region. The mass and decay width of
meson are constrained to their PDG values [21] while the width of the
resolution function is allowed to vary in the fit. The pole mass of
(980) (
(980)) and the coupling factors, including
,
,
and
, are fixed to their central values in the reference fit. The amplitude
is allowed to vary freely, while the relative phase
between the
(980) (
(980)) and nonresonance amplitudes is constrained to
(
) degrees, which was determined in the amplitude analysis of
(
) decays [1, 29]. The yields of the
background, the
tail leaking into the
region and the combinatorial background are fixed to the corresponding values in Table 1, while the yields of non-
for
and
decays as well as the yield of
decays take different values for Run 1 and Run 2 data samples and are left to vary in the fit.
The branching fraction
, the parameter of interest to be determined by the fit, is common for Run 1 and Run 2. The yield of
decays is internally expressed according to

where the branching fraction
has been measured by the LHCb collaboration [29],
is the efficiency ratio given in Sec. III,
is the ratio of the production fractions of
and
mesons in
collisions, which has been measured at 7
to be
in the LHCb detector acceptance [33]. The effect of increasing collision energy on
is found to be negligible for 8
and a scaling factor of
is needed for 13
[34]. The parameters
,
and
are fixed to their central values in the baseline fit and their uncertainties are propagated to
in the evaluation of systematic uncertainties.
The
distributions in the
and
regions are shown in Fig. 6 for both Run 1 and Run 2 data samples. The branching fraction
is found to be
. The significance of the decay
, over the background-only hypothesis, is estimated to be 2.3 standard deviations using Wilks' theorem [35].
Fig. 6 (color online) Distributions in the (top)
and (bottom)
regions, superimposed by the fit results. The left and right columns show the results for the Run 1 and Run 2 data samples, respectively. The violet (red) solid lines are
decays, violet (red) dashed lines are non-
signal, green dotted lines are the combinatorial background component, and the orange dotted lines are the
background component.
Download figure:
Standard imageTo validate the sequential fit procedure, a large number of pseudosamples were generated according to the fit models for the
and
distributions. The model parameters were taken from the result of the baseline fit to the data. The fit procedure described above was applied to each pseudosample. The distributions of the obtained estimate of
and the corresponding pulls are found to be consistent with the reference result, which indicates that the procedure has negligible bias and its uncertainty estimate is reliable. A similar check has been performed using pseudosamples generated with an alternative model for the
decays, which is based on the amplitude model developed for the
analysis [20] and includes contributions from P-wave
decays, S-wave
decays and their interference. In this case, the robustness of the fit method has also been confirmed.
V. SYSTEMATIC UNCERTAINTIES
Two categories of systematic uncertainties are considered: multiplicative uncertainties, which are associated with the normalisation factors; and additive uncertainties, which affect the determination of the yields of the
and
modes.
The multiplicative uncertainties include those propagated from the estimates of
,
and
. Using the
measurement at 7
[29, 33],
was measured to be
. The third uncertainty is completely anti-correlated with the uncertainty on
, since the estimate of
is inversely proportional to the value used for
. Taking this correlation into account yields
for 7
. The luminosity-weighted average of the scaling factor for
for 13
has a relative uncertainty of 3.4%. For the efficiency ratio
, its luminosity-weighted average has a relative uncertainty of 1.8%. Summing these three contributions in quadrature gives a total relative uncertainty of 7.3% on
.
The additive uncertainties are due to imperfect modeling of the
and
shapes of the signal and background components. To evaluate the systematic effect associated with the
model of the combinatorial background, the fit procedure is repeated by replacing the exponential function for the combinatorial background with a second-order polynomial function. A large number of simulated pseudosamples were generated according to the obtained alternative model. Each pseudosample was fitted twice, using the baseline and alternative combinatorial shape, respectively. The average difference of
is
, which is taken as a systematic uncertainty.
In the
fit, the yields of
decay, combinatorial backgrounds under the
and
peaks, and that of the
tail leaking into the
region are fixed to the values in Table 1. Varying these yields separately leads to a change of
by
for
,
for the combinatorial background and
for the
tail in the
region, and these are assigned as systematic uncertainties on
.
The constant d in Eq. (3) is varied between 1.0 and 3.0
. The maximum change of
is evaluated to be
, which is taken as a systematic uncertainty.
The
shape of the
tail under the
peak is extracted using a
simulation sample. The statistical uncertainty due to the limited size of this sample is estimated using the bootstrapping technique [36]. A large number of new data sets of the same size as the original simulation sample were formed by randomly cloning events from the original sample, allowing one event to be cloned more than once. The spread in the results of
obtained by using these pseudosamples in the analysis procedure is then adopted as a systematic uncertainty, which is evaluated to be
.
In the reference model, the
shape of the
background is determined from simulation, under the assumption that this shape is insensitive to the
region. A sideband sample enriched with
contributions is selected by requiring one kaon to have a large probability to be a proton. An alternative
shape is extracted from this sample after subtracting the random combinations, and used in the
fit. The resulting change of
is
, which is assigned as a systematic uncertainty.
The
shape of the combinatorial background is represented by that of the
combinations with a BDT selection that strongly favours the background over the signal, under the assumption that this shape is insensitive to the BDT requirement. Repeating the
fit by using the combinatorial background shape obtained with two non-overlapping sub-intervals of BDT response, the result for
is found to be stable, with a maximum variation of
, which is regarded as a systematic uncertainty.
In Eqs. (7)–(9), the coupling factors
,
,
and
, are fixed to their mean values from Ref. [31, 32]. The fit is repeated by varying each factor by its experimental uncertainty and the maximum variation of the branching fraction is considered for each parameter. The sum of the variations in quadrature is
, which is assigned as a systematic uncertainty.
The systematic uncertainties are summarised in Table 2. The total systematic uncertainty is the sum in quadrature of all these contributions.
Table 2. Systematic uncertainties on
for multiplicative and additive sources.
Multiplicative uncertainties | Value (%) |
---|---|
![]() | 6.2 |
Scaling factor for
![]() | 3.4 |
![]() | 1.8 |
Total | 7.3 |
Additive uncertainties | Value (10−8) |
![]() | 0.03 |
Fixed yields of
![]() ![]() | 0.05 |
Fixed yields of combinatorial background in
![]() | 0.61 |
Fixed yields of
![]() ![]() | 0.24 |
Constant d | 0.01 |
![]() ![]() | 0.29 |
![]() ![]() | 0.28 |
![]() | 0.16 |
![]() ![]() | 0.06 |
Total | 0.80 |
A profile likelihood method is used to compute the upper limit of
[37, 38]. The profile likelihood ratio as a function of
is defined as

where
represents the set of fit parameters other than
,
and
are the maximum likelihood estimators, and
is the profiled value of the parameter
that maximises L for the specified
. Systematic uncertainties are incorporated by smearing the profile likelihood ratio function with a Gaussian function which has a zero mean and a width equal to the total systematic uncertainty:

The smeared profile likelihood ratio curve is shown in Fig. 7. The 90% confidence interval starting at
is shown as the red area, which covers 90% of the integral of the
function in the physical region. The obtained upper limit on
at 90% CL is
.
Fig. 7 (color online) Smeared profile likelihood ratio curve shown as the blue solid line, and the 90% confidence interval indicated by the red area.
Download figure:
Standard imageVI. CONCLUSION
A search for the rare decay
has been performed using the full Run 1 and Run 2 data samples of
collisions collected with the LHCb experiment, corresponding to an integrated luminosity of 9
. A br-anching fraction of
is measured, which indicates no statistically significant excess of the decay
above the background-only hypothesis. The upper limit on its branching fraction at 90% CL is determined to be
, which is compatible with theoretical expectations and improved compared with the previous limit of
obtained by the LHCb experiment using Run 1 data, with a corresponding integrated luminosity of 1
.
ACKNOWLEDGEMENTS
We express our gratitude to our colleagues in the CERN accelerator departments for the excellent performance of the LHC. We thank the technical and administrative staff at the LHCb institutes. We acknowledge support from CERN and from the national agencies: CAPES, CNPq, FAPERJ and FINEP (Brazil); MOST and NSFC (China); CNRS/IN2P3 (France); BMBF, DFG and MPG (Germany); INFN (Italy); NWO (Netherlands); MNiSW and NCN (Poland); MEN/IFA (Romania); MSHE (Russia); MICINN (Spain); SNSF and SER (Switzerland); NASU (Ukraine); STFC (United Kingdom); DOE NP and NSF (USA). We acknowledge the computing resources that are provided by CERN, IN2P3 (France), KIT and DESY (Germany), INFN (Italy), SURF (Netherlands), PIC (Spain), GridPP (United Kingdom), RRCKI and Yandex LLC (Russia), CSCS (Switzerland), IFINHH (Romania), CBPF (Brazil), PL-GRID (Poland) and OSC (USA). We are indebted to the communities behind the multiple open-source software packages on which we depend.
Footnotes
- *
Individual groups or members have received support from AvH Foundation (Germany); EPLANET, Marie Sk lodowska-Curie Actions and ERC (European Union); A*MIDEX, ANR, Labex P2IO and OCEVU, and R´egion Auvergne-Rhˆone-Alpes (France); Key Research Program of Frontier Sciences of CAS, CAS PIFI, Thousand Talents Program, and Sci. & Tech. Program of Guangzhou (China); RFBR, RSF and Yandex LLC (Russia); GVA, XuntaGal and GENCAT (Spain); the Royal Society and the Leverhulme Trust (United Kingdom)
The inclusion of charge-conjugate processes is implied throughout this paper.