1 Introduction

Weak decays of heavy-flavoured hadrons provide a range of methods to test the Standard Model (SM) of particle physics. In particular, many such transitions are suppressed by particular features of the SM such as the GIM mechanism [1] and the CKM quark mixing matrix [2, 3]. As a consequence, the SM predicts a distinctive pattern of decay rates to various different final states, which may be modified by contributions from physics beyond the SM. Experimental and theoretical investigations in this area are therefore a priority in the field.

Until now, experimental studies of weak decays have been almost completely limited to the ground-state hadrons; considering neutral heavy-flavoured mesons, these are the pseudoscalar \({{D} ^0} \), \({{B} ^0} \) and \({{B} ^0_{s}} \) states. The leptonic decays, to the \(\ell ^+\ell ^-\) final state where \(\ell = e, \mu \) and (for \({B} ^{0}_{({s})}\) decays) \(\tau \), have branching fractions that are suppressed by the chiral structure of the SM weak interaction, and that can be predicted with small theoretical uncertainties [4]. These features together make them highly sensitive to potential contributions from physics beyond the SM. Intense activity on the \({{B} ^{0}_{({s})}} \rightarrow {\mu ^+\mu ^-} \) channels has resulted in the observation of the \({{B} ^0_{s}} \rightarrow {\mu ^+\mu ^-} \) decay by the LHCb, CMS and ATLAS experiments, and a combined limit on the \({{B} ^0} \rightarrow {\mu ^+\mu ^-} \) branching fraction that approaches its SM value [5,6,7,8,9,10]. The experimental limits on \({\mathcal {B}}\left( {{D} ^0} \rightarrow {e ^+e ^-} \right) \) [11], \({\mathcal {B}}\left( {{D} ^0} \rightarrow {\mu ^+\mu ^-} \right) \) [12], \({\mathcal {B}}\left( {{B} ^0_{({s})}} \rightarrow {e ^+e ^-} \right) \) [13] and \({\mathcal {B}}\left( {{B} ^0_{({s})}} \rightarrow {\tau ^+\tau ^-} \right) \) [14] are still several orders of magnitude above their SM predictions.

It is also possible to consider weak decays of the excited counterparts of the pseudoscalar mesons, the vector \({{D} ^{*0}} \), \({{B} ^{*0}} \) and \({{B} ^{*0}_{s}} \) resonances. In contrast to the situation for pseudoscalar mesons, the leptonic vector meson decays have no chiral suppression. Consequently the decay widths for each of the \(\ell ^+\ell ^-\) final states are expected to be the same, in the SM, up to effects related to the lepton mass (e.g., the available phase space), and will be larger compared to those for the pseudoscalar decays. However, the vector mesons can also decay via electromagnetic and (for the \({{D} ^{*0}} \) meson) strong transitions, which have widths many orders of magnitude larger than those for the weak decays. Therefore, the branching fractions of the weak decays are suppressed to what would usually be considered unobservably small levels. As an illustration, one can compare the width of the \({{D} ^{*+}} \) vector state, \(\Gamma _{{{D} ^{*+}}} = 83 \mathrm {\,keV} \) [15], with that of its \({{D} ^+} \) pseudoscalar counterpart, \(\Gamma _{{{D} ^+}} = \hbar /\tau _{{{D} ^+}} \approx \hbar /(1.0 \mathrm{\,ps}) = 0.7~\mathrm{meV}\) [16], a difference of over 8 orders of magnitude.Footnote 1 This is in the ballpark that one would naively expect, given that the weak decays are suppressed by the Fermi constant, but have more phase space available compared to the electromagnetic and strong transitions in which flavour is conserved. Given that weak decays lead to a plethora of different final states, the branching fractions for even the most favoured are unlikely to be above \(10^{-9}\), unless large enhancement factors due to physics beyond the SM are at play. For the \({{D} ^{*0}} \!\rightarrow \ell ^+\ell ^-\) decays, further suppression by the GIM mechanism [1] reduces the Standard Model prediction for the branching fractions to the \(\mathcal{O}\left( 10^{-20}\right) \) level, while for \({{B} ^{*0}_{s}} \!\rightarrow \ell ^+\ell ^-\) decays the prediction is in the range \((0.7\text {--}2.2) \times 10^{-11}\) [17, 18].

Hints of physics beyond the SM in B meson decays have, nevertheless, prompted theoretical activity on weak decays of excited heavy-flavoured states. Collectively referred to as the flavour anomalies, these hints include tensions between SM predictions and experimental measurements in branching fractions and angular observables in decays mediated by \(b \rightarrow s \ell ^+\ell ^-\) transitions, including observables that are sensitive to violations of lepton universality (see Ref. [19] for a recent review). The leptonic \({{D} ^{*0}} \), \({{B} ^{*0}} \) and \({{B} ^{*0}_{s}} \) decays are of most interest, since these are theoretically cleanest and, for the \({B} ^{*0}_{s} \) case, are sensitive to the same operators which could be responsible for the flavour anomalies.Footnote 2 These decays have been considered as a potential probe of physics beyond the SM in Refs. [17, 18], with further investigations of the impact of particular extensions of the SM considered in Refs. [20,21,22,23,24].

There is extra motivation to search for leptonic \({{D} ^{*0}} \) decays. The relevant operators for the \({{B} ^{*0}} \) and \({{B} ^{*0}_{s}} \) decays are already constrained from measurements of pseudoscalar \(B_{(s)} \rightarrow \ell ^+\ell ^-\) and \(h\ell ^+\ell ^-\) transitions (where h is a hadron), and can be used to set limits on \({\mathcal {B}}\left( {{B} ^{*0}_{({s})}} \!\rightarrow \ell ^+\ell ^- \right) \) that are below the current experimentally achievable sensitivity [18]. The interpretation of results from \({{D} ^0} \rightarrow \ell ^+\ell ^-\) and \(D_{(s)} \rightarrow h\ell ^+\ell ^-\) decays, on the other hand, is challenging due to long-distance effects [25,26,27,28]. Correspondingly, constraints on the relevant operators are weaker, and the possibility of a large enhancement to \({\mathcal {B}}\left( {{D} ^{*0}} \!\rightarrow \ell ^+\ell ^- \right) \) from physics beyond the Standard Model cannot be ruled out. Thus, searches for \({{D} ^{*0}} \rightarrow \ell ^+\ell ^-\) decays could in principle provide an important complementary approach to probe the operators involved, if sufficient experimental precision can be obtained.

Since there is no suppression of the coupling of the heavy flavoured vector resonances to dielectron, compared to dimuon, states, it may be attractive experimentally to search for these interactions through production in \({e ^+e ^-} \) collisions, rather than in decay processes [17]. A search for the \(e^+e^- \rightarrow D^{*}(2007)^0\) process has been carried out by the CMD-3 collaboration, resulting in an upper limit being set, \({\mathcal {B}}\left( {{D} ^{*0}} \rightarrow {e ^+e ^-} \right) < 1.7 \times 10^{-6}\) at 90% confidence level [29]. While it is likely that this result can be significantly improved through analyses of larger data samples with better background suppression, the limit is not at such a stringent level to suggest that other approaches to study these processes would be futile.

In particular, the copious production of heavy flavoured hadrons at the LHC makes it worthwhile to consider what sensitivity might be achievable with current and future data samples. As the dimuon signature allows for effective background suppression in hadron collider experiments, the \({{D} ^{*0}} \), \({{B} ^{*0}} \) and \({{B} ^{*0}_{s}} \) decays to the \(\mu ^+\mu ^-\) final state are the most amenable. There are several possible avenues to investigate these processes using LHC data. In particular, both “prompt” and “displaced” production can be considered, where in the former the heavy flavoured state is produced at the primary vertex of the proton-proton collision while in the latter the signal hadron originates from the decay of another weakly decaying particle some distance from the primary vertex. Specifically, \({{D} ^{*0}} \) mesons are produced at high rates in b hadron decays, and \({{B} ^{*0}_{({s})}} \) mesons can be produced in \({{B} _{c} ^+} \) meson, and potentially in other doubly-heavy hadron, decays. While prompt production has the highest rate, the large numbers of tracks originating from the proton-proton collision vertices lead to large backgrounds that limit the sensitivity of any rare decay search. In displaced production it is unlikely to find random tracks appearing to come from the vertex position, so long as the vertex resolution is sufficient. If in addition the signal is part of an exclusive decay process, further constraints can be applied to suppress background. Displaced production therefore appears to provide the most promising approach.

In the main part of this paper, the potential sensitivity of the LHCb experiment to \({{D} ^{*0}} \), \({{B} ^{*0}} \) and \({{B} ^{*0}_{s}} \) decays to the \(\mu ^+\mu ^-\) final state, using displaced production in exclusive final states, is investigated. Possibilities with prompt production, and with inclusive and semi-inclusive search approaches with displaced production, are considered for completeness in Appendices A and B, respectively. In principle the ATLAS and CMS experiments could also make competitive measurements, but until now they have fewer relevant measurements making extrapolations difficult. Moreover, as their vertexing and charged hadron identification capability is not as good as that of LHCb, it is expected that they will suffer from larger backgrounds.

2 \({{{B} ^-}} \rightarrow {{D} ^{*0}} {{\pi } ^-} \rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decays

The decay chain \({{{B} ^-}} \rightarrow {{D} ^{*0}} {{\pi } ^-} \rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) not only provides an excellent illustrative example, it also allows an estimated upper limit on the \({{D} ^{*0}} \rightarrow {\mu ^+\mu ^-} \) branching fraction to be obtained from published data. In particular, LHCb has studied the \({{{B} ^-}} \rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decay using a data sample corresponding to an integrated luminosity of \(3.0 \hbox {\,fb}^{-1} \), collected at centre-of-mass energies of 7 and 8 TeV [30]. The measured differential branching fraction \(\mathrm{d}{\mathcal {B}}/\mathrm{d}{q^2} \), where \({q^2} \) is the square of the dimuon invariant mass, is shown in Fig. 1. An upper limit on \({\mathcal {B}}\left( {{D} ^{*0}} \rightarrow {\mu ^+\mu ^-} \right) \) can be estimated by assuming conservatively that the \({{{B} ^-}} \rightarrow {{D} ^{*0}} {{\pi } ^-} \rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decay contributes not more than half of the \({{{B} ^-}} \rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) signal in the two bins either side of \({q^2} = 4 {\mathrm {\,GeV^2\!/}c^4} \), i.e. a branching fraction of . Then, using the world average value of \({\mathcal {B}}\left( {{{B} ^-}} \rightarrow {{D} ^{*0}} {{\pi } ^-} \right) \) [31], one obtains

This limit, while already more stringent than the result on \({{D} ^{*0}} \rightarrow {e ^+e ^-} \) from the CMD-3 collaboration [29], could most likely be improved by at least an order of magnitude by a dedicated LHCb analysis. In particular, the experimental mass resolution, which we expect to be around \(5 {\mathrm {\,MeV\!/}c^2} \), is much better than can be obtained from the coarse \({q^2} \) binning of Fig. 1. Moreover, LHCb has already collected a significantly larger data sample than was analysed in Ref. [30], and a somewhat higher selection efficiency could be anticipated in a dedicated analysis. It is therefore of interest to ask what sensitivity might ultimately be achieved by LHCb in such a search.

In LHC experiments, it is advantageous to measure branching fractions relative to those of appropriate normalisation channels. It is anticipated that the \({{{B} ^-}} \!\rightarrow {{J /\psi }} {{K} ^-} \) decay will be used for this purpose and hence the signal yield is converted to a measurement of the branching fraction as

$$\begin{aligned}&{\mathcal {B}}\left( {{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} \right) = \frac{N_{{{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{\pi } ^-} }}{N_{{{{B} ^-}} \!\rightarrow {{J /\psi }} {{K} ^-} }} \, \frac{\epsilon _{{{{B} ^-}} \!\rightarrow {{J /\psi }} {{K} ^-} }}{\epsilon _{{{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{\pi } ^-} }} \, \frac{{\mathcal {B}}\left( {{{B} ^-}} \!\rightarrow {{J /\psi }} {{K} ^-} \right) }{{\mathcal {B}}\left( {{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{\pi } ^-} \right) } {\mathcal {B}}\left( {{J /\psi }} \!\rightarrow {\mu ^+\mu ^-} \right) , \end{aligned}$$
(1)
$$\begin{aligned}&\displaystyle = \alpha _{{{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} } N_{{{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{\pi } ^-} } , \end{aligned}$$
(2)

where \(\alpha _{{{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} }\) is the single-event sensitivity, which corresponds to the branching fraction at which one signal event is expected in the data sample. In Eq. (1), N and \(\epsilon \) represent the yield and the efficiency for the decay indicated in the subscript, where the reconstruction of the \({D} ^{*0}\) or \({J /\psi }\) vector meson in the \(\mu ^+\mu ^-\) final state is implied.

The single-event sensitivity is estimated from the yield of \({{{B} ^-}} \!\rightarrow {{J /\psi }} {{K} ^-} \) decays, scaled from measurements with existing data [30] appropriately to each integrated luminosity value, and the branching fractions \(\mathcal{B} ({{{B} ^-}} \!\rightarrow {{J /\psi }} {{K} ^-} )\), \(\mathcal{B} ({{J /\psi }} \!\rightarrow {\mu ^+\mu ^-} )\) and \(\mathcal{B} ({{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{\pi } ^-} )\) [31]. It is assumed that \(\epsilon _{{{{B} ^-}} \!\rightarrow {{J /\psi }} {{K} ^-} }\approx \epsilon _{{{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{\pi } ^-} }\), since the final state is the same (apart from \(K \leftrightarrow \pi \)) and the efficiency varies slowly with \({q^2} \). The single-event sensitivity is shown in Fig. 2 as a function of the data set size.

The achievable precision depends not only on the single-event sensitivity but also on the uncertainty on the signal yield, which is often limited by the background level. To investigate how the achievable limit may scale with integrated luminosity, pseudoexperiments are generated under a background-only hypothesis. Three background components are considered: combinatorial background from random combinations of tracks from two or more decays; background from nonresonant \({{{B} ^-}} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decays; and background from \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) decays, where the \({K} ^-\) meson is mistakenly identified as a \({\pi } ^-\). The combinatorial background is assumed to be uniformly distributed in the dimuon mass, \(m({\mu ^+\mu ^-})\). The backgrounds from \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) and \({{{B} ^-}} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decays are assumed to be uniform in \({q^2} = m^{2}({\mu ^+\mu ^-})\), consistent with both the expected [35] and the observed [36] shape of the differential \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) decay rate in the \(q^2\) range of interest. The backgrounds are generated over the interval \(2< {q^2} < 6\mathrm {\,GeV} ^{2}\), which covers the two bins of the LHCb \({{{B} ^-}} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) analysis closest to the \({D} ^{*0}\) mass. The level of each of the three backgrounds is taken from Ref. [30] and is scaled to the considered integrated luminosity. Proton-proton collisions during LHC Run 2 (2015–18) were at 13 TeV centre-of-mass energy and those in future run periods are expected to be at similar or slightly higher energies. When extrapolating to future data sample sizes it is assumed that the \({b} {\overline{{b}}} \) production cross-section scales linearly with centre-of-mass energy [37].

Fig. 1
figure 1

Differential branching fraction of the \({{{B} ^-}} \rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decay as a function of \({q^2} = m^2({\mu ^+\mu ^-})\), taken from Ref. [30]. The hashed regions correspond to theoretical predictions from Refs. [32,33,34]

Limits are set at 90% confidence level for each pseudoexperiment using the method described in Ref. [38] (as implemented in the TRolke class in Root), taking the uncertainty on the single-event sensitivity into account. Two mass regions are defined, a signal region \(\pm 10{\mathrm {\,MeV\!/}c^2} \) around the known \({D} ^{*0}\) mass, \(m_{{{D} ^{*0}}}\), comprising a mixture of signal and background candidates and two sideband regions \(-35< m({\mu ^+\mu ^-}) - m_{{{D} ^{*0}}} < -15{\mathrm {\,MeV\!/}c^2} \) and \(15< m({\mu ^+\mu ^-}) - m_{{{D} ^{*0}}} < 35{\mathrm {\,MeV\!/}c^2} \), comprising only background candidates. The sideband regions are used to estimate the background in the signal region. The width of the signal region is taken as \(\pm 2\) times the expected \(m({\mu ^+} {\mu ^-})\) resolution of \(\sim 5{\mathrm {\,MeV\!/}c^2} \) [39], which is minimised by applying a kinematic fit to the \({{{B} ^-}} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) process in which the \({{B} ^-} \) mass is constrained to its known value.

The results of this pseudoexperiment-based study are shown in Fig. 3. An upper limit at the level of \(10^{-8}\) appears to be possible with the current LHCb data set. This can be further improved to \(\mathcal{O}(10^{-9})\) with the total sample of \(300 \hbox {\,fb}^{-1} \) anticipated with future LHCb upgrades [40, 41]. The rate of reduction of the expected limit as the sample size increases slows markedly at around \(20 \hbox {\,fb}^{-1} \) as background starts to become limiting. While combinatorial background and misidentified \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) decays can be further reduced with tighter selection requirements, the contribution from nonresonant \({{{B} ^-}} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decays is irreducible in this approach.

Fig. 2
figure 2

Expected single-event sensitivity for \({D} ^{*0}\) produced in \({{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{\pi } ^-} \) decays as a function of the integrated luminosity of the LHCb data set

Fig. 3
figure 3

Expected upper limit at 90% confidence level (CL) on \(\mathcal{B} ({{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} )\) obtained using reconstructed \({{{B} ^-}} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) decays as a function of the integrated luminosity of the LHCb data set. The curve represents the median value from an ensemble of pseudoexperiments and the shaded regions the one and two sigma intervals

The studies resulting in Fig. 3 neglect interference effects between the \({{{B} ^-}} \!\rightarrow \left( {{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} \right) {{\pi } ^-} \) signal and nonresonant \({{{B} ^-}} \!\rightarrow {{\pi } ^-} {\mu ^+\mu ^-} \) decays, which is reasonable given the narrow width of the \({D} ^{*0}\) meson. Nonetheless, since the \({D} ^{*0}\) quantum numbers are the same as part of the nonresonant contribution, a net interference effect is expected, with size depending on the relative phase between the two interfering amplitudes. Even though the \({D} ^{*0}\) width is below the experimental resolution, this effect could in principle be used, once the sample size is sufficiently large, to obtain a better limit than indicated in Fig. 3. Indeed, the LHCb collaboration has already demonstrated the possibility to measure a similar interference effect between \({{{B} ^-}} \!\rightarrow \left( {{J /\psi }} \!\rightarrow {\mu ^+\mu ^-} \right) {{K} ^-} \) and nonresonant \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) decays [39]. The results of this analysis allow the maximum size of the \({{{B} ^-}} \!\rightarrow {{D} ^{*0}} {{K} ^-} \) contribution to \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) decays to be estimated, and hence a limit on \({\mathcal {B}}\left( {{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} \right) \) can be derived. Assuming the yield of \({{{B} ^-}} \!\rightarrow \left( {{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} \right) {{K} ^-} \) decays is not more than \(2\times \sqrt{N}\), where \(N \sim 50\) is the yield in the \(m({\mu ^+\mu ^-})\) bin around \(m_{{{D} ^{*0}}}\), taken from Fig. 3 of Ref. [39], and normalising to the \({{{B} ^-}} \!\rightarrow \left( {{J /\psi }} \!\rightarrow {\mu ^+\mu ^-} \right) {{K} ^-} \) contribution which has a yield of \(\sim 900\,000\) (around 90% of the total \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) yield), and with an efficiency ratio of 0.85 (from Fig. 2 of Ref. [39]), then with known branching fractions [31, 42] one obtains

As expected, this mode is not as sensitive as \({{{B} ^-}} \!\rightarrow \left( {{D} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} \right) {{\pi } ^-} \) due to the smaller \({{B} ^-} \) branching fraction and the larger background from nonresonant \({{{B} ^-}} \!\rightarrow {{K} ^-} {\mu ^+\mu ^-} \) decays.

3 \({{B} _{c} ^+} \rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} \rightarrow {\mu ^+\mu ^-} {{\pi } ^+} \) decays

A similar strategy can in principle also be employed to search for \({{B} _{c} ^+} \rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} \rightarrow {\mu ^+\mu ^-} {{\pi } ^+} \) decays, where the \({{B} _{c} ^+} \rightarrow {{J /\psi }} {{\pi } ^+} \) decay can be used for normalisation. In this case, however, some of the necessary ingredients to convert the signal yield to the \({{B} ^{0}_{({s})}} \!\rightarrow {\mu ^+\mu ^-} \) branching fraction are not currently available. Specifically, the equivalent expression to Eq. (1) is

$$\begin{aligned} {\mathcal {B}}\left( {{B} ^{*0}_{({s})}} \!\rightarrow {\mu ^+\mu ^-} \right) = \frac{N_{{{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} }}{N_{{{B} _{c} ^+} \!\rightarrow {{J /\psi }} {{\pi } ^+} }} \, \frac{\epsilon _{{{B} _{c} ^+} \!\rightarrow {{J /\psi }} {{\pi } ^+} }}{\epsilon _{{{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} }} \, \frac{{\mathcal {B}}\left( {{B} _{c} ^+} \!\rightarrow {{J /\psi }} {{\pi } ^+} \right) }{{\mathcal {B}}\left( {{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} \right) } {\mathcal {B}}\left( {{J /\psi }} \!\rightarrow {\mu ^+\mu ^-} \right) , \end{aligned}$$
(3)
$$\begin{aligned}&\quad = \alpha _{{{B} ^{*0}_{({s})}} \!\rightarrow {\mu ^+\mu ^-} } N_{{{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} }, \end{aligned}$$
(4)

and the branching fractions for \({{B} _{c} ^+} \rightarrow {{B} ^{*0}_{s}} {{\pi } ^+} \) and \({{B} _{c} ^+} \rightarrow {{B} ^{*0}} {{\pi } ^+} \) decays are at present unmeasured. It seems possible, however, that at least the former decay could be observed with existing data. A previous LHCb analysis has observed \({{B} _{c} ^+} \rightarrow {{B} ^0_{s}} {{\pi } ^+} \) with \(3\hbox {\,fb}^{-1} \) of data [43] and this channel has also been used with the full current data sample of \(9\hbox {\,fb}^{-1} \) to measure the \({{B} _{c} ^+} \) mass [44]. The signal for \({{B} _{c} ^+} \rightarrow {{B} ^{*0}_{s}} {{\pi } ^+} \) would be expected in the same spectrum as a satellite peak shifted below the \({{B} _{c} ^+} \) mass by approximately \(m_{{{B} ^{*0}_{s}}} - m_{{{B} ^0_{s}}}\) due to the soft photon that is not included in the reconstructed candidate. Similar “partial reconstruction” techniques have been used by LHCb to observe \({{B} _{c} ^+} \rightarrow {{J /\psi }} {{D} ^{*+}_{s}} \) [45] and \({{{B} ^-}} \rightarrow {{D} ^{*0}} {{\pi } ^-} \) [46] decays, where the soft neutral particles from the \({{D} ^{*+}_{s}} \) and \({{D} ^{*0}} \) decays are not included in the reconstructed candidate. Assuming that \({\mathcal {B}}({{B} _{c} ^+} \rightarrow {{B} ^{*0}_{s}} {{\pi } ^+})\) is not much smaller than \({\mathcal {B}}({{B} _{c} ^+} \rightarrow {{B} ^0_{s}} {{\pi } ^+})\), as is expected theoretically [47,48,49,50,51], it should be possible to observe partially reconstructed \({{B} _{c} ^+} \rightarrow {{B} ^{*0}_{s}} {{\pi } ^+} \) decays, and to measure the corresponding branching fraction, in the existing LHCb data sample. The \({{B} _{c} ^+} \rightarrow {{B} ^0} {{\pi } ^+} \) and \({{B} _{c} ^+} \rightarrow {{B} ^{*0}} {{\pi } ^+} \) decays could be searched for with a similar technique although, since these transitions are Cabibbo-suppressed relative to \({{B} _{c} ^+} \rightarrow {{B} ^{(*)0}_{s}} {{\pi } ^+} \), larger data samples may be required to observe them. A further, albeit, minor complication is that the \(m_{{{B} ^{*0}}} - m_{{{B} ^0}}\) mass difference is as-yet unmeasured, although it can be predicted rather reliably from the \(m_{{{B} ^{*+}}} - m_{{{{B} ^+}}}\) mass difference invoking isospin symmetry. (This can also be interpreted as a further motivation for the analysis since a first measurement of \(m_{{{B} ^{*0}}}\) may be possible.)

Another apparent problem is that even for \({{B} _{c} ^+} \) decay modes which have been observed, there are currently no measurements of absolute branching fractions. Rather, only the product of the branching fraction with a ratio of production fractions is known. However, the \({{B} _{c} ^+} \) meson production rate cancels out in Eq. (3), allowing some simplification. With \({\mathcal {B}}\left( {{B} _{c} ^+} \!\rightarrow {{J /\psi }} {{\pi } ^+} \right) \), which appears in the numerator, measured relative to \({{{B} ^+}} \!\rightarrow {{J /\psi }} {{K} ^+} \) and \({\mathcal {B}}\left( {{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} \right) \), which appears in the denominator, measured relative to \({{B} ^0_{s}} \!\rightarrow {{J /\psi }} \phi \), it is only necessary to know the relative production rate of \({{{B} ^+}} \) and \({{B} ^0_{s}} \) mesons, which has been measured precisely [52]. Taking the yield of \(25.2 \times 10^3\) \({{B} _{c} ^+} \!\rightarrow {{J /\psi }} {{\pi } ^+} \) decays in \(9 \hbox {\,fb}^{-1} \) of LHCb data from Ref. [44] and other inputs from Refs. [31, 43, 52, 53], and assuming \({\mathcal {B}}\left( {{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{s}} {{\pi } ^+} \right) = \frac{1}{2}{\mathcal {B}}\left( {{B} _{c} ^+} \!\rightarrow {{B} ^0_{s}} {{\pi } ^+} \right) \) and that the ratio of efficiencies in Eq. (3) is unity, a single-event sensitivity with \(9 \hbox {\,fb}^{-1} \) of \(\alpha _{{{B} ^{*0}_{s}} \!\rightarrow {\mu ^+\mu ^-} } \approx 5.5 \times 10^{-8}\) is obtained. Further assuming the \({{B} _{c} ^+} \!\rightarrow {{B} ^{*0}} {{\pi } ^+} \) decay has a Cabibbo-suppression factor of \(\left| V_{cd}/V_{cs}\right| ^2 \approx 5\%\) relative to \({{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{s}} {{\pi } ^+} \), the corresponding single-event sensitivity is found to be \(\alpha _{{{B} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} } \approx 1.1 \times 10^{-6}\). Scaling to a data sample of \(300 \hbox {\,fb}^{-1} \), the achievable single-event-sensitivities could reach \(\approx 1.3 \times 10^{-9}\) and \(2.7 \times 10^{-8}\) for \({{B} ^{*0}_{s}} \!\rightarrow {\mu ^+\mu ^-} \) and \({{B} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} \) decays, respectively.

Interpretation of these single-event sensitivities must be made with care, however, since they assume a selection efficiency comparable to that for \({{B} _{c} ^+} \!\rightarrow {{J /\psi }} {{\pi } ^+} \) decays in Ref. [44]. In practice the selection requirements will be optimised to account for the level of background in the signal region for the \({{B} _{c} ^+} \rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} \rightarrow {\mu ^+\mu ^-} {{\pi } ^+} \) search. Since there is no published search for \({{B} _{c} ^+} \rightarrow {\mu ^+\mu ^-} {{\pi } ^+} \) decays outside the mass regions where the dimuon pair originates from a \({{J /\psi }} \) or \(\psi (2S)\) charmonium state, it is hard to judge what the optimal requirements are likely to be. Moreover a contribution from nonresonant \({{B} _{c} ^+} \rightarrow {\mu ^+\mu ^-} {{\pi } ^+} \) decays, which can occur in the SM through an annihilation diagram, may provide a limiting background if it is not negligible around \(m({\mu ^+\mu ^-}) \sim m_{{{B} ^{*0}_{({s})}}}\). Nevertheless, it appears possible that interesting sensitivity to \({{B} ^{*0}_{s}} \!\rightarrow {\mu ^+\mu ^-} \) and \({{B} ^{*0}} \!\rightarrow {\mu ^+\mu ^-} \) decays may be achievable.

4 Summary

In summary, the \({{{B} ^-}} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^-} \) and \({{B} _{c} ^+} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^+} \) decays provide interesting possibilities to search for the leptonic weak decays of \({D} ^{*0}\), \({B} ^{*0}\) and \({B} ^{*0}_{s} \) vector mesons. Published data allow a world-leading limit of to be obtained, and this can be significantly improved upon with a dedicated analysis of existing data. Sensitivity at the level of \(\mathcal{O}(10^{-9})\) is expected to be possible with the data sample to be collected by the end of HL-LHC operation with upgrades of the LHCb experiment. Competitive measurements could potentially also be made by the ATLAS and CMS experiments, but dedicated studies will be necessary to evaluate how well background can be suppressed. Good sensitivity to \({{B} ^{*0}_{({s})}} \!\rightarrow {\mu ^+\mu ^-} \) decays also appears achievable, although further experimental investigations will be needed before a firm conclusion can be reached on this point. In particular, measurements of \({\mathcal {B}}\left( {{B} _{c} ^+} \!\rightarrow {{B} ^{*0}_{({s})}} {{\pi } ^+} \right) \) and studies of nonresonant \({{B} _{c} ^+} \!\rightarrow {\mu ^+\mu ^-} {{\pi } ^+} \) decays are needed. Further studies of semi-inclusive search approaches, exploiting vector meson production through semileptonic b-hadron decays, will be necessary to understand if these may allow even better sensitivity to be reached.