QUBO-based Density Matrix Electronic Structure Method
QUBO-based Density Matrix Electronic Structure Method
Christian F. A. Negre, Alejandro Lopez-Bezanilla, Yu Zhang, Prosper D. Akrobotu, Susan M. Mniszewski, Sergei
Tretiak, and Pavel A. Dub
Density matrix electronic structure theory is used in many quantum chemistry methods to “alleviate” the
computational cost that arises from directly using wave functions. Although density matrix based methods
are computationally more efficient than wave functions based methods, yet significant computational effort
is involved. Since the Schrödinger equation needs to be solved as an eigenvalue problem, the time-to-solution
scales cubically with the system size, and is solved as many times in order to reach charge or field self-
consistency. We hereby propose and study a method to compute the density matrix by using a quadratic
unconstrained binary optimization (QUBO) solver. This method could be useful to solve the problem with
quantum computers, and more specifically, quantum annealers. The method hereby proposed is based on a
arXiv:2201.04720v1 [physics.chem-ph] 12 Jan 2022
direct construction of the density matrix using a QUBO eigensolver. We explore the main parameters of the
algorithm focusing on precision and efficiency. We show that, while direct construction of the density matrix
using a QUBO formulation is possible, the efficiency and precision have room for improvement. Moreover,
calculations performing Quantum Annealing with the D-Wave’s new Advantage quantum processing units is
compared with classical Simulated annealing, further highlighting some problems of the proposed method. We
also show some alternative methods that could lead to a better performance of the density matrix construction.
I. INTRODUCTION H = H0 + V
Paradigm
shift
The correct choice of the model or method used to ρ = CfF ()C †
V = fC (ρ) xt Qx
represent the electronic structure of a chemical system
is crucial to get a good description of any observable
(forces, energies, etc.) of such a system. The quality if |ρ − ρold | < tol,
of a model can be estimated as the ratio between its STOP
predictive power and its complexity (Quality = Predic-
tive Power /Complexity), the predictive power being re- FIG. 1. Scheme showing the paradigm shift hereby proposed.
lated to the amount of information that can be extracted On the left we represent the traditional self-consistent charge
from a calculation, and the complexity mainly referring (SCC) approach with steps that include the correction of the
to how the time-to-solution scales with the system size1 . Hamiltonian elements H; the construction of the density ma-
trix ρ from a Fermi function fF of the Hamiltonian; and the
Although this definition is vague, it gives us the general
calculation of the effective potential V as a Coulombic func-
understanding that, in order to optimize a method (in- tion fC of ρ to be used to reconstruct the Hamiltonian in the
crease its quality) it is necessary to work on approaches subsequent SCC step. On the right, the alternative QUBO
that could decrease the complexity, and, at the same idea is proposed.
time, avoid losing predictive power; In many cases the
predictive power translates into accuracy. Reducing the
complexity is the typical approach that has been taken
given that computational resources have always been exploring how feasible it is to construct the density ma-
limited due to the ever increasing need of addressing trix (DM) using linear algebra QUBO algorithms. Pre-
larger systems. However, with the arrival of quantum vious work on QAs used in quantum chemistry has been
computers (an alternative computational paradigm) and recently applied to determine the ground and firsts ex-
more specifically, quantum annealers (QAs)2 , it is pos- cited states on the Full Configuration Interaction (CI)
sible that more complex formulations (in terms of tradi- method by diagonalizing the Full CI (FCI) matrix5 ; the
tional computation) of a given problem end up having downfolding of the FCI matrix to reduce the size of the ef-
shorter or even instantaneous time-to-solution without fective matrix6 ; and Time Dependent Density Functional
accuracy resignation. This has been the case of graph Theory (DFT) to compute excitations7 .
problems including graph partitioning and community The density matrix (DM) formalism is ubiquitous to
detection solved with QAs that was made possible due to several methods in computational chemistry that com-
a quadratic unconstrained binary optimization (QUBO) pute the electronic structure of a molecular system.
reformulation of the problem3,4 . For the case of quan- Among these methods, the Khon-Sham method within
tum chemistry methods, the long term vision to reach DFT is arguably the most frequently used of the quan-
is the one in which all the computational burden of the tum chemistry methods, preferred over post Hartree-
eigenvalue problem and self-consistent process could be Fock methods, for its performance in terms of predictive
shifted towards an “extremely complex,” yet easily solv- power relative to its computational cost8,9 . The founda-
able QUBO problem (See Figure 1). In this paper we tions of this theory will be briefly stated below.
take an initial step towards the aforementioned vision by The many-body wave function of interacting electrons
2
in a chemical system resulting in the solution of the which is a generalized matrix eigenvalue problem, where
Schrödinger equation depends on 3×Ne spatial coordi- is a diagonal matrix with entries ij = δij i containing
nates, where Ne is the number of electrons in the system. all the eigenvalues of matrix H. Finally, C is a matrix
DFT can be used to overcome this complexity, allowing containing all the expansion coefficients for each of the
for the calculation of observables just by knowing the N eigenfunctions ψi (r).
electronic density of the system. This theory is based Given the expansion coefficients C, the DM which is
on using the electronic density n(r) (r being the electron our object of interest here, is constructed as:
coordinates) as a variable of the total energy functional
ρ = Cf ()C t (6)
E[n(r)] of the system, and the variational principle to
minimize the latter. The theorems of P. Hohenberg and where we have used the Fermi distribution function fn
W. Kohn10 ensure that the energy, wave function, and for a system in thermal equilibrium defined as:
all other electronic properties are uniquely determined 2
by the electron density n(r). Moreover, for any external fn () = (7)
1 + exp( −
kb T )
F
ergy of non-interacting electrons, and the inter-electronic which can be reused in equation 3 to recompute Ĥ.
interaction containing the non-kinetic correlation and ex- Other methods such as Hartree-Fock, Tight-Binding, and
change. The exact form of this functional is not known, Semiempirical are also based on constructing the single-
but many approximations exists nowadays with varying particle DM1,13–15 . From what was explained above, ma-
accuracies and computational costs11 . In this work we trices and C arise from diagonalizing the matrix rep-
will use the most basic one where the functional is rep- resentation of Ĥ in the aforementioned basis set. This
resented by a local ´density approximation (LDA) of the diagonalization step in all the density matrix based meth-
form: Exc [n(r)] = xc (n)n(r)d3 r, where xc (n) is the ods is typically the bottleneck of the whole calculation,
correlation and exchange energy per electron for a uni- and scales as O(N 3 ), where N is the number of elements
form gas of density n. Equation 2 is solved iterativelly in the basis set (which scales linearly with the system
using equation 3 until reaching self-consistency between size). Over the years, computational chemists have dedi-
the density n(r) and the effective potential vef f (r). cated lots of effort to develop algorithms that could solve
When written using a basis set of functions {φi }i∈1,..,N the density matrix with O(N ) scaling16–18 ; leading to
´ ´
with Sij = r φ∗i (r)φj (r)dr and Hij = r φ∗i (r)Ĥφj (r)dr the so called “order n methods.” Some of them have also
equation 2 turns into the following generalized eigenvalue adapted algorithms to the ever evolving classical com-
problem: puter architectures19,20 . The problem with all these or-
der n approaches is that they require a trade-off between
Hc = i Sc (4) accuracy and efficiency. The faster we want the method
where, ck is the k-th component of the expansion of the to be, the lower will be the accuracy of our results. In this
wavefunction ψi (r) in the chosen basis set. In matrix paper we analyze the tractibility of quantum annealing
form equation 4 becomes: as an alternative computational paradigm to get accu-
rate results at a lower computational cost for DM based
HC = SC (5) quantum chemistry methods.
3
guarantees convergence to any arbitrary desired preci- linear scaling. The other important point is that the two
sion. inner loops (initial and descent phases) have to be solved
For every “next” eigenpair we want to compute, the nocc times, where nocc is the number of occupied states
previous eigenvalue xmin ≡ xi−1 of H needs to be that depends on the number of electrons which scales lin-
“pushed out” of the eigenspectrum by using the following early with the number of orbitals N . From this, we can
transformation: deduce that even if the solution of the quadratic prob-
lem takes no time, the overall scaling of the algorithm
H = H − wSxi−1 xTi−1 S (13) gives back the “undesired” O(N 3 ) scaling. Provided the
matrix operations in the inner loop could be solved with
where w is an estimation for the “eigenspectrum width.”
O(N ) scaling methods, and the quadratic problem takes
A good estimation of w can be calculated as w = h −
no time, we will have at best an O(N 2 ) scaling. Even
l where h and l are the highest and lowest Gershgorin
though this seems to be an undesired result, it is a first
bounds of H respectively.
direct approach to solve the DM using a QUBO eigen-
Using the previously explained procedure to compute
solver and deserves some further study.
eigenvectors, the density matrix is constructed as ex-
plained before; using a straightforward summation of
eigenvector outer products up to the number of occupied C. DFT calculations
states. This is:
nocc
X We used the DFT-based code SIESTA24,25 to obtain
ρ= xk xTk (14) all the molecules’ Hamiltonians in a tight-binding like
k representation. SIESTA is a first-principles electronic
A python like pseudocode for the full algorithm is de- structure code for molecules and solids which represents
tailed in Pseudocode 1. The inputs are the Hamiltonian wavefunctions as a combination of atomic-like orbitals26 .
H, Overlap S, the occupation nocc, and the target pre- The Kohn-Sham equations are solved self-consistently in
cision precT . The output is the DM ρ. a linear combination of atomic orbitals basis set and they
are restricted to a fix-size cell condition. This framework
Pseudocode 1 Full DM construction algorithm using high-precision allows us to obtain a description of the operator associ-
QUBO eigensolver. Description of the variables follows. H: Input ated with the system energy employing a reduced number
Hamiltonian; precT : Target precision; nooc: Number of occupied
states; λ: Minimal eigenvalue being computed; N : Size of H; err:
of orbitals in the basis set with complete multiple-zeta
Convergence error; tol: Tolerance for the initial phase; S: Overlap and polarized bases, depending on the required accuracy.
matrix; x: Minimal eigenvector being computed; prec: Convergence In this study, all first-principles calculations were per-
precision; ∇f : Gradient of f ; H: Hessian matrix; ρ: Output DM
formed with a minimal single-ζ basis for each atom. De-
def get rho(H, S, nocc, precT ):
#Eigenspectrum width estimation scription of the interaction between atoms was conducted
w = get Gershgorin width(H) within the local density approximation approach27 for
for i in range(nocc): #Loop over occ states
λ = tr (H) /N #Initial estimate of λ
the exchange-correlation functional. The integration over
#Initial phase the Brillouin zone (BZ) was performed using a Monkhorst
while (err > tol):
sampling in Γ point. The radial extension of the orbitals
x =solve(xT (H − λS)x) #Send to Annealer
x ← x/(xT Sx) #Normalize
had a finite range with a kinetic energy cutoff of 50 meV.
λ = xT Hx #Compute new eigenvalue A separation of 20 Å in a cubic simulation box prevents
err = xT (H − λS)x #Compute error virtual periodic molecules from interacting. We used dif-
#Descent phase
prec = 0.1
ferent numbers of molecules to get the Hamiltonians (H)
while (prec > precT ): and overlap (S) matrices needed to test our algorithm.
δ =solve(∇f δ + δHδ) #Send to Annealer We also computed H and S for benzene in order to test
x←x+δ
x ← x/(xT Sx) #Normalize
for resiliency regarding degeneracy of eigenstates.
λ = xT Hx #Compute new eigenvalue
err = xT (H − λS)x #Compute error
if (err > errold ) : III. RESULTS AND DISCUSSION
v ← 0.1v
prec = 0.1prec
errold = err
H ← H − wSxT xS #Shift eigenvalue up
We first studied the propagation of the errors with the
ρ ← ρ + 2xT x #Construct partial DM number of occupied states. The error in the calculated
return ρ DM is computed as follows:
X
error = ||ρ − ρex ||1 = |ρij − ρex
ij | (15)
Pseudocode 1 implements a straightforward way of
ij
computing the density matrix using the high-precision
ex
QUBO eigensolver. From this pseudocode we can imme- where ρ is the DM obtained using regular diagonal-
diately identify some important properties. We see that ization in double precision which we hereby set as our
the matrix operations in the inner loops are O(N 2 ) un- standard. In order to reach chemical accuracy, the er-
less they can be solved using sparse matrix formats to get ror needs to satisfy error ≤ 1.0[kCal/mol]/ maxlm (Hlm )
5
Error
From the Pseudocode 1, we can deduce that the error
committed on the calculations of the very first computed
eigenpairs is susceptible to be propagated towards the
higher eigenpairs, hence, making the calculations of DM 0.01
with higher occupations less accurate. Figure 2 shows the
error as a function of the occupation number for a single
water molecule with six total atomic orbitals. Three dif-
ferent precision values were used to show that this prob-
5 10 15 20 25 30 35
lem happens regardless of the precision used. In general,
we notice that the higher we set the precision for the com- Number of orbitals
putation of the eigenvectors, the less error is committed
in the construction of the DM across different values of FIG. 3. Error as a function of system size (number of or-
bitals). Calculations using two and four bits of precision are
occupations. The larger the number of occupied orbitals,
shown.
the larger the error that is committed in the computation
of the DM.
1400
1 Prec = 1.0E-1 Nbits = 2
Prec = 1.0E-3 1200 Nbits = 4
Prec = 1.0E-5
Number of iterations
1000
log10(Error)
0.1 800
600
400
0.01
200
0
5 10 15 20 25 30 35
1 2 3 4 5 6 Number of orbitals
Number of occupied orbitals
FIG. 4. Number of iterations as a function of the system size
FIG. 2. Error as a function of the number of occupied states (number of orbitals). Calculations using two and four bits of
for a single water molecule with N = 6. Results for three precision are shown.
different precision values are shown.
The problem of error propagation worsens with system Finally, we computed the number of iterations needed
size. As it is shown in Figure 3, the error gets larger when to achieve the desired precision to compute every sin-
the total number of orbitals increases, regardless of the gle state for the benzene molecule. This particular case
precision and number of bits that are used. We notice has been selected since, due to its high symmetry (see
that, the error increases with the system size which is molecular representation on the inset of figure 5), the
a sign that a higher precision will be required for larger electronic structure of benzene is characterized by having
systems to reach chemical accuracy. several groups of degenerate molecular orbitals (orbitals
Figure 4 shows the number of iterations (total QUBO that are very close in energy). In general, the closer two
solutions) that are needed to construct the DM for two molecular orbitals are in energy, the more the degeneracy
and four bits. The total number of iterations is computed between them. Figure 5 shows the number of iterations
by adding up the number of iterations needed to achieve for computing a single eigenvector as a function of the
the desired precision for the calculation of each eigenvec- proximity in energy to the next eigenvector. We notice
tor. From Figure 3 it does not seem that much is gained that a group of molecular orbitals that are close in energy
by increasing the number of bits, however, as it is shown to the next (< 0.1 eV) need significantly more iterations
in Figure 4, we notice that less iterations are needed to to be computed at the desired precision. This point is a
reach convergence when using a larger number of bits. significantly limiting factor since the electronic structure
6
Degenerated
states DW5000Q
SA-1000
SA-100
SA-10
V. ALTERNATIVE APPROACHES
In this section we present a comparison between the
results obtained using SA with different sweeps and
QA performed with the new D-Wave’s Advantage 4.1 In this section we will briefly describe some alterna-
machine29 . The chain strength for QA was calculated as tive approaches that could potentially overcome the is-
30% of the largest absolute value in the QUBO matrix30 . sues that were found by using a direct method to con-
This helps ensure minimal chain breaks contributing to struct the DM using a QUBO eigensolver. Three dif-
erroneous solutions. The default anneal time of 20 mi- ferent approaches are hereby suggested together with a
croseconds was used. The system DM calculation was discussion of possible issues they might have. 1) Using a
performed over samples Hamiltonians taken from differ- QUBO based linear solver on the O(N ) method recently
ent configurations over a 100 Molecular Dynamics (MD) proposed by L. Anderson32 . This algorithm shifts the
steps. We have extracted H from 20 configurations computational cost of solving ρ to the solution of a lin-
evenly spaced over the course of the MD trajectory. The ear system (Ax = b) using Conjugate Gradient method
DM was computed using an eigenpair precision of 10−5 . which can be replaced by:
From Figure 6 we can see that, on average, the results us-
x = argmin (Ax − b)2
ing QA have a larger error. Moreover, we have observed x
that more iterations are needed to converge. In the case
of SA a larger number of sweeps tend to converge to re- and can be easily formulated as a QUBO problem and
sults with lower errors. Means for the error between SA solved on a QA. The problem that this method could have
and QA are statistically different even for small values is the fact that there might be several iterations (QUBO
of sweeps. One of the mayor issues is the fact that the solves) needed to reach the desired precision of x. This
QUBO matrices have a high dynamic range which causes will add yet another inner loop into the algorithm.
a “resolution problem.” When the number of bits is large 2) We can use Green Functions (G(z)) method and
there are many small QUBO entries that could have large find ρ by integrating over the complex contour (with vari-
relative error at the moment of embedding but that might able z ∈ C) containing the eigenspectrum of H. For this
have a large influence in the results. One could scale the proposition we would need to have a method to invert
QUBO matrix Q so that the minimun entry has a value a complex matrix using a QUBO formulation. Provided
of 1.0, but this introduces the need of having very strong we can compute the inverse of a matrix using a QUBO
7
formulation, ρ would be determined as: We have also proposed other alternative methods to
ˆ compute the DM and analyze some possible issues. These
ρ= G(z)dz (16) methods include using linear scaling algorithms based on
C linear systems that could be solved using QUBO solvers;
where G(z) = (H − zI)−1 and C is the semicircle above using Green functions obtained from a QUBO solver that
the real axis. Inverting a matrix A can be simply done then could be used to compute the DM by a close inte-
by solving N linear systems of the form Axi − ei = 0, gral surrounding the eigenspectrum through the complex
with ei being the i-th canonical vector, and xj the j- plane; and solving the full density matrix “all at once”
th column of A−1 . This can be easily formulated as a from a QUBO problem using a vectorized form of the
QUBO provided both, the solution and the matrix to DM. Even if all these alternative methods have challenges
be inverted are real. If this is not the case, as for the there is still an opportunity for further improvements.
Green Function method here proposed, the formulation An in depth study of these other methods will be the
becomes more complicated and, as far as we know, no subject of future work.
previous work has addressed this issue.
3) Finally, we could think about solving for ρ “all at
VII. ACKNOWLEDGMENTS
once” from a QUBO problem where ρ would be written
as a long vector x containing N × N entries, such that:
xk = ρij where i = bk/N c + 1, and j = k − N bk/N c; Research presented in this article was supported by
where b.c is the operator that takes the “floor” value of a the Laboratory Directed Research and De-velopment
real number. In this case, the function to be minimized (LDRD) program of Los Alamos National Laboratory
will be the total energy (LANL) under project number 20200056DR. This re-
search was also supported by the U.S. Department of
E = tr (ρH) ≡ 1T (H ⊗ I)x (17) Energy (DOE) National Nuclear Security Administration
where in this case 1k = δij , where i = bk/N c + 1, and (NNSA) Advanced Simulation and Computing (ASC)
j = k − N bk/N c. Although equation 17 can be easily program at LANL. We acknowledge the ASC program
formulated as a QUBO problem, the energy needs to be for providing the support for accessing D-Wave’s Advan-
minimized under certain constraints given by the prop- tage 4.1 computing resource. LANL is operated by Triad
erties of ρ; some of which can be extremely difficult to National Security, LLC, for the National Nuclear Security
formulate as a QUBO. Moreover, the size of the QUBO Administration of U.S. Department of Energy (Contract
problem to be solved will scale as N 2 × N 2 , probably No. 89233218NCA000001). Assigned: LA-UR-22-20271
opening up other bottlenecks in the algorithm even if the
annealing process require no time. VIII. APPENDIX
We have demonstrated that the DM can be computed Multiplying x and y in the binary representation leads
by using a QUBO based eigensolver and summing up the to the following operation:
self outer products of the eigenvectors up to the num- x × y = qxT (vT ⊗ v)qy
ber of occupied states. Although the results obtained
with this direct approach are encouraging, many issues An example with 3-bits follows. Given we have: x =
still needs to be addressed. We reported on the problem 0.5 ≡ (0, 1, 0), and y = 0.25 ≡ (0, 0, 1)
of the O(N 2 ) operation inside loops that compute the x × y = 0.125
egenpairs which leads back to an O(N 3 ) scaling. By ex-
−1 −0.5 −0.25 0
perimenting with different occupation numbers, we have
= (0, 1, 0) −0.5 0.25 0.125 0 = 0.125
seen that the error rapidly increases with the occupation
−0.25 0.125 0.0625 1
number regardless of the precision at which we compute
each eigenvector. We have also seen that the error in-
creases with the system size regardless of the number of B. Density matrix upper bound error
bits used to compute the eigenpairs. We observed that
by increasing the number of bits the number of iterations
to converge decreases significantly. We have identified a The error in ρ is computed as: ∆ρ = ||ρ − ρex ||; where
problem where the number of iterations to reach preci- ρex is the exact DM constructed using regular diagonal-
sion increases for degenerate states. Finally we showed ization methods. To ensure chemical accuracy we need
that QA does not show any advantage when compared to ensure that the total energy error remains below 1.0
to SA for computing DM within this QUBO formulation kCal/mol. This means: ∆E < 1.0kCal/mol.
and more research needs to be done to fully understand ∂tr (ρH) ∂tr (ρH)
the QA vs SA comparison. ∆E = ∆ρ + ∆H
∂ρ ∂H
8
Assuming no error is commited in the calculation of the 10 P. Hohenberg and W. Kohn, “Inhomogeneous electron gas,”
Hamiltonian, we get: Phys. Rev. 136, 864 – 870 (1964).
11 S. F. Sousa, P. A. Fernandes, and M. J. Ramos, “General per-
“Graph partitioning using quantum annealing on the D-Wave density-functional calculations for very large systems,” Physical
system,” in Proceedings of the Second International Workshop Review B 53, R10441–R10444 (1996).
25 E. Artacho, D. Sánchez-Portal, P. Ordejón, A. Garcı́a, and J. M.
on Post Moores Era Supercomputing, PMES’17 (Association for
Computing Machinery, New York, NY, USA, 2017) pp. 22–29. Soler, “Linear-scaling ab-initio calculations for large and complex
4 C. F. A. Negre, H. Ushijima-Mwesigwa, and S. M. Mniszewski, systems,” physica status solidi (b) 215, 809–817 (1999).
26 A. A. Demkov, J. Ortega, O. F. Sankey, and M. P. Grumbach,
“Detecting multiple communities using quantum annealing on
the D-Wave system,” PLOS ONE (2019). “Electronic structure approach for complex silicas,” Physical Re-
5 A. Teplukhin, B. K. Kendrick, S. Tretiak, and P. A. Dub, “Elec- view B 52, 1618–1630 (1995).
27 W. Kohn and L. J. Sham, “Self-consistent equations including
tronic structure with direct diagonalization on a dwave quantum
annealer,” Sci. Rep. 10, 20753 (2020). exchange and correlation effects,” Phys. Rev. 140, A1133–A1138
6 S. M. Mniszewski, P. A. Dub, S. Tretiak, P. M. Anisimov, (1965).
28 “dwave-neal: An implementation of a simulated annealing sam-
Y. Zhang, and C. F. A. Negre, “Reduction of the molecular
hamiltonian matrix using quantum community detection,” Sci- pler for general ising model graphs in c++ with a dimod python
entific Reports 11 (2021). wrapper,”.
7 A. Teplukhin, B. K. Kendrick, S. M. Mniszewski, Y. Zhang, 29 C. M. Pau Farre, “The advantage system: Performance update,”
A. Kumar, C. F. A. Negre, P. M. Anisimov, S. Tretiak, and Tech. Rep. 14-1054A-A (DWAVE, 2021).
30 D-Wave, “Programming the D-Wave QPU: Setting the chain
P. A. Dub, “Computing molecular excited states on a D-Wave
quantum annealer,” Sci. Rep. 11, 18796 (2021). strength,” D-Wave Systems Whitepaper 2020-04-14 , 11 (2020).
8 Á. Vázquez-Mayagoitia, W. Scott Thornton, J. R. Hammond, 31 A. S. Koshikawa, M. Ohzeki, T. Kadowaki, and K. Tanaka,
and R. J. Harrison, “Quantum chemistry methods with multi- “Benchmark test of black-box optimization using D-Wave quan-
wavelet bases on massive parallel computers,” (2014). tum annealer,” (2021), arXiv:2103.12320 [cond-mat.stat-mech].
9 C. Fiolhais, F. Nogueira, and M. Marques, A Primer in Density 32 L. Andersson, “Linear-scaling recursive expansion of the Fermi-