Estimating Graph Counts From Signatures: A Maximum Entropy Approach

Estimating Graph Counts from Signatures: A
Maximum Entropy Approach

Henri Lhomond
November 1, 2023
Abstract
This paper presents a maximum entropy model to estimate graph
counts given geodesic ball signatures specifying node reachability. We
derive probability distributions based on entropy maximization sub-
ject to signature constraints. Solving for Lagrange multipliers deter-
mines maximum entropy distributions, with partition functions esti-
mating valid graph counts. Computational complexity and signature
uncertainty are analyzed. Numerical simulations demonstrate the ap-
proach, indicating limitations and potential extensions.
1 Introduction
We examine counting graphs consistent with reachability signatures. Let G
be possible graphs on N nodes. A signature specifies R(d): the number of
nodes reachable at distance d from a node. We seek the number of valid
graphs |G∗ | where R∗ (d) = R(d).
Our approach models graph generation as a maximum entropy process
reproducing signatures in expectation. Entropy H measures state multiplic-
ity. Constraining H by ⟨R∗ ⟩ = R gives probabilistic graph counts via the
partition function.
We derive Lagrange conditions for signature constraints and analyze com-
putational complexity. Experiments highlight successes and limitations. Pos-
sible extensions incorporating uncertainty are discussed.
1
2 Maximum Entropy Model
Let p(g) be probabilities for generating graphs g ∈ G. The entropy is:
X
H=− p(g) log p(g) (1)
g∈G
We constrain expected reachability to match the signature:

X
⟨R∗ ⟩ = p(g)R∗ (g) = R (2)
g∈G
where R∗ (g) gives reachabilities for graph g.

Maximizing entropy subject to the constraints yields an equilibrium dis-
tribution. We introduce Lagrange multipliers λ:
L = H + λT (⟨R∗ ⟩ − R) (3)
Taking derivatives w.r.t. p(g) and λ gives:
1 −λT R∗ (g)
p(g) = e (4)
Z(λ)
R = −∇λ log Z(λ) (5)
T ∗
where Z(λ) = g∈G e−λ R (g) is the partition function.
P
Given a signature R, we solve Eq. 5 for λ∗ and evaluate Z(λ∗ ) to ap-
proximate |G∗ |, the graph count:
∗
|G∗ | ≈ eH = Z(λ∗ ) (6)
Low entropy indicates constraints dominate, limiting valid graphs. Next
we analyze computational complexity.
3 Computational Considerations
Counting graph configurations exactly is #P-complete [1]. Our approach
approximates via sampling.
Finding λ∗ involves a convex optimization in d-dimensions, where d is
the signature length. Efficient methods exist using interior point methods or
gradient descent.
2
Evaluating Z(λ) for a given λ requires summing over the exponentially
large G. We estimate Z using Monte Carlo integration, drawing M samples:
M
|G| X −λT R∗ (gi )
Z≈ e (7)
M i=1
Choosing sufficiently large M gives a close approximation, with errors
reducing as O(M −1/2 ).
Verifying graph validity remains challenging. Testing reachability in gen-
eral graphs is NP-complete [2]. However, our signatures specify only average
reachability. Connectivity distances in small-world graphs follow normal dis-
tributions per the small world theorem [3], allowing approximate verification.
In summary, computational hardness arises in multiple stages but can be
mitigated through estimation and approximation techniques. Greater effi-
ciency would facilitate larger graph sizes. Next we demonstrate the approach
on a numerical example.
4 Numerical Example
We estimate graphs on N = 100 nodes with signature (50, 25, 15, 10, 8, 5) for
d = 1 to d = 6.
Solving Eq. 5 gives λ∗ = (0.021, 0.0068, 0.0047, 0.0042, 0.004, 0.0039).
Sampling 106 graphs yields Z(λ∗ ) ≈ 2000. So we estimate ≈ 2000 valid
graphs.
The distribution of sampled graph reachabilities is concentrated around
the signature, confirming consistency.
However, testing graph validity is prohibitive. Approximate tests show
12% of samples violate the signature, suggesting Z overestimates the count.
The low entropy of H ∗ = 7.6 also indicates significant constraints limiting
graph possibilities below typical random graph entropy.
Incorporating uncertainty in R could improve estimates. We discuss ex-
tensions next.
5 Discussion
This demonstrates using maximum entropy models to estimate graph counts
from signatures. Key limitations are computational hardness and signature
3
violation rates.
Possible extensions include:
• Flexible probability models handling signature uncertainty [4]
• Efficient approximate graph validation methods
• Alternative entropy representations e.g. Bethe approximation [5]
• Variational or message passing algorithms to estimate Z [6]
• neural network models learning graph likelihoods [7]
The framework connects statistical physics, network science and graph

theory. Significant development could enable useful tools for large graph
analysis and sampling.
References
[1] Valiant, L. G. (1979). The complexity of enumeration and reliability
problems. SIAM Journal on Computing, 8(3), 410-421.
[2] Cook, S. A. (1971). The complexity of theorem-proving procedures.
Proceedings of the third annual ACM symposium on Theory of computing,
151-158.
[3] Travers, J., Milgram, S. (1969). An experimental study of the small
world problem. Sociometry, 425-443.
[4] Presse, S., Ghosh, K., Lee, J., Dill, K. A. (2013). Principles of
maximum entropy and maximum caliber in statistical physics. Reviews of
Modern Physics, 85(3), 1115.
[5] Yedidia, J. S., Freeman, W. T., Weiss, Y. (2005). Constructing free-
energy approximations and generalized belief propagation algorithms. IEEE
Transactions on information theory, 51(7), 2282-2312.
[6] Wainwright, M. J., Jordan, M. I. (2008). Graphical models, exponen-
tial families, and variational inference. Foundations and Trends in Machine
Learning, 1(1–2), 1-305.
[7] You, J., Ying, R., Leskovec, J. (2018). Position-aware graph neu-
ral networks. International Conference on Machine Learning, 7134-7143.
PMLR.

Estimating Graph Counts From Signatures: A Maximum Entropy Approach

Uploaded by

Copyright:

Available Formats

Estimating Graph Counts From Signatures: A Maximum Entropy Approach

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Estimating Graph Counts From Signatures: A Maximum Entropy Approach

Uploaded by

Copyright:

Available Formats

Estimating Graph Counts from Signatures: A

Maximum Entropy Approach

We constrain expected reachability to match the signature:

where R∗ (g) gives reachabilities for graph g.

• Flexible probability models handling signature uncertainty [4]

• Efficient approximate graph validation methods

• Alternative entropy representations e.g. Bethe approximation [5]

• Variational or message passing algorithms to estimate Z [6]

• neural network models learning graph likelihoods [7]

The framework connects statistical physics, network science and graph

You might also like