-
-
Notifications
You must be signed in to change notification settings - Fork 5.4k
MAINT: add test against generic fit method for vonmises distribution #18128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You were using _assert_less_or_close_loglike
correctly. It's fine not to use _reduce_func
like some of the other tests - I believe that's just the log-likelihood function with a penalty for out-of-bounds values. TBH, existing tests that use _reduce_func
should probably be changed - out-of-bounds values should result in NaN. I will do that separately.
Instead, ideally this would use nnlf
(which is an erroneous acronym for "negative log-likelihood function", NLLF). But I see why it can't just do that: scale
is broken for the vonmises
distribution, and it turns out that making the scale tiny results in very good NLLF, giving the default implementation a huge advantage in the comparison. This wouldn't be fair, so your negative_loglikehood
function ignores scale
.
The best thing to do, though, would be to pass fscale=1
to level the playing field. When I do that, this test fails: the default implementation is a little better than the override. I've made the test xslow
and skipped CI, but can you check this out locally? The difference in log-likelihood is small, but all the other fit overrides pass this test, so I think this should, too. It may just require tightening a root-solving tolerance?
Thanks for helping out with the likelihood function. I tried setting What I did now was adding a relative tolerance parameter to the |
That's something that can be tested. If you find such a case, it would probably be considered a shortcoming, too.
For many distributions, the MLE of one parameter is strictly independent of the value of another. For instance, the MLE of the location parameter of the normal distribution is the sample mean, regardless of scale. The two parameters do not need to be optimized simultaneously to get machine precision MLEs. Is this not the case for von Mises? (Or do you mean the equations say it is, but that's not what you're observing in practice?) One think to test is to see what happens when data is definitely not sampled from the von Mises distribution (e.g. bimodal). Do the results disagree even more? |
I've done quite a few For 1000 sets of parameters, fitting with all parameters treated as free
For 1000 sets of parameters, fitting with the location fixed
To further test this hypothesis (problem in the math), I import numpy as np
from numpy.testing import assert_allclose, assert_equal
from scipy import stats, optimize, integrate, ndimage
import matplotlib.pyplot as plt
from mpmath import mp
mp.dps = 50
rng = np.random.default_rng(1638083107694713882823079058616272161)
n = 100
kappa, loc, scale = 1, 0, 1
dist_true = stats.vonmises(kappa, loc, scale)
rvs = dist_true.rvs(size=n)
p_mle = stats.vonmises.fit(rvs)
loc_mle = p_mle[1]
n_flocs = 50
offsets = np.linspace(-1, 1, n_flocs)
nllf_overrides = np.zeros(n_flocs)
nllf_supers = np.zeros(n_flocs)
for i, offset in enumerate(offsets):
print(i)
p_override = stats.vonmises.fit(rvs, floc=loc_mle + offset)
nllf_overrides[i] = stats.vonmises.nnlf(p_override, rvs)
p_super = stats.vonmises.fit(rvs, floc=loc_mle + offset, fscale=1, superfit=True)
nllf_supers[i] = stats.vonmises.nnlf(p_super, rvs)
diff = (nllf_overrides - nllf_supers)/nllf_supers
plt.plot(offsets, diff)
plt.xlabel('Offset of fixed location from MLE location')
plt.ylabel('Relative change in NLLF (positive is bad)') The further the offset of
I've studied this sort of thing. A summary of my findings for most distributions with overrides is at #11782 (comment). Some seem to be robust to everything I throw at it (e.g. normal distribution, duh), in which case I've marked it "good". For the most part, I haven't merged a PR if I have observed any sign of regression, although there were some In this PR, I'd suggest just falling back to the generic fit method if either the location or shape is fixed. (If you are thinking of making any follow-up PRs, I'd recommend lining up a maintainer to review them first. |
Thanks for this thorough analysis! Its getting late at my place but from what I understand, the |
I'll go ahead and submit the fix to this branch if that sounds good. |
Sounds good to me, and thanks again for your help. From what I saw, the case of fixed |
Yes, so much so that I thought about writing a book compiling distribution MLE results : ) |
# location likelihood equation has a solution independent of kappa | ||
loc = floc if floc is not None else find_mu(data) | ||
# shape likelihood equation depends on location | ||
shape = fshape if fshape is not None else find_kappa(data, loc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also allows the user to fix the shape.
Note that it is preferable to call the final values of the location and shape loc
and shape
rather than floc
and fshape
, since they were not actually f
ixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something I'd suggest studying, @dschmitz89, is whether there is a relatively simple but more robust way of solving the kappa
equation. For instance:
from scipy import stats
rvs = [-0.92923506, -0.32498224, 0.13054989, -0.97252014, 2.79658071,
-0.89110948, 1.22520295, 1.44398065, 2.49163859, 1.50315096,
3.05437696, -2.73126329, -3.06272048, 1.64647173, 1.94509247,
-1.14328023, 0.8499056 , 2.36714682, -1.6823179 , -0.88359996,
1.3990757 , 0.50580527, 0.39606279, 0.59775193, -0.96931391,
-2.94886662, 2.51401895, 1.85529066, -2.16612654, 0.69811142,
1.55245103, 1.93724126, 1.04246362, 2.14759188, -2.59632106,
1.44772926, 0.13118682, -0.70731962, -3.12222871, 1.1710668 ,
-2.38427605, 1.177051 , -2.17754098, -0.15177223, 1.23153502,
-2.17272573, 2.36591449, -2.70966423, 2.78704888, 0.04717811,
-2.06487457, 1.99469452, 0.39622628, -2.84217529, -2.98719481,
3.12200669, -1.51914078, 2.58646588, -1.64947704, -2.94559419,
-3.00862607, -1.19894746, 1.46864545, -2.98512437, 2.02929986,
-1.44114382, 2.02391158, 2.05532412, 2.97241311, 2.60627323,
1.97437403, 2.8264543 , -0.86461338, 0.2307659 , 0.71714181,
2.93683305, 0.05672313, 2.6922025 , 2.04911119, -2.10693874,
1.3875065 , 2.97284149, -0.08860371, -2.91405584, -2.75601588,
-2.46343408, 0.29537451, 0.57600184, -1.40230045, 1.73518165,
-3.09919971, 2.55072157, 3.04286114, -2.32435821]
fixed = {'floc': 0.005191972335031776} # ValueError: f(a) and f(b) must have different signs
stats.vonmises.fit(rvs, **fixed)
Take a look at how some other fit
overrides find their starting brackets, for instance, or consider whether a gradient-based solver would work.
Typically when the fit
override is prone to failure like this, I would fall back on the generic fit
method. If we can't fix this before release, I think we'll want to do that. But this is much better than in main
, so for now I'll go ahead and merge if tests pass.
…rbitrary location; allow user to fix shape
309e758
to
5ee0c41
Compare
I went ahead and force-pushed a rebase with each of our commits squashed, since there was a pretty clean separation. In case you need a copy of the original history, it's here. I'll merge when a few CI jobs finish so I know the rebase went ok. |
For these data, the
Defaulting to the generic |
Oh, that makes it easier. |
Reference issue
Follow-up of #18013
What does this implement/fix?
In #18013, an additional test was requested to to compare the overwritten fit method of the vonmises distribution with the generic one. THis adds that.
Additional information
With this, the tests for von mises take 5.5 seconds on my machine. I think marking the new tests as slow when the PR is accepted would be a good choice to save a little on CI.
I did not fully understand the
_assert_less_or_close_loglike
machinery, that's why I ended up implementing the log likelihood function as the criterion. The methods used for the other distributions did not work, probably because_fitstart
does not exist for vonmises.