BUG/ENH: Removed non-standard scaling of the covariance matrix and added option to disable scaling completely. #11197

wummo · 2018-05-30T11:57:06Z

As discussed in the bug report polyfit uses a non-standard scaling factor for the covariance matrix, this is corrected.

Furthermore, an option is added to disable the scaling of the covariance matrix completely. It would be useful in occasions, where the weights are given by 1/sigma with sigma being the (known) standard errors of (Gaussian distributed) data points, in which case the unscaled matrix is already a correct estimate for the covariance matrix.

mattip · 2018-05-30T16:58:15Z

Thanks for the pull request, and welcome!

Needs a mention in doc/release/1.15.0-notes.rst under the Changes section, as well as a mention in the Improvements section for the additional option.

Also tests are missing for the new option. It would be nice to have a "demonstration of desired behavior" type of test that simply demonstrates the power of the new option, as well as a test for any new error modes. For instance, what happens if w is None but absolute_weights is True?

Since we are changing default behavior, a heads-up to the numpy-discussion mailing list with a link to this commit is also necessary.

wummo · 2018-06-01T12:45:32Z

Hi Matti,

thanks for the welcome.

I changed the release notes and added some notes in the sections "changes" and "improvements".
Also, I added a check if absolute_weights is set to True but w is empty. In this case a ValueError is thrown. Another possible clash of options might be absolute_weights=True but cov=False but I believe that this can be neglected. Finally, I introduced 2 tests, one to check if the covariance matrix is calculated correctly and another one the see if ValueError is thrown. Lastly, I wrote a small note to numpy-discussions about the change in default behavior that comes with this patch.

Best,
Andreas

mhvk

Overall, definitely a good idea, but I think the name should reflect what is actually done more closely.

Also, definitely include the test!

mhvk · 2018-06-01T13:04:10Z

numpy/lib/polynomial.py

@@ -423,6 +424,19 @@ def polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False):
    cov : bool, optional
        Return the estimate and the covariance matrix of the estimate
        If full is True, then cov is not returned.
+    absolute_weights: bool, optional


I realize it amount to bike-shedding, but I find this name confusing, since I've never encountered this term. If we stick close to this, I'd very much prefer a relative_weigths=True since I think that more clearly indicates that there is something weird about the weights.

But really what this does is forcing the reduced chi2 to unity, so maybe that is what the parameter name should reflect? Indeed, in the actual code, the weights are not used at all. Now force_redchi2_to_unity is a bit long... Maybe rescale_covariance=True? Or just cov_scale or scale_cov?

Just to explain the current choice of absolute_weights: It was suggested by @josef-pkt on the mailing list. It is the analogue to scipy.optimize.curve_fit's absolute_sigma parameter. Its name was decided on in the lengthy discussion within this PR, which was continued in this PR. At least I somewhat like the analogy to the curve_fit terminology but I don't know whether this really is a valid argument here.

OK, the comments in the first thread do argue specifically against scale_cov is bad ... I do think a bit of a mistake was made in scipy to not call it relative_sigma, but on the other hand there is an advantage for newly introduced flags to be False for the default of "old behaviour".

Let me try another suggestion, though: unlike scipy's curve_fit, right now we already have a flag to ask for the covariance matrix. Could we not broaden its purpose instead to also tell what type we want? If falsy, we do not return it as now, and if truthy, we do return it, but exactly what we return will depend on its value. Specifically, I suggest,

cov : bool or str, optional If given and not `False`, return not just the estimate but also its covariance matrix. By default, the covariance are scaled by chi2/sqrt(N-dof), i.e., the weights are presumed to be unreliable except in a relative sense and everything is scaled such that the reduced chi2 is unity. This scaling is omitted is ``cov='unscaled'``, as is relevant for the case that the weights are 1/sigma**2, with sigma known to be a reliable estimate of the uncertainty.

p.s. In the docstring proper, be careful with single back quotes - with those, there should be an actual link target, i.e., something like False works because it links to the python API.

What about this suggestion of using cov to indicate whether or not scaling should be done?

To be honest, I like the simple relative_weight better. Like @jotasi already mentioned, it behaves like absolute_sigma in curve_fit. Still, if you insist, I can implement your proposal. In this case it would be nice, if you could point me to another function with a similar parameter, so I can have a look what type of parameter check is performed.

mhvk · 2018-06-01T13:05:18Z

numpy/lib/polynomial.py

@@ -552,6 +566,8 @@ def polyfit(x, y, deg, rcond=None, full=False, w=None, cov=False):
        raise TypeError("expected 1D or 2D array for y")
    if x.shape[0] != y.shape[0]:
        raise TypeError("expected x and y to have same length")
+    if absolute_weights and (w is None):


I don't see why this is necessary: the rescaling could be done or omitted (as is arguably meaningful) independent of whether weights are present. I'd remove this.

Wouldn't omitting the rescaling without specifying weights effectively mean that all points' standard deviations are considered 1, or did I misunderstand that?

Yes, for normal distributions that would be the case. But mostly I see no reason to force a user to pass on w=np.ones(y.shape[0]) when the flag is set. The default is not really no weights, but weight equals 1.

In the case of relative weights (scaling), having all weights set to one means that the error on all data points is of equal magnitude. In the new case (absolute weights, no scaling), I'm not sure if this is any sensible default. Why would a data point have the error sigma==1? I guess, that would be just by coincidence or in special cases where you draw from a distribution with known width (like your unit test example)?

@wummo - it indeed implies sigma=1 - which I agree is not necessarily all that meaningful, but I don't see a reason to specifically forbid someone from entering it - it just makes the code longer and more complex for no benefit.

OK, here is my last argument: Isn't sigma==1 a detail of the implementation that might (maybe) change? Then, giving no weights is something like "undefined behavior".

But it isn't ;-) After all, this is a possibly weighted least-squares fit, not a chi2 one, and the meaning is clear without the weights (the meaning of the covariance admittedly less so, but I don't think one has to hand-hold people that much).

In my opinion, the test is useful, but having very little experience with numpy development/guidlines, I followed your hints and removed the check and the corresponding unit test.

numpy/lib/polynomial.py

mhvk · 2018-06-01T13:14:21Z

numpy/lib/polynomial.py

+        else:
+            if len(x) <= order:
+                raise ValueError("the number of data points must exceed order "
+                                 "for estimate the covariance matrix")


This error message is not correct any more, it only needs to be true if rescaling is done. Maybe just replace "for estimate" (weird grammar anyway) with "to scale"

Still to be done: "for estimate" -> "to scale" (or "in order to be able to scale"

mhvk · 2018-06-01T13:15:56Z

numpy/lib/tests/test_polynomial.py

-                      [0, 1, 3], [0, 1, 3], deg=0, cov=True)
+                      [1], [1], deg=0, cov=True)
+
+        # Check exception when option absolute_weights is True, but  no weights


This would need to be removed again...

numpy/lib/tests/test_polynomial.py

eric-wieser · 2018-06-04T07:41:17Z

numpy/lib/polynomial.py

-                             "for Bayesian estimate the covariance matrix")
-        fac = resids / (len(x) - order - 2.0)
+        if absolute_weights:
+            fac = 1.


Best to just use 1 here - it cooperates better with Decimal, if that ever becomes supported

OK, I changed this.

mhvk · 2018-06-06T14:05:51Z

numpy/lib/polynomial.py

+                                 "for estimate the covariance matrix")
+            # note, this used to be: fac = resids / (len(x) - order - 2.0)
+            # it was deciced that the "- 2" (originally justified by "Bayesian
+            # uncertainty analysis") is not was the user expects


Add "(see gh- and gh-11197)"

For some reason my comment was weird here: should have been "gh-11196 and gh-11197"

mhvk · 2018-06-06T14:10:05Z

Looks good, except for the two remaining issues:

Do we want to forbid getting an unscaled covariance without weights? I feel this is unnecessary hand-holding.
Do we want to fold in the absolute_weights into the cov boolean, by allowing a string value? Personally, I find routines whose outputs depend on flags super-annoying to start with, and adding even more flags seems silly, so I'd prefer just a single flag that tells what type of covariance to return (where False is "don't bother, I don't need them).

mhvk · 2018-06-06T14:56:28Z

@wummo - this function is a bit of a mess already... And IIRC it is in fact recommended to use np.polynomial.polynomial.polyfit, since it presents coefficients in more logical order, but which does not even have cov. Adding another argument makes it deviate even more, but I'm certainly not stuck on sticking to one argument. Let me ping @charris, since he has much more experience with the polynomial stuff.

josef-pkt · 2018-06-06T15:04:33Z

One argument in favor of rolling it into a trivariate cov is that users only have to change one flag to get fixed scale.

(I find it annoying in some cases in statsmodels, where we have one flag to switch away from the default and another flag to choose an option for the alternative. It's easy to forget to switch the first keyword, and I often have to correct my initial code to fix it.)

wummo · 2018-06-08T17:33:39Z

@mhvk did you have a chance to contact @charris ?

wummo · 2018-08-10T10:22:59Z

I rebased the code to 1.16 to fix the merge conflicts. Is there anything I can do to move this pull request forward?

wummo · 2018-09-03T10:41:19Z

In the light of PR #11733, is there still interest to fix this issue here or will this function be dropped eventually anyway, @mhvk @charris?

jsdodge · 2018-10-03T17:36:56Z

I just became aware of this issue in numpy.polyfit and would strongly recommend including this fix until numpy.polynomial.polynomial.polyfit includes an option to return the covariance matrix. Otherwise, users will be stuck with a default that does not provide an easy way to determine parameter uncertainties.

Also, since @josef-pkt expressed confusion about why a user would want this feature in mailing list discussion, it's worth noting that it is common practice in physics (my field) to determine the measurement sample variance independently, then treat it as known when fitting a model to data from the same apparatus. Introductory textbooks typically focus on this case.

mattip · 2018-11-15T17:33:16Z

Maybe this would be more palatable as two PRs - one to remove the non-standard -2 and another to add the kwarg. Or is it ready to go in as-is? The -2 has been a long standing issue (5 years).

mhvk · 2018-11-15T20:02:29Z

Sorry that this has slipped so far. I'd still like the opinion of @charris, because it would be good to move this to the polynomial classes. Absent that, I'm happy to merge the -2 removal, but would prefer the option not to add a new argument, but rather to use a string for the type of covariance one wants.

wummo · 2018-11-16T08:21:55Z

@mhvk Do you think it makes sense to wait any longer, given that we already waited 1/2 year? If you decide that a string for the type of covariance is the better interface, I will implement this and the PR can be merged.

mhvk · 2018-11-16T14:32:05Z

@wummo - fair enough. Yes, please do the string interface and we'll merge this.

mhvk

@wummo - thanks for making the changes. Only some small left-overs...

mhvk · 2018-11-19T15:04:53Z

doc/release/1.16.0-notes.rst

@@ -239,6 +239,15 @@ single elementary function for four related but different signatures,
 The ``out`` argument to these functions is now always tested for memory overlap
 to avoid corrupted results when memory overlap occurs.

+New option ``absolute_weights'' in ``np.polyfit''
+-------------------------------------------------
+Like ``absolute_sigma'' in ``scipy.optimize.curve_fit`` a boolean option


Need to change the release notes as well...

mhvk · 2018-11-19T15:05:13Z

doc/release/1.16.0-notes.rst

+weights are given by 1/sigma with sigma being the (known) standard errors of
+(Gaussian distributed) data points, in which case the unscaled matrix is already
+a correct estimate for the covariance matrix. In case ``absolute_weights'' is set
+to true, but no weights are given, a ``ValueError'' is thrown.
 Detailed docstrings for scalar numeric types


Note that the rebase has removed the empty line that should be here.

mhvk · 2018-11-19T15:05:25Z

doc/release/1.16.0-notes.rst

+covariance matrix. Namely, rather than using the standard chisq/(M-N), it
+scales it with chisq/(M-N-2) where M is the number of data points and N is the
+number of parameters.  This scaling is inconsistent with other fitting programs
+such as e.g. ``scipy.optimize.curve_fit`` and was changed to chisq/(M-N).


And another empty line to be added back in.

mhvk · 2018-11-19T15:05:50Z

numpy/lib/polynomial.py

+        except in a relative sense and everything is scaled such that the
+        reduced chi2 is unity. This scaling is omitted if ``cov='unscaled'``,
+        as is relevant for the case that the weights are 1/sigma**2, with
+        sigma known to be a reliable estimate of the uncertainty.


Very clear, thanks!

mhvk · 2018-11-19T15:07:46Z

numpy/lib/polynomial.py

+        else:
+            if len(x) <= order:
+                raise ValueError("the number of data points must exceed order "
+                                 "for estimate the covariance matrix")


Still to be done: "for estimate" -> "to scale" (or "in order to be able to scale"

mhvk · 2018-11-19T15:08:39Z

numpy/lib/polynomial.py

+                                 "for estimate the covariance matrix")
+            # note, this used to be: fac = resids / (len(x) - order - 2.0)
+            # it was deciced that the "- 2" (originally justified by "Bayesian
+            # uncertainty analysis") is not was the user expects


For some reason my comment was weird here: should have been "gh-11196 and gh-11197"

mhvk · 2018-11-19T15:11:15Z

p.s. While making the last changes, could you also rebase & squash the commits? Thanks again, and apologies that this has all taken so long.

charris · 2018-11-19T20:20:06Z

I've definitely considered adding a covariance computation to the polynomial package fitting functions, I've done it for myself in practice. I agree that -2 is a bit too clever, in a case like this it is best to follow conventions that folks are used to. Note that for large data sets it won't make much difference, and for small data sets the estimated covariance will have a large error regardless unless the variance of the measurement errors is known by other means.

For std and var, the "ddof" parameter the default is 0, which is also non-standard, but unlikely to change. There are reasons for that convention as @rkern has explained, but it is surprising to many.

wummo · 2018-11-20T09:01:55Z

@mhvk I fixed the problems with the documentation. If everything looks OK, I will do the rebase & squash.

mhvk · 2018-11-20T14:11:20Z

doc/release/1.16.0-notes.rst

+-----------------------------------------------------------
+
+A further possible value has been added to the ``cov'' parameter of the
+``np.polyfit`` function. With ``cov=unscaled`` the scaling of the covariance


One last small thing: missing quotes around unscaled, i.e., cov='unscaled'

mhvk · 2018-11-20T14:12:01Z

Looks good modulo the missing quotes. Please go ahead and rebase/squash as well, and I'll merge.

…n to disable scaling completely.

wummo · 2018-11-27T09:27:53Z

@mhvk I did the rebase and squashing and just wanted to ask whether there is anything more that needs to be done.

mhvk · 2018-11-27T14:59:31Z

@wummo - I hadn't seen that the branch was pushed - now all is OK so I'll merge. Thanks for the contribution and more thanks for your patience!

wummo · 2018-11-27T15:11:10Z

Its great that it got merge. Thanks a lot for your help.

mattip added 00 - Bug 01 - Enhancement 56 - Needs Release Note. Needs an entry in doc/release/upcoming_changes 59 - Needs tests component: numpy.polynomial labels May 30, 2018

mhvk requested changes Jun 1, 2018

View reviewed changes

eric-wieser reviewed Jun 4, 2018

View reviewed changes

mhvk reviewed Jun 6, 2018

View reviewed changes

wummo force-pushed the correct_covariance_scaling branch from 7e86c15 to 977a38a Compare June 18, 2018 09:27

wummo force-pushed the correct_covariance_scaling branch from 977a38a to 3250479 Compare August 10, 2018 07:35

wummo force-pushed the correct_covariance_scaling branch from 3250479 to 39efaf4 Compare August 23, 2018 10:11

mhvk reviewed Nov 19, 2018

View reviewed changes

mhvk reviewed Nov 20, 2018

View reviewed changes

mhvk approved these changes Nov 20, 2018

View reviewed changes

Removed non-standard scaling of the covariance matrix and added optio…

1837df7

…n to disable scaling completely.

wummo force-pushed the correct_covariance_scaling branch from 632802a to 1837df7 Compare November 21, 2018 20:24

mhvk merged commit 1e2cb50 into numpy:master Nov 27, 2018

jotasi deleted the correct_covariance_scaling branch December 1, 2018 09:16

melissawm mentioned this pull request Jul 13, 2020

Possible wrong polyfit documentation #16842

Closed

jsdodge mentioned this pull request Aug 28, 2023

ENH: add cov parameter to numpy.polynomial.polynomial.polyfit #20889

Open

Uh oh!

BUG/ENH: Removed non-standard scaling of the covariance matrix and added option to disable scaling completely. #11197

BUG/ENH: Removed non-standard scaling of the covariance matrix and added option to disable scaling completely. #11197

Uh oh!

Conversation

wummo commented May 30, 2018

Uh oh!

mattip commented May 30, 2018

Uh oh!

wummo commented Jun 1, 2018

Uh oh!

mhvk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mhvk commented Jun 6, 2018

Uh oh!

mhvk commented Jun 6, 2018

Uh oh!

josef-pkt commented Jun 6, 2018

Uh oh!

wummo commented Jun 8, 2018

Uh oh!

wummo commented Aug 10, 2018

Uh oh!

wummo commented Sep 3, 2018

Uh oh!

jsdodge commented Oct 3, 2018

Uh oh!

mattip commented Nov 15, 2018

Uh oh!

mhvk commented Nov 15, 2018

Uh oh!

wummo commented Nov 16, 2018

Uh oh!

mhvk commented Nov 16, 2018

Uh oh!