Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Robust Classification, Regression and Clustering. #70

Merged
merged 14 commits into from
Nov 8, 2020

Conversation

TimotheeMathieu
Copy link
Contributor

@TimotheeMathieu TimotheeMathieu commented Oct 4, 2020

This update the robust module.

  • Fix the PEP8 violation. Closes PEP8 violation #69
  • Change from RobustWeightedEstimator to the three classes RobustWeightedClassifier, RobustWeightedRegressor, RobustWeightedKMeans . Each class does one task.
  • Fix examples, doc and test in consequence

EDIT : I also added Huber loss because it is more robust than squared loss and it makes more sense in robust regression.

@rth
Copy link
Contributor

rth commented Nov 7, 2020

Thanks @TimotheeMathieu and sorry for slow review. I'll do a pass now. If you don't mind I will merge upstream/master in, which should resolve the CI issues..

Copy link
Contributor

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @TimotheeMathieu ! This is a nice improvement, and the split of RobustWeightedEstimator into separate regressor, classifier, and clusterer classes, makes indeed them easier to understand. We can easily do that since the added RobustWeightedEstimator was not yet par of a release.

I have a few comments below. Also it would be good to makes sure following parts are run in unit tests (either by adding new tests or extending existing tests to cover them e.g. using pytest.mark.parametrize)

  • weighting='mom' and k=None
  • check that classifier with another loss than log raises an error
    def test_robust_no_proba():
        ...
        est = RobustWeightedClassifier(loss='hinge').fit(X, y)
        msg = "Probability estimates are not available"
        with pytest.raises(AttributeError, match=msg):
            est.predict_proba(X)

Finally a few minor fixes of types: typos_diff.txt. Otherwise changes look good I think.

@TimotheeMathieu
Copy link
Contributor Author

I followed your recommendations @rth, thank you for the review, thanks to you there is a better test coverage.

Copy link
Contributor

@rth rth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@rth rth merged commit 1f626fa into scikit-learn-contrib:master Nov 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PEP8 violation
2 participants