Skip to content

Conversation

ArturoAmorQ
Copy link
Member

@ArturoAmorQ ArturoAmorQ commented Aug 18, 2022

Reference Issues/PRs

Fixes #13546.
Fixes #24288.
Takes #24192 into account.

What does this implement/fix? Explain your changes.

As discussed in #13546, the current state of the roc_plot.py example gives different macro-averaged AUC than roc_auc_score because they use different averaging strategies. This PR aims to clarifying the different averaging strategies by implementing a "tutorialization".

Any other comments?

Side effect: Implements notebook style as intended in #22406.

Copy link
Contributor

@cmarmo cmarmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ArturoAmorQ, thanks for your work.
I've made some comments, mainly about the format.

Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a partial review. More to come in a few days hopefully.

@ArturoAmorQ ArturoAmorQ changed the title DOC Rework roc_plot.py example DOC Rework plot_roc.py example Aug 28, 2022
@glemaitre glemaitre self-requested a review September 13, 2022 08:51
Copy link
Member

@glemaitre glemaitre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can state that the example is much better :)
I really like the narrative, I find it intuitive.

I put some nitpicking that could clarify the code.

# Interpolate all ROC curves at these points
mean_tpr = np.zeros_like(fpr_grid)

for i in range(n_classes):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to use sp.interpolate.interp1d then we should add a comment that interp is doing linear interpolation. Another downside of using interp is that values need to be ordered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am hesitating about what is the best, as we actually use np.interp to compute the tpr in the private function _binary_roc_auc_score. Also it would add boilerplate code below. Maybe the easiest is to indeed add a comment on linear interpolation.

ArturoAmorQ and others added 2 commits September 15, 2022 11:42
Co-authored-by: Guillaume Lemaitre <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
Copy link
Member

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ArturoAmorQ for refactoring and polishing this example. It really looks good. Just a few more suggestions below:

@haiatn
Copy link
Contributor

haiatn commented Oct 4, 2022

What is missing so this could be merged?

@ArturoAmorQ
Copy link
Member Author

What is missing so this could be merged?

I still have to tweak the plot legends to reduce their overlapping with the figure, I just haven't had the time. But (hopefully) this will be ready to merge soon ;) thanks for your interest, @haiatn!

@ArturoAmorQ ArturoAmorQ requested a review from glemaitre October 6, 2022 14:36
@glemaitre glemaitre merged commit a8f0858 into scikit-learn:main Oct 13, 2022
@glemaitre
Copy link
Member

Thanks @ArturoAmorQ LGTM.

@ArturoAmorQ ArturoAmorQ deleted the plot_roc branch October 14, 2022 14:56
glemaitre added a commit to glemaitre/scikit-learn that referenced this pull request Oct 31, 2022
Co-authored-by: Olivier Grisel <[email protected]>
Co-authored-by: Chiara Marmo <[email protected]>
Co-authored-by: Guillaume Lemaitre <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add plotting AUC plotting tools to "plot_roc" example roc_plot.py example gives different macro-averaged AUC than roc_auc_score
5 participants