[MRG+1] QuantileTransformer #8363

glemaitre · 2017-02-15T17:14:46Z

Reference Issue

Cont'd of #2176

What does this implement/fix? Explain your changes.

Implementation of quantile normalizer

Any other comments?

glemaitre · 2017-02-15T17:15:48Z

@tguillemot @dengemann @raghavrv @ogrisel here we go

tguillemot · 2017-02-16T09:03:32Z

sklearn/preprocessing/tests/test_data.py

+    X_trans = normalizer.fit_transform(X)
+    # FIXME: one of those will drive to precision error
+    # in the interpolation
+    # assert_array_almost_equal(np.min(X_trans, axis=0), 0.)


I'm working on it.

I checked yesterday for while and there is nothing wrong with our code.
f(min(X)) of the interpolated function do not want to return 0.
The issue should come from numpy.interp

This is working on the toy :D
I will try to sort out the issue with the CI error coming from different numpy version I think.

It's a problem of precision with numpy.interp indeed.

tguillemot · 2017-02-16T09:39:34Z

sklearn/preprocessing/data.py

+        self : object
+            Returns self
+        """
+        X = self._validate_X(X)


Just a niptick, is it necessary to create a specific function ?
When there are few lines I prefer not create function ;).

ogrisel · 2017-02-16T09:42:53Z

sklearn/preprocessing/tests/test_data.py

+    normalizer = QuantileNormalizer()
+    normalizer.fit(X)
+    X_trans = normalizer.fit_transform(X)
+    assert_array_almost_equal(np.min(X_trans, axis=0), 0.)


You can use use assert_almost_equal when you compare scalar values.

ogrisel · 2017-02-16T09:44:41Z

sklearn/preprocessing/tests/test_data.py

+    X_trans = normalizer.fit_transform(X)
+    assert_array_almost_equal(np.min(X_trans, axis=0), 0.)
+    assert_array_almost_equal(np.max(X_trans, axis=0), 1.)
+


Can you please add a check that extreme values are mapped to 0 or 1, e.g.

X_test = np.array([ [ -1, 1, 0], [101, 11, 10], ]) expected = np.array([ [0, 0, 0], [1, 1, 1], ]) assert_array_almost_equal(normalizer.transform(X_test), expected)

ogrisel · 2017-02-16T09:46:39Z

sklearn/preprocessing/data.py

+
+        for feat_idx, f in enumerate(func_transform):
+            Xt.data[Xt.indptr[feat_idx]:Xt.indptr[feat_idx + 1]] = f(
+                Xt.data[Xt.indptr[feat_idx]:Xt.indptr[feat_idx + 1]])


Actually you could factorize the slicing to make the code more readable:

column_slice = slice(Xt.indptr[feat_idx], Xt.indptr[feat_idx + 1]) Xt.data[column_slice] = f(Xt.data[column_sclice])

ogrisel · 2017-02-16T09:47:28Z

sklearn/preprocessing/data.py

+        ----------
+        X : sparse matrix, shape (n_samples, n_features)
+            The data used to scale along the features axis. The sparse matrix
+            needs to be semi-positive.


You should make it explicit that it only works for CSC sparse matrices (I know this is not public API but it makes it easier to understand how the code works).

ogrisel · 2017-02-16T09:52:09Z

sklearn/preprocessing/data.py

+        # we only accept positive sparse matrix
+        if sparse.issparse(X) and X.min() < 0:
+            raise ValueError('QuantileNormalizer only accepts semi-positive'
+                             ' sparse matrices')


Not "semi-positive sparse matrix" but "sparse matrices with all non-negative entries".

ogrisel · 2017-02-16T09:53:10Z

sklearn/preprocessing/tests/test_data.py

+def test_quantile_normalizer_error_neg_sparse():
+    X = np.array([[0, 25, 50, 75, 100],
+                  [-2, 4, 6, 8, 10],
+                  [2.6, 4.1, 2.3, 9.5, 0.1]]).T


You should insert more zero values in this matrix to make it sparser.

ogrisel · 2017-02-16T09:54:27Z

sklearn/preprocessing/tests/test_data.py

+
+    X = np.array([[0, 25, 50, 75, 100],
+                  [2, 4, 6, 8, 10],
+                  [2.6, 4.1, 2.3, 9.5, 0.1]]).T


You should insert more zero values in this matrix to make it sparser.

ogrisel · 2017-02-16T09:55:40Z

sklearn/preprocessing/tests/test_data.py

+    qn_ser = pickle.dumps(qn, pickle.HIGHEST_PROTOCOL)
+    qn2 = pickle.loads(qn_ser)
+    assert_array_almost_equal(qn.transform(iris.data),
+                              qn2.transform(iris.data))


You should also check that it can pickle correctly before fitting (evenn though it should trivially work).

dengemann · 2017-02-16T10:14:07Z

sklearn/preprocessing/data.py

+
+    The normalization is applied on each feature independently.
+    The cumulative density function of a feature is used to project the
+    original values.


Add something like:

Features of new/unseen data that fall below or above the fitted range will be mapped to 0 and one, respectively.
Note that this transform and non-linear. It may remove correlations between variables measured at the same scale but renders variables measured at different scales more directly comparable.

dengemann · 2017-02-16T10:16:05Z

sklearn/preprocessing/data.py

+    See also
+    --------
+    :class:`sklearn.preprocessing.StandardScaler` to perform standardization
+    that is faster, but less robust to outliers.


Add maybe

:class:`sklearn.preprocessing.Ro bustScaler` to perform robust standardization that removes the influence of outliers but does not put outliers and inliers on the same scale.

dengemann · 2017-02-16T10:16:59Z

sklearn/preprocessing/data.py

+                     bounds_error=False,
+                     fill_value=(min(quantiles_feat),
+                                 max(quantiles_feat)))
+            for quantiles_feat in self.quantiles_.T]


Is there any reason for these guys being lists, hence mutable?

Good point.

glemaitre · 2017-02-16T10:20:08Z

sklearn/preprocessing/data.py

+        self.references_ = np.linspace(0, 1, self.n_quantiles,
+                                       endpoint=True)
+        # FIXME: it does not take into account the zero in the computation
+        self.quantiles_ = np.array([np.percentile(


@ogrisel @tguillemot Here I am not really sure what should be the right way.
Assuming that the sparse matrix as a lot of zeros fo a given feature, it will have a bad influence on the normalisation, didn't it?
It could also be the case in the dense in fact. That was the reason of including a quantile_range.

Can we modify the reference value to take into account of the number of 0 ?
Not sure it's what we want.

it is in np.percentiles that we can do that. We know the size of X_col and we can now the number of non-zeros. Therefore, we can add the zeros in the data to compute the percentiles.

yes we need to find a way to shift the percentile distribution efficiently. It probably better to do the quantile computation ourselves: sort the subsampled column non-zero data, then consider the fraction of zeros that should be considered to be added at the beginning of that array (without actually materializing it) also taking the subsampling rate into account and do the quantile lookups manually.

But then we also need to handle the linear interpolation...

Actually, no need to do that, let's do:

column_nnz_data = X.data[X.indptr[feat]:X.indptr[feat + 1]] column_subsample = subsample * len(column_nnz_data) // X.shape[0] column_data = np.zeros(shape=subsample, dtype=X.dtype) column_data[:column_subsample] = rng.choice(column_nnz_data, column_subsample, replace=False)

and then proceed to extract the quantiles from column_data as usual.

Because subsample is going to be smallish and independent of X.shape[0] this is good enough and easier to maintain.

ogrisel · 2017-02-16T11:08:13Z

sklearn/preprocessing/data.py

+        # FIXME: it does not take into account the zero in the computation
+        self.quantiles_ = np.array([np.percentile(
+            X.data[X.indptr[feat]:X.indptr[feat + 1]], self.references_ * 100)
+                                    for feat in range(n_feat)]).T


Cosmetics: please use n_features and feature_idx.

glemaitre · 2017-02-16T13:58:01Z

sklearn/preprocessing/tests/test_data.py

+    # assert_array_almost_equal(np.min(X_trans, axis=0), 0.)
+    # assert_array_almost_equal(np.max(X_trans, axis=0), 1.)
+    X_trans_inv = normalizer.inverse_transform(X_trans)
+    assert_array_almost_equal(X, X_trans_inv)


Not directly related with the line but with the transform.inverse_transform. It will not be equal if X have out of bounds value which will be clipped during transform and mapped to minimum of maximum of the references_ during inverse transform

There are no problem for that case (and it's a way to be sure the normalizer works in a correct way).
But what you say is true indeed for general cases.

glemaitre · 2017-02-16T16:20:09Z

sklearn/preprocessing/data.py

        if direction:
+            print(1)


@tguillemot That look like debugging flags

glemaitre · 2017-02-16T16:20:25Z

sklearn/preprocessing/data.py

            func_transform = self.f_transform_
        else:
+            print(2)


@tguillemot That look like debugging flags

oups indeed. Sorry

glemaitre · 2017-02-16T18:19:46Z

@ogrisel I was checking the User guide for the preprocessing to see what to add.

I have a second thought on the naming of the class. From the description in the user guide, QuantileScaler would be more appropriate.

What is the reason to stick to normalizer?

ogrisel · 2017-02-16T18:24:11Z

The problem is that (feature-wise) scaling stands for deviding each feature by a scalar value. This is the case for StandardScaler and RobustScaler but not in our case. I prefer QuantileNormalizer or QuantileTransformer.

ogrisel · 2017-02-16T18:26:01Z

https://research.google.com/pubs/pub45530.html uses "quantile normalization" in the body of the article to describe what we do in this class. +1 for QuantileNormalizer.

ogrisel · 2017-02-16T18:28:24Z

sklearn/preprocessing/data.py

+    The cumulative density function of a feature is used to project the
+    original values. Features values of new/unseen data that fall below
+    or above the fitted range will be mapped to 0 and 1, respectively.
+    Note that this transform is non-linear. It may remove correlations between


"remove correlations" => "distort linear correlations"

ogrisel · 2017-02-16T18:30:00Z

sklearn/preprocessing/data.py

+    This Normalizer scales the features between 0 and 1, equalizing the
+    distribution of each feature to a uniform distribution. Therefore,
+    for a given feature, this normalization tends to spread out the most
+    frequent values.


It also reduces the impact of (marginal) outliers: this is therefore a robust preprocessing scheme.

glemaitre · 2017-02-16T18:34:50Z

https://research.google.com/pubs/pub45530.html uses "quantile normalization" in the body of the article to describe what we do in this class. +1 for QuantileNormalizer.

Fair enough. The narration of the User guide needs to be changed to be coherent.

ogrisel · 2017-02-16T21:02:45Z

sklearn/preprocessing/data.py

+
+    f_inverse_transform_ : list of callable, shape (n_quantiles,)
+        The inverse of the cumulative density function used to project the
+        data.


I think we should keep the f_transform_ and f_inverse_transform_ attribute private (with a leading underscore).

GaelVaroquaux · 2017-06-09T09:39:10Z

I've removed the smoothing_noise

@jnothman : this should give us your 👍, no?

There is a failing test that I will address soon

Simplifies also the code, examples, and documentation

GaelVaroquaux · 2017-06-09T23:16:13Z

Merged. Whoot!

This is based on a 4-year old PR by Joseph Turian :)

dengemann · 2017-06-09T23:18:04Z

Champagne!!

…

On Sat, Jun 10, 2017 at 2:16 AM Gael Varoquaux ***@***.***> wrote: Merged. Whoot! This is based on a 4-year old PR by Joseph Turian :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#8363 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AB0fivSofujPH0wT70__NEMYN7kvKtIrks5sCdJEgaJpZM4MB_c2> .

agramfort · 2017-06-10T05:44:07Z

🍻

tguillemot · 2017-06-10T09:37:00Z

👍

raghavrv · 2017-06-10T14:49:04Z

Yohoo :D Thanks for the patience @glemaitre

jnothman · 2017-06-10T23:39:37Z

Nicely resolved, @GaelVaroquaux, and well done all!

* resurrect quantile scaler * move the code in the pre-processing module * first draft * Add tests. * Fix bug in QuantileNormalizer. * Add quantile_normalizer. * Implement pickling * create a specific function for dense transform * Create a fit function for the dense case * Create a toy examples * First draft with sparse matrices * remove useless functions and non-negative sparse compatibility * fix slice call * Fix tests of QuantileNormalizer. * Fix estimator compatibility * List of functions became tuple of functions * Check X consistency at transform and inverse transform time * fix doc * Add negative ValueError tests for QuantileNormalizer. * Fix cosmetics * Fix compatibility numpy <= 1.8 * Add n_features tests and correct ValueError. * PEP8 * fix fill_value for early scipy compatibility * simplify sampling * Fix tests. * removing last pring * Change choice for permutation * cosmetics * fix remove remaining choice * DOC * Fix inconsistencies * pep8 * Add checker for init parameters. * hack bounds and make a test * FIX/TST bounds are provided by the fitting and not X at transform * PEP8 * FIX/TST axis should be <= 1 * PEP8 * ENH Add parameter ignore_implicit_zeros * ENH match output distribution * ENH clip the data to avoid infinity due to output PDF * FIX ENH restraint to uniform and norm * [MRG] ENH Add example comparing the distribution of all scaling preprocessor (scikit-learn#2) * ENH Add example comparing the distribution of all scaling preprocessor * Remove Jupyter notebook convert * FIX/ENH Select feat before not after; Plot interquantile data range for all * Add heatmap legend * Remove comment maybe? * Move doc from robust_scaling to plot_all_scaling; Need to update doc * Update the doc * Better aesthetics; Better spacing and plot colormap only at end * Shameless author re-ordering ;P * Use env python for she-bang * TST Validity of output_pdf * EXA Use OrderedDict; Make it easier to add more transformations * FIX PEP8 and replace scipy.stats by str in example * FIX remove useless import * COSMET change variable names * FIX change output_pdf occurence to output_distribution * FIX partial fixies from comments * COMIT change class name and code structure * COSMIT change direction to inverse * FIX factorize transform in _transform_col * PEP8 * FIX change the magic 10 * FIX add interp1d to fixes * FIX/TST allow negative entries when ignore_implicit_zeros is True * FIX use np.interp instead of sp.interpolate.interp1d * FIX/TST fix tests * DOC start checking doc * TST add test to check the behaviour of interp numpy * TST/EHN Add the possibility to add noise to compute quantile * FIX factorize quantile computation * FIX fixes issues * PEP8 * FIX/DOC correct doc * TST/DOC improve doc and add random state * EXA add examples to illustrate the use of smoothing_noise * FIX/DOC fix some grammar * DOC fix example * DOC/EXA make plot titles more succint * EXA improve explanation * EXA improve the docstring * DOC add a bit more documentation * FIX advance review * TST add subsampling test * DOC/TST better example for the docstring * DOC add ellipsis to docstring * FIX address olivier comments * FIX remove random_state in sparse.rand * FIX spelling doc * FIX cite example in user guide and docstring * FIX olivier comments * EHN improve the example comparing all the pre-processing methods * FIX/DOC remove title * FIX change the scaling of the figure * FIX plotting layout * FIX ratio w/h * Reorder and reword the plot_all_scaling example * Fix aspect ratio and better explanations in the plot_all_scaling.py example * Fix broken link and remove useless sentence * FIX fix couples of spelling * FIX comments joel * FIX/DOC address documentation comments * FIX address comments joel * FIX inline sparse and dense transform * PEP8 * TST/DOC temporary skipping test * FIX raise an error if n_quantiles > subsample * FIX wording in smoothing_noise example * EXA Denis comments * FIX rephrasing * FIX make smoothing_noise to be a boolearn and change doc * FIX address comments * FIX verbose the doc slightly more * PEP8/DOC * ENH: 2-ways interpolation to avoid smoothing_noise Simplifies also the code, examples, and documentation

* resurrect quantile scaler * move the code in the pre-processing module * first draft * Add tests. * Fix bug in QuantileNormalizer. * Add quantile_normalizer. * Implement pickling * create a specific function for dense transform * Create a fit function for the dense case * Create a toy examples * First draft with sparse matrices * remove useless functions and non-negative sparse compatibility * fix slice call * Fix tests of QuantileNormalizer. * Fix estimator compatibility * List of functions became tuple of functions * Check X consistency at transform and inverse transform time * fix doc * Add negative ValueError tests for QuantileNormalizer. * Fix cosmetics * Fix compatibility numpy <= 1.8 * Add n_features tests and correct ValueError. * PEP8 * fix fill_value for early scipy compatibility * simplify sampling * Fix tests. * removing last pring * Change choice for permutation * cosmetics * fix remove remaining choice * DOC * Fix inconsistencies * pep8 * Add checker for init parameters. * hack bounds and make a test * FIX/TST bounds are provided by the fitting and not X at transform * PEP8 * FIX/TST axis should be <= 1 * PEP8 * ENH Add parameter ignore_implicit_zeros * ENH match output distribution * ENH clip the data to avoid infinity due to output PDF * FIX ENH restraint to uniform and norm * [MRG] ENH Add example comparing the distribution of all scaling preprocessor (#2) * ENH Add example comparing the distribution of all scaling preprocessor * Remove Jupyter notebook convert * FIX/ENH Select feat before not after; Plot interquantile data range for all * Add heatmap legend * Remove comment maybe? * Move doc from robust_scaling to plot_all_scaling; Need to update doc * Update the doc * Better aesthetics; Better spacing and plot colormap only at end * Shameless author re-ordering ;P * Use env python for she-bang * TST Validity of output_pdf * EXA Use OrderedDict; Make it easier to add more transformations * FIX PEP8 and replace scipy.stats by str in example * FIX remove useless import * COSMET change variable names * FIX change output_pdf occurence to output_distribution * FIX partial fixies from comments * COMIT change class name and code structure * COSMIT change direction to inverse * FIX factorize transform in _transform_col * PEP8 * FIX change the magic 10 * FIX add interp1d to fixes * FIX/TST allow negative entries when ignore_implicit_zeros is True * FIX use np.interp instead of sp.interpolate.interp1d * FIX/TST fix tests * DOC start checking doc * TST add test to check the behaviour of interp numpy * TST/EHN Add the possibility to add noise to compute quantile * FIX factorize quantile computation * FIX fixes issues * PEP8 * FIX/DOC correct doc * TST/DOC improve doc and add random state * EXA add examples to illustrate the use of smoothing_noise * FIX/DOC fix some grammar * DOC fix example * DOC/EXA make plot titles more succint * EXA improve explanation * EXA improve the docstring * DOC add a bit more documentation * FIX advance review * TST add subsampling test * DOC/TST better example for the docstring * DOC add ellipsis to docstring * FIX address olivier comments * FIX remove random_state in sparse.rand * FIX spelling doc * FIX cite example in user guide and docstring * FIX olivier comments * EHN improve the example comparing all the pre-processing methods * FIX/DOC remove title * FIX change the scaling of the figure * FIX plotting layout * FIX ratio w/h * Reorder and reword the plot_all_scaling example * Fix aspect ratio and better explanations in the plot_all_scaling.py example * Fix broken link and remove useless sentence * FIX fix couples of spelling * FIX comments joel * FIX/DOC address documentation comments * FIX address comments joel * FIX inline sparse and dense transform * PEP8 * TST/DOC temporary skipping test * FIX raise an error if n_quantiles > subsample * FIX wording in smoothing_noise example * EXA Denis comments * FIX rephrasing * FIX make smoothing_noise to be a boolearn and change doc * FIX address comments * FIX verbose the doc slightly more * PEP8/DOC * ENH: 2-ways interpolation to avoid smoothing_noise Simplifies also the code, examples, and documentation

* resurrect quantile scaler * move the code in the pre-processing module * first draft * Add tests. * Fix bug in QuantileNormalizer. * Add quantile_normalizer. * Implement pickling * create a specific function for dense transform * Create a fit function for the dense case * Create a toy examples * First draft with sparse matrices * remove useless functions and non-negative sparse compatibility * fix slice call * Fix tests of QuantileNormalizer. * Fix estimator compatibility * List of functions became tuple of functions * Check X consistency at transform and inverse transform time * fix doc * Add negative ValueError tests for QuantileNormalizer. * Fix cosmetics * Fix compatibility numpy <= 1.8 * Add n_features tests and correct ValueError. * PEP8 * fix fill_value for early scipy compatibility * simplify sampling * Fix tests. * removing last pring * Change choice for permutation * cosmetics * fix remove remaining choice * DOC * Fix inconsistencies * pep8 * Add checker for init parameters. * hack bounds and make a test * FIX/TST bounds are provided by the fitting and not X at transform * PEP8 * FIX/TST axis should be <= 1 * PEP8 * ENH Add parameter ignore_implicit_zeros * ENH match output distribution * ENH clip the data to avoid infinity due to output PDF * FIX ENH restraint to uniform and norm * [MRG] ENH Add example comparing the distribution of all scaling preprocessor (scikit-learn#2) * ENH Add example comparing the distribution of all scaling preprocessor * Remove Jupyter notebook convert * FIX/ENH Select feat before not after; Plot interquantile data range for all * Add heatmap legend * Remove comment maybe? * Move doc from robust_scaling to plot_all_scaling; Need to update doc * Update the doc * Better aesthetics; Better spacing and plot colormap only at end * Shameless author re-ordering ;P * Use env python for she-bang * TST Validity of output_pdf * EXA Use OrderedDict; Make it easier to add more transformations * FIX PEP8 and replace scipy.stats by str in example * FIX remove useless import * COSMET change variable names * FIX change output_pdf occurence to output_distribution * FIX partial fixies from comments * COMIT change class name and code structure * COSMIT change direction to inverse * FIX factorize transform in _transform_col * PEP8 * FIX change the magic 10 * FIX add interp1d to fixes * FIX/TST allow negative entries when ignore_implicit_zeros is True * FIX use np.interp instead of sp.interpolate.interp1d * FIX/TST fix tests * DOC start checking doc * TST add test to check the behaviour of interp numpy * TST/EHN Add the possibility to add noise to compute quantile * FIX factorize quantile computation * FIX fixes issues * PEP8 * FIX/DOC correct doc * TST/DOC improve doc and add random state * EXA add examples to illustrate the use of smoothing_noise * FIX/DOC fix some grammar * DOC fix example * DOC/EXA make plot titles more succint * EXA improve explanation * EXA improve the docstring * DOC add a bit more documentation * FIX advance review * TST add subsampling test * DOC/TST better example for the docstring * DOC add ellipsis to docstring * FIX address olivier comments * FIX remove random_state in sparse.rand * FIX spelling doc * FIX cite example in user guide and docstring * FIX olivier comments * EHN improve the example comparing all the pre-processing methods * FIX/DOC remove title * FIX change the scaling of the figure * FIX plotting layout * FIX ratio w/h * Reorder and reword the plot_all_scaling example * Fix aspect ratio and better explanations in the plot_all_scaling.py example * Fix broken link and remove useless sentence * FIX fix couples of spelling * FIX comments joel * FIX/DOC address documentation comments * FIX address comments joel * FIX inline sparse and dense transform * PEP8 * TST/DOC temporary skipping test * FIX raise an error if n_quantiles > subsample * FIX wording in smoothing_noise example * EXA Denis comments * FIX rephrasing * FIX make smoothing_noise to be a boolearn and change doc * FIX address comments * FIX verbose the doc slightly more * PEP8/DOC * ENH: 2-ways interpolation to avoid smoothing_noise Simplifies also the code, examples, and documentation

lesteve · 2018-02-27T16:24:18Z

sklearn/preprocessing/data.py

+            output_distribution = self.output_distribution
+        output_distribution = getattr(stats, output_distribution)
+
+        # older version of scipy do not handle tuple as fill_value


@glemaitre I bumped into this, while trying to get rid of code related to old numpy/scipy versions that we don't support any more. Do you remember what this is about? I could not figure it out by just looking at the code and searching the PR comments ...

I think that it should have been removed. At first, I implemented the interpolation using scipy.interpolate.interp1d which get a fill_value parameters. In older version fill_values do not accept a tuple [min, max] which is what we need.

But right now we are using numpy.interp. We could move to the higher scipy interp function but we need to wait the fill_values is accepting a typle or array-like. Then I am also not sure if this is useful to spend time on it :)

jnothman mentioned this pull request Feb 16, 2017

[MRG+2] Add fixed width discretization to scikit-learn #7668

Closed

8 tasks

lesteve mentioned this pull request Feb 16, 2017

[MRG] Better handling of Circle CI artifact URLs scikit-learn/scikit-learn.github.io#6

Merged

tguillemot reviewed Feb 16, 2017

View reviewed changes

ogrisel reviewed Feb 16, 2017

View reviewed changes

dengemann reviewed Feb 16, 2017

View reviewed changes

glemaitre commented Feb 16, 2017

View reviewed changes

tguillemot force-pushed the quantile_scaler branch from 509882d to 2b89139 Compare February 16, 2017 10:53

ogrisel reviewed Feb 16, 2017

View reviewed changes

glemaitre commented Feb 16, 2017

View reviewed changes

sklearn/preprocessing/data.py Outdated

if direction:

print(1)

Copy link

Member Author

glemaitre Feb 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tguillemot That look like debugging flags

glemaitre commented Feb 16, 2017

View reviewed changes

tguillemot force-pushed the quantile_scaler branch from 211bd64 to a803339 Compare February 16, 2017 16:27

ogrisel reviewed Feb 16, 2017

View reviewed changes

GaelVaroquaux force-pushed the quantile_scaler branch from f74666a to 7ac719a Compare June 9, 2017 09:35

GaelVaroquaux force-pushed the quantile_scaler branch 9 times, most recently from 05290f5 to 45a1548 Compare June 9, 2017 13:49

ENH: 2-ways interpolation to avoid smoothing_noise

7046a6d

Simplifies also the code, examples, and documentation

GaelVaroquaux force-pushed the quantile_scaler branch from 45a1548 to 7046a6d Compare June 9, 2017 17:36

GaelVaroquaux merged commit 26a1027 into scikit-learn:master Jun 9, 2017

amueller mentioned this pull request Jun 10, 2017

[MRG] Added code for sklearn.preprocessing.RankScaler #2176

Closed

lesteve reviewed Feb 27, 2018

View reviewed changes

Uh oh!

[MRG+1] QuantileTransformer #8363

[MRG+1] QuantileTransformer #8363

Uh oh!

Conversation

glemaitre commented Feb 15, 2017

Reference Issue

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

glemaitre commented Feb 15, 2017

Uh oh!

tguillemot Feb 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tguillemot Feb 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tguillemot Feb 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ogrisel Feb 16, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tguillemot Feb 16, 2017 •

edited

Loading

tguillemot Feb 16, 2017 •

edited

Loading

tguillemot Feb 16, 2017 •

edited

Loading

ogrisel Feb 16, 2017 •

edited

Loading

lesteve Feb 28, 2018 •

edited

Loading