IsolationForest(max_features=0.8).predict(X) fails input validation

When subsampling features `IsolationForest` fails the input validation when calling `predict()`.

``` python
from sklearn.ensemble import IsolationForest
from sklearn import datasets

iris = datasets.load_iris()
X = iris.data
y = iris.target

clf = IsolationForest(max_features=0.8)
clf.fit(X, y)
clf.predict(X)
```

gives the following:

``` python
scikit-learn/sklearn/tree/tree.pyc in _validate_X_predict(self, X, check_input)
    392                              " match the input. Model n_features is %s and "
    393                              " input n_features is %s "
--> 394                              % (self.n_features_, n_features))
    395
    396         return X

ValueError: Number of features of the model must  match the input. Model n_features is 3 and  input n_features is 4
```

In `predict` one of the individual fitted estimators is used for input validation: `self.estimators_[0]._validate_X_predict(X, check_input=True)` but it is passed the full `X` which has all the features. After looking into it a bit, `bagging.py` sub-samples the features itself, where as `forest.py` delegates it to the underlying `DecisionTree`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

IsolationForest(max_features=0.8).predict(X) fails input validation #5732

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

IsolationForest(max_features=0.8).predict(X) fails input validation #5732

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions