This document demonstrates the scikit-learn 4-step modeling pattern for machine learning in Python. It shows how to (1) import an estimator class, (2) instantiate the model, (3) fit the model to training data, and (4) predict new observations. As an example, it loads the Iris dataset, fits a k-nearest neighbors model (with different values of K), and makes predictions on test data. It then repeats the process using a logistic regression model instead of k-NN.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0 ratings0% found this document useful (0 votes)
39 views2 pages
Loading The Data: Numpy NP Sklearn - Datasets
This document demonstrates the scikit-learn 4-step modeling pattern for machine learning in Python. It shows how to (1) import an estimator class, (2) instantiate the model, (3) fit the model to training data, and (4) predict new observations. As an example, it loads the Iris dataset, fits a k-nearest neighbors model (with different values of K), and makes predictions on test data. It then repeats the process using a logistic regression model instead of k-NN.
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2
Loading the data
In [ ]: # import load_iris function from datasets module import numpy as np from sklearn.datasets import load_iris
# save "bunch" object containing iris dataset and its attributes
iris = load_iris()
# store feature matrix in "X"
X = iris.data
# store response vector in "y"
y = iris.target In [ ]: # print the shapes of X and y print X.shape print y.shape scikit-learn 4-step modeling pattern Step 1: Import the class you plan to use In [ ]: from sklearn.neighbors import KNeighborsClassifier Step 2: "Instantiate" the "estimator" • "Estimator" is scikit-learn's term for model • "Instantiate" means "make an instance of" In [ ]: knn = KNeighborsClassifier(n_neighbors=1) • Name of the object does not matter • Can specify tuning parameters (aka "hyperparameters") during this step • All parameters not specified are set to their defaults In [ ]: print knn Step 3: Fit the model with data (aka "model training") • Model is learning the relationship between X and y • Occurs in-place In [ ]: knn.fit(X, y) Step 4: Predict the response for a new observation • New observations are called "out-of-sample" data • Uses the information it learned during the model training process In [ ]: X_new = np.array([3, 5, 4, 2]).reshape(1,4) knn.predict(X_new) • Returns a NumPy array • Can predict for multiple observations at once In [ ]: X_new =np.array([[3, 5, 4, 2], [5, 4, 3, 2]]) knn.predict(X_new) Using a diferent value for K In [ ]: # instantiate the model (using the value K=5) knn = KNeighborsClassifier(n_neighbors=5)
# fit the model with data
knn.fit(X, y)
# predict the response for new observations
knn.predict(X_new) Using a diferent classification model In [ ]: # import the class from sklearn.linear_model import LogisticRegression
# instantiate the model (using the default parameters)