Data Pre-Processing with Sklearn using Standard and Minmax
Data Pre-Processing with Sklearn using Standard and Minmax
# create data
data = [[11, 2], [3, 7], [0, 10], [11, 8]]
# scale features
scaler = MinMaxScaler()
model=scaler.fit(data)
scaled_data=model.transform(data)
[[1. 0. ]
[0.27272727 0.625 ]
[0. 1. ]
[1. 0.75 ]]
binarize the data using Python Scikit-learn
• Binarization is a preprocessing technique which is
used when we need to convert the data into binary
numbers i.e., when we need to binarize the data. The
scikit-learn function
named Sklearn.preprocessing.binarize() is used to
binarize the data.
• This binarize function is having threshold parameter,
the feature values below or equal this threshold value is
replaced by 0 and value above it is replaced by 1.
• # Importing the necessary packages import
sklearn import numpy as np from sklearn import
preprocessing X = [[ 0.4, -1.8, 2.9],[ 2.5,
0.9, 0.3],[ 0., 1., -1.5],[ 0.1, 2.9, 5.9]]
Binarized_Data =
preprocessing.Binarizer(threshold=0.5).transfor
m(X) print("\nThe Binarized data is:\n",
Binarized_Data)
• # Importing the necessary packages
• import sklearn
• import numpy as np
• from sklearn import preprocessing
• X = [[ 0.4, -1.8, 2.9],[ 2.5, 0.9, 0.3],[ 0., 1., -1.5],[ 0.1, 2.9, 5.9]]
• Binarized_Data = preprocessing.Binarizer(threshold=0.5).transform(X)
• print("\nThe Binarized data is:\n", Binarized_Data)
Output:
The Binarized
data is:
[[0. 0. 1.]
[1. 1. 0.]
[0. 1. 0.]
[0. 1. 1.]]
• Implementation of Logistic and Linear Regression Algorithms
Linear Regression:
Linear regression is a type of supervised machine
learning algorithm that computes the linear relationship
between a dependent variable and one or more
independent features.
When the number of the independent feature, is 1 then it
is known as Univariate Linear regression and
in the case of more than one feature, it is known as
multivariate linear regression.
• The goal of the algorithm is to find the best linear
equation that can predict the value of the dependent
variable based on the independent variables.
• The equation provides a straight line that represents the
relationship between the dependent and independent
variables.
• The slope of the line indicates how much the dependent
variable changes for a unit change in the independent
variable(s).
• Linear regression is used in many different fields,
including finance, economics, and psychology, to
understand and predict the behavior of a particular
variable. For example, in finance, linear regression
might be used to understand the relationship between a
company’s stock price and its earnings or to predict the
future value of a currency based on its past
performance.
• One of the most important supervised learning tasks is regression.
In regression set of records are present with X and Y values and
these values are used to learn a function so if you want to predict
Y from an unknown X this learned function can be used. In
regression we have to find the value of Y, So, a function is
required that predicts continuous Y in the case of regression given
X as independent features.
• Here Y is called a dependent or target variable and X is called an
independent variable also known as the predictor of Y. There are
many types of functions or modules that can be used for
regression. A linear function is the simplest type of function. Here,
X may be a single feature or multiple features representing the
problem.
Linear regression performs the task to predict a
dependent variable value (y) based on a given
independent variable (x)). Hence, the name is Linear
Regression. In the figure above, X (input) is the work
experience and Y (output) is the salary of a person. The
regression line is the best-fit line for our model.