.. automodule:: Orange.regression.lasso
.. index:: regression
The lasso (least absolute shrinkage and selection operator) is a regularized version of least squares regression. It minimizes the sum of squared errors while also penalizing the L_1 norm (sum of absolute values) of the coefficients.
Concretely, the function that is minimized in Orange is:
\frac{1}{n}\|Xw - y\|_2^2 + \frac{\lambda}{m} \|w\|_1
Where X is a n \times m data matrix, y the vector of class values and w the regression coefficients to be estimated.
.. autoclass:: LassoRegressionLearner
:members:
:show-inheritance:
.. autoclass:: LassoRegression
:members:
:show-inheritance:
.. autofunction:: get_bootstrap_sample
.. autofunction:: permute_responses
To fit the regression parameters on housing data set use the following code:
.. literalinclude:: code/lasso-example.py
:lines: 9,10,11
To predict values of the response for the first five instances:
.. literalinclude:: code/lasso-example.py
:lines: 15,16
Output:
Actual: 24.00, predicted: 30.45
Actual: 21.60, predicted: 25.60
Actual: 34.70, predicted: 31.48
Actual: 33.40, predicted: 30.18
Actual: 36.20, predicted: 29.59
To see the fitted regression coefficients, print the model:
.. literalinclude:: code/lasso-example.py
:lines: 19
Output:
Variable Coeff Est Std Error p
Intercept 22.533
CRIM -0.023 0.024 0.050 .
CHAS 1.970 1.331 0.040 *
NOX -4.226 2.944 0.010 *
RM 4.270 0.934 0.000 ***
DIS -0.373 0.170 0.010 *
PTRATIO -0.798 0.117 0.000 ***
B 0.007 0.003 0.020 *
LSTAT -0.519 0.102 0.000 ***
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 empty 1
For 5 variables the regression coefficient equals 0:
ZN, INDUS, AGE, RAD, TAX
Note that some of the regression coefficients are equal to 0.