0% found this document useful (0 votes)
8 views4 pages

Unit 3 7

The document outlines the process of using Polynomial Regression to fit a synthetic non-linear dataset defined by the equation y = x^2 + 2x + 3. It includes steps for generating the dataset, visualizing it, implementing Polynomial Regression, and comparing its performance against Simple Linear Regression. The results indicate that Polynomial Regression provides a significantly better fit for the non-linear data compared to the linear model.

Uploaded by

mcanarender
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views4 pages

Unit 3 7

The document outlines the process of using Polynomial Regression to fit a synthetic non-linear dataset defined by the equation y = x^2 + 2x + 3. It includes steps for generating the dataset, visualizing it, implementing Polynomial Regression, and comparing its performance against Simple Linear Regression. The results indicate that Polynomial Regression provides a significantly better fit for the non-linear data compared to the linear model.

Uploaded by

mcanarender
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Polynomial Regression for Non-Linear Data

Objective: Use Polynomial Regression to fit a non-linear dataset.


Dataset: Create a synthetic dataset with a non-linear relationship (e.g., \(y = x^2 + 2x + 3\)).
Tasks:
1. Generate and explore the dataset.
2. Visualize the data to confirm its non-linear nature.
3. Implement Polynomial Regression (e.g., degree=2) to fit the data.
4. Compare the performance with a Simple Linear Regression model.

import numpy as np
import pandas as pd

# Set random seed for reproducibility


np.random.seed(42)

# Generate synthetic dataset


X = np.linspace(-10, 10, 100).reshape(-1, 1) # Generate 100 values from -10 to 10
y = X**2 + 2*X + 3 + np.random.normal(0, 5, X.shape) # Quadratic equation with noise

# Convert to DataFrame
df = pd.DataFrame({'X': X.flatten(), 'y': y.flatten()})

# Display first few rows


print(df.head())

import matplotlib.pyplot as plt

# Scatter plot to visualize the dataset


plt.scatter(df['X'], df['y'], color='blue', label='Data Points')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Scatter Plot of Non-Linear Dataset')
plt.legend()
plt.show()
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import make_pipeline

# Create a polynomial regression model (degree=2)


poly_degree = 2
poly_model = make_pipeline(PolynomialFeatures(degree=poly_degree), LinearRegression())

# Train the model


poly_model.fit(X, y)

# Predictions
y_pred_poly = poly_model.predict(X)

# Plot Polynomial Regression Fit


plt.scatter(X, y, color='blue', label='Data Points')
plt.plot(X, y_pred_poly, color='red', label=f'Polynomial Regression (degree={poly_degree})')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Polynomial Regression Fit')
plt.legend()
plt.show()
# Train a simple linear regression model
linear_model = LinearRegression()
linear_model.fit(X, y)

# Predictions using Linear Regression


y_pred_linear = linear_model.predict(X)

# Compare Polynomial vs. Linear Regression


plt.scatter(X, y, color='blue', label='Data Points')
plt.plot(X, y_pred_linear, color='green', linestyle='dashed', label='Linear Regression')
plt.plot(X, y_pred_poly, color='red', label=f'Polynomial Regression (degree={poly_degree})')
plt.xlabel('X')
plt.ylabel('y')
plt.title('Comparison: Polynomial vs. Linear Regression')
plt.legend()
plt.show()
The comparison between Polynomial Regression (degree=2) and Simple Linear Regression
shows that Polynomial Regression provides a much better fit for the given dataset. Linear
Regression, represented by the green dashed line, assumes a straight-line relationship and fails to
capture the quadratic pattern in the data, leading to higher error and poor predictive performance.
In contrast, Polynomial Regression (red line) effectively models the curvature, reducing the error
and improving accuracy. While Linear Regression underfits the data due to its simplicity,
Polynomial Regression balances flexibility and generalization, making it the better choice for
this dataset.

You might also like