Task 2
Task 2
Consider the
dataset with the following attributes
i)Height
ii)Weight
iii)Gender
Tool: Python
Aim
To implement a Linear Regression model that predicts a person's Age based on the attributes Height,
Weight, and Gender, and visualize the data using scatter plots and a line graph for better
understanding.
Procedure
1. Data Collection: We'll create a sample dataset with the attributes Height, Weight,
Gender, and Age.
2. Data Preprocessing: We'll handle categorical data (e.g., Gender) using encoding
techniques.
3. Data Visualization: We'll plot scatter plots for visualizing the relationships between
the features (Height, Weight, Gender) and the target (Age).
4. Linear Regression Model: We'll train a linear regression model to predict Age.
5. Visualization of Predictions: We'll use a line graph to show how the predicted age
changes with the input features.
Program
# Step 1: Create a sample dataset (you can replace this with your own
dataset)
data = {
'Height': [160, 170, 180, 165, 155, 190, 175, 168],
'Weight': [60, 70, 80, 65, 55, 90, 75, 68],
'Gender': ['Male', 'Female', 'Female', 'Male', 'Female', 'Male',
'Female', 'Male'],
'Age': [25, 30, 35, 28, 22, 40, 33, 27]
}
# Convert data into a pandas DataFrame
df = pd.DataFrame(data)
# Step 3: Split the data into features (X) and target (y)
X = df[['Height', 'Weight', 'Gender']] # Features
y = df['Age'] # Target variable
# Step 4: Split the data into training and testing sets (80% training,
20% testing)
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2, random_state=42)
# Step 9: Visualization
plt.subplot(1, 2, 1)
plt.scatter(df['Height'], df['Age'], color='blue', label='Height vs
Age')
plt.title('Height vs Age')
plt.xlabel('Height (cm)')
plt.ylabel('Age')
plt.grid(True)
plt.tight_layout()
plt.show()
Output
Model Coefficients: [ 0.25391986 0.25391986 -0.45557491]
Model Intercept: -30.940766550522664
Mean Absolute Error (MAE): 0.14939024390244704
Mean Squared Error (MSE): 0.04463488994646417
R-squared (R²): 0.9982146044021414
Predicted Age for the new data: 30.000000000000004
Result
By following this procedure, you can implement a Linear Regression model in Python to predict the
age of a person based on the features Height, Weight, and Gender, and use scatter plots and line graphs
for visualizing the data and predictions.