2. Random Forest Algorithm
2. Random Forest Algorithm
How it Works: Random Forest is an ensemble learning algorithm that creates multiple decision trees. It splits data
randomly at each node and averages the predictions of all trees for regression tasks like predicting the age of crabs.
Steps:
1. Collect Data: Gather crab data (e.g., size, weight, shell dimensions) and their ages.
2. Preprocess Data: Handle missing data and split the data into training and testing sets.
3. Train Model: Build a Random Forest model using the training data.
4. Evaluate: Use metrics like Mean Absolute Error (MAE) and R² to assess the model’s performance.
Advantages:
CODE
# Import necessary libraries
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV, cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, classification_report
from sklearn.preprocessing import StandardScaler
# Load your dataset (replace 'your_dataset.csv' with the actual file path)
dataset = pd.read_csv('your_dataset.csv')
# Split the data into training and testing sets (80% training, 20% testing)
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)