Data Preprocessing and Cleaning
Data Preprocessing and Cleaning
Key Steps:
- Handling missing values (`df.fillna()`, `df.dropna()`)
- Encoding categorical data (`pd.get_dummies()`, `LabelEncoder`)
- Scaling numerical features (`StandardScaler`, `MinMaxScaler`)
- Splitting data into training and test sets (`train_test_split`)
Example:
--------------------------------
from sklearn.preprocessing import StandardScaler, LabelEncoder
df.fillna(df.mean(), inplace=True)
le = LabelEncoder()
df['category_encoded'] = le.fit_transform(df['category'])
scaler = StandardScaler()
df[['feature1', 'feature2']] = scaler.fit_transform(df[['feature1', 'feature2']])
--------------------------------