Assignment 1
Assignment 1
String Manipulation:
While the Iris dataset doesn't have any text columns, this step would be crucial
for datasets with textual data.
example:
Python
# Assuming a 'text_column' exists
iris['text_column'] = iris['text_column'].str.lower().str.strip()
NumPy Operations:
# Convert relevant columns to NumPy arrays
sepal_length_np = iris['sepal_length'].values
assignments
petal_width_np = iris['petal_width'].values
# Calculate basic statistics
print("Mean sepal length:", np.mean(sepal_length_np))
print("Median petal width:", np.median(petal_width_np))
Data Splitting:
from sklearn.model_selection import train_test_split
# Split into features (X) and target (y)
X = iris[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]
y = iris['species']
# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
Building a Model:
from sklearn.linear_model import LogisticRegression
# Create a logistic regression model
model = LogisticRegression()
# Train the model
model.fit(X_train, y_train)
# Evaluate the model
accuracy = model.score(X_test, y_test)
print("Accuracy:", accuracy)
Report:
assignments
Further Exploration
Feature Engineering: Consider creating new features based on existing
ones (e.g., ratios of sepal and petal measurements).
Model Selection: Experiment with other models like decision trees,
random forests, or support vector machines.
Hyperparameter Tuning: Optimize model parameters to improve
performance.
Visualization: Create plots to visualize the data and model predictions.
By following these steps and exploring further, you can gain deeper insights
into the dataset and build more accurate machine learning models.