21.Create a Python program to analyze a dataset (e.g.
, customer reviews) and
demonstrate how AI can be used to extract insights. For example, use
sentiment analysis to categorize reviews as positive, negative, or neutral.
Python Code for Sentiment Analysis
import pandas as pd
import matplotlib.pyplot as plt
from textblob import TextBlob
# Load Dataset (Example: CSV File with "Review" Column)
file_path = "customer_reviews.csv" # Update this with your actual file path
df = pd.read_csv(file_path)
# Function to Get Sentiment Score
def analyze_sentiment(review):
analysis = TextBlob(str(review)) # Convert to string to avoid errors
polarity = analysis.sentiment.polarity
# Categorizing the Sentiment
if polarity > 0:
return "Positive"
elif polarity < 0:
return "Negative"
else:
return "Neutral"
# Apply Sentiment Analysis to Each Review
df["Sentiment"] = df["Review"].apply(analyze_sentiment)
# Display the first few rows
print(df.head())
# Count Sentiment Categories
sentiment_counts = df["Sentiment"].value_counts()
# Plot Sentiment Distribution
plt.figure(figsize=(6,4))
plt.bar(sentiment_counts.index, sentiment_counts.values, color=["green", "red",
"gray"])
plt.xlabel("Sentiment")
plt.ylabel("Number of Reviews")
plt.title("Customer Sentiment Distribution")
plt.show()
Steps to Run the Program:
1. Prepare a dataset (customer_reviews.csv) with at least one column named "Review".
2. Install Dependencies if not installed:
3. pip install pandas textblob matplotlib
4. Run the script, and it will:
o Categorize each review as Positive, Negative, or Neutral.
o Display the first few categorized reviews.
o Generate a bar chart showing sentiment distribution.
Example Output
Dataset Sample:
Review Sentiment
"Great product, very useful!" Positive
"Worst service ever." Negative
"It's okay, nothing special." Neutral
22.Implement a simple chatbot using Python's NLTK library that can respond to
basic user queries.
If you haven’t installed NLTK, install it first:
pip install nltk
Python Code for a Simple Chatbot
import nltk
from nltk.chat.util import Chat, reflections
# Define chatbot responses using pairs of patterns and responses
pairs = [
[r"hi|hello|hey", ["Hello!", "Hi there!", "Hey!"]],
[r"how are you ?", ["I'm a chatbot, I don't have feelings, but I'm here to
help!", "I'm doing great!"]],
[r"what is your name ?", ["I am a simple chatbot.", "You can call me
ChatBot!"]],
[r"how can you help me ?", ["I can answer basic questions!"]],
[r"who created you ?", ["I was created using Python and NLTK!"]],
[r"bye|goodbye", ["Goodbye!", "See you later!", "Bye, have a great day!"]],
]
# Create chatbot
chatbot = Chat(pairs, reflections)
# Start chatbot interaction
print("Hello! I'm your chatbot. Type 'bye' to exit.")
while True:
user_input = input("You: ")
if user_input.lower() == "bye":
print("Chatbot: Goodbye!")
break
response = chatbot.respond(user_input)
print("Chatbot:", response)
Example Conversation
Hello! I'm your chatbot. Type 'bye' to exit.
You: hi
Chatbot: Hi there!
You: what is your name?
Chatbot: I am a simple chatbot.
You: bye
Chatbot: Goodbye!
23.Write a Python program to analyse sentiment in each text using the TextBlob
library.
Install Required Library
If you haven’t installed TextBlob, install it first:
pip install textblob
Python Code for Sentiment Analysis
from textblob import TextBlob
# Function to analyze sentiment
def analyze_sentiment(text):
blob = TextBlob(text)
polarity = blob.sentiment.polarity # Sentiment score between -1 to 1
# Classifying sentiment
if polarity > 0:
sentiment = "Positive"
elif polarity < 0:
sentiment = "Negative"
else:
sentiment = "Neutral"
return polarity, sentiment
# Take user input
text = input("Enter a text for sentiment analysis: ")
polarity, sentiment = analyze_sentiment(text)
# Display results
print(f"\nText: {text}")
print(f"Sentiment Score: {polarity}")
print(f"Overall Sentiment: {sentiment}")
Example Output
Input:
vbnet
Enter a text for sentiment analysis: This product is amazing and works
perfectly!
Output:
Text: This product is amazing and works perfectly!
Sentiment Score: 0.85
Overall Sentiment: Positive
24.Create a machine learning model in Python to classify spam emails using the
Naive Bayes algorithm.
Install Required Libraries
If you haven’t installed the required libraries, install them first:
pip install pandas scikit-learn nltk
Python Code: Spam Email Classifier
import pandas as pd
import numpy as np
import nltk
import string
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import MultinomialNB
from sklearn.pipeline import Pipeline
from sklearn.metrics import accuracy_score, classification_report
# Load Dataset (Example: spam.csv)
df = pd.read_csv("spam.csv", encoding="latin-1")[["v1", "v2"]] # Selecting only
relevant columns
df.columns = ["label", "message"]
# Convert labels to numerical values (spam = 1, ham = 0)
df["label"] = df["label"].map({"ham": 0, "spam": 1})
# Text Preprocessing Function
def clean_text(text):
text = text.lower() # Convert to lowercase
text = "".join([char for char in text if char not in string.punctuation]) #
Remove punctuation
return text
df["message"] = df["message"].apply(clean_text)
# Split Data into Training and Testing Sets
X_train, X_test, y_train, y_test = train_test_split(df["message"], df["label"],
test_size=0.2, random_state=42)
# Create a Naive Bayes Model Pipeline
model = Pipeline([
("vectorizer", CountVectorizer()), # Convert text to numerical vectors
("tfidf", TfidfTransformer()), # Apply TF-IDF transformation
("classifier", MultinomialNB()) # Train a Naive Bayes classifier
])
# Train the Model
model.fit(X_train, y_train)
# Predict on Test Data
y_pred = model.predict(X_test)
# Evaluate the Model
accuracy = accuracy_score(y_test, y_pred)
print(f"Model Accuracy: {accuracy:.2f}")
print("\nClassification Report:\n", classification_report(y_test, y_pred))
# Test the Model on a New Email
def predict_spam(email):
prediction = model.predict([email])
return "Spam" if prediction[0] == 1 else "Not Spam"
# Example Test
email = input("\nEnter an email to check if it's spam or not: ")
print("Prediction:", predict_spam(email))
Example Output
vbnet
CopyEdit
Model Accuracy: 0.98
Classification Report:
precision recall f1-score support
0 0.99 0.99 0.99 965
1 0.94 0.95 0.94 150
Enter an email to check if it's spam or not: Congratulations! You've won a free
iPhone! Click here to claim.
Prediction: Spam
25.Write a Python program to predict house prices using linear regression.
Install Required Libraries
If you haven’t installed the required libraries, install them first:
pip install pandas numpy scikit-learn matplotlib seaborn
Python Code: House Price Prediction Using Linear Regression
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
# Load Dataset (Example: housing.csv)
df = pd.read_csv("housing.csv") # Replace with your dataset
# Display first few rows
print(df.head())
# Check for missing values
print("\nMissing Values:\n", df.isnull().sum())
# Drop rows with missing values (if necessary)
df.dropna(inplace=True)
# Select Features (X) and Target Variable (y)
X = df[["area", "bedrooms", "bathrooms", "stories"]] # Choose relevant columns
y = df["price"]
# Split Data into Training and Testing Sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# Train Linear Regression Model
model = LinearRegression()
model.fit(X_train, y_train)
# Predict on Test Data
y_pred = model.predict(X_test)
# Evaluate Model Performance
print("\nModel Performance:")
print(f"Mean Absolute Error (MAE): {mean_absolute_error(y_test, y_pred):.2f}")
print(f"Mean Squared Error (MSE): {mean_squared_error(y_test, y_pred):.2f}")
print(f"R-squared (R² Score): {r2_score(y_test, y_pred):.2f}")
# Plot Actual vs. Predicted Prices
plt.figure(figsize=(8,6))
sns.scatterplot(x=y_test, y=y_pred)
plt.xlabel("Actual Prices")
plt.ylabel("Predicted Prices")
plt.title("Actual vs Predicted House Prices")
plt.show()
# Function to Predict House Price Based on User Input
def predict_house_price(area, bedrooms, bathrooms, stories):
input_data = np.array([[area, bedrooms, bathrooms, stories]])
prediction = model.predict(input_data)
return prediction[0]
# Example Test
area = float(input("\nEnter Area (sqft): "))
bedrooms = int(input("Enter Number of Bedrooms: "))
bathrooms = int(input("Enter Number of Bathrooms: "))
stories = int(input("Enter Number of Stories: "))
predicted_price = predict_house_price(area, bedrooms, bathrooms, stories)
print(f"\nPredicted House Price: ${predicted_price:,.2f}")
Example Output
mathematica
CopyEdit
Model Performance:
Mean Absolute Error (MAE): 35000.45
Mean Squared Error (MSE): 2,500,000,000.67
R-squared (R² Score): 0.87
Enter Area (sqft): 1500
Enter Number of Bedrooms: 3
Enter Number of Bathrooms: 2
Enter Number of Stories: 2
Predicted House Price: $250,000.00
26.Develop a Python program to clean and preprocess a given dataset for a
machine learning model.
import pandas as pd
from sklearn.preprocessing import StandardScaler, LabelEncoder
# Load Dataset
df = pd.read_csv("data.csv") # Replace with your dataset
# Display initial info
print("Initial Data Info:")
print(df.info())
# Handle Missing Values
df.fillna(df.mean(), inplace=True)
# Remove Duplicates
df.drop_duplicates(inplace=True)
# Encode Categorical Variables
encoder = LabelEncoder()
for col in df.select_dtypes(include=['object']).columns:
df[col] = encoder.fit_transform(df[col])
# Scale Numerical Data
scaler = StandardScaler()
df[df.columns] = scaler.fit_transform(df)
# Save cleaned dataset
df.to_csv("cleaned_data.csv", index=False)
print("Dataset cleaned and saved as cleaned_data.csv")
27.Write a Python script to perform exploratory data analysis (EDA) on a dataset of
your choice.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Load Dataset
df = pd.read_csv("data.csv") # Replace with your dataset
# Display basic statistics
print(df.describe())
# Check missing values
print("\nMissing Values:\n", df.isnull().sum())
# Plot Correlation Heatmap
plt.figure(figsize=(10,6))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm', fmt=".2f")
plt.title("Feature Correlation Heatmap")
plt.show()
# Histogram of a Feature
df.hist(figsize=(10,8))
plt.show()
28. Write a program to check the strength of the pass word.
import re
def check_password_strength(password):
if len(password) < 8:
return "Weak: Password should be at least 8 characters long."
if not re.search(r"[A-Z]", password):
return "Weak: Include at least one uppercase letter."
if not re.search(r"\d", password):
return "Weak: Include at least one digit."
if not re.search(r"[!@#$%^&*]", password):
return "Weak: Include at least one special character."
return "Strong Password!"
password = input("Enter your password: ")
print(check_password_strength(password))
29. From Yahoo Finance download a stock price in Python and find the Simple
Moving Average and print the first 100 values and the related graph.
import yfinance as yf
import matplotlib.pyplot as plt
# Download stock data
stock = yf.download('TATAMOTORS.NS', start='2019-01-01', end='2024-01-01')
# Calculate Simple Moving Average (SMA)
stock['SMA_50'] = stock['Close'].rolling(window=50).mean()
# Print first 100 values
print(stock[['Close', 'SMA_50']].head(100))
# Plot SMA
plt.figure(figsize=(10,5))
plt.plot(stock['Close'], label="Close Price")
plt.plot(stock['SMA_50'], label="50-Day SMA", linestyle="dashed")
plt.legend()
plt.title("Stock Price and 50-Day SMA")
plt.show()
30. From Yahoo finance download a stock price in Python and draw the needed
graphs to present the Open, Close, High, Low, Adj. Close, Volume of five year data.
import yfinance as yf
import matplotlib.pyplot as plt
# Download stock data for the last 5 years
stock = yf.download('TATAMOTORS.NS', period='5y')
# Plot different stock parameters
plt.figure(figsize=(12,6))
plt.subplot(2, 3, 1)
plt.plot(stock['Open'], label='Open Price')
plt.title("Open Price")
plt.subplot(2, 3, 2)
plt.plot(stock['High'], label='High Price', color='g')
plt.title("High Price")
plt.subplot(2, 3, 3)
plt.plot(stock['Low'], label='Low Price', color='r')
plt.title("Low Price")
plt.subplot(2, 3, 4)
plt.plot(stock['Close'], label='Close Price', color='b')
plt.title("Close Price")
plt.subplot(2, 3, 5)
plt.plot(stock['Adj Close'], label='Adjusted Close', color='purple')
plt.title("Adj Close Price")
plt.subplot(2, 3, 6)
plt.bar(stock.index, stock['Volume'], color='gray')
plt.title("Trading Volume")
plt.tight_layout()
plt.show()