ADS Practical Exam Questions
ADS Practical Exam Questions
Bowley's Skewness
OutPut:
Q1: 13.3475, Q2: 17.795, Q3: 24.127499999999998
Output:
Skewness: 1.1262346334818638
Kurtosis: 1.1691681323851366
6. Poisson Distribution
#The annual number of industrial accidents occurring in a particular manufacturing
plant is known to
# follow Poisson distribution with mean 12.
#a)What is the probability of observing exactly 5 accidents at this plant during
the coming year?
#b)What is the probability of observing not more than 12 accidents at this plant
the coming year?
#c)What is the probability of observing at least 15 accidents at this plant during
the coming year?
#d)What is the probability of observing between 10 and 15 accidents (inclusive) at
this plant during the coming year?
7. Z-Test
# Question: A soft drink company claims that the mean sugar content in its soda
bottles is 40 grams.
# To test this claim, a random sample of 25 bottles is selected, and the sugar
content is measured.
# The sample mean is found to be 38 grams with a standard deviation of 4 grams.
# Conduct a Z-test at a significance level of 0.05 to determine if there is enough
evidence to reject the company's claim.
8. Outlier Detection
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.neighbors import LocalOutlierFactor
iris = load_iris()
X = iris.data
y = iris.target
lof = LocalOutlierFactor(n_neighbors=20, contamination=0.1)
y_pred = lof.fit_predict(X)
plt.figure(figsize=(12, 6))
plt.scatter(X[:, 0], X[:, 1], c=y_pred, cmap='viridis')
plt.colorbar()
plt.title('Outlier Detection with Local Outlier Factor (LOF)')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Sepal Width (cm)')
plt.show()
9. Performance Evaluation
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_absolute_error, mean_squared_error,
mean_absolute_percentage_error
diabetes = load_diabetes()
X = diabetes.data
y = diabetes.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False)
mape = mean_absolute_percentage_error(y_test, y_pred)
mase = mae / (sum(abs(y_test[i + 1] - y_test[i]) for i in range(len(y_test) - 1)) /
(len(y_test) - 1))
print(f"Mean Absolute Error (MAE): {mae:.2f}")
print(f"Root Mean Square Error (RMSE): {rmse:.2f}")
print(f"Mean Absolute Percentage Error (MAPE): {mape:.2f}")
print(f"Mean Absolute Scaled Error (MASE): {mase:.2f}")
Output:
Mean Absolute Error (MAE): 42.79
Root Mean Square Error (RMSE): 53.85
Mean Absolute Percentage Error (MAPE): 0.37
Mean Absolute Scaled Error (MASE): 0.53
10. ARIMA
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_squared_error
from math import sqrt
from statsmodels.datasets import get_rdataset
air_passengers = get_rdataset('AirPassengers', 'datasets').data['value']
air_passengers.index = pd.date_range(start='1949-01-01',
periods=len(air_passengers), freq='M')
model = ARIMA(air_passengers, order=(1, 1, 1))
model_fit = model.fit()
predictions = model_fit.predict(start=len(air_passengers),
end=len(air_passengers)+11, typ='levels')
plt.plot(air_passengers.index, air_passengers, label='Actual')
plt.plot(predictions.index, predictions, label='Predicted')
plt.title('ARIMA Model Forecasting')
plt.xlabel('Date')
plt.ylabel('Passengers')
plt.legend()
plt.show()
rmse = sqrt(mean_squared_error(air_passengers[-12:], predictions))
print(f'Root Mean Squared Error (RMSE): {rmse:.2f}')
Root Mean Squared Error (RMSE): 76.29