0% found this document useful (0 votes)

16 views35 pages

DataAnalytics Lab Manual

The document is a lab manual for the Data Analytics Lab course at Ellenki College of Engineering and Technology for the academic year 2024-25. It outlines the course objectives, outcomes, and a list of experiments covering data preprocessing, regression models, decision trees, random forests, ARIMA, and visualization techniques. Additionally, it includes programming examples and references for textbooks and software used in the course.

Uploaded by

gogulasumanth02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views35 pages

DataAnalytics Lab Manual

Uploaded by

gogulasumanth02

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

ELLENKI COLLEGE OF ENGINEERING AND TECHNOLOGY

(Autonomous Institution - UGC, Govt. of India)

(Sponsored by Ellenki Educational Society)
Patelguda, Sangareddy Dist. Hyderabad.
Approved by AICTE & Affiliated to JNTUH, Accredited by NAAC, Recognition of 2(f), UGC,
MSME-HI

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

(AI&ML)

III B.Tech II Semester

Subject Name: DATA ANALYTICS LAB

Lab Manual

Academic Year: 2024-25

DEPARTMENT OF COMPUTER SCIENCE AND

ENGINEERING(AI&ML)

ELLENKI COLLEGE OF ENGINEERING AND TECHNOLOGY

Patelguda, Sangareddy Dist. Hyderabad.

1
AM605PC: DATA ANALYTICS LAB
B.Tech. III Year II Sem.
Course Objectives:
To explore the fundamental concepts of data analytics.
To learn the principles and methods of statistical analysis
 Discover interesting patterns, analyze supervised and unsupervised models
and estimate the accuracy of the algorithms.
To understand the various search methods and visualization techniques.
Course Outcomes:
Understand linear regression and logistic regression
 Understand the functionality of different classifiers
 Implement visualization techniques using different graphs
 Apply descriptive and predictive analytics for different types of data

2
List of Experiments:
1. Data Preprocessing
a. Handling missing values
b. Noise detection removal
c. Identifying data redundancy and elimination
2. Implement any one imputation model
3. Implement Linear Regression
4. Implement Logistic Regression
5. Implement Decision Tree Induction for classification
6. Implement Random Forest Classifier
7. Implement ARIMA on Time Series data
8. Object segmentation using hierarchical based methods
9. Perform Visualization techniques (types of maps - Bar, Colum, Line, Scatter,
3D Cubes etc)
10. Perform Descriptive analytics on healthcare data
11. Perform Predictive analytics on Product Sales data
12. Apply Predictive analytics for Weather forecasting
TEXT BOOKS:
1. Student’s Handbook for Associate Analytics – II, III.
2. Data Mining Concepts and Techniques, Han, Kamber, 3rd Edition, Morgan Kaufmann
Publishers.

REFERENCE BOOKS:
1. Introduction to Data Mining, Tan, Steinbach and Kumar, Addison Wesley, 2006.
2. Data Mining Analysis and Concepts, M. Zaki and W. Meira
3. Mining of Massive Datasets, Jure Leskovec Stanford Univ. Anand Rajaraman
Milliway Labs Jeffrey D Ullman Stanford Univ

SOFTWARES:
1. Python IDLE 2. Pycharm 3.Visual Studio

3
1. Data Preprocessing
a. Handling missing values
b. Noise detection removal
c. Identifying data redundancy and elimination
a. Handling missing values
PROGRAM:
import pandas as pd
import numpy as np
# Sample data with missing values
data = {
'A': [1, 2, np.nan, 4, 5],
'B': [np.nan, 2, 3, np.nan, 5],
'C': ['foo', 'bar', 'baz', np.nan, 'qux']
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Method 1: Remove rows with any missing values
df_dropna = df.dropna()
print("\nDataFrame after removing rows with any missing values:")
print(df_dropna)
# Method 2: Fill missing values with a specific value (e.g., 0)
df_fillna = df.fillna(0)
print("\nDataFrame after filling missing values with 0:")
print(df_fillna)

# Method 3: Fill missing values with the mean of the column (numerical columns only)
df_mean = df.copy()
df_mean['A'] = df_mean['A'].fillna(df_mean['A'].mean())
df_mean['B'] = df_mean['B'].fillna(df_mean['B'].mean())

4
print("\nDataFrame after filling missing values with the mean of the column:")
print(df_mean)
# Method 4: Fill missing values using forward fill
df_ffill = df.fillna(method='ffill')
print("\nDataFrame after forward fill:")
print(df_ffill)
# Method 5: Fill missing values using backward fill
df_bfill = df.fillna(method='bfill')
print("\nDataFrame after backward fill:")
print(df_bfill)
# Method 6: Interpolation for numerical columns
df_interp = df.interpolate()
print("\nDataFrame after interpolation:")
print(df_interp)

output:

5
b. Noise detection removal
PROGRAM:
import pandas as pd
import numpy as np
from sklearn.ensemble import IsolationForest
# Sample data with noise
data = {
'A': [1, 2, 3, 4, 100, 6, 7, 8, 9, 10],
'B': [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)
# Detect noise using Isolation Forest
iso_forest = IsolationForest(contamination=0.1)
df['anomaly'] = iso_forest.fit_predict(df[['A', 'B']])
# Remove noise
df_clean = df[df['anomaly'] == 1].drop(columns=['anomaly'])
print("\nDataFrame after noise removal:")
print(df_clean)

6
OUTPUT:

c. Identifying data redundancy and elimination

PROGRAM:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
# Sample data with redundant rows and columns
data = {
'A': [1, 2, 2, 4, 5],
'B': [10, 20, 20, 40, 50],
'C': [1, 2, 2, 4, 5] # Duplicate of column 'A'
}
df = pd.DataFrame(data)
print("Original DataFrame:")
print(df)

7
# Remove duplicate rows
df_no_duplicates = df.drop_duplicates()
print("\nDataFrame after removing duplicate rows:")
print(df_no_duplicates)
# Remove duplicate columns
df_no_duplicates = df_no_duplicates.loc[:, ~df_no_duplicates.columns.duplicated()]
print("\nDataFrame after removing duplicate columns:")
print(df_no_duplicates)
# Calculate correlation matrix and drop highly correlated columns
correlation_matrix = df_no_duplicates.corr().abs()
upper = correlation_matrix.where(np.triu(np.ones(correlation_matrix.shape),
k=1).astype(bool))
to_drop = [column for column in upper.columns if any(upper[column] > 0.9)
df_no_redundant_columns = df_no_duplicates.drop(columns=to_drop)
print("\nDataFrame after removing highly correlated columns:")
print(df_no_redundant_columns)

output:

8
2. Implement any one imputation model
PROGRAM:
import pandas as pd
from sklearn.impute import SimpleImputer
# Sample DataFrame with missing values
data = {
'A': [1, 2, None, 4, 5],
'B': [None, 2, 3, 4, None],
'C': [1, 2, 3, 4, 5]
}
df = pd.DataFrame(data)
# Display original DataFrame
print("Original DataFrame:\n", df)
# Mean Imputation using SimpleImputer
mean_imputer = SimpleImputer(strategy='mean')
df_mean_imputed = pd.DataFrame(mean_imputer.fit_transform(df), columns=df.columns)
# Display DataFrame after Mean Imputation
print("\nDataFrame After Mean Imputation:\n", df_mean_imputed)
OUTPUT:

9
3. Implement Linear Regression
PROGRAM:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# Sample DataFrame
data = {
'Feature1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Feature2': [2, 4, 6, 8, 10, 12, 14, 16, 18, 20],
'Target': [2.5, 3.5, 6, 7, 8.5, 10, 11.5, 13, 14.5, 16]
}
df = pd.DataFrame(data)
# Define features and target variable
X = df[['Feature1', 'Feature2']]
y = df['Target']
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train)
# Make predictions on the testing set
y_pred = model.predict(X_test)
# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error:", mse)
print("R-squared:", r2)

10
# Visualize the results
plt.scatter(X_test['Feature1'], y_test, color='blue', label='Actual')
plt.scatter(X_test['Feature1'], y_pred, color='red', label='Predicted')
plt.xlabel('Feature1')
plt.ylabel('Target')
plt.title('Linear Regression Results')
plt.legend()
plt.show()
output:

11
4. Implement Logistic Regression
PROGRAM:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
# Sample DataFrame
data = {
'Hours_Studied': [5, 10, 15, 20, 25, 30, 35, 40, 45, 50],
'Attendance': [1, 0, 1, 1, 0, 0, 1, 1, 0, 1],
'Passed': [0, 0, 1, 1, 0, 0, 1, 1, 0, 1] # 1: Passed, 0: Failed
}
df = pd.DataFrame(data)
# Define features and target variable
X = df[['Hours_Studied', 'Attendance']]
y = df['Passed']
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)
# Make predictions on the testing set
y_pred = model.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)

12
print("Classification Report:\n", class_report)
# Visualize the results (Optional)
import matplotlib.pyplot as plt
plt.scatter(df['Hours_Studied'], df['Passed'], color='blue', label='Actual')
plt.scatter(X_test['Hours_Studied'], y_pred, color='red', label='Predicted')
plt.xlabel('Hours Studied')
plt.ylabel('Passed')
plt.title('Logistic Regression Results')
plt.legend()
plt.show()

output:

13
14
5. Implement Decision Tree Induction for classification
PROGRAM:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt
from sklearn.tree import plot_tree
# Load the Iris dataset
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Series(iris.target, name='species')
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the Decision Tree model
model = DecisionTreeClassifier(random_state=42)
model.fit(X_train, y_train)
# Make predictions on the testing set
y_pred = model.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)
print("Classification Report:\n", class_report)
# Visualize the Decision Tree
plt.figure(figsize=(15, 10))
plot_tree(model, feature_names=iris.feature_names, class_names=iris.target_names,
filled=True)

15
plt.title('Decision Tree for Iris Classification')
plt.show()

output:

16
6. Implement Random Forest Classifier
PROGRAM:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
from sklearn.datasets import load_iris
# Load the Iris dataset
iris = load_iris()
X = pd.DataFrame(iris.data, columns=iris.feature_names)
y = pd.Series(iris.target, name='species')
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the Random Forest model
model = RandomForestClassifier(random_state=42, n_estimators=100)
model.fit(X_train, y_train)
# Make predictions on the testing set
y_pred = model.predict(X_test)
# Evaluate the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
class_report = classification_report(y_test, y_pred)
print("Accuracy:", accuracy)
print("Confusion Matrix:\n", conf_matrix)
print("Classification Report:\n", class_report)

17
output:

18
7. Implement ARIMA on Time Series data
PROGRAM:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.arima.model import ARIMA
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf
# Sample time series data
data = {
'Date': pd.date_range(start='2022-01-01', periods=24, freq='M'),
'Value': [120, 130, 135, 140, 150, 160, 170, 175, 180, 190, 200, 210,
220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330]
}
df = pd.DataFrame(data)
df.set_index('Date', inplace=True)
# Plot the time series data
plt.figure(figsize=(10, 6))
plt.plot(df, label='Original Data')
plt.xlabel('Date')
plt.ylabel('Value')
plt.title('Time Series Data')
plt.legend()
plt.show()
# Plot ACF and PACF
fig, axes = plt.subplots(1, 2, figsize=(16, 6))
plot_acf(df['Value'], lags=20, ax=axes[0])
plot_pacf(df['Value'], lags=20, ax=axes[1])
plt.show()
# Fit ARIMA model
model = ARIMA(df['Value'], order=(2, 1, 2))

19
fit = model.fit()
# Summary of the model
print(fit.summary())
# Forecasting
forecast = fit.forecast(steps=12)
forecast_dates = pd.date_range(start=df.index[-1] + pd.DateOffset(months=1), periods=12,
freq='M')
forecast_df = pd.DataFrame(forecast, index=forecast_dates, columns=['Forecast'])
# Plot the forecast
plt.figure(figsize=(10, 6))
plt.plot(df, label='Original Data')
plt.plot(forecast_df, label='Forecast', color='red')
plt.xlabel('Date')
plt.ylabel('Value')
plt.title('ARIMA Forecast')
plt.legend()
plt.show()

output:

20
8. Object segmentation using hierarchical based methods
PROGRAM:
import matplotlib.pyplot as plt
from skimage import data
from skimage.segmentation import felzenszwalb
from skimage.color import label2rgb
# Load a sample image
image = data.astronaut()
# Apply Felzenszwalb's Graph-Based Segmentation
segments_fz = felzenszwalb(image, scale=100, sigma=0.5, min_size=50)
# Create an overlay of the original image and the segmented image
segmented_image = label2rgb(segments_fz, image, kind='avg')
# Plot the results
fig, ax = plt.subplots(1, 2, figsize=(15, 10), sharex=True, sharey=True)
ax[0].imshow(image)
ax[0].set_title('Original Image')
ax[0].axis('off')
ax[1].imshow(segmented_image)
ax[1].set_title('Felzenszwalb Segmentation')
ax[1].axis('off')
plt.tight_layout()
plt.show()
output:

21
9. Perform Visualization techniques (types of maps - Bar, Colum, Line, Scatter,
3D Cubes etc)
PROGRAM:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
# Sample data for bar and column charts
categories = ['A', 'B', 'C', 'D']
values = [10, 20, 15, 25]
# Sample data for line chart and scatter plot
x = [1, 2, 3, 4, 5]
y = [2, 3, 5, 7, 11]
# Sample data for 3D plot
z = [1, 4, 9, 16, 25]
# Plotting Bar Chart
plt.figure(figsize=(14, 10))
plt.subplot(2, 2, 1)
plt.bar(categories, values, color='blue')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Chart')
# Plotting Column Chart
plt.subplot(2, 2, 2)
plt.bar(categories, values, color='green')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Column Chart')
# Plotting Line Chart
plt.subplot(2, 2, 3)
plt.plot(x, y, marker='o', linestyle='-', color='red')
plt.xlabel('X-axis')

22
plt.ylabel('Y-axis')
plt.title('Line Chart')
# Plotting Scatter Plot
plt.subplot(2, 2, 4)
plt.scatter(x, y, color='purple')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.title('Scatter Plot')
plt.tight_layout()
plt.show()
# Plotting 3D Plot
fig = plt.figure(figsize=(10, 7))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x, y, z, color='orange')
ax.set_xlabel('X-axis')
ax.set_ylabel('Y-axis')
ax.set_zlabel('Z-axis')
ax.set_title('3D Scatter Plot')
plt.show()

output:

23
10. Perform Descriptive analytics on healthcare data
PROGRAM:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Sample healthcare dataset
data = {
'PatientID': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
'Age': [25, 45, 35, 50, 65, 55, 40, 30, 70, 60],
'Gender': ['F', 'M', 'F', 'F', 'M', 'M', 'F', 'M', 'F', 'M'],
'BloodPressure': [120, 140, 130, 150, 160, 155, 135, 125, 165, 150],
'Cholesterol': [200, 220, 215, 250, 240, 230, 210, 205, 255, 245],
'Diabetes': ['No', 'Yes', 'No', 'No', 'Yes', 'Yes', 'No', 'No', 'Yes', 'Yes']
}
df = pd.DataFrame(data)
# Display basic statistics
print("Basic Statistics:")
print(df.describe())

24
# Count of gender
gender_count = df['Gender'].value_counts()
print("\nGender Count:")
print(gender_count)
# Count of diabetes status
diabetes_count = df['Diabetes'].value_counts()
print("\nDiabetes Count:")
print(diabetes_count)
# Plot Age distribution
plt.figure(figsize=(10, 6))
sns.histplot(df['Age'], bins=10, kde=True)
plt.xlabel('Age')
plt.title('Age Distribution')
plt.show()
# Plot Blood Pressure distribution by Gender
plt.figure(figsize=(10, 6))
sns.boxplot(x='Gender', y='BloodPressure', data=df)
plt.xlabel('Gender')
plt.ylabel('Blood Pressure')
plt.title('Blood Pressure Distribution by Gender')
plt.show()
# Plot Cholesterol levels by Diabetes status
plt.figure(figsize=(10, 6))
sns.boxplot(x='Diabetes', y='Cholesterol', data=df)
plt.xlabel('Diabetes')
plt.ylabel('Cholesterol')
plt.title('Cholesterol Levels by Diabetes Status')
plt.show()

25
output:

26
27
11. Perform Predictive analytics on Product Sales data
PROGRAM:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

# Sample product sales data

data = {
'Month': pd.date_range(start='2022-01-01', periods=12, freq='M'),
'Sales': [1500, 1600, 1700, 1800, 1750, 1900, 2100, 2200, 2300, 2400, 2500, 2600]
}
df = pd.DataFrame(data)
df['Month'] = pd.to_datetime(df['Month'])
df['Month_Num'] = df['Month'].dt.month
# Plot the historical sales data
plt.figure(figsize=(10, 6))
plt.plot(df['Month'], df['Sales'], marker='o', linestyle='-', color='blue')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Historical Sales Data')
plt.grid(True)
plt.show()
# Define features and target variable
X = df[['Month_Num']]
y = df['Sales']
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

28
# Create and train the Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train)
# Make predictions on the testing set
y_pred = model.predict(X_test)
# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("Mean Squared Error:", mse)

print("R-squared:", r2)
# Forecast future sales
future_months = pd.date_range(start='2023-01-01', periods=6, freq='M')
future_months_num = future_months.month
future_sales = model.predict(future_months_num.reshape(-1, 1))
# Plot the forecasted sales
plt.figure(figsize=(10, 6))
plt.plot(df['Month'], df['Sales'], marker='o', linestyle='-', color='blue', label='Historical Sales')
plt.plot(future_months, future_sales, marker='o', linestyle='--', color='red', label='Forecasted
Sales')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Sales Forecast')
plt.legend()
plt.grid(True)
plt.show()

output:

29
12. Apply Predictive analytics for Weather forecasting.
PROGRAM:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# Sample weather dataset
data = {
'Day': pd.date_range(start='2022-01-01', periods=30, freq='D'),
'Temperature': [25, 26, 27, 25, 28, 29, 30, 28, 27, 26,
25, 24, 23, 26, 27, 28, 29, 27, 25, 26,
28, 29, 30, 31, 32, 31, 29, 28, 27, 26]
}
df = pd.DataFrame(data)
df['Day_Num'] = df['Day'].dt.dayofyear
# Plot the historical temperature data
plt.figure(figsize=(10, 6))

30
plt.plot(df['Day'], df['Temperature'], marker='o', linestyle='-', color='blue')
plt.xlabel('Day')
plt.ylabel('Temperature'
plt.title('Historical Temperature Data')
plt.grid(True)
plt.show()
# Define features and target variable
X = df[['Day_Num']]
y = df['Temperature']
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Create and train the Linear Regression model
model = LinearRegression()
model.fit(X_train, y_train)
# Make predictions on the testing set
y_pred = model.predict(X_test)
# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print("Mean Squared Error:", mse)
print("R-squared:", r2)
# Forecast future temperatures
future_days = pd.date_range(start='2022-02-01', periods=7, freq='D')
future_days_num = future_days.dayofyear
future_temperatures = model.predict(future_days_num.reshape(-1, 1))
# Plot the forecasted temperatures
plt.figure(figsize=(10, 6))
plt.plot(df['Day'], df['Temperature'], marker='o', linestyle='-', color='blue', label='Historical
Temperature')
plt.plot(future_days, future_temperatures, marker='o', linestyle='--', color='red',
label='Forecasted

31
Temperature')
plt.xlabel('Day')
plt.ylabel('Temperature')
plt.title('Temperature Forecast')
plt.legend()
plt.grid(True)
plt.show()

output:

Manual Prepared by
Bhuvaneswari Beeram
Assistant Professor in ECET

32
33
34
35

(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
ML Lab Manual 2025-2
No ratings yet
ML Lab Manual 2025-2
35 pages
Experiment No. 5: Objective
No ratings yet
Experiment No. 5: Objective
5 pages
Part A Assignment 6
No ratings yet
Part A Assignment 6
28 pages
DA Lab Manual r22
No ratings yet
DA Lab Manual r22
31 pages
CS 611 Slides 4
No ratings yet
CS 611 Slides 4
25 pages
Data Analytics Lab Manual
No ratings yet
Data Analytics Lab Manual
47 pages
Data Analytics Lab Manuals 2025-2026-1
No ratings yet
Data Analytics Lab Manuals 2025-2026-1
39 pages
Data Science Practicals
No ratings yet
Data Science Practicals
47 pages
Monika Sree 11-07-2024
No ratings yet
Monika Sree 11-07-2024
36 pages
ADS EXP Assignments
No ratings yet
ADS EXP Assignments
38 pages
Data Preprocessing For Machine Learning in Python
No ratings yet
Data Preprocessing For Machine Learning in Python
27 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
AIML 01 Merged
No ratings yet
AIML 01 Merged
25 pages
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
No ratings yet
About The Dataset - Car Evaluation Dataset (UCI Machine Learning Repository
5 pages
Be A 65 Ads Exp 3
No ratings yet
Be A 65 Ads Exp 3
6 pages
ML Lab Record
No ratings yet
ML Lab Record
38 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
Complete Data Science Questions
No ratings yet
Complete Data Science Questions
5 pages
Dsbda 4
No ratings yet
Dsbda 4
16 pages
Data-Analytics-Manual Lab G.anill Kumar
No ratings yet
Data-Analytics-Manual Lab G.anill Kumar
23 pages
Project Paarth
No ratings yet
Project Paarth
21 pages
Data Preprocessing Example Programs1
No ratings yet
Data Preprocessing Example Programs1
9 pages
Project Idea
No ratings yet
Project Idea
8 pages
Machine Learning Lab File (BTCS619-18)
No ratings yet
Machine Learning Lab File (BTCS619-18)
50 pages
Cleaning Data in Python
No ratings yet
Cleaning Data in Python
8 pages
Advance Python
No ratings yet
Advance Python
5 pages
Da Program Upto 6
No ratings yet
Da Program Upto 6
20 pages
Data Analytics Lab Manual - 250402 - 095326
No ratings yet
Data Analytics Lab Manual - 250402 - 095326
58 pages
ML Combined
No ratings yet
ML Combined
254 pages
Machine Learning Lab File
No ratings yet
Machine Learning Lab File
45 pages
DSBDA Practicals
No ratings yet
DSBDA Practicals
16 pages
cdp201 10 11 2023
No ratings yet
cdp201 10 11 2023
17 pages
Bussiness Report PM
No ratings yet
Bussiness Report PM
44 pages
Saurabh
No ratings yet
Saurabh
22 pages
ML Practical File
100% (2)
ML Practical File
43 pages
DA Programs
No ratings yet
DA Programs
44 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
Class Xii PDF For Practical
No ratings yet
Class Xii PDF For Practical
24 pages
Week 6 - Data Cleaning
No ratings yet
Week 6 - Data Cleaning
8 pages
Transportation and Assignment Problem
100% (1)
Transportation and Assignment Problem
13 pages
ASSi2 DSBDA
No ratings yet
ASSi2 DSBDA
4 pages
Machine File
No ratings yet
Machine File
27 pages
AIDS - DM Using Python - Lab Programs
No ratings yet
AIDS - DM Using Python - Lab Programs
19 pages
Exp-2 ML
No ratings yet
Exp-2 ML
6 pages
DA Lab
No ratings yet
DA Lab
27 pages
Data Science
No ratings yet
Data Science
18 pages
Data Mining Lab 03
No ratings yet
Data Mining Lab 03
10 pages
Dual Simplex Method
No ratings yet
Dual Simplex Method
7 pages
FYMCA IDSLab A6 Submission
No ratings yet
FYMCA IDSLab A6 Submission
9 pages
ML File Syllabus
No ratings yet
ML File Syllabus
43 pages
Feed Forward Feed Backward Process
No ratings yet
Feed Forward Feed Backward Process
9 pages
Introduction To Design of Experiments
No ratings yet
Introduction To Design of Experiments
4 pages
External
No ratings yet
External
11 pages
Data Analysis
No ratings yet
Data Analysis
8 pages
How To Run Cluster Analysis in Excel
No ratings yet
How To Run Cluster Analysis in Excel
9 pages
Exp 2
No ratings yet
Exp 2
6 pages
Lecture Material 10
No ratings yet
Lecture Material 10
9 pages
Chandigarh Group of Colleges College of Engineering Landran, Mohali
No ratings yet
Chandigarh Group of Colleges College of Engineering Landran, Mohali
47 pages
DSBDA Lab Assignment No 2
No ratings yet
DSBDA Lab Assignment No 2
7 pages
ML 8 Program
No ratings yet
ML 8 Program
5 pages
FDS Record-1-4
No ratings yet
FDS Record-1-4
18 pages
ML LAB Manual-1
No ratings yet
ML LAB Manual-1
33 pages
Circle Mid Point
0% (1)
Circle Mid Point
20 pages
Python NumPy and Machine Learning A Comprehensive Guide
No ratings yet
Python NumPy and Machine Learning A Comprehensive Guide
10 pages
DSP 18eel67 Final
No ratings yet
DSP 18eel67 Final
94 pages
Probability Tut Solns
No ratings yet
Probability Tut Solns
27 pages
Regression Models Course Notes
No ratings yet
Regression Models Course Notes
102 pages
E1) CLDS Documentation D1-D10
No ratings yet
E1) CLDS Documentation D1-D10
16 pages
Upscaling of Grid Properties in Reservoir Simulation
No ratings yet
Upscaling of Grid Properties in Reservoir Simulation
30 pages
Module 5 - Clustering - Afterclassb
No ratings yet
Module 5 - Clustering - Afterclassb
49 pages
Prediction of Mobile Phone Price Class Using Supervised Machine Learning Techniques
No ratings yet
Prediction of Mobile Phone Price Class Using Supervised Machine Learning Techniques
4 pages
Arithmetic Sequences and Examples
No ratings yet
Arithmetic Sequences and Examples
14 pages
Iarjset 2024 11739
No ratings yet
Iarjset 2024 11739
4 pages
Introduction To Neural Networks: Deep Learning For NLP
No ratings yet
Introduction To Neural Networks: Deep Learning For NLP
57 pages
Spectrum Estimation
No ratings yet
Spectrum Estimation
49 pages
A Survey of Control Approaches For Unmanned Underwater Vehicles 19.8.2023 - Ver4.0
No ratings yet
A Survey of Control Approaches For Unmanned Underwater Vehicles 19.8.2023 - Ver4.0
15 pages
201881-ms 240223 211726
No ratings yet
201881-ms 240223 211726
16 pages
CS 213 2022
No ratings yet
CS 213 2022
20 pages
NT Quant Quant Puzzles 4: Recursion: Problem 1: Down To Zero (Easy)
No ratings yet
NT Quant Quant Puzzles 4: Recursion: Problem 1: Down To Zero (Easy)
4 pages
l7 - Learning in Multi-Layer Perceptrons, Back-Propagation
No ratings yet
l7 - Learning in Multi-Layer Perceptrons, Back-Propagation
16 pages
Pattern Recognition and Anomaly Detection Lab
No ratings yet
Pattern Recognition and Anomaly Detection Lab
3 pages
Depth First Search
No ratings yet
Depth First Search
20 pages
The 8-Point Algorithm: 16-385 Computer Vision (Kris Kitani)
No ratings yet
The 8-Point Algorithm: 16-385 Computer Vision (Kris Kitani)
33 pages
Class 11 Assignment 10 (Prac)
No ratings yet
Class 11 Assignment 10 (Prac)
3 pages
Worksheet Chapter 3
No ratings yet
Worksheet Chapter 3
3 pages
Detection of Cardiovascular Diseases in ECG Images Using Machine Learning and Deep Learning Methods
No ratings yet
Detection of Cardiovascular Diseases in ECG Images Using Machine Learning and Deep Learning Methods
4 pages
Maths SJ
No ratings yet
Maths SJ
2 pages
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
From Everand
DATA MINING AND MACHINE LEARNING. PREDICTIVE TECHNIQUES: REGRESSION, GENERALIZED LINEAR MODELS, SUPPORT VECTOR MACHINE AND NEURAL NETWORKS
César Pérez López
No ratings yet

DataAnalytics Lab Manual

Uploaded by

DataAnalytics Lab Manual

Uploaded by

ELLENKI COLLEGE OF ENGINEERING AND TECHNOLOGY

(Autonomous Institution - UGC, Govt. of India)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

III B.Tech II Semester

Subject Name: DATA ANALYTICS LAB

Academic Year: 2024-25

DEPARTMENT OF COMPUTER SCIENCE AND

ELLENKI COLLEGE OF ENGINEERING AND TECHNOLOGY

Patelguda, Sangareddy Dist. Hyderabad.

c. Identifying data redundancy and elimination

# Sample product sales data

print("Mean Squared Error:", mse)

You might also like