0% found this document useful (0 votes)

10 views4 pages

Ashfatmaterial

The document covers various machine learning techniques including Association Rule Mining using the Apriori algorithm, Linear Regression, and data preprocessing steps such as handling missing data and encoding categorical variables. It provides code examples and explanations for each method, emphasizing the importance of data preparation for effective machine learning models. Additionally, it highlights the application of the Apriori algorithm in market basket analysis to identify frequent itemsets and association rules.

Uploaded by

Ashwin Vikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

10 views4 pages

Ashfatmaterial

Uploaded by

Ashwin Vikram

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 4

CAT1

01 December 2024 17:38

1. Association Rule Mining – Apriori Algorithm

From <https://www.youtube.com/watch?v=43CMKRHdH30&ab_channel=MaheshHuddar>

1. Association Rule Mining – Apriori Algorithm - Numerical Example Solved by Mahesh Huddar

Confusion Matrix:

How to find Calculate F1 Score for Multi-Class Classification Machine Learning by Dr. Mahesh Huddar

Confusion Matrix Solved Example Accuracy Precision Recall F1 Score Prevalence by Mahesh Huddar

//more is not taught in class

Machine Learning Page 1

Confusion Matrix for Multiclass Classification Precision Recall Weighted F1 Score by Mahesh Huddar

Linear Regression:

Linear Regression Algorithm – Solved Numerical Example in Machine Learning by Mahesh Huddar

Linear Regression Matrix Form | Solved Numerical Example using Matrix Method by Mahesh Huddar

Matrix method:

Multiple Linear Regression:

Multiple Linear Regression Solved Numerical Example in Machine Learning Data Mining by Mahesh
Huddar

Feature Scaling :

1 normalizations
Xi-xmin

Machine Learning Page 2

Xi-xmin
------------
Xmax-xmin

2 standardization: This code is a step-by-step implementation of data preprocessing for a machine learning task using
Xi-xmean Python. Let me break it down:
------------
s.d 1. Importing Libraries:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
○ numpy: Used for numerical computations.
Codes: ○ pandas: Used for handling and analyzing datasets.
○ matplotlib.pyplot: Used for visualizing data.

2. Loading Data:
a = pd.read_csv('Data.csv')
x = a.iloc[:, :-1].values
y = a.iloc[:, -1].values
○ pd.read_csv('Data.csv'): Loads the dataset from a file named Data.csv.
○ x: Contains all columns except the last one (features).
○ y: Contains only the last column (target/output variable).

3. Handling Missing Data:

from sklearn.impute import SimpleImputer
imputer = SimpleImputer(missing_values=np.nan, strategy='mean')
imputer.fit(x[:, 1:3])
x[:, 1:3] = imputer.transform(x[:, 1:3])
○ SimpleImputer: Fills missing values in the dataset.
○ strategy='mean': Replaces missing values with the mean of the column.
○ This is applied to the 2nd and 3rd columns of x.

4. Encoding Categorical Data:

from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
y = le.fit_transform(y)
○ LabelEncoder: Converts categorical labels (in y) into numeric values.
○ Example: If y contains ['Yes', 'No', 'Yes'], it will be converted to [1, 0, 1].

5. Splitting the Dataset:

from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)
○ Splits the dataset into training and testing sets.
○ test_size=0.2: 20% of the data is used for testing.
○ random_state=1: Ensures the split is reproducible.

Summary:
• The code reads a dataset, cleans it (handles missing values), encodes categorical data, and splits it
into training and testing sets. These steps are essential for preparing data for machine learning
models.

From <https://chatgpt.com/c/674d069c-1af0-8007-b82d-198ffae1e8ab>

This code snippet demonstrates how to preprocess a dataset (e.g., SB_Startups.csv) for machine
learning. Below is a step-by-step explanation:

1. Importing Libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
• numpy: Used for numerical computations and matrix operations.
• matplotlib.pyplot: For data visualization (not used here, but imported).
• pandas: For handling datasets and data preprocessing.

2. Importing the Dataset

dataset = pd.read_csv('SB_Startups.csv')
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values
• pd.read_csv: Reads the dataset from a CSV file named SB_Startups.csv.
• X: Independent variables (all columns except the last one).
• y: Dependent variable (the last column, assumed to be the output/target variable).
Example Values of X:
[[165349.2, 136897.8, 471784.1, 'New York'],
[162597.7, 151377.59, 443898.53, 'California'],
...]
• The last column ('New York', 'California', etc.) is categorical data.
Example Values of y:
[192261.83, 191792.06, ...]

3. Encoding Categorical Data

from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
ct = ColumnTransformer(
transformers=[('encoder', OneHotEncoder(), [3])],
remainder='passthrough'
)
X = np.array(ct.fit_transform(X))
Explanation:
• Categorical Data: The fourth column in X ([3]) contains city names ('New York', 'California', etc.),
which are categorical and need to be converted into numeric values for machine learning.
• OneHotEncoder: Converts categorical values into one-hot encoded vectors (binary columns).
○ Example:
'New York' -> [0, 0, 1]
'California' -> [1, 0, 0]
'Florida' -> [0, 1, 0]
• ColumnTransformer: Applies the transformation to the specified column(s) ([3]) while keeping the
rest of the dataset unchanged.
• remainder='passthrough': Leaves other columns as-is.
Example Output of Transformed X:
[[0.0, 0.0, 1.0, 165349.2, 136897.8, 471784.1],
[1.0, 0.0, 0.0, 162597.7, 151377.59, 443898.53],
[0.0, 1.0, 0.0, 153441.51, 101145.55, 487934.54],
...]
• The categorical column ('New York', 'California', etc.) is now represented as one-hot encoded
vectors, and the numerical columns remain unchanged.

4. Key Observations
• Original Data Issues: There are some typos or incorrect values in the code snippet:
○ ydataset.iloc[:, -1].values should be y = dataset.iloc[:, -1].values.
○ Encoding output contains formatting errors (e.g., 8.8 values seem to be typos for the one-
hot encoded data).
• The code correctly encodes the categorical column and prepares X for use in a machine learning
model.
If you'd like, I can clean this code further for better clarity. Let me know!

From <https://chatgpt.com/c/674d069c-1af0-8007-b82d-198ffae1e8ab>

The provided code implements Association Rule Learning using the Apriori algorithm to analyze
transactional data (e.g., market basket analysis). Here's an explanation of each part of the code:

1. Importing Libraries
!pip install apyori
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
• apyori: A library that implements the Apriori algorithm for discovering frequent itemsets and
association rules.
• numpy, pandas, matplotlib: Common Python libraries used for numerical calculations, data
manipulation, and visualization.

2. Data Preprocessing
dataset = pd.read_csv('Market_Basket_Optimisation.csv', header=None)
transactions = []
for i in range(0, 7501):
transactions.append([str(dataset.values[i, j]) for j in range(0, 20)])
What this does:

Machine Learning Page 3

What this does:
• pd.read_csv:
○ Reads the dataset from the file Market_Basket_Optimisation.csv.
○ header=None indicates the dataset does not have column headers.
• Creating transactions:
○ Initializes an empty list to store all transactions.
○ Loops through 7501 rows (transactions) in the dataset.
○ For each row, loops through 20 columns (items in a transaction).
○ Appends each transaction as a list of items (converted to strings) to transactions.
Example Output of transactions:
[['milk', 'bread', 'butter'],
['tea', 'sugar', 'biscuits'],
...]

3. Training the Apriori Model

from apyori import apriori
rules = apriori(transactions=transactions,
min_support=0.003,
min_confidence=0.2,
min_lift=3,
min_length=2,
max_length=2)
Parameters Explained:
• transactions: The list of transactions from the preprocessing step.
• min_support=0.003:
○ An itemset must appear in at least 0.3% of transactions (7501 × 0.003 ≈ 23 transactions).
• min_confidence=0.2:
○ Rules must have at least 20% confidence (i.e., if X is bought, there's at least a 20% chance Y
is also bought).
• min_lift=3:
○ The rule's lift (strength compared to random chance) must be at least 3.
• min_length=2, max_length=2:
○ Only consider rules with exactly 2 items in the itemset.
Output:
• The rules object is a generator containing association rules that meet these criteria.

4. Visualizing the Results

results = list(rules)
results
• list(rules): Converts the generator into a list to view all discovered rules.
Example Output:
[RelationRecord(items=frozenset({'chicken', 'light cream'}),
support=0.00457, confidence=0.39, lift=4.84),
RelationRecord(items=frozenset({'olive oil', 'whole wheat pasta'}),
support=0.003, confidence=0.27, lift=3.12)]
Breakdown of a Rule:
RelationRecord(items=frozenset({'chicken', 'light cream'}),
support=0.00457, confidence=0.39, lift=4.84)
• Items: 'chicken' and 'light cream' are frequently bought together.
• Support: 0.00457 (i.e., ~0.457% of all transactions include both items).
• Confidence: 0.39 (i.e., if someone buys 'chicken', there's a 39% chance they also buy 'light cream').
• Lift: 4.84 (i.e., buying 'chicken' makes buying 'light cream' 4.84 times more likely than random
chance).

Key Insights
• The Apriori algorithm helps identify frequent itemsets and association rules in the dataset.
• Businesses can use this analysis for:
○ Cross-selling (e.g., "People who bought this also bought that").
○ Optimized product placement in stores.
○ Recommendation systems in e-commerce.
Let me know if you'd like clarification on any part!

From <https://chatgpt.com/c/674d069c-1af0-8007-b82d-198ffae1e8ab>

Machine Learning Page 4

Data Preprocessing in Machine Learning
No ratings yet
Data Preprocessing in Machine Learning
27 pages
(Feature Engineering) (Extended-Cheatsheet)
No ratings yet
(Feature Engineering) (Extended-Cheatsheet)
9 pages
30 Common IT Support Interview Questions and Answers - Wisestep
No ratings yet
30 Common IT Support Interview Questions and Answers - Wisestep
8 pages
Dymola Full User Manual
No ratings yet
Dymola Full User Manual
1,645 pages
FD Controller Instruction Manual Command Reference: 4th Edition
No ratings yet
FD Controller Instruction Manual Command Reference: 4th Edition
124 pages
ML Practical 205160694034
No ratings yet
ML Practical 205160694034
33 pages
Data Mining Lab Manual CSE VII Sem
No ratings yet
Data Mining Lab Manual CSE VII Sem
63 pages
DA Assignment
No ratings yet
DA Assignment
18 pages
Building Good Training Sets UNIT 1 PART2
No ratings yet
Building Good Training Sets UNIT 1 PART2
46 pages
Apriori Algorithm in Machine Learning
No ratings yet
Apriori Algorithm in Machine Learning
8 pages
Hierarchical Clustering
No ratings yet
Hierarchical Clustering
23 pages
4 Data Preprocessing
No ratings yet
4 Data Preprocessing
27 pages
Intro To Ai ML
No ratings yet
Intro To Ai ML
21 pages
Ensem Imp Data Science and Big Data Analytics Unit - 4
No ratings yet
Ensem Imp Data Science and Big Data Analytics Unit - 4
25 pages
Dwdm-Lab Manual
No ratings yet
Dwdm-Lab Manual
39 pages
Model Learning Steps
No ratings yet
Model Learning Steps
12 pages
Ashwin Report
No ratings yet
Ashwin Report
18 pages
Machine Learning Project Checklist
No ratings yet
Machine Learning Project Checklist
30 pages
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
No ratings yet
Machine Learning Laboratory (BTCS619-18) B.Tech Cse 6Th 2024 EVEN
29 pages
Slides On DataI
No ratings yet
Slides On DataI
33 pages
Indexdw
No ratings yet
Indexdw
34 pages
ML Book Notes
No ratings yet
ML Book Notes
9 pages
Practical File of AI and ML
No ratings yet
Practical File of AI and ML
26 pages
Practical 3 - Categorical Feature Engineering
No ratings yet
Practical 3 - Categorical Feature Engineering
6 pages
Ia1 ML Scheme Common To Is, Ai, Cs
No ratings yet
Ia1 ML Scheme Common To Is, Ai, Cs
10 pages
S6 - Data Mining Lab Experiments (Except 1)
No ratings yet
S6 - Data Mining Lab Experiments (Except 1)
6 pages
Subject - Machine Learning Group - E27-24 Name
No ratings yet
Subject - Machine Learning Group - E27-24 Name
18 pages
Lecture Material 10
No ratings yet
Lecture Material 10
9 pages
Scikit Hca
No ratings yet
Scikit Hca
8 pages
Ds Notes Mca
No ratings yet
Ds Notes Mca
30 pages
Apriori Algorithm in Word File
No ratings yet
Apriori Algorithm in Word File
16 pages
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
No ratings yet
Progress of CATBOOST ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
9 pages
DA Programs
No ratings yet
DA Programs
44 pages
CS 611 Slides 4
No ratings yet
CS 611 Slides 4
25 pages
ML Practical
No ratings yet
ML Practical
61 pages
Progress of GRADIENT BOOSTING ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
No ratings yet
Progress of GRADIENT BOOSTING ALGORITHM FOR ELECTRICITY THEFT DETECTION IN POWER UTILITIES
10 pages
219 - Exp 9 - DWM
No ratings yet
219 - Exp 9 - DWM
10 pages
Lab Manual Computer Science & Engineering
No ratings yet
Lab Manual Computer Science & Engineering
29 pages
# For Linear Algebra Import Numpy As NP # For Data Processing Import Pandas As PD
No ratings yet
# For Linear Algebra Import Numpy As NP # For Data Processing Import Pandas As PD
4 pages
How To Prepare Your Dataset For Machine Learning in Python
No ratings yet
How To Prepare Your Dataset For Machine Learning in Python
14 pages
Cse3001 Ai ML m2
No ratings yet
Cse3001 Ai ML m2
118 pages
Minor Project
No ratings yet
Minor Project
21 pages
Implementing Artificial Neural Network in Python From Scratch
No ratings yet
Implementing Artificial Neural Network in Python From Scratch
16 pages
Advanced Database
No ratings yet
Advanced Database
23 pages
Articles Xgboost Classification With Smote-Enn Algorithm
No ratings yet
Articles Xgboost Classification With Smote-Enn Algorithm
11 pages
Advance Python
No ratings yet
Advance Python
5 pages
Iii Aid - ML
No ratings yet
Iii Aid - ML
30 pages
Big Data Practical
No ratings yet
Big Data Practical
20 pages
Class Xii PDF For Practical
No ratings yet
Class Xii PDF For Practical
24 pages
Data Preprocessing ML Lab
No ratings yet
Data Preprocessing ML Lab
6 pages
Lab 08 - Data Preprocessing
No ratings yet
Lab 08 - Data Preprocessing
9 pages
Da Exp 9
No ratings yet
Da Exp 9
5 pages
ML (Prac1)
No ratings yet
ML (Prac1)
12 pages
DMC Lab Ex - 1 To 15 (31.03.2024)
No ratings yet
DMC Lab Ex - 1 To 15 (31.03.2024)
52 pages
Scikit Learn
No ratings yet
Scikit Learn
17 pages
Final ML
No ratings yet
Final ML
2 pages
Machine Learning
No ratings yet
Machine Learning
81 pages
22K61A0654 2 Sasi Auto
No ratings yet
22K61A0654 2 Sasi Auto
24 pages
CSC 240 HW 3
No ratings yet
CSC 240 HW 3
14 pages
ML and Deploying It Using Flask and Docker.
No ratings yet
ML and Deploying It Using Flask and Docker.
30 pages
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
No ratings yet
Project - Machine Learning-Business Report: By: K Ravi Kumar PGP-Data Science and Business Analytics (PGPDSBA.O.MAR23.A)
38 pages
Data Mining Lab Manual 2 2
No ratings yet
Data Mining Lab Manual 2 2
63 pages
Top Numerical Methods With Matlab For Beginners!
From Everand
Top Numerical Methods With Matlab For Beginners!
Andrei Besedin
No ratings yet
Marine Panel Personal Computer Mvpc-1901: Unicont SPB LTD
No ratings yet
Marine Panel Personal Computer Mvpc-1901: Unicont SPB LTD
12 pages
h310m-d m.2 v20 Manual
No ratings yet
h310m-d m.2 v20 Manual
8 pages
Edms 2
No ratings yet
Edms 2
10 pages
Welcome: - Basics of Dns
No ratings yet
Welcome: - Basics of Dns
21 pages
Digital Signal
No ratings yet
Digital Signal
6 pages
Study Material IP XII
No ratings yet
Study Material IP XII
116 pages
EQP S3 Software
No ratings yet
EQP S3 Software
57 pages
January Budget 2021
No ratings yet
January Budget 2021
6 pages
A Parking Guidance and Information System Based On
No ratings yet
A Parking Guidance and Information System Based On
5 pages
Sell Bitcoin For Bank Transfer NoOnes
No ratings yet
Sell Bitcoin For Bank Transfer NoOnes
1 page
TJ200BD en
No ratings yet
TJ200BD en
4 pages
Cbse - Department of Skill Education Artificial Intelligence
No ratings yet
Cbse - Department of Skill Education Artificial Intelligence
10 pages
Notes - 5 Unit
No ratings yet
Notes - 5 Unit
54 pages
Fundamentals-Of-Tally - Erp9 - 2
No ratings yet
Fundamentals-Of-Tally - Erp9 - 2
18 pages
Anaplan IPO
No ratings yet
Anaplan IPO
614 pages
Mid Sem 1 Portions
No ratings yet
Mid Sem 1 Portions
3 pages
B.tech Syllabus 3rd Semester For CSE IPU
No ratings yet
B.tech Syllabus 3rd Semester For CSE IPU
8 pages
Sigma Personal Voice Assistance Mid - Defence - Report
No ratings yet
Sigma Personal Voice Assistance Mid - Defence - Report
27 pages
Lecture 1: Matrices and Systems of Linear Equations: Brandon Behring
No ratings yet
Lecture 1: Matrices and Systems of Linear Equations: Brandon Behring
37 pages
8086imp Short Programs8
No ratings yet
8086imp Short Programs8
7 pages
11 Privacy Preservation For Federated Learning in Health Care 2024 Patterns
No ratings yet
11 Privacy Preservation For Federated Learning in Health Care 2024 Patterns
14 pages
Influential Penetration Testers: Who Are Making The World Safer
No ratings yet
Influential Penetration Testers: Who Are Making The World Safer
47 pages
Your Budak Paste - SPaste2
No ratings yet
Your Budak Paste - SPaste2
3 pages
AFSIM
No ratings yet
AFSIM
60 pages
Movius, How To Negotiate - Virtually
No ratings yet
Movius, How To Negotiate - Virtually
5 pages
Game Requirements For Venge Io (Clone)
No ratings yet
Game Requirements For Venge Io (Clone)
3 pages
Iot (Internet of Things) : Connect The Things, Shrink The World
No ratings yet
Iot (Internet of Things) : Connect The Things, Shrink The World
26 pages

Ashfatmaterial

Uploaded by

Ashfatmaterial

Uploaded by

CAT1

01 December 2024 17:38

1. Association Rule Mining – Apriori Algorithm

//more is not taught in class

Machine Learning Page 1

Multiple Linear Regression:

Machine Learning Page 2

3. Handling Missing Data:

4. Encoding Categorical Data:

5. Splitting the Dataset:

2. Importing the Dataset

3. Encoding Categorical Data

Machine Learning Page 3

3. Training the Apriori Model

4. Visualizing the Results

Machine Learning Page 4

You might also like