0% found this document useful (0 votes)

278 views28 pages

Association Rules Ans

This document discusses association rule mining on grocery store transaction data. It loads and preprocesses the data, generates frequent itemsets and association rules using the Apriori algorithm, and provides visualizations of the results including bar plots of top items and support/lift for rules.

Uploaded by

Priya kamble

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

278 views28 pages

Association Rules Ans

Uploaded by

Priya kamble

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 28

Association Rules

Instructions:
Please share your answers filled in-line in the word document. Submit code separately wherever
applicable.

Please ensure you update all the details:

Name: Vinutha N Batch ID: DSWDE24062021
Topic: Association Rules

Grading Guidelines:
1. An assignment submission is considered complete only when correct and executable code(s) are
submitted along with the documentation explaining the method and results. Failing to submit either of
those will be considered an invalid submission and will not be considered for evaluation.
2. Assignments submitted after the deadline will affect your grades.

Grading:
Ans Date Ans Date
Correct On time A 100
80% & above On time B 85 Correct Late
50% & above On time C 75 80% & above Late
50% & below On time D 65 50% & above Late
E 55 50% & below
Copied/No Submission F 45

● Grade A: (>= 90): When all assignments are submitted on or before the given deadline.
● Grade B: (>= 80 and < 90):
o When assignments are submitted on time but less than 80% of problems are completed.
(OR)
o All assignments are submitted after the deadline.

● Grade C: (>= 70 and < 80):

o When assignments are submitted on time but less than 50% of the problems are completed.
(OR)
o Less than 80% of problems in the assignments are submitted after the deadline.

● Grade D: (>= 60 and < 70):

o Assignments submitted after the deadline and with 50% or less problems.

● Grade E: (>= 50 and < 60):

o Less than 30% of problems in the assignments are submitted after the deadline.
(OR)
o Less than 30% of problems in the assignments are submitted before the deadline.

● Grade F: (< 50): No submission (or) malpractice.

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Hints:

1. Business Problem
1.1. What is the business objective?
1.2. Are there any constraints?
2. Work on each feature of the dataset to create a data dictionary as displayed in the
below image:

3. Data Pre-processing
3.1 Data Cleaning, Feature Engineering, etc.
4. Model Building
4.1 Application of Apriori Algorithm
4.2 Build most frequent item sets and plot the rules
4.3 Work on Codes
5.Deployment
5.1 Deploy solutions
6. Write about the benefits/impact of the solution - in what way does the business
(client) benefit from the solution provided?

Problem Statement: -
Kitabi Duniya, a famous book store in India, which was established before Independence, the growth of the
company was incremental year by year, but due to online selling of books and wide spread Internet access its
annual growth started to collapse, seeing sharp downfalls, you as a Data Scientist help this heritage book
store gain its popularity back and increase footfall of customers and provide ways the business can improve
exponentially, apply Association RuleAlgorithm, explain the rules, and visualize the graphs for clear
understanding of solution.
1.) Books.csv

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Python code:

# import pandas library

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

#loading the data set

book = pd.read_csv(“C:/Users/Vinutha/Downloads/Datasets(6)/book.csv”)

#splitting the data

book = book.split(“\n”)
print(book.split())
book_list = []
for i in book:
book_list.append(i.split(“,”))
“There is a link that will split the data whenever we see the (comma(,))
values”.
all_book_list = [i for item in book_list for i in item]
we are going in I and in I we are searching all the values i i

from collections import Counter #, OrderedDict

#we are importing counter from collections
item_frequencies = Counter(all_book_list)
countering all the books list and feeding in the item_frequencies

#after sorting
item_frequencies = sorted(item_frequencies.items(), key = lambda x:x[1])
sorting the data

#sorting frequencies and items in separate variables

frequencies = list(reversed([i[1] for i in item_frequencies]))
items = list(reversed([i[0] for i in item_frequencies]))

© 2013 - 2021 360DigiTMG. All Rights Reserved.

#barplot of top 10
import matplotlib.pyplot as plt
=> importing matplotlib to visualize the plot of the data

plt.bar(height = frequencies[0:11], x = list(range(0,11)), color =

‘rgbkymc’)
plt.xticks(list(range(0,11), ), items[0:11])
plt.xlabel(“items”)
plt.ylabel(“Count”)
plt.show()

#creating data frame for the transactions data

book_series = pd.DataFrame(pd.Series(book_list))
book_series= book_series.iloc[:2000, :]

book_series.columns = [“trans”]

#creating a dummy column for each item in each transactions… using column
names as item name
X = book_series[‘trans’].str.join(sep = ‘*’).str.get_dummies(sep = ‘*’)
frequent_itemsets = apriori(X, min_support = 0.0075, max_len=4,
use_colnames= True)

#most frequent item sets based on support

frequent_itemsets.sort_values(‘support’, ascending = False, inplace= True)

plt.bar(x=list(range(0,11),height=frequent_itemsets.support[0:11],color=’r
gmyk’)
plt.xticks(list(range(0,11)), frequent_itemsets.itemsets[0:11])
plt.xlabel(‘item-sets’)
plt.ylabel(‘support’)
plt.show()

rules=association_rules(frequent_itemsets, metric=”lift”,min_threshold=1)
rules.head(10)
rules.sort_values(‘lift’, ascending=False).head(10)

Exploratory Data Analysis:

© 2013 - 2021 360DigiTMG. All Rights Reserved.

1. Scatterplot of the rules with three different variables on the plot.

Insights:

· From the books dataset, 164 rules were created using the apriori algorithm.

· There are 3 different variables plotted in the scatterplot; confidence, support and lift.

· From the above visualization we can say that the lift ratio for rules ranges from 0 to 14.5
approximately.

· The rules are all over scattered and more than 50 percent of rules lie within the support
range of 0.03

· The rules with the highest lift ratio lie within the constraint of support value of 0.025 to 0.03
and confidence value from 0.6 to 0.7

2. Graph plot for random rules

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Insights:

· We have a graph plot of 3 rules amongst all the rules generated.

· The highest lift ratio value is for the rules with the items CookBks, DoltYBks, ArtBks, ItalCook.

· The lowest lift ratio value is for the items ItalArts, ArtBks, CookBks, DoltYBks

3. Grouped chart for all 164 rules

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Insights:

· Above is a group chart of 164 rules.

· We have 11 rules with the highest lift value and the items include ItalCook, ArtBks and 4 other
items.

· There are 6 rules which have the highest support and the items include DoltYBks, GeogBks and
4 other items.
© 2013 - 2021 360DigiTMG. All Rights Reserved.
Problem Statement: -
The Departmental Store, has gathered the data of the products it sells on a Daily basis.
Using Association Rules concepts, provide the insights on the rules and the plots.
2.) Groceries.csv

Python code:

# import pandas library

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

#loading the data set

groceries = pd.read_csv
(“C:/Users/Vinutha/Downloads/Datasets(6)/groceries.csv”)

#splitting the data

groceries = groceries.split(“\n”)
print(groceries.split())
groceries_list = []
for i in groceries:
groceries_list.append(i.split(“,”))
“There is a link that will split the data whenever we see the (comma(,))
values”.
all_groceries_list = [i for item in mgroceries_list for i in item]
we are going in I and in I we are searching all the values i i

from collections import Counter #, OrderedDict

#we are importing counter from collections
item_frequencies = Counter(all_groceries_list)

© 2013 - 2021 360DigiTMG. All Rights Reserved.

countering all the items list and feeding in the item_frequencies

#after sorting
item_frequencies = sorted(item_frequencies.items(), key = lambda x:x[1])
sorting the data

#sorting frequencies and items in separate variables

frequencies = list(reversed([i[1] for i in item_frequencies]))
items = list(reversed([i[0] for i in item_frequencies]))

#barplot of top 10
import matplotlib.pyplot as plt
=> importing matplotlib to visualize the plot of the data

plt.bar(height = frequencies[0:11], x = list(range(0,11)), color =

‘rgbkymc’)
plt.xticks(list(range(0,11), ), items[0:11])
plt.xlabel(“items”)
plt.ylabel(“Count”)
plt.show()

#creating data frame for the transactions data

groceries_series = pd.DataFrame(pd.Series(groceries_list))
groceries_series= groceries_series.iloc[:2000, :]

groceries_series.columns = [“trans”]

#creating a dummy column for each item in each transactions… using column
names as item name
X = groceries_series[‘trans’].str.join(sep = ‘*’).str.get_dummies(sep =
‘*’)
frequent_itemsets = apriori(X, min_support = 0.0075, max_len=4,
use_colnames= True)

#most frequent item sets based on support

frequent_itemsets.sort_values(‘support’, ascending = False, inplace= True)

rules=association_rules(frequent_itemsets, metric=”lift”,min_threshold=1)
rules.head(10)
rules.sort_values(‘lift’, ascending=False).head(10)
Exploratory data Analysis:

1. Two-key plot for 39 rules

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Insights:

· The above is the two key plot of 39 rules from the groceries dataset which are plotted on the basis
of order of the rules. Also we have 3 different variables plotted.

· From the visualization we can infer that the rules with 4 itemsets are more in number as
compared to 3 and 5 itemsets.

· Majority of the rules with 4 itemsets lie within the constraint of support value as 0.002 to 0.0032
and confidence as 0.75 to 0.78.

· There is a rule with 5 itemsets which has highest confidence and support value approximately as
0.0032.

2. Graph chart

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Insights:

· Above is the graph chart of 4rules.

· Amongst the 4 rules there is a rue with highest lift ratio and items include whipped/sour cream,
onions, whole milk and other vegetables..

· The item other vegetables is again common for 3 different rules.

· There is a rule which has highest support value and the items include tropical fruit, frozen
vegetables, root vegetables and whole milk. Whole milk is again common for 3 different rules

3. Grouped chart

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Insights:
© 2013 - 2021 360DigiTMG. All Rights Reserved.
· Above is the group chart for 39 rules.

· There is one rules which has maximum lift ratio value and the items include citrus fruit, whole
milk and 2 other items.

· There is one rule which has highest support and good lift value and the items include citrus fruit,
tropical fruit and 1 other item.

Problem Statement: -

© 2013 - 2021 360DigiTMG. All Rights Reserved.

A film distribution company wants to target audience based on their likes and dislikes, you as a Chief Data
Scientist Analyze the data and come up with different rules of movie list so that the business objective is
achieved.
3.) my_movies.csv

Python code:
# import pandas library
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

#loading the data set

movie = pd.read_csv
(“C:/Users/Vinutha/Downloads/Datasets(6)/my_movies.csv”)

#splitting the data

movie = movie.split(“\n”)
print(movie.split())
movie_list = []
for i in movie:
movie_list.append(i.split(“,”))
“There is a link that will split the data whenever we see the (comma(,))
values”.
all_movie_list = [i for item in movie_list for i in item]
we are going in I and in I we are searching all the values i i

from collections import Counter #, OrderedDict

#we are importing counter from collections
item_frequencies = Counter(all_movie_list)
countering all the books list and feeding in the item_frequencies

#after sorting
item_frequencies = sorted(item_frequencies.items(), key = lambda x:x[1])
sorting the data

#sorting frequencies and items in separate variables

© 2013 - 2021 360DigiTMG. All Rights Reserved.
frequencies = list(reversed([i[1] for i in item_frequencies]))
items = list(reversed([i[0] for i in item_frequencies]))

#barplot of top 10
import matplotlib.pyplot as plt
=> importing matplotlib to visualize the plot of the data

plt.bar(height = frequencies[0:11], x = list(range(0,11)), color =

‘rgbkymc’)
plt.xticks(list(range(0,11), ), items[0:11])
plt.xlabel(“items”)
plt.ylabel(“Count”)
plt.show()

#creating data frame for the transactions data

movie_series = pd.DataFrame(pd.Series(movie_list))
movie_series= movie_series.iloc[:2000, :]

movie_series.columns = [“trans”]

#creating a dummy column for each item in each transactions… using column
names as item name
X = movie_series[‘trans’].str.join(sep = ‘*’).str.get_dummies(sep = ‘*’)
frequent_itemsets = apriori(X, min_support = 0.0075, max_len=4,
use_colnames= True)

#most frequent item sets based on support

frequent_itemsets.sort_values(‘support’, ascending = False, inplace= True)

rules=association_rules(frequent_itemsets, metric=”lift”,min_threshold=1)
rules.head(10)
rules.sort_values(‘lift’, ascending=False).head(10)

Exploratory Data Analysis:

© 2013 - 2021 360DigiTMG. All Rights Reserved.

1. Graph chart for 6 rules

Insights:

· Above is the graph chart for 6 rules.

· From the above chart we can infer that there is a rule which has higher lift among others but has
low support and the movies include LOTR, Green Mile, Gladiator.

· The rule with movies Gladiator and Patriot has the highest support value among others.

· The movie Gladiator and Sixth Sense are a part of other rules as well.

2. Grouped chart

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Insights:

· Above is the grouped chart 30 rules.

· There is a rule which has the highest lift ratio value and has the movies Gladiator and Green Mile.

· The rule which has higher support as compared to other rules includes the movies Gladiator
and Patriot.

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Problem Statement: -
A Mobile Phone manufacturing company wants to launch its three brand new phone into the market, but
before going with its traditional marketing approach this time it want to analyze the data of its previous
model sales in different regions and you have been hired as an Data Scientist to help them out, use the
Association rules concept and provide your insights to the company’s marketing team to improve its sales.
4.) myphonedata.csv

Python code:

# import pandas library

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
from collections import Counter
import matplotlib.pyplot as plt
#loading the data set
data = pd.read_csv
(“C:/Users/Vinutha/Downloads/Datasets(6)/myphonedata.csv”)

#splitting the data

data_list = []
for i in data:
data_list.append(i.split(“,”))
“There is a link that will split the data whenever we see the (comma(,))
values”.
all_data_list = [i for item in data_list for i in item]
we are going in I and in I we are searching all the values i i

from collections import Counter #, OrderedDict

#we are importing counter from collections
item_frequencies = Counter(all_data_list)
countering all the books list and feeding in the item_frequencies

© 2013 - 2021 360DigiTMG. All Rights Reserved.

#after sorting
item_frequencies = sorted(item_frequencies.items(), key = lambda x:x[1])
sorting the data

#sorting frequencies and items in separate variables

frequencies = list(reversed([i[1] for i in item_frequencies]))
items = list(reversed([i[0] for i in item_frequencies]))

#barplot of top 10
import matplotlib.pyplot as plt
=> importing matplotlib to visualize the plot of the data

plt.bar(height = frequencies[0:11], x = list(range(0,11)), color =

‘rgbkymc’)
plt.xticks(list(range(0,11), ), items[0:11, rotation=30)
plt.xlabel(“items”)
plt.ylabel(“Count”)
plt.show()

#creating data frame for the transactions data

data_series = pd.DataFrame(pd.Series(data_list))

data_series.columns = [“trans”]

#creating a dummy column for each item in each transactions… using column
names as item name
X = data_series[‘trans’].str.join(sep = ‘*’).str.get_dummies(sep = ‘*’)
frequent_itemsets = apriori(X, min_support = 0.0075, max_len=4,
use_colnames= True)

#most frequent item sets based on support

frequent_itemsets.sort_values(‘support’, ascending = False, inplace= True)

plt.bar(x=list(range(0,5),height=frequent_itemsets.support[0:5],color=’rgm
yk’)
plt.xticks(list(range(0,5)), frequent_itemsets.itemsets[0:5], rotation=15)
plt.xlabel(‘item-sets’)
plt.ylabel(‘support’)
plt.show()

rules=association_rules(frequent_itemsets, metric=”lift”,min_threshold=1)
rules.head(5)
rules.sort_values(‘lift’, ascending=False).head(5)

© 2013 - 2021 360DigiTMG. All Rights Reserved.

Exploratory Data Analysis:

1. Graph chart for the 12 rules generated

Insights:

· Above is the graph chart for 12 rules.

· Amongst the 12 rules we have a rule with highest lift ratio and the items(color of the
phones) green, white, red.

· There is a rule with highest support value and the items include only white color.

· From the graph above we can infer that white and red color are a part of most of the rules
hence we can go ahead with both of those colors.

2. Grouped chart
© 2013 - 2021 360DigiTMG. All Rights Reserved.
Insights:

· Above is a grouped chart of 12 rules.

· From the chart we can infer that there are 6 rules which have the highest support and the colors
of the rules are blue, red and white.

· We have a rule with the highest lift ratio amongst all the rules and it has color white and green
but the support value is less as compared to other rules.

Problem Statement: -
A retail store in India, has its transaction data, and it would like to know the buying pattern of the
consumers in its locality, you have been assigned this task to provide the manager with rules
on how the placement of products needs to be there in shelves so that it can improve the buying

patterns of consumes and increase customer footfall.
5.) transaction_retail.csv

Python code:

# import pandas library

import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules

#loading the data set

tr = pd.read_csv
(“C:/Users/Vinutha/Downloads/Datasets(6)/transaction_retail1.csv”)

#splitting the data

tr_list = []
for i in tr:
tr_list.append(i.split(“,”))
“There is a link that will split the data whenever we see the (comma(,))
values”.
all_tr_list = [i for item in tr_list for i in item]
we are going in I and in I we are searching all the values i i

from collections import Counter #, OrderedDict

#we are importing counter from collections
item_frequencies = Counter(all_tr_list)
countering all the books list and feeding in the item_frequencies

#after sorting
item_frequencies = sorted(item_frequencies.items(), key = lambda x:x[1])

sorting the data

#sorting frequencies and items in separate variables

frequencies = list(reversed([i[1] for i in item_frequencies]))
items = list(reversed([i[0] for i in item_frequencies]))

#barplot of top 10
import matplotlib.pyplot as plt
=> importing matplotlib to visualize the plot of the data

plt.bar(height = frequencies[0:5], x = list(range(0,5)), color =

‘rgbkymc’)
plt.xticks(list(range(0,5), ), items[0:5],rotation=30)
plt.xlabel(“items”)
plt.ylabel(“Count”)
plt.show()

#creating data frame for the transactions data

tr_series = pd.DataFrame(pd.Series(tr_list))
tr_series= tr_series.iloc[:2000, :]

tr_series.columns = [“trans”]

#creating a dummy column for each item in each transactions… using column
names as item name
X = tr_series[‘trans’].str.join(sep = ‘*’).str.get_dummies(sep = ‘*’)
frequent_itemsets = apriori(X, min_support = 0.0075, max_len=4,
use_colnames= True)

#most frequent item sets based on support

frequent_itemsets.sort_values(‘support’, ascending = False, inplace= True)

rules=association_rules(frequent_itemsets, metric=”lift”,min_threshold=1)
rules.head(20)
rules.sort_values(‘lift’, ascending=False).head(10)

Exploratory Data Analysis

1. Graph plot for 6 rules

Insights:

· Above we have a graph plot of 6 rules.

· From the chart we can infer that there is a rule which has higher support value as compared to
other rules and the items include wicker, heart, of. Wicker and of are again a part of a
different rule.

· There are other rules which have approximately similar lift ratio value and the items include
paper, Christmas, kit for one rule and for another rule we have items as wooden, frame, white.

2. Grouped plot:

· There is a rule which has the highest lift ratio value amongst other and the
items include kneeling and pad.

· We have a rule which has higher support value as compared to other rules and
the items include Jam and set.

Data Dictionaries:

1. Books dataset

Name of Description Type Relevance

Feature

ChildBks Category of book is Nominal Relevant

chidrens

YouthBks Category of book is Nominal Relevant

Youth

CookBks Category of book is Nominal Relevant

cooking recipes

DoItYBks Category of book is dolt Nominal Relevant

RefBks Category of book is Nominal Relevant

reference books

ArtBks Category of book is arts Nominal Relevant

GeogBks Category of book is Nominal Relevant

geography

ItalCook Category of book is Nominal Relevant

Italian cooking recipes

ItalAtlas Category of book is Nominal Relevant
Italian Atlas

ItalArt Category of book is Nominal Relevant

Italian arts

Florence Category of book is Nominal Relevant

Florence

2. Groceries dataset

It is a transaction dataset and does not have specific columns. For our analysis we convert the
items in the transactions as the columns in the dataset.

3. My movies data set

It is dataset in which columns are in encoded format apart from the mentioned below.

Name of Description Type Relevance

Feature

V1 Consists of movie names Nominal Relevant

V2 Consists of movie names Nominal Relevant

V3 Consists of movie names Nominal Relevant

V4 Consists of movie names Nominal Relevant

V5 Consists of movie names Nominal Relevant

4. My phone data

It is dataset in which columns are in encoded format apart from the mentioned below.

Name of Description Type Relevance
Feature

V1 Consists of color of Nominal, Qualitative Relevant

phones

V2 Consists of color of Nominal, Qualitative Relevant

phones

V3 Consists of color of Nominal, Qualitative Relevant

phones

5. Transaction retail 1 dataset

It is a transaction dataset and does not have specific columns. For our analysis we convert the
items in the transactions as the columns in the dataset.

15 KNN - Problem Statement
0% (2)
15 KNN - Problem Statement
3 pages
11 Network Analytics - Problem Statement
25% (4)
11 Network Analytics - Problem Statement
4 pages
Assignment 03
No ratings yet
Assignment 03
4 pages
8.dummy Variables
No ratings yet
8.dummy Variables
4 pages
Association Rules Problem Statement
50% (2)
Association Rules Problem Statement
5 pages
Day17 Association Rules
No ratings yet
Day17 Association Rules
4 pages
Python Codes Arules
100% (1)
Python Codes Arules
17 pages
Recommendation Engine Problem Statement
33% (3)
Recommendation Engine Problem Statement
3 pages
Python For Data Analytics
No ratings yet
Python For Data Analytics
3 pages
Association Rules Problem Statement
100% (1)
Association Rules Problem Statement
29 pages
2a EDA
No ratings yet
2a EDA
16 pages
Day10 Mathematical Foundations
No ratings yet
Day10 Mathematical Foundations
4 pages
Assignment 06
50% (2)
Assignment 06
2 pages
CRISP ML (Q) Business Understanding
No ratings yet
CRISP ML (Q) Business Understanding
12 pages
Business Uderstanding Solved1 - Module 1
No ratings yet
Business Uderstanding Solved1 - Module 1
11 pages
DataPreparation Outlier Treatment
100% (1)
DataPreparation Outlier Treatment
3 pages
Discretization Problem Statement
No ratings yet
Discretization Problem Statement
3 pages
Multinomial Problem Statement
No ratings yet
Multinomial Problem Statement
28 pages
Assignment 05
No ratings yet
Assignment 05
8 pages
Power BI Basic Exercise
67% (6)
Power BI Basic Exercise
2 pages
Name:Silpa Batch Id: Analysis: WDEO 171220 Topic: Principal Component
100% (1)
Name:Silpa Batch Id: Analysis: WDEO 171220 Topic: Principal Component
7 pages
Module 2 Data Types, Operators, Variables Assignment
No ratings yet
Module 2 Data Types, Operators, Variables Assignment
4 pages
KNN - Problem Statement ANSWER
100% (1)
KNN - Problem Statement ANSWER
8 pages
Radhika PCA - Problem Statement
No ratings yet
Radhika PCA - Problem Statement
3 pages
Hierarchical Clustering: Instructions
67% (3)
Hierarchical Clustering: Instructions
4 pages
Assignment 05 ANSWERS
100% (1)
Assignment 05 ANSWERS
5 pages
Clustering Documentation R Code
100% (1)
Clustering Documentation R Code
9 pages
Network Analytics - Problem Statement
No ratings yet
Network Analytics - Problem Statement
4 pages
Text Mining Problem Statement
100% (1)
Text Mining Problem Statement
3 pages
CRISP - ML (Q) - Business Understanding
No ratings yet
CRISP - ML (Q) - Business Understanding
13 pages
20dit073 Jay Prajapati ML
No ratings yet
20dit073 Jay Prajapati ML
68 pages
Duplication - Typecasting-Problem Statement
100% (1)
Duplication - Typecasting-Problem Statement
3 pages
Comparison of QlikView, Tableau and Power BI
No ratings yet
Comparison of QlikView, Tableau and Power BI
13 pages
Basic Statistics (Module - 3)
No ratings yet
Basic Statistics (Module - 3)
9 pages
Zero Variance-Problem Statement
0% (1)
Zero Variance-Problem Statement
3 pages
Topic: Dimension Reduction With PCA: Instructions
No ratings yet
Topic: Dimension Reduction With PCA: Instructions
8 pages
06.discretization Problem Statement
50% (2)
06.discretization Problem Statement
2 pages
Data Mining Techniques and Applications
No ratings yet
Data Mining Techniques and Applications
16 pages
Day12 Hierarchical Clustering
No ratings yet
Day12 Hierarchical Clustering
9 pages
Day13 K Means Clustering
No ratings yet
Day13 K Means Clustering
4 pages
Missing Values
No ratings yet
Missing Values
6 pages
R - Assignment
No ratings yet
R - Assignment
2 pages
Data Assigment 1
100% (2)
Data Assigment 1
32 pages
Basic Statistics (Module - 3)
100% (2)
Basic Statistics (Module - 3)
12 pages
Problem Statement - Mathematical Foundations
No ratings yet
Problem Statement - Mathematical Foundations
6 pages
CRISP DM Business Understanding - Data Science
No ratings yet
CRISP DM Business Understanding - Data Science
15 pages
Support Vector Machines Problem Statement
No ratings yet
Support Vector Machines Problem Statement
27 pages
CRISP DM Business Understanding Completed
No ratings yet
CRISP DM Business Understanding Completed
18 pages
Module-Preliminaries For Data Analysis - Data Science
100% (1)
Module-Preliminaries For Data Analysis - Data Science
5 pages
Fa22-Bcs-025 MOAZ Assignment 1
No ratings yet
Fa22-Bcs-025 MOAZ Assignment 1
9 pages
Iare DWDM and WT Lab Manual PDF
No ratings yet
Iare DWDM and WT Lab Manual PDF
69 pages
APRIARI Algorithm
No ratings yet
APRIARI Algorithm
55 pages
DataPreparation - Outlier - Treatment ASSIGNMENT 1
100% (1)
DataPreparation - Outlier - Treatment ASSIGNMENT 1
7 pages
13.exploratory Data Analysis
50% (2)
13.exploratory Data Analysis
8 pages
CRISP ML (Q) Business Understanding
No ratings yet
CRISP ML (Q) Business Understanding
17 pages
Original
No ratings yet
Original
30 pages
Module 03 Assignment
100% (1)
Module 03 Assignment
13 pages
Heart Disease Prediction Synopsis
No ratings yet
Heart Disease Prediction Synopsis
36 pages
Day02-Data Understanding Answer Asit 25082022
No ratings yet
Day02-Data Understanding Answer Asit 25082022
4 pages
6 (4 Files Merged)
0% (1)
6 (4 Files Merged)
4 pages
Business Moments Graphic Assignmebt
No ratings yet
Business Moments Graphic Assignmebt
11 pages
Da 11
No ratings yet
Da 11
3 pages
Basic Statistics (Module - 3)
No ratings yet
Basic Statistics (Module - 3)
7 pages
1 - Write A Python Program To Check That A String Contains Only A Certain Set of Characters (In This Case A-Z, A-Z and 0-9)
No ratings yet
1 - Write A Python Program To Check That A String Contains Only A Certain Set of Characters (In This Case A-Z, A-Z and 0-9)
4 pages
UNIT-III Data Warehouse and Minig Notes MDU
No ratings yet
UNIT-III Data Warehouse and Minig Notes MDU
42 pages
Using Sentiment Analysis in Complaint Management System
100% (1)
Using Sentiment Analysis in Complaint Management System
6 pages
Split Data
No ratings yet
Split Data
5 pages
BIS 541 Ch01 20-21 S
No ratings yet
BIS 541 Ch01 20-21 S
129 pages
Study Unit 14: F.4. Data Analytics
No ratings yet
Study Unit 14: F.4. Data Analytics
49 pages
Unit 1 - DA - Introduction To Data Science
No ratings yet
Unit 1 - DA - Introduction To Data Science
70 pages
Pythonwith Spyder An Experiential Learning Perspective
No ratings yet
Pythonwith Spyder An Experiential Learning Perspective
230 pages
Data Mining Assignment
0% (1)
Data Mining Assignment
11 pages
1 Chapter One
No ratings yet
1 Chapter One
54 pages
DM Lab Manual
No ratings yet
DM Lab Manual
32 pages
Datastage
No ratings yet
Datastage
50 pages
Data Mining Lab-Weka Edited
No ratings yet
Data Mining Lab-Weka Edited
55 pages
Concepts and Techniques: - Chapter 1
No ratings yet
Concepts and Techniques: - Chapter 1
48 pages
Spatiotemporal Data Mining
No ratings yet
Spatiotemporal Data Mining
27 pages
Data Analytics Unit III
No ratings yet
Data Analytics Unit III
88 pages
Day14-PCA - Problem Statement
0% (1)
Day14-PCA - Problem Statement
4 pages
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
No ratings yet
Unit 3 and Unit 4 Notes - Data Science - III BCA 2
27 pages
Final Report
No ratings yet
Final Report
38 pages
Untitled Presentation
No ratings yet
Untitled Presentation
6 pages
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
No ratings yet
Power System Fault Classification and Prediction Based On A Three-Layer Data Mining Structure
18 pages
A New Method For Mining Maximal Frequent Itemsets Based On Graph Theory
No ratings yet
A New Method For Mining Maximal Frequent Itemsets Based On Graph Theory
6 pages
CSC407 - Chapter 1
No ratings yet
CSC407 - Chapter 1
31 pages
Sem Vi Ty BSC Cs Qp's Oct 2022 NSG Academy
100% (1)
Sem Vi Ty BSC Cs Qp's Oct 2022 NSG Academy
17 pages
Association Analysis: Basic Concepts and Algorithms: Market-Basket Transactions
No ratings yet
Association Analysis: Basic Concepts and Algorithms: Market-Basket Transactions
42 pages
Utility-Driven Data Analytics On Uncertain Data
No ratings yet
Utility-Driven Data Analytics On Uncertain Data
11 pages
Association Rule Mining Spring 2022
No ratings yet
Association Rule Mining Spring 2022
84 pages
Difference Between Star Schema and Snowflake Schema
No ratings yet
Difference Between Star Schema and Snowflake Schema
6 pages
Jawaharlal Nehru Technological University Hyderabad
No ratings yet
Jawaharlal Nehru Technological University Hyderabad
11 pages
Conceptual Model and Logical Model
No ratings yet
Conceptual Model and Logical Model
5 pages
DATA MINING For Search Engines
No ratings yet
DATA MINING For Search Engines
33 pages
Assignment On DWDM
No ratings yet
Assignment On DWDM
8 pages
An Implementation of The FP-growth Algorithm: Christian Borgelt
No ratings yet
An Implementation of The FP-growth Algorithm: Christian Borgelt
5 pages
TYBSc (CS) WT - Deleted (2) - Removed - Removed
No ratings yet
TYBSc (CS) WT - Deleted (2) - Removed - Removed
13 pages