0% found this document useful (0 votes)
46 views36 pages

Data Mining Lab Manual

Uploaded by

Cruzer Jones
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
46 views36 pages

Data Mining Lab Manual

Uploaded by

Cruzer Jones
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 36
Exercise-1 Preprocessing a) Data type Conversion Aim To convert Numerical Attribute to Nominal. Procedure Step 1: Open Weka Explorer window. Step 2: Click on Preprocess tab. Step 3: Open the file or Dataset (e.g. weather.arfi) which is available in your computer after installing Weka tool (C:\Program Files\Weka-3-8\data) by Clicking on the open file tab. Step 4: Observe the data type of the features or an attributes on your selected dataset. Step 5:To convert the Numeric Attribute to Nominal Attribute select filters d > attributes. > NumericToNominal by clicking on the choose button and click the apply button to apply this change in your dataset which are available on the filter tab, Step 6: Save and close the window. Screen shot tig - 0 x (Sm [ea as [oe | eee Do- : E Y sige _ Semmes ct [ane erie [series | at — a vt gee CD a ” 4 fo o | gee CD as] 1. b) Data Transformation Aim The NumericTransform using filter in weka. Procedure Step 1: Open weka explorer. Step 2: Click on open file button and open theitis.arff file. Here the selected attributes of the dataset (iris.arff) contains float values as their statistic values. Step 3: Click on choose button and choose NumericTransform filter as weka-> filters> Unsupervised -> attribute -> NumericTransform. Step 4: Click on text box next to the choose button where NumericTransform is appearing. Step 5: Enter the attribute index on which we want to perform transfer. Here | is index of first attribute i.e. sepallength and it’s max and min values are 4.3 and 7.9 respectively. we ‘want to perform transformation on it so enter 1 as index number. Step 6: Select method name for transformation (method name could be any valid method name of java.lang.math class like floor()). Enter floor then click ok and click apply. Step 7: Observe the data transformation at selected attribute window. Step 8: Save and close the window. Screen shot (Comm. coi rs a ex = : =r Boe: SO rereeEY 3 °. (Sor [a ai [oe ay | eee Doe : EES sige Sills (Sm [ea as [oe ar Doe: EES Exercise-2 Filters 2. a) Replace missing values Aim To Replace the missing values on dataset before apply data mining algorithms. Procedure Step 1: Create a excel file and save it with .csv extension as figure:] which has some missing values. Step 2: Open this file in weka explorer. Step 3: Weka-> filters -> unsupervised -> attribute -> ReplaceMissingValues. Step 4: Click edit button to see the filled values. Step 5: Save and Close the window. Screen Shot a |: | | | ik | 1 Product 019—~2018~—«2017—«OAE~SC«ONS ZA ee eT 38 0 55 ac o 4 se 5D ea aoa so oe a se 3s) sas TF 1, as als af al 1 an 4 Ss 10} als | 3 oD asa ‘ wk sith rite ett ia 8 ll | = (ee Figure: 1 R t 1 = oom fil a) eee Bho nn Boe t wn | ape 2. b) Add Expressions To Create a excel file and save it with .csv extension as figure:1 which has three attributes like Product, X and Y. here we want add a new attribute suppose Z which is =(X+Y)*2. Procedure Step 1: Open a esv file in weka explorer. Step 2: Click on Choose button and select AddExpression filter as Weka-> filters -> unsupervised -> attribute -> AddExpression. Step 3: Click on text box next to the Choose button where AddExpression is appering. Here the attribute X is identified by a2 and the attribute Y is identified by a3. Step 4: Type expression (X+Y)*5 as (a2+a3)*5 Step 5: Type the name is as Z and click ok then click on apply. Step 6: Observe a new attribute Z will be added and click on Save button and Close the window. Screen shot LTE AA Ec TP PS PO 1 Product xy Zia ati 38 Cr ac o 50 nn bE Ca TF 14 Og lm] a4 lemma) 101 eS mn) sl aK ET lo ven Figure:1 °. Lr [ce se [ors [ae] Himes o | eee Boe: ESC CTC o. (ow [oe as [oe [re] Te iene Exercise - 3 Feature Selection — Select Attributes 3. a) Filter Aim To select the attribues on given dataset using weka, Procedure Step 1: Start the Weka GUI Chooser Step 2: Click the “Explorer” button to open the Weka Explorer interface. Step 3: Load the iris dataset (iris.arff file. Step 4: Click the choose button -> weka -> filters>Supervised > Attributes- >AttributeSelections. Step S:click the generic object editor by clicking on the text box where the AttributeSelections s appeared, Step 6: choose the ClassifierSubsetEval as evaluator and select the search method (Bestfirst is default ). Step 7:click the ClassiferSubsetEval and select NaiveBayes as classifier and click ok. Step 8: click the apply button on filter window to display the selected attributes on attributes window. Step 9: Save and close the window. Screen shot °. (Sor [So is [oe Omen x | = (eed °. (Sor [So is [oe Geog anensetne = | = (eed sige (Sor [ame as [es vt eee Do- : a eES sige Sills (Sor [ame as [es o vt eee Do- : a eS 1 3. b) Wrapper To select the attribues on given dataset using weka Procedure Step 1: Start the Weka GUI Chooser. Step 2: Click the “Explorer” button to open the Weka Explorer interface. Step 3: Load the iris dataset (iris.arff file). Step 4: Click the “select attributes” tab to choose the attribute evaluator as clicking on choose button -> weka -> attributeselection >WrapperSubsetEval. Step S:open the generic object editor by clicking on the text box where the WrapperSubsetEval is appeared. Step 6: choose the NaiveBayes as classifier and click ok. Step 7: select the search method (Bestfirst is default. ) Step 8: click the start button to display the selected attributes. Step 9: Save and close the window. Screen shot °. rss [cs [eo | ee] ounces a a | = (eed 2 Bove: ESC erry B 3.) Dimensionality Reduction To select the attribues (Dimensionality reduction) on given dataset using weka. Procedure Step 1: Start the Weka GUI Chooser. Step 2: Click the “Explorer” button to open the Weka Explorer interface. Step 3: Load the iris dataset (iris.arff file). Step 4: Click the “select attributes” tab to choose the attribute evaluator as clicking on choose button -> weka -> attributeselection ->PrincipalComponents. Step 5: select the search method (Ranker is default. ) Step 8: click the start button to display the selected attributes. Step 9: Save and close the window Screen shot | = (eed 4 Exercise - 4 Supervised Technique Classifier 4, a) Function - Multilayer Perceptron Aim To implement the Multilayer Perceptron algorithm to solve classification (supervised Learning ) Problem. Procedure Step 1: Start the Weka GUI Chooser. Step 2: Click the “Explorer” button to open the Weka Explorer interface. Step 3: Load the iris dataset (iris.arff file). Step 4: Click the “Classify” tab to open the classification tab. Step 5: Select the Multilayer Perceptron as weka> classifiers ->Functions- >MultilayerPerceptron, Step 6: Select the “Cross-validation” Test options (it should be selected by default). Step 7:Open the generic object editor and and set GUI as True the put the value to hidden layer. Step 8:Click the “Start” button to evaluate the algorithm on the dataset, Screen shot 15 Omsrte sills (rc [eee [sn | ae] Elo: ERENCE NS REE 16 4.b) Tree (548) Aim To implement the 148 algorithm to solve classification (supervised Leaming ) Problem. Procedure Step 1: Start the Weka GUI Chooser. Step 2: Click the “Explorer” button to open the Weka Explorer interface Step 3: Load the iris dataset (iris.arf file). Step 4: Click the “Classify” tab to open the classification tab. Step 5: Select the J48 as weka-> classifiers > Trees-> J48. Step 6: Select the “Cross-validation” Test options (it should be selected by default). Step 7: Click the “Start” button to evaluate the algorithm on the dataset. Screen shot (0 torreewson v Exercise—5 Classifier Bayes Rule To implement the naive bayes algorithm to solve classification (supervised Learning ) Problem, Procedure Step 1: Start the Weka GUI Chooser. Step 2: Click the “Explorer” button to open the Weka Explorer interface. Step 3: Load the iris dataset (iris.arff file). Step 4: Click the “Classify” tab to open the classification tab, Step 5: Select the Naive Bayes as weka-> classifiers -> Bayes-> NaiveBayes. Step 6: Select the “Cross-validation” Test options (it should be selected by default). Step 7: Click the “Start” button to evaluate the algorithm on the dataset. Screen shot Qs - ox [roms Joy] cur sas srt ‘aie ee Nas ts asc extant : Camus Comets tas 0 Pests ber ‘eta tcta ‘9-9 A gp 18 5.b) Zero R Aim To implement the Zero R algorithm to solve classification (supervised Leaning ) Problem. Procedure Step 1:Start the Weka GUI Chooser. Step 2:Click the “Explorer” button to open the Weka Explorer interface. Step 3:Load the iris dataset (iris.arff file). Step 4:Click the “Classify” tab to open the classification tab. Step 5:Select the ZeroR algorithm (it should be selected by default). Step 6:Select the “Cross-validation” Test options (it should be selected by default). Step 7:Click the “Start” button to evaluate the algorithm on the dataset. Screen shot | = (eed 19 Exercise —6 Unsupervised Techniques 6. a) Partitioned Clustering (K means) Algorithm Aim To use K means Cluster algorithm to group (cluster) the given instances in a dataset(weather.arff). Procedure Step 1:Start the Weka GUI Chooser Step 2: Click the “Explorer” button to open the Weka Explorer interface Step 3: Load the weather dataset (weather.arff file) Step 4: Click the “Cluster” tab to perform the clustering Step 5: Select the K means algorithm as Weka -> Clusterers > SimpleKmeans. Step 6: Click the “Start” button to evaluate the algorithm on the dataset. Screen shot ° ho eo 20 6. b) Hierarchical clustering Algorithm To use Hierarchical Cluster algorithm to group (cluster) the given instances in a dataset (weather.arff). Procedure Step 1:Start the Weka GUI Chooser. Step 2: Click the “Explorer” button to open the Weka Explorer interface Step 3: Load the weather dataset (weather.arff file). Step 4: Click the “Cluster” tab to perform the clustering Step 5: Select the hierarchicalCluster Step 6: Click the “Start” button to evaluate the algorithm on the dataset. Screen shot ome rss cs J [ |] | = (eed Exercise—7 Association Rules Mining - Apriori Algorithm Aim To use apriori algorithm to find the association among the items, Procedure Step 1: To Create a excel file and save it with .csy extension as figure:1 which has only a nominal attributes. (Apriori algorithm does not support numerical or any other data type.) Step 2: Open a esv file in weka explorer. Step 3: Click on the association tab and open the generic object editor by clicking the text box on associator where the apriori algorithm is displayed. Step 4: Set minimum support as 0.3 and minimum confidence (min metric) as 0.5 on the generic object editor. Step 5: Set numrules as 50 and OutputltemSets as True, Step 6: Click start and observe the result Screen shot A B ¢ D E F G H Il J KF 1 BREAD JELLY BUTTER MILK SUGAR 2 Yes YES VES YES 3 YES yes YES 4 ves ves YES 5 YES ves YES 6 ves YES 7 8 9 10 ——— Figure:1 22 oa | [une [sa |, a roseuueise Fe a eesng se a sqetenantie 7 Bon OO eererncry Ontipee (to oc [on [se [sas ase TOSHS UB YIOWED-EAK eet (0 ec Br emioa, 23 Exercise — 8 Experimenter Aim To use dat et test and based test algorithm. Procedure Step 1:Click the “Explorer” button to open the Weka Explorer interface on weka GUI Chooser. Step 2:Click the “New” button to create a new experiment configuration. Step 3: Load the weather dataset (weather.arff file) from data directory by clicking the “Add new...” button on “Datasets” window. Step 4: choose 3 algorithms to run our dataset. Step 5:Click “Add new...” in the “Algorithms” section, Click the “Choose” button. Click “ZeroR” under the “rules” selection. Step 6:Click “Add new...” in the “Algorithms” section, Click the “Choose” button. Click “OneR” under the “rules” selection. Step 7:Click “Add new...” in the “Algorithms” section. Click the “Choose” button. Click “J48” under the “trees” selection. Step 8:Click the “Run” tab at the top of the screen and click the start button to start the experiment, Step 9:Click the “Analyse” tab at the top of the screen to open the experiment results analysis panel. Step 10: Click the “Experiment” button in the “Source” section to load the results from the current experiment. Step 11:Click the “Select” button for the “Test base” and choose “Ranking“ and then Click the “Perform test” button. Step 12:Click the “Select” button for the “Test base” and choose the “ZeroR” algorithm in the list and click the “Select” button. Step 13: Click the check-box next to “Show std. deviations“ and then click the “Perform test” button, 24 Step 14:Click the “Select” button for the “Test base” and choose the “J48” algorithm in the list and click the “Select” button and click the “Perform test” button to compare the above three algorithms. Step 15: Save output and close the window Screen shot pe [sas ner 25 penser wiliallis [en Tmaes (anerterienne 3 cauatas (em 5 “ sean (ae 3 son (astern) $ coments (ern a EF 0 tereensan, a 26 a ox [ee] Fores) nate opt Teoh [Riise ont eet oI seman wom ho trcve 3 7 Exercise—9 KNOWLEDGE FLOW 9. a) Feature Selection Aim To select the features on given dataset to improve the modelin weka knowledge flow. Procedure Step 1: Start the Weka GUI Chooser. Step 2: Click the “Knowledge flow” button to open the Weka Knowledge flow interface. Step 3:Click Datasource-> arff loader Step 4: Click Evaluation->ClassAssigner. Step 5:Click filters -> AttributeSelection. Step 6:Click Visualization > Textviewer. Step 7:keep the above things on the working environment and connect one to another by right click on them. (Arffloader (right click) select dataset and connect it with classAssigner.), (ClassAssigner (dataset) AttributeSelection), (AttributeSelection( dataset) Text Viewer). Step 8: Load the data by right click on the arffloader ->configure->Browse file name and set the value of retain stringvals and use relative path as as True, Step 9: Click(Right) on AttributeSelector -> Choose Evaluator and Search. And open the generieObjectEditor to choose classifier by clicking the Evaluator textbox then click ok. Step 10: Run the process by clicking the play button on theleft side of the top of the window. Step 11:Right click the TextViewer to display the result. 28 Screen shot Doe: aE Cee TS °. [CSRS a ieee Wea © eG mee rue Ta RTeeeS aN aban ee @ x 3 eee | 29 ° (CaS aman nepali @ ee PAE TORT 1 eee Sew ‘tomate “i Bo: Tees ° Gisenireseers on stents cn 7 30 9. b) Clustering Aim To implement the clustering algorithm on given dataset in weka knowledge flow. Procedure Step 1: Start the Weka GUI Chooser. Step 2: Click the “Knowledge flow” button to open the Weka Knowledge flow interface. Step 3:Click Datasource-> arff loader Step 4: Click Evaluation-> CrossValidationFoldMaker. Step 5:Click Clusters->SimpleKmeans Step 6:Click Evaluation->ClustererPerformanceEvaluator. Step 7:Click Visualization -> Textviewer. Step 8: keep the above things on the working environment and connect one to another by right click on them. (ArfiLoader (right click) select dataset and connect it with CrossValidationFoldMaker), (ClassAssigner (dataset)), (CrossValidationFoldMaker (test set and training set) SimpleKmeans), (SimpleKmeans(bateh clusterer)ClustererPerformanceEvaluator), (ClustererPerformanceEvaluator (text)Textviewer) Step 9: Load the data by right click on the arffloader ->configure->Browse the file Step 10: Run the process by clicking the play button on theleft side of the top of the window. Step 11:Right click the TextViewer to display the result. 31 Screen shot aoe Sew UEeaa eo oa | a cn ~ } myc | = a Cisse ieesecnay sateen Sue Qe Tree ences aq Rineba arff loader Step 4: Click Evaluation->ClassAssigner and Cross ValidationFoldMaker. Step 5:Click Classifiers->Trees->J48 Step 6:Click Evaluation->ClassifierPerformanceEvaluator. Step 7:Click Visualization > Textviewer. Step 8: keep the above things on the working environment and connect one to another by right click on them. (ArfflLoader (right click) select dataset and connect it with classAssigner.), (ClassAssigner (dataset)CrossValidationFoldMaker), (CrossValidationFoldMaker (test set and training set) 148), (J48(batch classifier)ClassifierPerformanceEvaluator), (ClassifierPerformanceEvaluator (text)Textviewer). Step 9: Load the data by right click on the arffloader ->configure-> set the value of retain stringvals and use relative path as as True. Step 10: Choose class attribute by right clicking the ClassAssigner(by default every dataset has class attribute in the case of classification problem.) Step 11: Run the process by clicking the play button on thelett side of the top of the window. Step 12:Right click the TextViewer to display the result. 34 Screen shot eco ee — eames Dm 35 Ta RIneeS Sw GOA ee 36

You might also like