Dataware Practical 5
Dataware Practical 5
1. Open Weka - Download and install Weka from Weka’s official website if you haven’t already.
2. Open Dataset - Click on Open file… to load a dataset. Weka supports. arff, .csv, and other
formats.
o Sample Datasets - If you don’t have a dataset, Weka includes sample datasets. You
can find these under the data directory within your Weka installa on. Popular
examples include iris.arff or weather.arff.
1. Examine the Dataset - In the Preprocess tab, you can view a ributes, instances, and
summary sta s cs of the dataset.
2. Filter and Edit (if needed) - You can use filters to remove or transform a ributes, but for
building a basic decision tree, this step is op onal.
o Decision Tree - Under trees, choose J48 (the Weka implementa on of the C4.5
algorithm for decision trees). This is the default decision tree algorithm in Weka.
1. Set Parameters - A er selec ng J48, you’ll see its default se ngs in the classifier box. You
can change se ngs by clicking on the text box next to J48.
o MinNumObj - Sets the minimum number of instances per leaf. Higher values result
in smaller trees.
o Unpruned - If selected, the tree will not be pruned, which may increase complexity
and overfi ng.
2. Confirm Se ngs - Once you’ve configured the parameters (or le them at default), click OK.
Step 6: Choose the Class A ribute
1. Select Class A ribute - At the bo om right, make sure the Class a ribute (the target
variable) is correctly set. This is typically the last a ribute in the dataset, but you can change
it if necessary.
o Alterna vely, you can choose Percentage split (e.g., 66% training, 34% tes ng) or
Use training set to evaluate the model.
1. Click Start - Once everything is set up, click Start to build the model. Weka will train the
Decision Tree on the dataset and display the results.
1. View Model Output - A er training, Weka will display results in the Classifier output sec on.
o Summary - You’ll see evalua on metrics such as accuracy, precision, recall, F1 score,
and confusion matrix.
o Decision Tree Visualiza on - To see the tree, click on Visualize tree at the bo om.
This gives a graphical representa on of the decision tree.
1. Save Model - To save your model, right-click on the model in the Result list on the le pane,
then select Save model.
2. Save Predic ons (Op onal) - You can also save the predic ons by right-clicking on the model
and selec ng Save result buffer.
1. Load New Data - You can use the Supplied test set op on in the Classify tab to load a new
dataset for predic on.
2. Predict - Once the new data is loaded, click Start to classify instances in the new dataset
using the trained decision tree.
Prac cal 5b.
Step 1: Launch Weka
1. Open File - In the Explorer, go to the Preprocess tab and click on Open file… to load your
dataset.
2. Supported Formats - Weka supports .arff, .csv, and other file formats. You can select any
dataset you want to classify using Naïve Bayes.
o Sample Datasets - Weka includes sample datasets. You can use iris.arff or
weather.arff (available in Weka’s data folder) if you don’t have a dataset.
1. Check A ributes - In the Preprocess tab, you can view all the a ributes and their data types.
2. Filter Data - If you need to clean or transform data, you can use Filters here. This is op onal
for a basic Naïve Bayes model.
1. Go to the Classify Tab - Click on the Classify tab to move to the classifier sec on.
o NaiveBayesMul nomial - You might see op ons like NaiveBayesMul nomial, which
is suited for text classifica on and data with nominal a ributes. For a general
numeric and nominal dataset, choose the standard NaiveBayes op on.
1. Choose Evalua on Technique - By default, Weka uses 10-fold Cross-Valida on, which is
o en effec ve for classifica on tasks.
o Percentage Split - Alterna vely, you can select Percentage split (e.g., 66% training,
34% tes ng).
o Use Training Set - You can also evaluate the model on the en re training dataset,
although this approach doesn’t give a realis c es mate of performance on unseen
data.
2. Click Start - With everything set, click Start to train and evaluate the Naïve Bayes model on
your data.
1. View Model Output - A er running the model, Weka will display results in the Classifier
output sec on.
o Summary - Check key metrics such as accuracy, precision, recall, F1 score, and the
confusion matrix to understand the performance of the classifier.
o Detailed Accuracy by Class - This provides precision, recall, F-measure, and ROC area
for each class.
o Confusion Matrix - This shows the number of correct and incorrect predic ons for
each class.
1. Visualize Errors - To see the misclassified instances, click Visualize classifier errors at the
bo om.
2. Plot Results - You can also generate 2D visualiza ons of a ribute distribu ons or
classifica on boundaries by selec ng Visualize.
1. Save Model - Right-click on your model in the Result list (le side of the Classify tab) and
choose Save model to save the Naïve Bayes model for future use.
1. Load New Data - If you want to classify a new dataset with your saved Naïve Bayes model,
load the new data in the Preprocess tab.
2. Classify New Instances - Go to the Classify tab, load your saved model, and classify the new
data.
Using these steps, you can apply Naïve Bayes on any dataset in Weka to perform
classifica on.