0% found this document useful (0 votes)
18 views4 pages

Week 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views4 pages

Week 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

DATE: 02-02-2024

WEEK-1

AIM: Basics of WEKA tool


A) Investigate the application interfaces
B) explore the default data sets in WEKA.

A) Weka tool is an open-source tool developed by students of Waikato university which


stands for Waikato Environment for Knowledge Analysis having all inbuilt machine learning
algorithms. It is used for solving real-life problems using data mining techniques. The tool
was developed using the Java programming language so that it is platform independent.
The tool itself contains some data sets in the data file of the application, We can them to
implement our algorithms. The dataset can be defined by Prediction models that predict
continuous-valued functions, while classification models predict categorical class mark The
GUI Chooser application allows you to run five different types of applications as listed here
The WEKA Workbench empowers you to explore, analyze, and manipulate data through three
distinct panels:

Explorer: Become a data detective, uncovering patterns and relationships through


visualizations and statistics. Prepare your data for analysis by cleaning and refining it through
filters. Experiment with various machine learning algorithms, fine-tuning them for optimal
performance.

K SAMPATH VISHAL 323103383L02


DATE: 02-02-2024

Experimenter: Conduct rigorous comparisons between algorithms, meticulously evaluating


their performance using diverse metrics. Delve deeper beyond surface-level scores, analyzing
intricate details to make informed choices. Choose the algorithm that aligns perfectly with your
project goals.
KnowledgeFlow: Design complex data pipelines visually, connecting processing steps, feature
selection, and algorithms. Automate repetitive tasks for efficiency and consistency. Share your
expertise with others by exporting these workflows, fostering collaboration and innovation.
Workbench: The Weka Workbench is a comprehensive environment that integrates various
components of Weka, including the Explorer, Experimenter, and Knowledge Flow, into a
unified interface. It provides a centralized platform for users to access and utilize Weka's
functionalities seamlessly. The Workbench offers a cohesive environment for conducting data
analysis, experimentation, and workflow development
SimpleCLI: Using the SimpleCLI interface, We may directly instruct WEKA on particular
tasks and wield commands like a pro. The fine-grained control this interface offers is ideal for
seasoned users who enjoy automating and scripting their data analysis activities.

B) Locate the datasets:


▪ Open WEKA.
▪ Click on the "Choose" button next to the "Open file" option in the Explorer panel.
▪ This will open a file selection dialog.
▪ By default, WEKA installs sample datasets in a folder called "data" within its
installation directory("Weka-3-8-6/data).

K SAMPATH VISHAL 323103383L02


DATE: 02-02-2024

In WEKA, there are several datasets that come pre-installed with the software, and they cover
various domains such as classification, regression, clustering, and association rule mining and
some of the data sets are.
1. Iris: Contains measurements of iris flowers for classification into three species
2. Titanic: Passenger information from the Titanic ship for survival prediction.
3. Weather: Daily weather observations from Australian weather stations for rain
prediction
The ".arff" is an extension of the data files used in WEKA. .arff stands for Attribute-Relation
File Format. It is a plain text format specifically designed for storing data used in machine
learning and data mining tasks.

The German credit filter is a statistical model used to predict the likelihood of an individual
defaulting on a loan. It is based on a dataset of historical creditworthiness data from Germany
The model uses this information to assign a score to each individual, which represents their
estimated risk of default. This score can then be used by lenders to decide whether or not to
approve a loan application.
Various kinds of tools can be used on the dataset
Attributes are essentially features or characteristics that describe each data point (instance) in
the dataset. They capture different aspects of the data and contribute to the overall insights you
can extract.

K SAMPATH VISHAL 323103383L02


DATE: 02-02-2024

▪ Preprocessing: Filters clean and prepare your data (discretization, normalization,


missing value handling, etc.).
▪ Classification: Classifiers build models to predict class labels of new data (Naive Bayes,
Decision Tree, KNN, SVM, etc.).
▪ Clustering: Algorithms group data based on similarities (K-Means, Hierarchical,
DBSCAN, etc.).
▪ Association Rule Learning: Tools discover relationships between attributes, identifying
patterns.
▪ Visualization: Explore data and analysis results with scatter plots, histograms,
confusion matrices, etc.

K SAMPATH VISHAL 323103383L02

You might also like