Week 1
Week 1
WEEK-1
In WEKA, there are several datasets that come pre-installed with the software, and they cover
various domains such as classification, regression, clustering, and association rule mining and
some of the data sets are.
1. Iris: Contains measurements of iris flowers for classification into three species
2. Titanic: Passenger information from the Titanic ship for survival prediction.
3. Weather: Daily weather observations from Australian weather stations for rain
prediction
The ".arff" is an extension of the data files used in WEKA. .arff stands for Attribute-Relation
File Format. It is a plain text format specifically designed for storing data used in machine
learning and data mining tasks.
The German credit filter is a statistical model used to predict the likelihood of an individual
defaulting on a loan. It is based on a dataset of historical creditworthiness data from Germany
The model uses this information to assign a score to each individual, which represents their
estimated risk of default. This score can then be used by lenders to decide whether or not to
approve a loan application.
Various kinds of tools can be used on the dataset
Attributes are essentially features or characteristics that describe each data point (instance) in
the dataset. They capture different aspects of the data and contribute to the overall insights you
can extract.