0% found this document useful (0 votes)
53 views

Top 10 Open Source Data Mining Tools: A Brief Look at Mining Tasks

This document discusses various data mining tools and techniques. It describes preprocessing tasks like removing noise and filling in missing values. It also outlines common mining tasks like clustering, classification, outlier analysis, associative analysis, and regression. Clustering partitions data into related subgroups, classification assigns categories to data, and associative analysis finds hidden relationships. The document lists proprietary tools like Sisense and SSDT and open source tools like Weka, RapidMiner, Orange, Apache Mahout, R, and Rattle for performing data mining.

Uploaded by

Malathi T
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Top 10 Open Source Data Mining Tools: A Brief Look at Mining Tasks

This document discusses various data mining tools and techniques. It describes preprocessing tasks like removing noise and filling in missing values. It also outlines common mining tasks like clustering, classification, outlier analysis, associative analysis, and regression. Clustering partitions data into related subgroups, classification assigns categories to data, and associative analysis finds hidden relationships. The document lists proprietary tools like Sisense and SSDT and open source tools like Weka, RapidMiner, Orange, Apache Mahout, R, and Rattle for performing data mining.

Uploaded by

Malathi T
Copyright
© © All Rights Reserved
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Top 10 open source data mining tools

A brief look at mining tasks

Pre-processing: This involves all the preliminary tasks that can help in getting started with any of
the actual mining tasks. Pre-processing could be removing anomalies and noise from the data
that’s about to be mined, filling in missing values, normalising the data or compressing data
using techniques like generalisation and aggregation.

Clustering: This is partitioning a huge set of data into related sub-classes.


Classification: This is tagging or classifying data items into different user-defined categories.
Outlier analysis helps in identifying those data elements which are deviant or distant from the
rest of the elements in a dataset. This can help in anomaly detection.

Associative analysis helps in bringing out hidden relationships among data items in a large data
set. This can help in predicting the occurrence of a particular item in a transaction or an event
whenever some other item is present. You can think of this as a conditional probability.
Regression is used to predict values of a dependent variable by constructing a model or a
mathematical function out of independent variables.

Summarisation helps in coming up with a compact description for the whole data set.
Data mining is a combination of various techniques like pattern recognition, statistics, machine
learning, etc. While there is a good amount of intersection between machine learning and data
mining, as both go hand in hand and machine learning algorithms are used for mining data, we
will restrict ourselves in this article to only those tools specialised for data mining.

https://www.softwaretestinghelp.com/data-mining-tools/

https://opensourceforu.com/2017/03/top-10-open-source-data-mining-tools/
Data Mining Tools
1. Sisense Licensed
2. SSDT (SQL Server Data Tools) Licensed
3. Oracle Data Mining Proprietary License
4. IBM Cognos Proprietary License
5. IBM SPSS Modeler Proprietary License
6. SAS Data Mining Proprietary License

1. Weka Free software


2. Rapid Miner Open source
3. Orange Open source
4. Apache Mahout Open source
5. R Data mining Free software
6. Rattle Open source
Rattle, expanded to ‘R Analytical Tool To Learn Easily’
7. Knime Open Source

1. MOA
Massive Online Analysis (MOA)
2. KEEL
KEEL (Knowledge Extraction for Evolutionary Learning)

You might also like