0% found this document useful (0 votes)

80 views21 pages

Sample Doc Final

This document is a project report submitted by four students to fulfill the requirements for a Bachelor of Engineering degree in Computer Science and Engineering. The project aims to cluster high dimensional data using signatures with hashing techniques. It involves assigning random numbers to data points, finding dense units in each dimension, calculating hash signatures of dense units by summing random numbers, storing signatures in a hash table to eliminate redundant clusters, and running DBSCAN clustering algorithm on the non-redundant dense units to find final clusters.

Uploaded by

saranya n r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views21 pages

Sample Doc Final

Uploaded by

saranya n r

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 21

CLUSTERING HIGH DIMENSIONAL DATA USING

SIGNATURES WITH HASHING TECHNIQUE

A PROJECT REPORT

Submitted by

ANUSHA S 1305066
KIRUTHIKA M 1305084
PRATHEEBA R 1305097
SWETHA M 1305116

in partial fulfillment for the award of the degree

BACHELOR OF ENGINEERING
in

COMPUTER SCIENCE AND ENGINEERING

COIMBATORE INSTITUTE OF TECHNOLOGY

(Government Aided Autonomous Institution Affiliated to Anna University)
COIMBATORE-641014

ANNA UNIVERSITY-CHENNAI 600 025

APRIL 2017
COIMBATORE INSTITUTE OF TECHNOLOGY
(A Govt. Aided Autonomous Institution Affiliated to Anna University)
COIMBATORE - 641014

BONAFIDE CERTIFICATE

Certified that this project CLUSTERING HIGH DIMENSIONAL DATA USING SIGNATURES
WITH HASHING TECHNIQUE is the bonafide work of ANUSHA S (1305066), KIRUTHIKA M
(1305084), PRATHEEBA R (1305097), SWETHA M (1305116) under my supervision during the
academic year 2016-2017.

Prof. K.S.PALANISAMY, M.E Mrs. S.PRIYA,M.E

HEAD OF THE DEPARTMENT, SUPERVISOR,
Department of CSE & IT, Department of CSE & IT,
Coimbatore Institute of Technology, Coimbatore Institute of Technology,
Coimbatore 641014 Coimbatore - 641014

Certified that the candidates were examined by us in the project work viva-voce
examination held on

Internal Examiner External Examiner

Place:

Date:
ACKNOWLEDGEMENT

We express our sincere thanks to our Secretary Dr.R.Prabhakar and our Principal

Dr.V.Selladurai for providing us a greater opportunity to carry out our work. The following words are

rather very meagre to express our gratitude to them. This work is the outcome of their inspiration and

product of plethora of their knowledge and rich experience.

We record the deep sense of gratefulness to Prof.K.S.Palanisamy, Head of the Department of

Computer Science and Engineering & Information Technology, for his encouragement during this tenure.

We equally tender our sincere thankfulness to our project guide Mrs.S.Priya, Department of

Computer Science and Engineering& Information Technology, for her valuable suggestions and guidance

during this course.

During the period of study, the entire staff members of the Department of Computer Science and

Engineering & Information Technology have offered ungrudging help. It is also a great pleasure to

acknowledge the unfailing help we have received from our friends.

It is matter of great pleasure to thank our parents and family members for their constant support

and co-operation in the pursuit of this endeavor.

ABSTRACT

Rapid growth of high dimensional datasets have created an indispensible need of

analysing the underlying patterns. Clustering is an approach to find these underlying
patterns of interest. Clustering high dimensional data is a challenging task as data group
together differently under different subsets of dimensions, called subspaces. Subspace
clustering algorithms tries to extract the clusters but they require excessive database scans
and are subdued by redundant clusters thus making it computationally large. The proposed
system involves assigning a unique random number to each data point and incrementally
finding dense units in each dimension. The sum of the random numbers of data points in
each dense unit is calculated and stored in a hash table. Redundant clusters are implied by
same sum value in the hash table and are thus eliminated. Dbscan algorithm is run over
the non redundant dense units to find the final clusters.
LIST OF ABBREVIATIONS

DB Database of points
D Set of attributes / Dimensions
S Subspace
N Neighbourhood of a point in a subspace S
C Cluster
CS Core set
U Dense unit
H Hash Signature of a dense unit
L Large Integer
HTable Hash table
PCA Principal Component Analysis
1. INTRODUCTION

1.1 Data Mining

Data mining is the computational process of discovering patterns in large relational databases and
summarizing it into useful information. The overall goal of the data mining process is to extract
information from a data set and transform it into an understandable structure for further use.

Data mining consists of five major elements:

Extract, transform, and load transaction data onto the data warehouse system.

Store and manage the data in a multidimensional database system.

Provide data access to business analysts and information technology professionals.

Analyze the data by application software.

Present the data in a useful format, such as a graph or table.

1.2 Data Mining Techniques

1.2.1 Association

Association enables the discovery of interesting relations between different variables in large
databases. Association rule learning uncovers hidden patterns in the data that can be used to identify
variables within the data and the co-occurences of different variables that appear with the greatest
frequencies.

1.2.2 Clustering

Clustering is a data mining technique that makes a meaningful or useful cluster of objects which
have similar characteristics using the automatic technique.

1.2.3 Classification

Classification is used to classify each item in a set of data into one of a predefined set of classes
or groups. Classification method makes use of mathematical techniques such as decision trees, linear
programming, neural network and statistics
1.2.4 Prediction

The prediction is one of a data mining techniques that discovers the relationship between
independent variables and relationship between dependent and independent variables.

1.2.5 Sequential Patterns

Sequential patterns analysis seeks to discover or identify similar patterns, regular events or trens
in transaction data over a business period.

1.3 Clustering

Clustering is the task of grouping a set of objects in such a way that objects in the same group are
more similar to each other than to those in other groups. Cluster is a group of objects that belongs to the
same class.

Clustering methods can be classified into the following categories

Partitioning Method

Hierarchical Method

Density-based Method

Grid-Based Method

Constraint-based Method
1.3.1 Partitioning Method
For a database of n objects, the partitioning method constructs k partition of data. Each
partition will represent a cluster and k will be less than or equal to n. It means that it will classify the
data into k groups, which satisfy the following requirements

Each group contains at least one object.

Each object must belong to exactly one group.

1.3.2 Hierarchical Methods

This method creates a hierarchical decomposition of the given set of data objects. There are two
approaches,

Agglomerative Approach

Divisive Approach
1.3.2.1 Agglomerative Approach
This method is started with each object forming a separate group. It keeps on merging the
objects or groups that are close to one another. It keeps on doing so until all of the groups are
merged into one or until the termination condition holds.

1.3.2.2 Divisive Approach

This approach is started with all of the objects in the same cluster. In the continuous
iteration, a cluster is split up into smaller clusters. It is down until each object in one cluster or the
termination condition holds.

1.3.3 Density-based Method

This method is based on the notion of density. The basic idea is to continue growing the
given cluster as long as the density in the neighborhood exceeds some threshold, i.e., for each data
point within a given cluster, the radius of a given cluster has to contain at least a minimum number
of points.

1.3.4 Grid-based Method

The objects together form a grid. The object space is quantized into finite number of cells that
form a grid structure

1.3.5 Constraint-based Method

In this method, the clustering is performed by the incorporation of user or application-oriented
constraints. A constraint refers to the user expectation or the properties of desired clustering results.
Constraints can be specified by the user or the application requirement.

1.4 Applications of Clustering

Clustering analysis is broadly used in many applications such as market research, pattern
recognition, data analysis, and image processing.

Clustering can also help marketers discover distinct groups in their customer base. And they can
characterize their customer groups based on the purchasing patterns.

Clustering also helps in classifying documents on the web for information discovery.

Clustering is also used in outlier detection applications such as detection of credit card fraud.

1.5 Hashing

The hash function is used to index the original value or key and then used later each time the
data associated with the value or key is to be retrieved. A hash table uses a hash function to compute an
index into an array of buckets or slots, from which the desired value can be found.

1.5.1 Advantage

The main advantage of hash tables over other table data structures is speed especially when the
number of entries is large. If the set of key-value pairs is fixed and known ahead of time the average
lookup cost can be reduced by a careful choice of the hash function, bucket table size, and internal data
structures.
2. SYSTEM SPECIFICATION

The hardware and software for the system is selected by considering the factors such as CPU processing
speed, peripheral channel speed, printer speed, seek time, relational delay of hard disk and
communication speed etc. The hardware and software specifications are as follows.

2.1 Hardware Specification

Processor : Intel Pentium III or more

Speed : 2.10 GHZ
RAM : 2 GB
Monitor : 15TFT
Keyboard : 104 keys window keyboard
Mouse : Optical mouse

Table: 2.1 Hardware Specifications

2.2 Software Specification

Operating system : Windows 7 or more

Language : JAVA
IDE : NETBEANS 8.0

Table: 2.2 Software Specifications

3. SYSTEM ANALYSIS

3.1 LITERATURE REVIEW

3.1.1 SPARSE SUBSPACE CLUSTERING: ALGORITHM, THEORY, AND APPLICATIONS.

DESCRIPTION:

In Ordered Subspace clustering, the number of clusters to be formed are not known apriori.Each
data point in a union of subspaces can always be written as a linear combination of all other points. A
block diagonal matrix is formed from the data. The columns that are within a segment will be the zero
vector or very close to it because columns from the same subspace share similarity. Columns that greatly
deviate away from the zero vector indicate the boundary of a segment as the similarity is low.

3.1.2 AUTOMATIC SUBSPACE CLUSTERING OF HIGH DIMENSIONAL DATA FOR DATA

MINING APPLICATIONS

DESCRIPTION:

CLIQUE, a clustering algorithm identifies dense clusters in subspaces of maximum

dimensionality. The subspaces that contain clusters are identified using bottom - up algorithm which
finds dense units. The dense subspaces are sorted by coverage and the subspace with greatest coverage
are kept and the rest are pruned. For the disjoint sets of connected k-dimensional units in the same
subspace minimal cluster descriptions are generated. By taking as input a cover for each cluster, a
minimal cover is found.

LIMITATIONS:

CLIQUE algorithm does not perform well when the number of dimensions increases.

3.1.3 HIGH DIMENSIONAL DATA: SUBSPACE CLUSTERING - A REVIEW

DESCRIPTION:

Data set is projected in all dimensions. Histograms are constructed over the dimensions. Sparse
histogram do not contribute to the cluster. Dimensions having dense histogram are combined recursively.
DFS is applied to find maximal region that represents a cluster.
3.2 PROPOSED SYSTEM

The proposed system uses the commonality of data points across dimensions as the key step in
Cluster contribution thereby by-passing the generation of clusters across increasing combinations of
dimensions. Clusters generated in 1-D having same signatures imply a maximal subspace cluster where
the subspace is a set of those particular dimensions. However same signatures for 1-D do not always
imply a maximal subspace cluster because there are chances that a cluster in 1-D can have interleaved
dense units of clusters in other dimensions. So a dense unit in 1-D is split into combinations of dense
units of comparatively smaller size TAU+1. These combinations of dense units form the Core Set. Dense
units from the Core Set can now be tested for same signatures to find maximal subspace cluster.
DBSCAN can be run across all the maximal subspace clusters so formed to find the final maximal
clusters.

3.2.1 Commonality of data points

For the dense sets of points in each of the 1-dimensional projections of the attribute-set of a
given data, the sufficiently common points among these 1-dimensional sets will lead to the dense points in
the higher dimensional subspaces.

3.2.2 Assigning Point ID and Labels

Each row of database is assigned a point ID. A random number is generated for each point
ID and maintained in a table so that it can be used during assignment of signatures. Every point is added
to the point vector. All the point vectors are added to a data matrix.

3.2.3 Finding Dense Units

A point is said to be dense if it has at least tau points within epsilon neighborhood. Epsilon
is a distance measure between two points. These dense units can be connected together to form a cluster.

3.2.4 Finding Core Set

A dense unit in 1-D is split into combinations of dense units of comparatively smaller size
TAU+1. These combinations of dense units form the Core Set. Dense units from the Core Set can now be
tested for same signatures to find maximal subspace cluster.
3.2.5 Hashing

The method of assigning signatures to each of these 1-D dense units is to avoid comparing
the individual points among all dense units in order to decide whether they contain exactly same points or
not. We can hash the signatures of these 1-D dense units from all k-dimensions.

3.2.6 Maximal Subspace Clusters and DBSCAN

The hash table is checked for collision now. The resulting collisions will lead to the
maximal subspace dense units. These dense units are given as input to DBSCAN algorithm to obtain
clusters.

Block Diagram:

3.3 Features

It involves bottom up approach since the number of clusters and number of subspaces
are not known priori.
This algorithm gives only non redundant information, (i.e) it assigns signatures to the
data points thereby generating non redundant clusters.
4. DESIGN

4.1 Input Module

Input is obtained as database containing information about patients with diabetes disease. The
database is normalized using WEKA and stored as .csv format.

4.2 Processing Modules

4.2.1 Find dense units

Each row of database is assigned a point ID. A random number is generated for each point
ID and maintained in a table. A point is said to be dense in a data matrix if it has at least tau points within
epsilon neighborhood within a single dimension. Epsilon is a distance measure between two points.
These dense units can be connected together to form a cluster.

4.2.2 Find core set

4.2.3 Signature assignment

4.2.4 Maximal subspace cluster generation

The resulting collisions in the hash table will lead to the maximal subspace dense units.
These dense units are given as input to DBSCAN algorithm to obtain clusters.

4.3 Output Module

The output is a set of maximal subspace clusters from all the dimensions.
5. SOFTWARE DESCRIPTION

5.1 Net Beans IDE

The Smarter and Faster Way to Code

Net Beans is a software development platform written in Java. The Net

Beans Platform allows applications to be developed from a set of modular software
components called modules. Applications based on the Net Beans Platform, including the Net
Beans integrated development environment (IDE), can be extended by third party developers. The
Net Beans IDE is primarily intended for development in Java, but also supports other languages, in
particular PHP, C/C++ and HTML5. Net Beans is cross-platform and runs on Microsoft
Windows, Mac OS X, Linux, Solaris and other platforms supporting a compatible JVM. The Net
Beans Team actively support the product and seek feature suggestions from the wider community.
Every release is preceded by a time for Community testing and feedback.

Best Support for Latest Java Technologies

Net Beans IDE is the official IDE for Java 8. With its editors, code analyzers, and
converters, you can quickly and smoothly upgrade your applications to use new Java 8 language
constructs, such as lambdas, functional operations, and method references. Batch analyzers and
converters are provided to search through multiple applications at the same time, matching patterns
for conversion to new Java 8 language constructs. With its constantly improving Java Editor, many
rich features and an extensive range of tools, templates and samples, Net Beans IDE sets the
standard for developing with cutting edge technologies out of the box.

5.2 Creating, Editing, and Refactoring

The IDE provides wizards and templates to let you create Java EE, Java SE, and Java ME
applications. A variety of technologies and frameworks are supported out of the box. For example, you
can use wizard and templates to create applications that use the OSGi framework or the NetBeans module
system as the basis of modular applications. The language-aware NetBeans editor detects errors while
you type and assists you with documentation popups and smart code completionall with the speed and
simplicity of your favorite lightweight text editor.
5.3 Building

Out of the box, the IDE provides support for the Maven and Ant build systems. In the New
Project wizard, when you choose to create a new application, you can choose to create Maven-based or
Ant-based applications. You can open Maven-based applications into the IDE without an import process
because the IDE reads project settings from the Maven POM file. In addition, tools are provided for
importing Ant-based projects that were not created in the IDE. The IDE includes a Maven Repository
Browser, as well as graphs for analying Maven dependencies.

5.4 Debugging and Profiling

To identify and solve problems in your applications, such as deadlocks and memory leaks, the
IDE provides a feature rich debugger and profiler.

5.5 Testing and Code Analysis

When you are testing your applications, the IDE provides tools for using JUnit and TestNG, as
well as code analyzers and, in particular, integration with the popular open source FindBugs tool.

Rapid User Interface Development

Design GUIs for Java SE, HTML5, Java EE, PHP, C/C++, and Java ME applications quickly and
smoothly by using editors and drag-and-drop tools in the IDE. For Java SE applications, the NetBeans
GUI Builder automatically takes care of correct spacing and alignment, while supporting in-place editing,
as well. The GUI builder is so easy to use and intuitive that it has been used to prototype GUIs live at
customer presentations.
6. IMPLEMENTATION

6.1 Function

Assign a set of random integers to each data point in the database. Sort the points in the database
so that there is no need to scan the database each time when the first two values itself not in the range of
epsilon distance. Find the core set by using epsilon as a distance measure between the points in each
dimension such that there are tau number of points in the neighborhood.

A set K of n large integers are randomly generated and used as a one-to-one mapping M : DB
K to assign a unique label to each point in the database. The signature H of a dense unit U is given by the
sum of the labels of the points in it. These 1-D signatures across different dimensions can be matched
without checking for the individual points contained in these dense units. The signature-sums are hashed
to a hash table. If all the sums collide then these dense units are same (with very high probability) and
exist in the subspace {d1 , d2 , . . . , dm }. Thus, the final collisions after hashing all dense units in all
dimensions generate dense units in the relevant maximal subspaces. These dense units are combined to
get final clusters in their respective subspaces.

PROCEDURE:
The maximal subspace cluster is generated using this procedure.

STEP 1:

Consider a set, K of very large, unique and random positive integers {K1 , K2 , . . . , Kn }. We
define M as a one-to-one mapping function, M : DB K.Each point Pi DB is assigned a unique
random integer Ki from the set K.

STEP 2:

In each dimension j, we have projections of n -points, P1 , P2 , . . . , Pn . We create all possible

dense units containing + 1 points that are within an distance. Instead of the actual points, a dense unit U
will now contain the mapped keys i.e. {K1 , K2 , . . . , K +1 }.

STEP 3:

Then, we create a hash table hTable, as follows. In each dimension j, for everydense unit Ua we
calculate the sum of its elements called signature, Ha and hash this signature in hTable. If Ha collides
with anothersignature Hb then the dense unit Ua exists in subspace {j, k} with extremely high k
probability. After repeating this process in all single dimensions, each entry of this hash table will contain
a dense unit in the maximal subspace as we can store the colliding dimensions against each signature Hi
hTable.

STEP 4:

The dense units in all possible maximal subspaces are processed to create density-reachable sets
and hence, maximal clusters. We use DBSCAN in each found subspace for the clustering process and the
value of and can be adapted differently as per the dimensionality of the subspace to deal with the curse
of dimensionality.

6.1.2 Flow Chart

7. ADVANTAGES

7.1 Advantages

All the clustering algorithms find the common points in each dimensions but this approach finds
the interleaved dense units.

Unlike other algorithms, it uses a method of assigning signature so that repeated clusters are not
formed.

We need to scan the database only once unlike other algorithm which requires n database scans
for n dimensions.

It runs efficiently on high dimensional databases.

Since the data points are sorted before finding dense units there is no need to scan all the data
points in a dimension when the first two points itself not contribute to the cluster.

The computation of signatures makes this approach effective.

This approach finds dense units only in single dimension which can be combined together to form
multiple dimensions.

The number of database scans and the number of clusters to be generated are not estimated before
runtime.
8. CONCLUSION

The generation of large and high dimensional data in the recent few years has over-whelmed the
data mining community. This approach efficiently finds the quality subspace clusters without expensive
database scans or generating trivial clusters in between. We have compared SUBSCALE algorithm
against recent subspace clustering algorithms and our proposed algorithm has performed far better when
it comes to handling high-dimensional datasets. However, the main cost in our SUBSCALE algorithm is
the computation of the candidate 1-dimensional dense units. In addition to splitting the hash table
computation, SUBSCALE has a high degree of parallelism as there is no dependency in computing dense
units across multiple dimensions. We plan to implement a parallel version of our algorithm on General
Purpose Graphics Processing Units (GPGPU) in the near future.
9. REFERENCES

1. Elhamifar E, Vidal R (2013) Sparse subspace clustering: algorithm, theory, and

applications. IEEE Trans Pattern Anal Mach Intell 35(11):27652781.

2. Tierney S, Gao J, Guo Y (2014) Subspace clustering for sequential data. In:
Computer vision and pattern recognition(CVPR), 2014 IEEE conference On. IEEE.
pp 10191026.

3. ParsonsL,HaqueE,LiuH(2004)Subspace clustering for high dimensional data :

ACMSIGKDDExplorNewsl 6(1):90105.

4. VidalR(2011)Subspaceclustering.IEEESignalProcMag28(2):5268.

5. AgrawalR,GehrkeJ,GunopulosD Automatic subspace clustering of high

dimensional data for datamining applications.

Unit 5
No ratings yet
Unit 5
27 pages
SBC 4 R 03
No ratings yet
SBC 4 R 03
359 pages
Personalization Guide
No ratings yet
Personalization Guide
87 pages
DM 5th Unit
No ratings yet
DM 5th Unit
54 pages
Unit 1 Mining
No ratings yet
Unit 1 Mining
15 pages
18mca52c U1
No ratings yet
18mca52c U1
17 pages
Comparison of Different Clustering Algorithms Using WEKA Tool
No ratings yet
Comparison of Different Clustering Algorithms Using WEKA Tool
3 pages
Paper - Xvii Data Mining and Warehousing
No ratings yet
Paper - Xvii Data Mining and Warehousing
140 pages
BCA Data Mining
No ratings yet
BCA Data Mining
116 pages
Gokaraju Rangaraju Institute of Engineering and Technology
No ratings yet
Gokaraju Rangaraju Institute of Engineering and Technology
49 pages
Data Mining-Unit IV
No ratings yet
Data Mining-Unit IV
15 pages
CLUSTER ANALYSIS Unit 3 Data Mining
No ratings yet
CLUSTER ANALYSIS Unit 3 Data Mining
84 pages
Chapter 5
No ratings yet
Chapter 5
43 pages
UNIT 3 DWDM Notes
No ratings yet
UNIT 3 DWDM Notes
32 pages
Data Clustering Seminar
No ratings yet
Data Clustering Seminar
34 pages
Distance Based Pattern Driven Mining For Outlier Detection in High Dimensional Big Dataset
No ratings yet
Distance Based Pattern Driven Mining For Outlier Detection in High Dimensional Big Dataset
17 pages
Screenshot 2024-05-17 at 3.30.05 PM
No ratings yet
Screenshot 2024-05-17 at 3.30.05 PM
31 pages
Comparative Study of Data Mining Tools
No ratings yet
Comparative Study of Data Mining Tools
8 pages
DM Module 4
No ratings yet
DM Module 4
17 pages
Unit-3 DWDM 7TH Sem Cse
No ratings yet
Unit-3 DWDM 7TH Sem Cse
54 pages
A06-A Survey of Clustering Techniques
No ratings yet
A06-A Survey of Clustering Techniques
5 pages
DM Unit 5
No ratings yet
DM Unit 5
15 pages
UNIT-1 Introduction To Data Mining
No ratings yet
UNIT-1 Introduction To Data Mining
29 pages
Data Mining System and Applications A Re
No ratings yet
Data Mining System and Applications A Re
13 pages
Data Mining - UNIT-IV
No ratings yet
Data Mining - UNIT-IV
24 pages
Cluster Analysis
No ratings yet
Cluster Analysis
36 pages
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
No ratings yet
Data Warehousing & Data Mining Syllabus Subject Code:56055 L:4 T/P/D:0 Credits:4 Int. Marks:25 Ext. Marks:75 Total Marks:100
52 pages
Amity School of Engineering and Technology: Submitted To
No ratings yet
Amity School of Engineering and Technology: Submitted To
28 pages
Unit Iv
No ratings yet
Unit Iv
14 pages
Unit 4 - Part 2
No ratings yet
Unit 4 - Part 2
45 pages
Clustering Techniquesin Data Mining
No ratings yet
Clustering Techniquesin Data Mining
7 pages
ML Unit 4 Notes - NJ
No ratings yet
ML Unit 4 Notes - NJ
15 pages
IJARCCE3E S Ranshul A Survey
No ratings yet
IJARCCE3E S Ranshul A Survey
2 pages
CS-DM Module - 1
No ratings yet
CS-DM Module - 1
27 pages
CSC 425 Data Mining and Warehousing 2024
No ratings yet
CSC 425 Data Mining and Warehousing 2024
54 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
47 pages
Data Mining Clustering Techniques
No ratings yet
Data Mining Clustering Techniques
3 pages
Unit-1 Notes
No ratings yet
Unit-1 Notes
24 pages
Clustering Techniques in Data Mining
No ratings yet
Clustering Techniques in Data Mining
7 pages
Sakhr - Chaib - Paper On Data Mining
No ratings yet
Sakhr - Chaib - Paper On Data Mining
3 pages
Data Mining Implementation
No ratings yet
Data Mining Implementation
9 pages
An Enhanced Clustering Algorithm To Analyze Spatial Data: Dr. Mahesh Kumar, Mr. Sachin Yadav
No ratings yet
An Enhanced Clustering Algorithm To Analyze Spatial Data: Dr. Mahesh Kumar, Mr. Sachin Yadav
3 pages
Data Mining
No ratings yet
Data Mining
8 pages
Data Mining
No ratings yet
Data Mining
14 pages
DMW - Unit 1
No ratings yet
DMW - Unit 1
21 pages
An Efficient Clustering Method To Find Similaritybetween The Documents
No ratings yet
An Efficient Clustering Method To Find Similaritybetween The Documents
4 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
13 pages
Data Mining and Data Warehousing
No ratings yet
Data Mining and Data Warehousing
9 pages
Clustering Unit4
No ratings yet
Clustering Unit4
9 pages
Fundamentals of Data Science Unit 3
No ratings yet
Fundamentals of Data Science Unit 3
15 pages
The Survey of Data Mining Applications and Feature Scope
No ratings yet
The Survey of Data Mining Applications and Feature Scope
16 pages
Complete Doc - Lavanya
No ratings yet
Complete Doc - Lavanya
95 pages
Unit-3 DWDM
No ratings yet
Unit-3 DWDM
39 pages
Data Mining Module - New
No ratings yet
Data Mining Module - New
38 pages
A Novel Methodology For Discrimination Prevention in Data Mining
No ratings yet
A Novel Methodology For Discrimination Prevention in Data Mining
21 pages
DWM Merged
No ratings yet
DWM Merged
125 pages
Data Moning Seminar Report
No ratings yet
Data Moning Seminar Report
12 pages
Data Mining 1 2 and 3
No ratings yet
Data Mining 1 2 and 3
20 pages
Data Warehouse Presentation
No ratings yet
Data Warehouse Presentation
28 pages
Introduction to Robotics
From Everand
Introduction to Robotics
Swarnalata Verma
No ratings yet
Cloud-Based Multi-Modal Information Analytics
From Everand
Cloud-Based Multi-Modal Information Analytics
Tanushri Kaniyar
No ratings yet
Sentiment Classification of Reviews Using Sentiwordnet: 9Th. It & T Conference
No ratings yet
Sentiment Classification of Reviews Using Sentiwordnet: 9Th. It & T Conference
10 pages
Sentence Improvement
No ratings yet
Sentence Improvement
20 pages
Sentiment Analysis Tutorial: Prof. Ronen Feldman Hebrew University, JERUSALEM Digital Trowel, Empire State Building
No ratings yet
Sentiment Analysis Tutorial: Prof. Ronen Feldman Hebrew University, JERUSALEM Digital Trowel, Empire State Building
248 pages
Coding Decoding
No ratings yet
Coding Decoding
5 pages
Sentiment Classification On Product Reviews and Dynamic Tweets
No ratings yet
Sentiment Classification On Product Reviews and Dynamic Tweets
1 page
01DEC16 T2P1 (WWW - Qmaths.in)
No ratings yet
01DEC16 T2P1 (WWW - Qmaths.in)
28 pages
RPC Client Server For Calculator Program
No ratings yet
RPC Client Server For Calculator Program
3 pages
RPC Client Server For Movie Ticket Booking
No ratings yet
RPC Client Server For Movie Ticket Booking
3 pages
Lex Yacc Code To Find Declared Initialised Used Variables
No ratings yet
Lex Yacc Code To Find Declared Initialised Used Variables
3 pages
Rmi - Study: Ex. No: 1 5.12.2016
No ratings yet
Rmi - Study: Ex. No: 1 5.12.2016
5 pages
Lex Yacc Code To Find Declared Initialised Used Variables
No ratings yet
Lex Yacc Code To Find Declared Initialised Used Variables
3 pages
2KN and 4K Families Host Mode Programming Manual (ENG)
No ratings yet
2KN and 4K Families Host Mode Programming Manual (ENG)
72 pages
Waste Management System: Dbms Project 15CSE302
No ratings yet
Waste Management System: Dbms Project 15CSE302
13 pages
Chapter 3 Handout
No ratings yet
Chapter 3 Handout
23 pages
Ragavan Kasturirangan Sir Speech
No ratings yet
Ragavan Kasturirangan Sir Speech
16 pages
Huawei Fusionserver Rh2288h v3 Rack Server Brochure
No ratings yet
Huawei Fusionserver Rh2288h v3 Rack Server Brochure
2 pages
GC 2024 10 30
No ratings yet
GC 2024 10 30
14 pages
Sfha Notes 603 Aix
No ratings yet
Sfha Notes 603 Aix
164 pages
How To Upgrade The Software
No ratings yet
How To Upgrade The Software
2 pages
AWS Scenario Based
No ratings yet
AWS Scenario Based
16 pages
June, 2008 2-1 Workcentre 5222, 5225, 5230 Status Indicator Raps Revision
No ratings yet
June, 2008 2-1 Workcentre 5222, 5225, 5230 Status Indicator Raps Revision
14 pages
Vega en
No ratings yet
Vega en
58 pages
General Protection Fault: Virtual Memory
No ratings yet
General Protection Fault: Virtual Memory
4 pages
Embedded Wireless Communication Using Irda Standard Protocol
No ratings yet
Embedded Wireless Communication Using Irda Standard Protocol
64 pages
CH 1 IAR For STM8
No ratings yet
CH 1 IAR For STM8
13 pages
SW OS
No ratings yet
SW OS
11 pages
8085 Microprocessor - Block Diagram
No ratings yet
8085 Microprocessor - Block Diagram
5 pages
Input:: / AH 0 - Set Video Mode
No ratings yet
Input:: / AH 0 - Set Video Mode
6 pages
The Python Standard Library
No ratings yet
The Python Standard Library
8 pages
Chapter 1 Critical Thinking Answers
No ratings yet
Chapter 1 Critical Thinking Answers
12 pages
PartA GRVII EKYA - CMRNPS CS TEE1 SEM1 08OCT2021
No ratings yet
PartA GRVII EKYA - CMRNPS CS TEE1 SEM1 08OCT2021
4 pages
Chapter 3 Bus Designing and Address Decoding
No ratings yet
Chapter 3 Bus Designing and Address Decoding
26 pages
Wget - Curl Large File From Google Drive - Stack Overflow
No ratings yet
Wget - Curl Large File From Google Drive - Stack Overflow
29 pages
Arduino Microwire EEPROM Reader
No ratings yet
Arduino Microwire EEPROM Reader
2 pages
Terminal Services and Citrix
No ratings yet
Terminal Services and Citrix
6 pages
Pert Master For Primavera and Contractor
No ratings yet
Pert Master For Primavera and Contractor
47 pages
TMS TDBAdvGrid PDF
No ratings yet
TMS TDBAdvGrid PDF
11 pages
Blockchain Technology Topic: CAP Theorem and Blockchain: Basaveshwar Engineering College (Autonomous) Bagalkot
No ratings yet
Blockchain Technology Topic: CAP Theorem and Blockchain: Basaveshwar Engineering College (Autonomous) Bagalkot
7 pages
H12-211 Study Material - H12-211 Practice Test
75% (4)
H12-211 Study Material - H12-211 Practice Test
16 pages
SQL Server 2016
No ratings yet
SQL Server 2016
27 pages