0% found this document useful (0 votes)

6 views

3_Introduction to Data (3)

Uploaded by

Garv tech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views

3_Introduction to Data (3)

Uploaded by

Garv tech

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 55

Outline

Attributes and Objects

Types of Data

Data Quality

Similarity and Distance

01/27/2021 Introduction to Data Mining, 2nd Edition 1

Tan, Steinbach, Karpatne, Kumar
What is Data?

Collection of data objects Attributes

and their attributes
An attribute is a property Tid Refund Marit Taxabl
Cheat
or characteristic of an al e
Status Income
object 1 Yes Single 125K No
– Examples: eye color of a 2 No Married 100K No
person, temperature, etc.

Objects
3 No Single 70K No
– Attribute is also known as
variable, field, characteristic, 4 Yes Married 120K No
dimension, or feature 5 No Divorced 95K Yes
A collection of attributes 6 No Married 60K No
describe an object 7 Yes Divorced 220K No
– Object is also known as 8 No Single 85K Yes
record, point, case, sample,
entity, or instance 9 No Married 75K No

10
10 No Single 90K Yes
Attribute Values

Attribute values are numbers or symbols assigned to

an attribute for a particular object

Distinction between attributes and attribute values

– Same attribute can be mapped to different attribute values
 Example: height can be measured in feet or meters

– Different attributes can be mapped to the same set of values

 Example: Attribute values for ID and age are integers

– But properties of attribute can be different than the

properties of the values used to represent the
attribute

01/27/2021 Introduction to Data Mining, 2nd Edition 3

Tan, Steinbach, Karpatne, Kumar
Attribute Types

Nominal: categories, states, or “names of things”

– Hair_color = {auburn, black, blond, brown, grey, red, white}
– marital status, occupation, ID numbers, zip codes
Binary
– Nominal attribute with only 2 states (0 and 1)
– Symmetric binary: both outcomes equally important
 e.g., gender
– Asymmetric binary: outcomes not equally important.
 e.g., medical test (positive vs. negative)

 Convention: assign 1 to most important outcome (e.g., HIV

positive)
Ordinal
– Values have a meaningful order (ranking) but magnitude between
successive values is not known.
– Size = {small, medium, large}, grades, army rankings
01/27/2021 Introduction to Data Mining, 2nd Edition 4
Tan, Steinbach, Karpatne, Kumar
Numeric Attribute Types

Interval
 Measured on a scale of equal-sized units
 Values have order
– E.g., temperature in C˚or F˚, calendar dates
 No true zero-point
Ratio
 Inherent zero-point
 We can speak of values as being an order of magnitude
larger than the unit of measurement (10 K˚ is twice as high as
5 K˚).
– e.g. length, counts, monetary quantities

01/27/2021 Introduction to Data Mining, 2nd Edition 5

Tan, Steinbach, Karpatne, Kumar 5
https://www.graphpad.com/support/faq/what-is-the-difference-between-ordinal-interval-and-ratio-
variables-why-should-i-care/
01/27/2021 Introduction to Data Mining, 2nd Edition 5
Tan, Steinbach, Karpatne, Kumar 6
01/27/2021 Introduction to Data Mining, 2nd Edition 7
Tan, Steinbach, Karpatne, Kumar
Discrete and Continuous Attributes

Discrete Attribute
– Has only a finite or countably infinite set of values
– Examples: zip codes, counts, or the set of words in a
collection of documents
– Often represented as integer variables.
– Note: binary attributes are a special case of discrete
attributes
Continuous Attribute
– Has real numbers as attribute values
– Examples: temperature, height, or weight.
– Practically, real values can only be measured and
represented using a finite number of digits.
– Continuous attributes are typically represented as floating-
point variables.
01/27/2021 Introduction to Data Mining, 2nd Edition 8
Tan, Steinbach, Karpatne, Kumar
Basic Statistical Descriptions of Data

Motivation
– To better understand the data: central tendency,
variation and spread
Data dispersion characteristics
– median, max, min, quantiles, outliers, variance, etc.
Numerical dimensions correspond to sorted intervals
– Data dispersion: analyzed with multiple granularities
of precision
– Boxplot or quantile analysis on sorted intervals
Dispersion analysis on computed measures
– Folding measures into numerical dimensions
– Boxplot orInqtroudaucntitoinletoaDantaaMlyinsinigs,2ondnEdthiteontransformed cube
01/27/2021 9 9
Tan, Steinbach, Karpatne, Kumar
Measuring the Central Tendency

 
n
1

Mean (algebraic measure) (sample vs. population): x
x  xi
Note: n is sample size and N is population size. n i 1
N

w
n
– Weighted arithmetic mean:
– Trimmed mean: chopping extreme values x  i1

Median:
– Middle value if odd number of values, or average of
the middle two values otherwise

– Estimated by interpolation (for grouped data):

n / 2  (  freq)l
median  L1  ( )width
freq median
Mode
– Value that occurs most frequently in the data

01/27/2021 Introduction to Data Mining, 2nd Edition 10

Tan, Steinbach, Karpatne, Kumar 10
Symmetric vs. Skewed
Data
Median, mean and mode of symmetric
symmetric, positively and
negatively skewed data

positively skewed negatively skewed

01/27/2021 IntroductionDatato ng:MinCoinncg,e2n

ptsdaEd
nd ition 11
DMataini catnhniquese, Kumar 11
Measuring the Dispersion of Data

Quartiles, outliers and boxplots

– Quartiles: Q1 (25th percentile of data below this point), Q3 (75th percentile)
– Inter-quartile range: IQR = Q3 – Q1
– Five number summary: min, Q1, median, Q3, max
– Boxplot: ends of the box are the quartiles; median is marked; add
whiskers, and plot outliers individually
– Outlier: usually, a value higher/lower than 1.5 x IQR
Variance and standard deviation (sample: s, population: σ)
– Variance: (algebraic, scalable computation)
1 n 1 n 2 1 n 2
 [ xi  ( xi ) ]
n n

(x x
1 1
s 
2
(xi  x) 
2

2
 ) 
2 2
2
n 1 i1 n 1 i1 n i1 N i1
i
N i1
i

– Standard deviation s (or σ) is the square root of variance s2 (or σ2)

01/27/2021 Introduction to Data Mining, 2nd Edition 12

Tan, Steinbach, Karpatne, Kumar 12
Boxplot Analysis

Five-number summary of a distributio n

– Minimum, Q1, Median, Q3, Maximum
Boxplot
– Data is represented with a box
– The ends of the box are at the first and
third quartiles, i.e., the height of the box is
IQR
– The median is marked by a line within the
box
– Outliers: points beyond a specified outlier
threshold, plotted individually.

01/27/2021 Introduction to Data Mining, 2nd Edition 12

Tan, Steinbach, Karpatne, Kumar 13
Example

01/27/2021 Introduction to Data Mining, 2nd Edition 14

Tan, Steinbach, Karpatne, Kumar
Example

01/27/2021 Introduction to Data Mining, 2nd Edition 15

Tan, Steinbach, Karpatne, Kumar
Graphic Displays of Basic Statistical Descriptions

Boxplot: graphic display of five-number summary

Histogram: x-axis are values, y-axis repres. frequencies
Quantile plot: each value xi is paired with fi indicating
that approximately 100 fi % of data are  xi
Quantile-quantile (q-q) plot: graphs the quantiles of one
univariant distribution against the corresponding quantiles
of another

Scatter plot: each pair of values is a pair of coordinates

and plotted as points in the plane

01/27/2021 Introduction to Data Mining, 2nd Edition 16

Tan, Steinbach, Karpatne, Kumar 16
01/27/2021 Introduction to Data Mining, 2nd Edition 17
Tan, Steinbach, Karpatne, Kumar
Histogram Analysis
Histogram: Graph display of
tabulated frequencies, shown as
40
bars
It shows what proportion of cases fall35
into each of several categories 30
Differs from a bar chart in that it is25
the area of the bar that denotes the20
value, not the height as in bar charts,
15
a crucial distinction when the
categories are not of uniform width 10
The categories are usually specified 5
as non-overlapping intervals of some 0
variable. The categories (bars) must
be adjacent

01/27/2021 Introduction to Data Mining, 2nd Edition 18

Tan, Steinbach, Karpatne, Kumar 18
Histogram vs. Bar Graph

01/27/2021 Introduction to Data Mining, 2nd Edition 19

Tan, Steinbach, Karpatne, Kumar
Histograms Often Tell More than Boxplots

 The two histograms shown in the left

may have the same boxplot
representation
 The same values for: min, Q1,
median, Q3, max
 But they have rather different data
distributions

01/27/2021 Introduction to Data Mining, 2nd Edition 20

Tan, Steinbach, Karpatne, Kumar 20
Quantile Plot

Displays all of the data (allowing the user to assess both

the overall behavior and unusual occurrences)
Plots quantile information
– For a data xi data sorted in increasing order, fi
indicates that approximately 100 fi% of the data are
below or equal to the value xi

01/27/2021 IntroductionDtaotaDMaintainMg:inCionngc,e2pntsdaEnddition 21
Tan, Steinbach, KaTrepcahtnnieq,uKesumar 21
Quantile-Quantile (Q-Q) Plot

Graphs the quantiles of one univariate distribution against the

corresponding quantiles of another
View: Is there is a shift in going from one distribution to another?
Example shows unit price of items sold at Branch 1 vs. Branch 2 for each
quantile. Unit prices of items sold at Branch 1 tend to be lower than those
at Branch 2.

01/27/2021 Introduction to Data Mining, 2nd Edition 22

Tan, Steinbach, Karpatne, Kumar 22
Scatter plot

Provides a first look at bivariate data to see clusters of

points, outliers, etc
Each pair of values is treated as a pair of coordinates and
plotted as points in the plane

01/27/2021 Introduction to Data Mining, 2nd Edition 22

Tan, Steinbach, Karpatne, Kumar 23
Positively and Negatively Correlated Data

The left half fragment is positively

correlated

The right half is negative correlated

01/27/2021 Introduction to Data Mining, 2nd Edition 24

Tan, Steinbach, Karpatne, Kumar 24
Uncorrelated Data

01/27/2021 Introduction to Data Minin g, 2nd Edition 25

Tan, Steinbach, Karpatne, Kumar 25
Important Characteristics of Data

– Dimensionality (number of attributes)

 High dimensional data brings a number of challenges

– Sparsity
 Only presence counts

– Resolution
 Patterns depend on the scale

– Size
 Type of analysis may depend on size of data

01/27/2021 Introduction to Data Mining, 2nd Edition 26

Tan, Steinbach, Karpatne, Kumar
Types of data sets
Record
– Data Matrix
– Document Data
– Transaction Data
Graph
– World Wide Web
– Molecular Structures
Ordered
– Spatial Data
– Temporal Data
– Sequential Data
– Genetic Sequence Data

01/27/2021 Introduction to Data Mining, 2nd Edition 27

Tan, Steinbach, Karpatne, Kumar
Record Data

Data that consists of a collection of records, each

of which consists of a fixed set of attributes
Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No

2 No Married 100K No
3 No Single 70K No
4 Yes Married 120K No
5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10

01/27/2021 Introduction to Data Mining, 2nd Edition 28

Tan, Steinbach, Karpatne, Kumar
Data Matrix

If data objects have the same fixed set of numeric

attributes, then the data objects can be thought of as
points in a multi-dimensional space, where each
dimension represents a distinct attribute

Such a data set can be represented by an m by n matrix,

where there are m rows, one for each object, and n
columns, one for each attribute
Projection Projection Distance Load Thickness
of x Load of y load

10.23 5.27 15.22 2.7 1.2

12.65 6.25 16.22 2.2 1.1

01/27/2021 Introduction to Data Mining, 2nd Edition 29

Tan, Steinbach, Karpatne, Kumar
Document Data

Each document becomes a ‘term’ vector

– Each term is a component (attribute) of the vector
– The value of each component is the number of times
the corresponding term occurs in the document.
coach

game
score

timeout

season
play
team

win
ball

lost
Document 1 3 0 5 0 2 6 0 2 0 2

Document 2 0 7 0 2 1 0 0 3 0 0

Document 3 0 1 0 0 1 2 2 0 3 0

01/27/2021 Introduction to Data Mining, 2nd Edition 30

Tan, Steinbach, Karpatne, Kumar
Transaction Data

A special type of data, where

– Each transaction involves a set of items.
– For example, consider a grocery store. The set of products
purchased by a customer during one shopping trip constitute a
transaction, while the individual products that were purchased
are the items.
– Can represent transaction data as record data

TID Items
1 Bread, Coke, Milk
2 Beer, Bread
3 Beer, Coke, Diaper, Milk
4 Beer, Bread, Diaper, Milk
5 Coke, Diaper, Milk
01/27/2021 Introduction to Data Mining, 2nd Edition 31
Tan, Steinbach, Karpatne, Kumar
Graph Data

Examples: Generic graph, a molecule, and webpages

2
5 1
2
5

Benzene Molecule: C6H6

01/27/2021 Introduction to Data Mining, 2nd Edition 32
Tan, Steinbach, Karpatne, Kumar
Ordered Data

Sequences of transactions
Items/Events

An element of
the sequence
01/27/2021 Introduction to Data Mining, 2nd Edition 33
Tan, Steinbach, Karpatne, Kumar
Ordered Data

Genomic sequence data

GGTTCCGCCTTCAGCCCCGCGCC
CGCAGGGCCCGCCCCGCGCCGTC
GAGAAGGGCCCGCCTGGCGGGCG
GGGGGAGGCGGGGCCGCCCGAGC
CCAACCGAGTCCGACCAGGTGCC
CCCTCTGCTCGGCCTAGACCTGA
GCTCATTAGGCGGCAGCGGACAG
GCCAAGTAGAACACGCGAAGCGC
TGGGCTGCCTGCTGCGACCAGGG

01/27/2021 Introduction to Data Mining, 2nd Edition 34

Tan, Steinbach, Karpatne, Kumar
Ordered Data

Spatio-Temporal Data

Average Monthly
Temperature of
land and ocean

01/27/2021 Introduction to Data Mining, 2nd Edition 35

Tan, Steinbach, Karpatne, Kumar
Similarity and Dissimilarity Measures

Similarity measure
– Numerical measure of how alike two data objects are.
– Is higher when objects are more alike.
– Often falls in the range [0,1]
Dissimilarity measure
– Numerical measure of how different two data objects
are
– Lower when objects are more alike
– Minimum dissimilarity is often 0
– Upper limit varies
Proximity refers to a similarity or dissimilarity
01/27/2021 Introduction to Data Mining, 2nd Edition 36
Tan, Steinbach, Karpatne, Kumar
Similarity/Dissimilarity for Simple Attributes

The following table shows the similarity and dissimilarity

between two objects, x and y, with respect to a single, simple
attribute.

01/27/2021 Introduction to Data Mining, 2nd Edition 37

Tan, Steinbach, Karpatne, Kumar
Euclidean Distance

Euclidean Distance

where n is the number of dimensions (attributes) and

xk and yk are, respectively, the kth attributes
(components) or data objects x and y.

Standardization is necessary, if scales differ.

01/27/2021 Introduction to Data Mining, 2nd Edition 38

Tan, Steinbach, Karpatne, Kumar
Euclidean Distance

3
point x y
2 p1
p1 0 2
p3
1
p2 2 0
p2 p3 3 1
0 p4 5 1
0

p1 p2 p3 p4
p1 0 2.828 3.162 5.099
p2 2.828 0 1.414 3.162
p3 3.162 1.414 0 2
p4 5.099 3.162 2 0
Distance Matrix
01/27/2021 Introduction to Data Mining, 2nd Edition 39
Tan, Steinbach, Karpatne, Kumar
Minkowski Distance

Minkowski Distance is a generalization of Euclidean

Distance

Where r is a parameter, n is the number of dimensions

(attributes) and xk and yk are, respectively, the kth
attributes (components) or data objects x and y.

01/27/2021 Introduction to Data Mining, 2nd Edition 40

Tan, Steinbach, Karpatne, Kumar
Minkowski Distance: Examples

r = 1. City block (Manhattan, taxicab, L1 norm) distance.

– A common example of this for binary vectors is the
Hamming distance, which is just the number of bits that are
different between two binary vectors

r = 2. Euclidean distance

r  . “supremum” (Lmax norm, L norm) distance.

– This is the maximum difference between any component of
the vectors

Do not confuse r with n, i.e., all these distances are

defined for all numbers of dimensions.

01/27/2021 Introduction to Data Mining, 2nd Edition 41

Tan, Steinbach, Karpatne, Kumar
Minkowski Distance

L1 p1 p2 p3 p
p1 0 4 4
p2 4 0
p3 4
p4
point x y
p1 0 2 L2 p1 p2 p3 p
p2 2 0 p1 0 2.828 3
p3 3 1 p2 2.828
p4 5 1 p3 3
p4

L p1 p2 p3 p
p1 0 2
p2 2
p3
p

Distance Matrix
01/27/2021 Introduction to Data Mining, 2nd Edition 42
Tan, Steinbach, Karpatne, Kumar
Common Properties of a Distance

Distances, such as the Euclidean distance,

have some well known properties.

1. d(x, y)  0 for all x and y and d(x, y) = 0 if and only

if x = y.
2. d(x, y) = d(y, x) for all x and y. (Symmetry)
3. d(x, z)  d(x, y) + d(y, z) for all points x, y, and z.
(Triangle Inequality)

where d(x, y) is the distance (dissimilarity) between

points (data objects), x and y.

A distance that satisfies these properties is a

metric
01/27/2021 Introduction to Data Mining, 2nd Edition 43
Tan, Steinbach, Karpatne, Kumar
Common Properties of a Similarity

Similarities, also have some well known

properties.

1. s(x, y) = 1 (or maximum similarity) only if x = y.

(does not always hold, e.g., cosine)
2. s(x, y) = s(y, x) for all x and y. (Symmetry)

where s(x, y) is the similarity between points (data

objects), x and y.

01/27/2021 Introduction to Data Mining, 2nd Edition 44

Tan, Steinbach, Karpatne, Kumar
Similarity Between Binary Vectors
Common situation is that objects, x and y, have only
binary attributes
Compute similarities using the following quantities
f01 = the number of attributes where x was 0 and y was 1
f10 = the number of attributes where x was 1 and y was 0
f00 = the number of attributes where x was 0 and y was 0
f11 = the number of attributes where x was 1 and y was 1

Simple Matching and Jaccard Coefficients

counts both presences and absences equally and it is normally
used for symmetric binary attributes

SMC = number of matches / number of attributes

= (f11 + f00) / (f01 + f10 + f11 + f00)

01/27/2021 Introduction to Data Mining, 2nd Edition 45

Jaccard Coefficients
counts only presences and it is frequently for asymmetric binary
attributes.
J = number of 11 matches / number of non-zero attributes
= (f11) / (f01 + f10 + f11)

01/27/2021 Introduction to Data Mining, 2nd Edition 46

Tan, Steinbach, Karpatne, Kumar
SMC versus Jaccard: Example

x= 1000000000
y= 0000001001

f01 = 2 (the number of attributes where x was 0 and y was 1)

f10 = 1 (the number of attributes where x was 1 and y was 0)
f00 = 7 (the number of attributes where x was 0 and y was 0)
f11 = 0 (the number of attributes where x was 1 and y was 1)

SMC = (f11 + f00) / (f01 + f10 + f11 + f00)

= (0+7) / (2+1+0+7) = 0.7

J = (f11) / (f01 + f10 + f11) = 0 / (2 + 1 + 0) = 0

01/27/2021 Introduction to Data Mining, 2nd Edition 47

Tan, Steinbach, Karpatne, Kumar
Cosine Similarity

01/27/2021 Introduction to Data Mining, 2nd Edition 48

Tan, Steinbach, Karpatne, Kumar
Cosine Similarity

If d1 and d2 are two document vectors, then

cos( d1, d2 ) = <d1,d2> / ||d1|| ||d2|| ,
where <d1,d2> indicates inner product or vector dot
product of vectors, d1 and d2, and || d || is the length of
vector d.

Example:
d1 = 3 2 0 5 0 0 0 2 0 0
d2 = 1 0 0 0 0 0 0 1 0 2
<d1, d2> = 3*1 + 2*0 + 0*0 + 5*0 + 0*0 + 0*0 + 0*0 + 2*1 + 0*0 + 0*2 = 5
| d1 || = (3*3+2*2+0*0+5*5+0*0+0*0+0*0+2*2+0*0+0*0)0.5 = (42) 0.5 = 6.481
|| d2 || = (1*1+0*0+0*0+0*0+0*0+0*0+0*0+1*1+0*0+2*2) 0.5 = (6) 0.5 = 2.449
cos(d1, d2 ) = 0.3150

01/27/2021 Introduction to Data Mining, 2nd Edition 49

Tan, Steinbach, Karpatne, Kumar
Correlation measures the linear relationship
between objects

01/27/2021 Introduction to Data Mining, 2nd Edition 50

Tan, Steinbach, Karpatne, Kumar
Correlation

01/27/2021 Introduction to Data Mining, 2nd Edition 53

Tan, Steinbach, Karpatne, Kumar
Drawback of Correlation (Non-linear Data)

x = (-3, -2, -1, 0, 1, 2, 3)

y = (9, 4, 1, 0, 1, 4, 9)

yi = x 2
i

mean(x) = 0, mean(y) = 4
std(x) = 2.16, std(y) = 3.74

corr = (-3)(5)+(-2)(0)+(-1)(-3)+(0)(-4)+(1)(-3)+(2)(0)+3(5) / ( 6 * 2.16 * 3.74 )

01/27/2021 Introduction to Data Mining, 2nd Edition 54

Tan, Steinbach, Karpatne, Kumar
Correlation vs cosine vs Euclidean distance

Choice of the right proximity measure depends on the domain

What is the correct choice of proximity measure for the
following situations?
– Comparing documents using the frequencies of words
 Documents are considered similar if the word frequencies are similar

– Comparing the temperature in Celsius of two locations

 Two locations are considered similar if the temperatures are similar in
magnitude

– Comparing two time series of temperature measured in Celsius

 Two time series are considered similar if their “shape” is similar, i.e., they
vary in the same way over time, achieving minimums and maximums at
similar times, etc.

01/27/2021 Introduction to Data Mining, 2nd Edition 55

Tan, Steinbach, Karpatne, Kumar

EuroBrake 2024 Preliminary Programme
No ratings yet
EuroBrake 2024 Preliminary Programme
62 pages
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
96 pages
Csat Book
83% (6)
Csat Book
751 pages
3 - Introduction To Data
No ratings yet
3 - Introduction To Data
56 pages
chap2_data (1)
No ratings yet
chap2_data (1)
105 pages
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
No ratings yet
Lecture Notes For Chapter 2 Introduction To Data Mining, 2 Edition
87 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
36 pages
Chap2 Data
No ratings yet
Chap2 Data
87 pages
Chap2 Data
No ratings yet
Chap2 Data
92 pages
All Data Mining Chapters
No ratings yet
All Data Mining Chapters
235 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
29 pages
02data (Compatibility Mode)
No ratings yet
02data (Compatibility Mode)
11 pages
Chap2 Data
No ratings yet
Chap2 Data
86 pages
02 Data
No ratings yet
02 Data
41 pages
2DMT
No ratings yet
2DMT
73 pages
CH2 data 1
No ratings yet
CH2 data 1
35 pages
chapter 2
No ratings yet
chapter 2
57 pages
Lecture 2
No ratings yet
Lecture 2
62 pages
Data Mining (DM) : Lecture 3: Know Your Data
No ratings yet
Data Mining (DM) : Lecture 3: Know Your Data
53 pages
UFE Lecture-1 Overview Data
No ratings yet
UFE Lecture-1 Overview Data
42 pages
Chap2 Data
No ratings yet
Chap2 Data
78 pages
02-KnowYourData
No ratings yet
02-KnowYourData
44 pages
Chapter 2 - Tagged
No ratings yet
Chapter 2 - Tagged
66 pages
02Know Your Data Lecture2 3
No ratings yet
02Know Your Data Lecture2 3
53 pages
02Data
No ratings yet
02Data
65 pages
Chapter 2
No ratings yet
Chapter 2
65 pages
Data Warehousing and Data Mining
No ratings yet
Data Warehousing and Data Mining
46 pages
02 Data
No ratings yet
02 Data
65 pages
Transportation Data Mining: Chapter 2. Getting To Know Your Data
No ratings yet
Transportation Data Mining: Chapter 2. Getting To Know Your Data
77 pages
Lectur 4 Basic Statistical Descriptions of Data
No ratings yet
Lectur 4 Basic Statistical Descriptions of Data
44 pages
02know Your Data-Lecture2-3
No ratings yet
02know Your Data-Lecture2-3
53 pages
02 Data
No ratings yet
02 Data
64 pages
Data Mining: Data Exploration: - Chapter 6
No ratings yet
Data Mining: Data Exploration: - Chapter 6
56 pages
02 Data
No ratings yet
02 Data
64 pages
Data Type, Data Chart, Descriptive Statistics
No ratings yet
Data Type, Data Chart, Descriptive Statistics
65 pages
Getting To Know Your Data
No ratings yet
Getting To Know Your Data
78 pages
CH 2
No ratings yet
CH 2
68 pages
02 Data
No ratings yet
02 Data
62 pages
Lect 3
No ratings yet
Lect 3
51 pages
02Data
No ratings yet
02Data
24 pages
IT326 - Ch2
No ratings yet
IT326 - Ch2
44 pages
Data Mining:: Concepts and Techniques
100% (1)
Data Mining:: Concepts and Techniques
63 pages
02 Data
No ratings yet
02 Data
35 pages
Lec 2
No ratings yet
Lec 2
26 pages
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
No ratings yet
CS 591.03 Introduction To Data Mining Instructor: Abdullah Mueen
52 pages
VIPDMTheoryChapter2
No ratings yet
VIPDMTheoryChapter2
56 pages
Data Mining Notes
No ratings yet
Data Mining Notes
25 pages
Lec.02 Getting to Know Your Data
No ratings yet
Lec.02 Getting to Know Your Data
62 pages
DM UNIT-1-1
No ratings yet
DM UNIT-1-1
56 pages
02Data
No ratings yet
02Data
66 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
65 pages
02Data Edited v2
No ratings yet
02Data Edited v2
43 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
65 pages
2 Knowing Data & Visualization
No ratings yet
2 Knowing Data & Visualization
51 pages
Week2-1
No ratings yet
Week2-1
24 pages
Mining
No ratings yet
Mining
129 pages
Data Mining - Lecture 1
No ratings yet
Data Mining - Lecture 1
33 pages
Concepts and Techniques: - Chapter 2
No ratings yet
Concepts and Techniques: - Chapter 2
54 pages
Illuminating Data: A hands on guide to data visualization in R
From Everand
Illuminating Data: A hands on guide to data visualization in R
Eman Ahmad
No ratings yet
Statistics I Essentials
From Everand
Statistics I Essentials
Emil G. Milewski
No ratings yet
Algebra & Trigonometry II Essentials
From Everand
Algebra & Trigonometry II Essentials
Editors of REA
4/5 (4)
GCSE Maths Revision: Cheeky Revision Shortcuts
From Everand
GCSE Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (2)
TS-Preventive Maintenance of Cooling Towers
No ratings yet
TS-Preventive Maintenance of Cooling Towers
8 pages
Graham Greene Quotations
No ratings yet
Graham Greene Quotations
163 pages
MODULE SOLUTION E - Field Potential Slab of Charge in and Out READY FOR WEB PAGE UPLOAD
No ratings yet
MODULE SOLUTION E - Field Potential Slab of Charge in and Out READY FOR WEB PAGE UPLOAD
8 pages
Microchip MCP7940M Low Cost I2C Real Time Clock Calendar With SRAM
No ratings yet
Microchip MCP7940M Low Cost I2C Real Time Clock Calendar With SRAM
39 pages
Environmental Accounting: Emergy, Systems Ecology, and Ecological Modelling
No ratings yet
Environmental Accounting: Emergy, Systems Ecology, and Ecological Modelling
4 pages
Ebffiledocnew 3632
No ratings yet
Ebffiledocnew 3632
52 pages
Wicked.part.I.2024.720p.camrip.dublado.srt
No ratings yet
Wicked.part.I.2024.720p.camrip.dublado.srt
149 pages
User Manual Jadever JWI-3000
No ratings yet
User Manual Jadever JWI-3000
19 pages
Fuzzing or Fuzz Testing
No ratings yet
Fuzzing or Fuzz Testing
3 pages
ERP - Complete Cycle of ERP From Order To Dispatch
60% (5)
ERP - Complete Cycle of ERP From Order To Dispatch
316 pages
NAAMM Metal Bar Grating Manual 531 - Part3
No ratings yet
NAAMM Metal Bar Grating Manual 531 - Part3
2 pages
Geometry of MS Pipe Staging For PSC Girder
No ratings yet
Geometry of MS Pipe Staging For PSC Girder
5 pages
Deo Madrid Portfolio Resume
No ratings yet
Deo Madrid Portfolio Resume
1 page
Hilton Addis (1) D
No ratings yet
Hilton Addis (1) D
20 pages
Open Woven Bandage Is 16469
No ratings yet
Open Woven Bandage Is 16469
6 pages
Q2 - DLL-De Leon - December 05 - 09, 2022
No ratings yet
Q2 - DLL-De Leon - December 05 - 09, 2022
14 pages
303-2013 New Furniture
No ratings yet
303-2013 New Furniture
13 pages
Directional Guided Seismic Attributes and Their Use in Assisting Structural, Stratigraphic and Lithological Interpretation
No ratings yet
Directional Guided Seismic Attributes and Their Use in Assisting Structural, Stratigraphic and Lithological Interpretation
5 pages
Worksheet For Field Density Test: STA. 93+729 - STA. 98+959
No ratings yet
Worksheet For Field Density Test: STA. 93+729 - STA. 98+959
1 page
Assignment 2 MAC
No ratings yet
Assignment 2 MAC
8 pages
Chemistry Grade X Prelim Paper 2019 Changed 123 - 1
No ratings yet
Chemistry Grade X Prelim Paper 2019 Changed 123 - 1
4 pages
A Detailed Lesson Plan in Grade IX Science
100% (1)
A Detailed Lesson Plan in Grade IX Science
4 pages
Module 3.2 - The Child and Adolescent Learners and Learning Principles
No ratings yet
Module 3.2 - The Child and Adolescent Learners and Learning Principles
10 pages
Mill Hill Neighbourhood Forum AGM Slides
No ratings yet
Mill Hill Neighbourhood Forum AGM Slides
28 pages
10th Chem Guess 2024 by Prof Arshad Bhatti
No ratings yet
10th Chem Guess 2024 by Prof Arshad Bhatti
5 pages
Pway Zone RPH 2022
No ratings yet
Pway Zone RPH 2022
17 pages
KonoSuba - Gods Blessing On This Wonderful World! Volume 11 - PDF Room
No ratings yet
KonoSuba - Gods Blessing On This Wonderful World! Volume 11 - PDF Room
181 pages
Application Note .: Measuring Torsional Operational Deflection Shapes of Rotating Shafts
No ratings yet
Application Note .: Measuring Torsional Operational Deflection Shapes of Rotating Shafts
8 pages