0% found this document useful (0 votes)

92 views64 pages

Data Preprocessing

The document discusses data mining and the data pre-processing steps involved. It explains that data mining involves extracting knowledge from large amounts of data, rather than just extracting data. The data pre-processing steps include data cleaning, integration, selection, transformation and mining patterns. Data cleaning aims to fill in missing values, smooth out noise and correct inconsistencies. Common techniques for handling missing values include ignoring values, manual filling, and using statistical measures like mean/median.

Uploaded by

salman khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views64 pages

Data Preprocessing

Uploaded by

salman khan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 64

Data Mining

By :
Parul Chauhan
Assistant Prof.
Data Mining
▶ Huge amount of data gets added up in our computer
networks, world wide web, and various storage
devices everyday from media, facebook, science etc.

▶ Example:
▶ Walmart handle hundreds of millions of transactions
per week at thousands of branches

Parul Chauhan (Assistant Prof.) RTU,Kota

▶ Solution:

▶ Powerful and versatile tools are badly needed to

automatically uncover valuable information from the
tremendous amounts of data and to transform such
data into organized knowledge.

▶ This led to the birth of data mining.

Parul Chauhan (Assistant Prof.) RTU,Kota

KDD
▶ Data mining does not mean extracting the data from
huge amounts of data.

▶ It actually mean extracting knowledge from huge

amount of data.
▶ Therefore, also known as Knowledge discovery (KDD)
from data.

▶ Knowledge discovery is a bigger process and data

mining is just a part.

Parul Chauhan (Assistant Prof.) RTU,Kota

Data mining as
a
step in the
process of
knowledge
discovery
KDD
▶ 1. Data cleaning- To remove noise, inconsistent data.
Performed on various sources.
▶ 2. Data Integration- Multiple sources may be
combined.
▶ 3. Data Selection- Where data relevant to the analysis
task are retrieved from the database.
▶ 4. Data Transformation- Where data are transformed
into forms appropriate for mining by performing
aggregation operations

Parul Chauhan (Assistant Prof.) RTU,Kota

KDD
▶ 5. Data Mining - Process where intelligent patterns are
applied to extract data patterns.
▶ 6. Pattern evaluation - To identify the truly interesting
patterns representing knowledge based on interestingness
measures.
▶ 7. Knowledge Representation - Where visualization and
knowledge representation techniques are used to present
mined knowledge to users.
▶ STEP 1-4 are different forms of pre-processing where data
are prepared for mining.

Parul Chauhan (Assistant Prof.) RTU,Kota

What Kinds of data can be mined ?
1. Relational database- Is a collection of tables, each
of which is assigned a unique name. Each table
consists of set of attributes(columns or fields) and
usually stores a large set of tuples(records or rows).

2. It can be accessed by database queries written in

SQL etc.

3. Example- How many people having income upto

100k are defaulted borrowers ?

Parul Chauhan (Assistant Prof.) RTU,Kota

Relational Database

Parul Chauhan (Assistant Prof.) RTU,Kota

2. Data Warehouse- Is a repository of information
collected from multiple sources, stored under a
unified schema and usually reside at a single site.

The data in the data warehouse are organized around

major subjects, such as customer, item, supplier and
activity. The data are stored to provide information
from a historical perspective Such as from past 5-10
years) and are typically summarized.

Parul Chauhan (Assistant Prof.) RTU,Kota

▶ Example-

◦ An international company has branches all over the

world.
◦ Each branch has its own set of databases.
◦ Owner has asked you to provide an analysis of the
company’s sales per item per branch for the third
quarter.
◦ Difficult- data is spread out over several databases.
◦ If the company had a warehouse, this task would be easy.

▶ Rather than storing the details of each sales transaction,

the data warehouse may store a summary of the
transactions per item type for each store.

Parul Chauhan (Assistant Prof.) RTU,Kota

3. Transactional Databases- Record represents a
transaction.

Transaction includes a unique transaction identity

number(trans_ID) and a list of the items making up
the transaction.

So as an analyst we can ask- Which items sold

together?

This kind of analysis is called Market Basket Analysis.

Such analysis can be used to boast sales.

Parul Chauhan (Assistant Prof.) RTU,Kota

Parul Chauhan (Assistant Prof.) RTU,Kota
Data Pre-processing
▶ Today’s real-world databases are highly susceptible to
noisy, missing, and inconsistent data due to their
typically huge size and their likely origin from multiple,
heterogeneous sources.

▶ Low-quality data will lead to low-quality mining results.

“How can the data be preprocessed in order to help improve

the quality of the data and, consequently, of the mining
results?

Parul Chauhan (Assistant Prof.) RTU,Kota

Data Quality: Why Preprocess the Data?
▶ Imagine that you are a manager at AllElectronics and
have been charged with analyzing the company’s data
with respect to your branch’s sales.

▶ You carefully inspect the company’s database and

data warehouse, identifying and selecting the
attributes (e.g., item, price, and units sold) to be
included in your analysis.

▶ You notice that several of the attributes for various

tuples have no recorded value.
Parul Chauhan (Assistant Prof.) RTU,Kota
▶ For your analysis, you would like to include information as to
whether each item purchased was advertised as on sale, yet
you discover that this information has not been recorded.

▶ Furthermore, users of your database system have reported

errors, unusual values, and inconsistencies in the data
recorded for some transactions.

▶ The data you wish to analyze by data mining techniques are

incomplete (lacking attribute values or certain attributes of
interest, or containing only aggregate data); inaccurate or
noisy (containing errors, or values that deviate from the
expected); and inconsistent (e.g., containing discrepancies in
the department codes used to categorize items

Parul Chauhan (Assistant Prof.) RTU,Kota

▶ This scenario illustrates three of the elements defining
data quality: accuracy, completeness, and consistency.
▶ There are many possible reasons for inaccurate data:
a) The data collection instruments used
may be faulty.
b) There may have been human or
computer errors occurring at data entry.
c) Users may purposely submit incorrect
data values for mandatory fields when
they do not wish to submit personal
information

Parul Chauhan (Assistant Prof.) RTU,Kota

Incomplete data

a) Attributes of interest may not always be available

b) data may not be included simply because they were
not considered important at the time of entry.
c) Relevant data may not be recorded due to a
misunderstanding or because of equipment
malfunctions.

Parul Chauhan (Assistant Prof.) RTU,Kota

Inconsistent

▶ Data that were inconsistent with other recorded data

may have been deleted.

▶ the recording of the data history or modifications may

have been overlooked.

Parul Chauhan (Assistant Prof.) RTU,Kota

Two other factors affecting data quality are believability
and interpretability.

▶ Believability reflects how much the data are trusted

by users, while

▶ Interpretability reflects how easy the data are

understood.

Parul Chauhan (Assistant Prof.) RTU,Kota

Major Tasks in Data Preprocessing

Parul Chauhan (Assistant Prof.) RTU,Kota

1. DATA CLEANING

▶ Real-world data tend to be incomplete, noisy, and

inconsistent.

▶ Data cleaning (or data cleansing) routines attempt to fill in

missing values, smooth out noise while identifying outliers,
and correct inconsistencies in the data.

▶ 1. Missing Values
▶ 2. Noisy data

Parul Chauhan (Assistant Prof.) RTU,Kota

A) Missing Values
▶ Imagine that you need to analyze AllElectronics sales
and customer data. You note that many tuples have
no recorded value for several attributes such as
customer income.

▶ How can you go about filling in the missing values for

this attribute?

▶ Various methods are :

Parul Chauhan (Assistant Prof.) RTU,Kota

1. Ignore the tuple: This method is not very effective, unless the
tuple contains several attributes with missing values.
By ignoring the tuple, we do not make use of the remaining
attributes’ values in the tuple. Such data could have been useful to
the task at hand.

2. Fill in the missing value manually: This approach is time

consuming and may not be feasible given a large data set with
many missing values.

3. Use a measure of central tendency for the

attribute to fill in the missing value:
▶ For example, suppose that the data distribution regarding the
income of AllElectronics customers is symmetric and that the
mean income is $56,000. Use this value to replace the missing
value for income.

Parul Chauhan (Assistant Prof.) RTU,Kota

4. Use a global constant to fill in the missing value.

5. Use the attribute mean or median for all samples

belonging to the same class as the given tuple.

6. Use the most probable value to fill in the missing

value

Parul Chauhan (Assistant Prof.) RTU,Kota

Note:
▶ In some cases, a missing value may not imply an error
in the data!
For example, when applying for a credit card, candidates
may be asked to supply their driver’s license number.
Candidates who do not have a driver’s license may
naturally leave this field blank.
▶ Forms should allow respondents to specify values
such as “not applicable.”

Parul Chauhan (Assistant Prof.) RTU,Kota

B) Noisy data
▶ Noise is a random error or variance in a measured
variable.
▶ Let’s look at the following data smoothing techniques:

i. Binning: smooth a sorted data value by consulting

its “neighborhood,” that is, the values around it.
ii. Clustering/ outlier analysis

iii. Regression

Parul Chauhan (Assistant Prof.) RTU,Kota

i) Binning
▶ The sorted values are distributed into a number of
“buckets,” or bins.
▶ Because binning methods consult the neighborhood of
values, they perform local smoothing.
▶ The original data values are divided into small
intervals known as bins and then they are replaced by
a general value calculated for that bin.

Parul Chauhan (Assistant Prof.) RTU,Kota

Steps for Binning:

1. Sort the array of given data set.

2. Divides the range into N intervals, each containing

the approximately same number of samples(Equal-
depth partitioning).

3. Store mean/ median/ boundaries in each row.

Parul Chauhan (Assistant Prof.) RTU,Kota

Sorted data for price
4, 8, 15, 21, 21, 24, 25, 28, 34

Partition into bins

Bin 1: 4, 8, 15
Bin 2: 21,21,24
Bin 3: 25,28,34

A) smoothing by means
Bin 1: 9,9,9
Bin 2: 22,22,22
Bin 3: 29,29,29

Parul Chauhan (Assistant Prof.) RTU,Kota

4, 8, 15, 21, 21, 24, 25, 28, 34
B) Smoothing by bin boundaries:
Bin 1: 4,4,15
Bin 2: 21,21,24
Bin 3: 25,25,34

C) Smoothing by bin median:

Bin 1: 8,8,8
Bin 2: 21,21,21
Bin 3: 28,28,28

Parul Chauhan (Assistant Prof.) RTU,Kota

410, 451, 492, 533, 533, 575, 615, 656, 697,
738, 779, 820,

Partition into bins

Bin 1: 410, 451, 492
Bin 2: 533, 533, 575
Bin 3: 615, 656, 697
Bin 4: 738, 779, 820

Parul Chauhan (Assistant Prof.) RTU,Kota

410, 451, 492, 533, 533, 575, 615, 656, 697,
738, 779, 820,

A) smoothing by means
Bin 1: 451, 451, 451
Bin 2: 547, 547, 547
Bin 3 656, 656, 656
Bin 4: 779, 779, 779

Parul Chauhan (Assistant Prof.) RTU,Kota

Parul Chauhan (Assistant Prof.) RTU,Kota
CONS of data smoothing

▶ Data smoothing doesn’t always provide a clear

explanation of the patterns among the data.

▶ It is possible that certain data points being ignored by

focusing the other data points.

Parul Chauhan (Assistant Prof.) RTU,Kota

ii) Regression
▶ Regression is a data mining function that predicts a
number. Age, weight, distance, temperature, income, or
sales could all be predicted using regression techniques.

▶ For example, a regression model could be used to predict

children's height, given their age, weight, and other factors.

▶ Data can be smoothed by fitting the data to function, such

as with regression. (It finds the best line to fit two
attributes.

Parul Chauhan (Assistant Prof.) RTU,Kota

iii) Clustering

▶ Outliers may be detected by clustering, where similar

values are organized into groups or clusters.

▶ Values that fall outside of the set of clusters may be

considered outliers.

Parul Chauhan (Assistant Prof.) RTU,Kota

2. Data Integration
▶ Data integration—the merging of data from multiple data
stores.

▶ Careful integration can help reduce and avoid

redundancies and inconsistencies in the resulting data set.
▶ The heterogeneity and structure of data pose great
challenges in data integration.

a) Entity Identification Problem

b) Redundancy
c) Data Value Conflict Detection and Resolution

Parul Chauhan (Assistant Prof.) RTU,Kota

1. Entity Identification Problem

▶ For example,

▶ How can the data analyst or the computer be sure

that customer id in one database and cust number in
another refer to the same attribute ?

Parul Chauhan (Assistant Prof.) RTU,Kota

2. Redundancy
▶ Redundancy is another important issue in data
integration.
▶ An attribute (such as annual revenue, for instance)
may be redundant if it can be “derived” from another
attribute or set of attributes.
▶ Some redundancies can be detected by correlation
analysis.

Parul Chauhan (Assistant Prof.) RTU,Kota

3. Data Value Conflict Detection and
Resolution

▶ For example, for the same real-world entity,

attribute values from different sources may differ.

▶ This may be due to differences in representation,

scaling, or encoding.

▶ For instance, a weight attribute may be stored in

metric units in one system and British imperial units in
another.

Parul Chauhan (Assistant Prof.) RTU,Kota

3. Data Reduction
▶ Imagine that you have selected data from the
AllElectronics data warehouse for analysis.

▶ The data set will likely be huge! Complex data analysis and
mining on huge amounts of data can take a long time,
making such analysis impractical or infeasible.

▶ Data reduction techniques can be applied to obtain a

reduced representation of the data set that is much
smaller in volume, yet closely maintains the integrity of
the original data.

Parul Chauhan (Assistant Prof.) RTU,Kota

Data reduction strategies include:

a) Dimensionality reduction,

b) Numerosity reduction, and

c) Data compression.

Parul Chauhan (Assistant Prof.) RTU,Kota

i) Dimensionality reduction

1. Dimensionality reduction is the process of reducing

the number of random variables or attributes under
consideration.

▶ Dimensionality reduction methods include wavelet

transforms which transform or project the original
data onto a smaller space.

Parul Chauhan (Assistant Prof.) RTU,Kota

ii) Numerosity reduction
2. Numerosity reduction techniques replace the original
data volume by alternative, smaller forms of data
representation. These techniques may be parametric or
nonparametric.

▶ For parametric methods, a model is used to estimate the

data, so that typically only the data parameters need to be
stored, instead of the actual data. Eg. Regression

▶ Nonparametric methods for storing reduced

representations of the data include histograms, clustering
,sampling , and data cube aggregation

Parul Chauhan (Assistant Prof.) RTU,Kota

a) Data Cube Aggregation
▶ Imagine that you have collected the data for your analysis.
These data consist of the AllElectronics sales per quarter,
for the years 2008 to 2010.

▶ You are, however, interested in the annual sales (total per

year), rather than the total per quarter.

▶ Thus, the data can be aggregated so that the resulting

data summarize the total sales per year instead of per
quarter.

Parul Chauhan (Assistant Prof.) RTU,Kota

Parul Chauhan (Assistant Prof.) RTU,Kota
▶ Data cubes store multidimensional aggregated
information.

Parul Chauhan (Assistant Prof.) RTU,Kota

b) Histograms
▶ Histograms use binning to approximate data distributions and are a
popular form of data reduction.

▶ A histogram for an attribute, A, partitions the data distribution of A

into subsets, referred to as buckets or bins.

▶ The following data are a list of AllElectronics prices for commonly sold
items (rounded to the nearest dollar). The numbers have been sorted:

1, 1, 5, 5, 5, 5, 5, 8, 8, 10, 10, 10, 10, 12, 14, 14, 14, 15, 15, 15, 15, 15, 15, 18, 18, 18,
18, 18,18, 18, 18, 20, 20, 20, 20, 20, 20, 20, 21, 21, 21, 21, 25, 25, 25, 25, 25, 28,
28, 30,30, 30.

Parul Chauhan (Assistant Prof.) RTU,Kota

▶ If each bucket represents only a single
attribute–value/frequency pair, the buckets are called
singleton buckets.

Parul Chauhan (Assistant Prof.) RTU,Kota

▶ To further reduce the data, it is common to have each
bucket denote a continuous value range for the given
attribute.

Parul Chauhan (Assistant Prof.) RTU,Kota

c) Clustering
▶ They partition the objects into groups, or clusters, so
that objects within a cluster are “similar” to one
another and “dissimilar” to objects in other clusters.

▶ The “quality” of a cluster may be represented by its

diameter, the maximum distance between any two
objects in the cluster.

Parul Chauhan (Assistant Prof.) RTU,Kota

iii) Data compression
In data compression, transformations are applied so as to obtain
a reduced or “compressed” representation of the original data.

▶ If the original data can be reconstructed from the compressed

data without any information loss, the data reduction is called
lossless.

▶ If, instead, we can reconstruct only an approximation of the

original data, then the data reduction is called lossy.

▶ Dimensionality reduction and numerosity reduction techniques

can also be considered forms of data compression.

Parul Chauhan (Assistant Prof.) RTU,Kota

4. Data Transformation
▶ In data transformation, the data are transformed or
consolidated into forms appropriate for mining.

▶ Various methods include:

a) Min_Max

b) Z score

c) Decimal Scaling

Parul Chauhan (Assistant Prof.) RTU,Kota

Data Transformation Strategies
1. Smoothing, which works to remove noise from the data.
Techniques include binning, regression, and clustering.
2. Attribute construction (or feature construction), where
new attributes are constructed and added from the given set
of attributes to help the mining process.
3. Aggregation, where summary or aggregation operations
are applied to the data. For example, the daily sales data
may be aggregated so as to compute monthly and annual
total amounts. This step is typically used in constructing a
data cube for data analysis at multiple abstraction levels.

Parul Chauhan (Assistant Prof.) RTU,Kota

4. Normalization, where the attribute data are scaled
so as to fall within a smaller range, such as -1.0 to 1.0, or
0.0 to 1.0.
5. Discretization, where the raw values of a numeric
attribute (e.g., age) are replaced by interval labels (e.g.,
0–10, 11–20, etc.) or conceptual labels (e.g., youth, adult,
senior). The labels, in turn, can be recursively organized
into higher-level concepts, resulting in a concept
hierarchy for the numeric attribute.

Parul Chauhan (Assistant Prof.) RTU,Kota

6. Concept hierarchy generation for nominal data
where attributes such as street can be generalized to
higher-level concepts, like city or country. Many
hierarchies for nominal attributes are implicit within the
database schema and can be automatically defined at
the schema definition level.

Parul Chauhan (Assistant Prof.) RTU,Kota

▶

Parul Chauhan (Assistant Prof.) RTU,Kota

▶

Parul Chauhan (Assistant Prof.) RTU,Kota

▶ Suppose that the mean and standard deviation of the
values for the attribute income are $54,000 and
$16,000, respectively.

▶ With z-score normalization, a value of $73,600 for

income is transformed to

Parul Chauhan (Assistant Prof.) RTU,Kota

▶

Parul Chauhan (Assistant Prof.) RTU,Kota

▶ Suppose that the recorded values of A range from
-986 to 917. The maximum absolute value of A is 986.
To normalize by decimal scaling, we therefore divide
each value by 1000 (i.e., j =3) so that -986 normalizes
to -0.986 and 917 normalizes to 0.917.

Parul Chauhan (Assistant Prof.) RTU,Kota

Example
▶ The following data (in increasing order) for the
attribute age: 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25,
25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 70.
▶ (a) Use min-max normalization to transform the value
35 for age onto the range [0.0, 1.0].
▶ (b) Use z-score normalization to transform the value
35 for age, where the standard deviation of age is
12.94 years.
▶ (c) Use normalization by decimal scaling to transform
the value 35 for age.

Parul Chauhan (Assistant Prof.) RTU,Kota

Solution

Parul Chauhan (Assistant Prof.) RTU,Kota

Computer Organization: Bus Structures
No ratings yet
Computer Organization: Bus Structures
4 pages
Infidelity in Committed Relationships I: A Methodological Review
100% (1)
Infidelity in Committed Relationships I: A Methodological Review
34 pages
By: Parul Chauhan Assistant Prof
No ratings yet
By: Parul Chauhan Assistant Prof
64 pages
Aiml Data Preprocessing
No ratings yet
Aiml Data Preprocessing
99 pages
Unit-3 Data Preprocessing
100% (1)
Unit-3 Data Preprocessing
7 pages
18mca52c U2
No ratings yet
18mca52c U2
23 pages
VIPDMTheory Chapter 3
No ratings yet
VIPDMTheory Chapter 3
87 pages
Lecture Source: Books by Tan, Steinbach, Kumar Han, Kamber & Pei Evans Dinesh Kumar + Experiential Knowledge
No ratings yet
Lecture Source: Books by Tan, Steinbach, Kumar Han, Kamber & Pei Evans Dinesh Kumar + Experiential Knowledge
40 pages
Data Science - Module 1.3
No ratings yet
Data Science - Module 1.3
34 pages
Data Preprocessing Part 1
No ratings yet
Data Preprocessing Part 1
14 pages
Mod2 DM
No ratings yet
Mod2 DM
86 pages
UNIT - Introduction - DataScience - New
No ratings yet
UNIT - Introduction - DataScience - New
55 pages
Chapter-3 Data Processing
No ratings yet
Chapter-3 Data Processing
54 pages
DM Chapter 3
No ratings yet
DM Chapter 3
60 pages
Data Preprocessing
No ratings yet
Data Preprocessing
77 pages
3 Data Preprocessing
No ratings yet
3 Data Preprocessing
33 pages
03preprocessing Part1
No ratings yet
03preprocessing Part1
21 pages
Unit 3
No ratings yet
Unit 3
41 pages
Data Mining and Data Warehousing CSPC-308
No ratings yet
Data Mining and Data Warehousing CSPC-308
51 pages
Data and DW Lab Manual Updated
No ratings yet
Data and DW Lab Manual Updated
44 pages
Chapter 2.data Warehouse
No ratings yet
Chapter 2.data Warehouse
42 pages
A Comprehensive Approach Towards Data Preprocessing Techniques & Association Rules
No ratings yet
A Comprehensive Approach Towards Data Preprocessing Techniques & Association Rules
9 pages
Dmi Unit 3
No ratings yet
Dmi Unit 3
12 pages
Pre Processing
No ratings yet
Pre Processing
68 pages
Lecture 3 Unit 1
No ratings yet
Lecture 3 Unit 1
61 pages
Data Preprocessing, Data Warehousing
No ratings yet
Data Preprocessing, Data Warehousing
9 pages
Data Preprocessing
No ratings yet
Data Preprocessing
67 pages
DM Unit 3
No ratings yet
DM Unit 3
15 pages
DWDM 3
No ratings yet
DWDM 3
12 pages
Preprocessing Techniques
No ratings yet
Preprocessing Techniques
63 pages
3 DSEngineering
No ratings yet
3 DSEngineering
64 pages
Module2 DataPreprocessing
No ratings yet
Module2 DataPreprocessing
27 pages
03 Data Preprocessing
No ratings yet
03 Data Preprocessing
15 pages
Chapter 3& 4
No ratings yet
Chapter 3& 4
60 pages
TTDS Lecture 2
No ratings yet
TTDS Lecture 2
40 pages
Screenshot 2025-04-09 at 10.35.12 AM
No ratings yet
Screenshot 2025-04-09 at 10.35.12 AM
31 pages
Data Preparation and Analysis
No ratings yet
Data Preparation and Analysis
22 pages
Data Warehousing and Mining
No ratings yet
Data Warehousing and Mining
56 pages
DWM Module 2
No ratings yet
DWM Module 2
9 pages
04 DM BI Data Preprocessing
No ratings yet
04 DM BI Data Preprocessing
93 pages
DWDM Unit 3
No ratings yet
DWDM Unit 3
16 pages
Lecture 3 - Data Preprocessing
No ratings yet
Lecture 3 - Data Preprocessing
50 pages
Pre Processing
No ratings yet
Pre Processing
52 pages
Data Preprocessing
No ratings yet
Data Preprocessing
48 pages
Big Data and Analytics
No ratings yet
Big Data and Analytics
86 pages
Estimasi Anggaran Biaya Google Adwords Iklan Website
No ratings yet
Estimasi Anggaran Biaya Google Adwords Iklan Website
54 pages
03 Preprocessing
No ratings yet
03 Preprocessing
18 pages
B DWM Lab Manual Zil
No ratings yet
B DWM Lab Manual Zil
114 pages
Correlation
No ratings yet
Correlation
14 pages
Lecture 3
No ratings yet
Lecture 3
47 pages
INF30036 Lecture3
No ratings yet
INF30036 Lecture3
36 pages
21BCAD5C01 IDA Module 2 Notes
No ratings yet
21BCAD5C01 IDA Module 2 Notes
16 pages
Data Preprocessing: Why Preprocess The Data?
No ratings yet
Data Preprocessing: Why Preprocess The Data?
51 pages
Data Mining - Lecture 2
No ratings yet
Data Mining - Lecture 2
23 pages
TTDS Lecture 2
No ratings yet
TTDS Lecture 2
40 pages
DWDM LS3 Fall 24 25
No ratings yet
DWDM LS3 Fall 24 25
50 pages
Lec 3
No ratings yet
Lec 3
31 pages
Data Mining: Concepts and Techniques
No ratings yet
Data Mining: Concepts and Techniques
50 pages
3 Preprocessing
No ratings yet
3 Preprocessing
27 pages
Lecture 6 Data Preprocessing
No ratings yet
Lecture 6 Data Preprocessing
59 pages
Tetris-Packing Problem With Maximizing Filled Grid Squares
No ratings yet
Tetris-Packing Problem With Maximizing Filled Grid Squares
73 pages
O o o o o o O: What Is Database
No ratings yet
O o o o o o O: What Is Database
20 pages
O o o o o o O: What Is Database
No ratings yet
O o o o o o O: What Is Database
20 pages
Finite Automata: Formal Definition of FA
No ratings yet
Finite Automata: Formal Definition of FA
39 pages
KEY 2019 Paper2 PDF
No ratings yet
KEY 2019 Paper2 PDF
1 page
1481-Article Text-6011-1-10-20230215
No ratings yet
1481-Article Text-6011-1-10-20230215
5 pages
Audit Sampling 50 Points
100% (1)
Audit Sampling 50 Points
18 pages
Large Sample Test
No ratings yet
Large Sample Test
27 pages
Learning Module - Statistics and Probability
No ratings yet
Learning Module - Statistics and Probability
123 pages
Mark Scheme (FINAL) Summer 2008: GCE Biology (6106/02)
No ratings yet
Mark Scheme (FINAL) Summer 2008: GCE Biology (6106/02)
7 pages
Artificial Intelligence in Elementary Math Education Analyzing Impact On Students Achievements
No ratings yet
Artificial Intelligence in Elementary Math Education Analyzing Impact On Students Achievements
14 pages
Pilot Study and Action Research
No ratings yet
Pilot Study and Action Research
3 pages
Assignment in Lieu of Midterm Exam
No ratings yet
Assignment in Lieu of Midterm Exam
4 pages
Isrm Suggested Improvement On Schimdt Rebound Hardness - 1993
No ratings yet
Isrm Suggested Improvement On Schimdt Rebound Hardness - 1993
2 pages
Joint Probability Distribution
No ratings yet
Joint Probability Distribution
25 pages
3 8 A Precision Accuracy Measurement
60% (5)
3 8 A Precision Accuracy Measurement
7 pages
FDPI Study Guide 2019
No ratings yet
FDPI Study Guide 2019
69 pages
Empirical Distribution Function (EDF) in Excel Tutorial
No ratings yet
Empirical Distribution Function (EDF) in Excel Tutorial
6 pages
Autoencoder
No ratings yet
Autoencoder
14 pages
JM - Jie,+11555 31878 1 SP
No ratings yet
JM - Jie,+11555 31878 1 SP
16 pages
CART - Machine Learning
No ratings yet
CART - Machine Learning
29 pages
Evolution of Quality: First Fifty Issues Of: Production and Operations Management
No ratings yet
Evolution of Quality: First Fifty Issues Of: Production and Operations Management
15 pages
ANN-unit 4 PDF
No ratings yet
ANN-unit 4 PDF
23 pages
Stats Project Final - PDF - Statistical Hypothesis Testing - Confidence Interval
No ratings yet
Stats Project Final - PDF - Statistical Hypothesis Testing - Confidence Interval
16 pages
Chapter 1-3
No ratings yet
Chapter 1-3
27 pages
I Jomed
No ratings yet
I Jomed
23 pages
Lesson Plan (CO 1) Random Variable
No ratings yet
Lesson Plan (CO 1) Random Variable
7 pages
FORECASTING AND INVENTORY @paper Sample Paper
No ratings yet
FORECASTING AND INVENTORY @paper Sample Paper
17 pages
Bcom 1-6 Syllabus
No ratings yet
Bcom 1-6 Syllabus
60 pages
Chapter-1 Business Statistics
No ratings yet
Chapter-1 Business Statistics
11 pages
Salih Kilicli: Education Experience
No ratings yet
Salih Kilicli: Education Experience
1 page
LESSON 1.1 Measurement (Accuracy Vs Precision)
No ratings yet
LESSON 1.1 Measurement (Accuracy Vs Precision)
17 pages
AP-Research Methods & Statistics-NEW
No ratings yet
AP-Research Methods & Statistics-NEW
5 pages
Pre Defense Paper III
No ratings yet
Pre Defense Paper III
18 pages