0% found this document useful (0 votes)
14 views17 pages

Dm-Unit I

Uploaded by

dharshutae
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views17 pages

Dm-Unit I

Uploaded by

dharshutae
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Data mining - Unit-l MBCA

Semestas:V
Whak is data Nuinina ? Lntroducin
lhe preces of extracting In tematin to identitt
bronds and use eul Aat that would alow the
Patteurs,
buines to' takse e data -di ven decieeon krm hg
Sets data in Called bata-ining
I otes words we Can &ay a t datd mninginyomatün
investigatirg hidden pattesns o å to
pesspsctves. ter Catagorizatio
foto vaoious
Vagious pespocols ehn. and agenbled
data , cwlich wnsehouyel
ed
in'Daukiudan asaas Such ag oata
date mni ng algeith.
egfient analyea the moAt weeyul techniguy.
>Data mining h ong o
Databae ( KDD.
ledgc dis (ovey oro CAS îneduoles
procs Celection, Data
> The Know
Cleaning,Da Tntegruion, Data patteun
Date transormaion, Daba mining,
evaluation.Qnd nnladge poreentalion.
OBaatc. Data mining JasKs :
data uning tasks Can be claeiyiasl qonoally
he what a'Specie taste
fn to tuwo ty based on
ties to acieve
Buoptve tas ks )

’pediitive taks
Qescnpttve tasks, clhaaceize te
* ta
Drepeuties data
miing tasAA paxfon inev
intecn
1b cttve data
Dfeoant data miring tas kA
numbun data mining tusics
tuIs
au a c
elaiyation, redcion, i -

Such as clustesing
oics Onalyko,lamoctation,
tiee -tusxs aoe ci Hlun predtc live cdaa
* AU data mini ng taks
tasks or daeriptive
yelern Can cra ute Ona or maD
*A daka omining
anigcahion
Preditive Predfction

Data Araysa
ring
Assocattm
Reipttve custeing
Sunasïzation

(D Predflbve DaBa wining Task:


witA deelopng models that Can predict jute
belhair a Outcomes based on hio tosG data

dotomin He clas o an obyit baged on tt


abibutes
cleip caton an be wed in duuch takdia
alit o Cutomes who ae isaly to luy
new pruobuct
model based on
a
preccteon invalves developing modal s ceed n
avculabb data and t
he

inteest

Pme Sesios s a
one or more a the
detominad by.
event ün
ferecodui evente
anaes Incude nathoy to
*Tne astes anaye
-Seofe. data îo adn to exhract
Lime
tine
chalze pattaxn, rends les ndsfats Hes.
ueput
task;
BDesciptive data màning data
tases uually noa
Descrtptive da miningGnes Us wi th raw,
desoñbing pattond an d
Ae cuat lable data et.
fngomain pem
to Tduntg Proaluca that
A rotilon bing Can be Cong?deed as
Qoe punchaed togehen
a ducspte data mlEng tasn
tAso catfon : Asoefation dicovesa teagociafion
Connecion among a etot items.
dÝ ationabips bokucan
kABofakion 9dentras the rel
Abouakorn anayis
NOagguant,adbenisig Cakalog,cdoron dóvect
ng ete.,
Cistes§ng cluteung ï used to Tdontit:
that Osul Biil an to ona arottet
data obet on a
Stmilasity Can be decded baes
The punchae, lbahauiov,
numbes o faclos lihe
Tes porgivenoas o Ceutain ac tfons
locatios and So n.
SumeICSuizion : Qommasizaion the
gnnalizati on of data
A*A-Beb o elevant data is Surnmaniies
which esue Pn a Amallen set tat ges agreatd
Ingomakton o the data. dueet
* Data Can be Summosé zad n
abghracton evels and from
a)ata minsng Vosus Kruladg
in databases
KDD, a
*Datamanug
mare Gomplee proCOA. Couctal elent in
* Datamng and KDD ase
data analys and Knouwledge escha ct-om, y
hawedaart tancfors oid obicives.
*RDD KDD Stands or Kno wlesge
in datulanle itesaie
u.O Com
Gm plex and
*The KDp method Qziractin rom big
aproach to Knowladge
data
fncudes utilisig a Vastety of abgoitlins
and Stalisial mesodle to sont frougi la
amcunt s o4 data and identit alavant and
Valualo data.
Procss ot KDD :
*Data claaning
* bata Prtugton
* Datd seliction
*Data trunstematon
* Data minind

* krowladge prentafion
Apli Catt ons ot KDD
* Busins and aukating
Uke anayo, masurat predction, euentng
CUents and tocseod al etanp las
o buseney and mautiy databde3
ANantacusuy
Predrctve
Cand qpuali Coihol.
Enane faud and stock nKat
arKot eeeascb
regeasch n
Coedt yK,
ARctos Can be arayged ueig tha koD
the fnane
Nethod.
Uruy rgres fatientMonit
and dikease diagnokaim alauge Betba patieut
data
Setentsje Yegcarch: aenti tyig pathuns
dn MasVe Cuent4ie databaes Such asgorefia
Gstrnm and clmate.
Daba mintng: Daka mlning identyy data
echacttng details about big
ePat teun and
leauriug drd databageJystors
Mackin
il Doa ining".

ApplfCalios :
to oraluato te
miring Can be used
* Data 0CCuence Such as the cloect
Prbab?uty o an
inapeduct
ReCo mmon dation i
Data mning Can bae froduct uggatong
to sustomasu based on their hintoy or
broweing pattens
Froud detection:
Data mlning Can spot pattouns o Bhady
bohaiou, sih as sraugulnty uaga Gudit
Caud or ?nlatirg a po ligy:
aNedia diagnasis
Daa mântng Can be util'2ed to dlagnoe
buodi al dis onelios by Secing bors in mediaeod
Data miing Vs KDD
Key featus Data KDD

Basrc
Dain+
Batn tion_
dantyi pateuns anditemattve
4porooh to
and exctingdutiLKnowled
about big dah tats exnachfrn ksem
laugintellgeutNatod biq data.
ectract patterns To dis Coven
Goal lo
too datasets Knowledge re
datacets
athodkDD ba brd
In te*DD
Scope LOtod that
ocudas data
Calleddatannuhing

used classPPcatfon >Data cleaning


> Dataintegration
Technigas Recisiontreeg DataSeltom
Ponali tyDaBa transtermq
’Dinen& Dtion
Reduction >Data mçnng
Neual ekw ons alation
Reresfon >know ladgl
prelentatton.
Ecanple cstorg qroups tata araly i
data elementd to tino patten
baed on how Aionilaeand linke.
they aoe
() Data mining suás
* Drta ming un not on
lWed an t Voy Gmplex and date un iot elany
avclable ad ne place
Tn necds to be fndcqrated tem Vamiou bekemeg
*Bome
and uc Sntevactin
7mining natodoloqy
>Divoge data types
Mining Melodologt hnd usen Intenacfon
ining dieuant Kndy of Krouladge in dahbae
Dyfeuent
Theutore itIt un nees aogyfor dala mining to ves
a brad ong Knalag lncoveg task.

>Tnkaaclive mining et Cnguadgg at nuuHp


abshrac btion
noeds to be inteactue
reiing poocoes
* The date maning tocus ta Rlaach ey
bcause it al aws uses to
eining daa maug
Paten's Poviing andretushealulk. me
Yeyets.basd on te
mof backqround nouwledge.
procas and to eopreng,Hhe
pattns and loackrond knbuleal Gar
* Backunnd Khowlde may be used to erxre Rae
docovesad patteuns not ony in Coneia teams bot
at nulipla tere o abstracton.
G> presntation and vi&uazaHo¡n o daa miràng regulk
* Ore teR pttaun aedaroveud, ft noLds to
ba escpresed in high lovel angagps and
Vibuad rpreentd Hons Coanchant

Dndauatardasle
>Hanciy notsy or in Complate data.
# The dae clearinH incompate obiecda
ohande the noike and
data raul°ties.
olkila minung tteanig
*Tt thu da moods c not Hhea
patten
tan the acuuay o Hhe di Covesed
( Pateun Evauation:
hould ha Sntostg
The atesun dis Co voLad

Orlack Novelty
K" PoayprmanceTees

>EHiclence and Scalabilyoda minig

Sh orda to eyecvdy eschac the


huge amount of data ineytient
inyommaktn koom unihgalgoitm
dai aes, dala muut be
and Scalase
Poal, des bi buted and intaamantal ing

kize O databagy
hugoond Compleity Ot
wida dntibution ot data, tue dovefpn
mathbdls mativate
miung
data
Pauallel and disth buted daa mining
Ó,
olgostuns. the data nto
dinde
*These alyontas uÝ
Patttons wich
Pasab.l pauti tions u
res uts trom tee
* Then the
masged
sueg,
Divese Data typ
velati onal and Complo typs of
Handling O
Contain Complex data
Ths dahabases may objeck 'Spaial data
chjecs mliedia ala
datd etc.
tem poral to mina
po?ble
pog bla for one Ryeterm
x It u not
Kind ot data
all thee hatonogeneo
fom
>Nining Intematon
global
global inkrmatn Systons
intomaton
dlatabases and
LAN Os WAN.
Boun es
hese data ounca may be tuctued, Sem
-Structua oT Unstructued.
* Tüasayore mining the Knowledge trom tlam
adds cballongs to data mining
Doko mining Metics

-nca hat wes poxceptfon modals, analytcl medals,


and taulltple algitms to Aimuldate the tehniauas ot
the uman brain.
lupports Nacuias bo tano henan
* Data miring
deisons and cee huan Choicos o
data mining tools wt Uhave
* he wen ot re
dinoct th Machine rules praferences, and even
erpesienay to hawe deeison Support data minig
Metics ae as
’useulna lves Cerea) notes that toll us
* wsetulnes îwo
whotke the model ororidles woeyul data.
ehig model that Comelator
* fer îrstance, acata Can be bot, acceate
Sve fe locatom with sal wseped becauge t
anA olable , but annot be
Aat resut by' irsetirq mare
Cennot Qencnal'se same locatin+
Btors otae
iwastment : Ror)
Rebusun on întewstrg patfeeg
mlning tools will yind
* Data and derelop predictive
buaiod fngtde the data
MOdals.

denotng haw cwell thoy tt the e Cords


not clean houw to (aeate a deceeeon
Tt s
oL the neewes
based onn kone ot
elemant o, data minig
>Acas Ffhandtal Snjomaton dunig Data
minmy. in
The Sinuplet way to trame do eirions
*
eranial lomg tato agpmart te ow in jorm i
that genally minsd to alo Conta tanisl
data derelopirs
ae fnvesting and
* Some oqonizatins data mants.
data ahouses and
fle cuign ot wahose on mankCntai
Contoleratins about the types otcnayss and
ceta noad ed eopectad questes.
Ht
alows acees to inancial fntmaton along witr
aces or more typflal daa on product atthi butes
ty pflal datu

>gvostfng Daka mining nabies nto


*A qenenal data ening Mebic tue nuoas0
"Lift".
hate te fo weh the Model ee not use.
HfahY s means much s achievool.
y Ot Can eemn fhen tat one an aaate
fEên based
>Acunay
asHe os how well Huo moda
Goeeates Yesuts with the altibulo. fn tho data tat
has been suppor tod.
maaunes of acumay but
aC(na yaee dapndes
In yonmation tha í wed
* h realy, Valus Can be missing on opoximale
or the data Can have beon

It an de ciolo to aCept a 2pe }fe amount of esoY

* For eaampla, a model that predFels kalis er


abpeic Srore baed on pagt Aales an be
Powoe Coelateo and vay accwato, evan it Hht
Store CongBtently used the wrog alountng
Cuotay ould be balantol
Ths, Moaguaments o4 ac
by
wesments raliabl't
new

Dake munui by trarnaiui


Comelatis Fatens and trendssaved
onnyh a hih amout of dae
dake saured în Yepositemias,
Cogi tai technolog includig
uuin patton remtenaital tediguey.
statotil ond Cyatens ane desghed to promote Ha
* Dala mning
and' claifiCakin op inetiduals into
Agments.
These aue Vouous. Aoçal mplteations ok data

yea tivauy Csn(huns hove


hae taken
* In cuant
Areilan Aociehy
namore Ymportat Yale io Aue'Can
Neeches, ?nusence Conuponiu,and qovesnmant
aqenies ma wasel cuges fnetlig pesonal
The Conceou people hawe ovay tee
hat
hö data wil 2enalle ertend to Jome aryi
'apat,?bios wed to Hue data:
data mining hodd stut thkg
about how thely ge of thes technoloqy wi t! be

Proing?
* Daka mininq and prDUng s a devaloptng Fdl
hat attempt> to osgeriïza , undanstard, anaz
njemation dg2
or ano rmaDies Huat asa Vesy Conmple
to orctadt douin
eeflt ortne Consumag to re Coqríze:
fhe tondon of miorosoft's exploratntem used
lomplo data mining agontms to olve an tezue
iat had hauted asbonomors for Sane
blem
vteing decibing
eCordod ovon3
nd lslgoiig

dlocate shy ohds Uno Stass on qalafes


* e algontms wQe abla to ehad h teohQ
tat enoetod sky cs Slasy orqalaricA
objecs
*Thi developing roid o dala nining ardCan be
has evetal fionieu whoso TtA
Pro ling
* Some otcr etoal qoals Can be mGUKd
data
* Un ethilal busineeses oo people Can uie t
Oblainzd thugh data mining to taka brait
o< vulnana la peeple or döeHminate aguinst a
mOrQ, the data minng tecniqpo us not
* fusttas mstats do appca
loo % acuate ths results.
Selous
Can hare
whh con

V) Dala mâ ning sem Databa potpochive


Knouledge hom lasye dabakaes
*nìning înhortatün and TDgea choss as aa kay
by nand
has bean relagnized daakoe systems and nMacline
Yoscanch to pi tn
doanis nd aywiaan induabrial ompanaiz ag an
an oppor tunty of Mafor reueny
Sngortan coLea
in nany d'yant relds
Regean ches în haue Shoun
Qpaat frlat fn data ming.
Soveal emaegeng trends appl Catons fninoma
Proiding Seices /Such as datd wae housung and
Orline
Onine Seies Dv Phe
Inteunat, also call kor Vaoious
data mining techntes to bettes ndes tand aen
beharior o imove e Sosuie Priced and to
foteage buiney oppor tunitres .
ose to Such a demand,teis asticle
* In resaschang
Suswey, frona alalabase
un point O vYew,on the data minng
techniques davalopo e tantylabledata miniva
*A ClaifCation of tle awal
becaniqe o proñded and a Cam paratwe shuly

om a data base pespective


Daba mining
>Roal wotd data
Updates
Ease
>Scalbig: to eyielenty echat intomatci
data in databases

ds Coveey algoñtams maunt bo


Solable to lange'dotbase.
eHiCient and
tie Bf adata mihig algitlum
predPctabb and acceptabe inlage
ust bo
databases

olyrmial mpltity will not be lwye


heal wotd data
NotAY and miusag tibdes values
problmg.
>Updateg Tala oeneing
Statfe data sets
>Ear cvell
thoy May not bo
l K uell,
to use hdesitand

You might also like