0% found this document useful (0 votes)
17 views21 pages

DSBDA Insem Solution

The document discusses various techniques for dimensionality reduction in data analysis, highlighting the benefits of improving performance and simplifying visualization. It covers methods such as Principal Component Analysis and Wavelet Analysis, emphasizing their role in enhancing data mining and analysis. Additionally, it addresses the importance of data cleaning and integration to facilitate effective decision-making in various applications.

Uploaded by

943-Rohit Garje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views21 pages

DSBDA Insem Solution

The document discusses various techniques for dimensionality reduction in data analysis, highlighting the benefits of improving performance and simplifying visualization. It covers methods such as Principal Component Analysis and Wavelet Analysis, emphasizing their role in enhancing data mining and analysis. Additionally, it addresses the importance of data cleaning and integration to facilitate effective decision-making in various applications.

Uploaded by

943-Rohit Garje
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

7Dimen qu

Benejit ) 1) iBimonsiO
etainth as
naity Jeduong 1( )
fectuscJ
Ond eith dimenionaby
Bìmensionaity impEOvinq
Pomance. model
and the
eed an
imensio technigu beneits hhat?
qioblam
benegcialashas
mnality Wavelet
Peincipal sionality. is
analysin sÑiye and Stupciuisand
ed nsem
n0.o
Aoduco i sionality
duction
dimen
dimnsi
onality datasets,seduction psTt ansfrtm ts lby of
data
heledution inytmation
eduction
halcapn much ar
Roducion Quertion USBDA
Componont
Analysis elemontt
complbz" sducing t
moise.Tt cmention
peyoEminq dataalloes eliinae to
mining voitrchnique
dMonlintaA O9ts Pape
t in f
aeduion dalq.
P:o;etion alitycaio adthethe dataset
ualixatHon.
easietvis Roduce the rmitigalehelto
p the Madhaui PageNot
ensaixation' Kultagui2023
detq Aedrcon complezit
iacloy
ant
the ald pro and
visuaicatbr time t
wotrin
posjble. ohilo
mpleuy
od
Qu
e oducion
thReduced of
cleanung
datqnt0 of B) Cvethelps oVeitt
pwt to Pscventsiqvisuize
Simpüyed Hoimp:oVel
visualization
whenTnCecase
emognhat
tham
make essOES datq
analyc. What
dimensiona
datq
High which ,
fr ond i Highe
the eliminates
datg in
piompt tohbining dmansionaldatq.
Space cmentionalML oducee
decicion utangling
?ehy IB2elavant
dota. Campazid
dccessiblo
and modole
anatsts
fotmatfor compler
i2solou
makng cohich datg to
ant
dath moo Date PageNo.
. lowe
do dimention con daig
Gnd
psocOL.the
Setsto ead

aby t
v)
sousce!
thitQuetomate
nTt intutive systcmend as -indudingwesLandsfoEmingto )
nissing
lbmonts. e Location. datq Some Datita anthe Data
mated
Auto
Scqçisdi
data the Maeing înteqsates
aningcle Mesqinq Canyet
aluable
fomexample to wDEanging uEangin
and
aangbno
trchoiqu.
dag taw
to the uwey in
dasgthe yaiou!
multip( dotab
ikedye qwicely data
qethee Vaioe nteiace
dat
data analy
tca mappin
foE dos
of into hel
build
fotmatkai) Vatiety to procoRc
SoutCel datthag flou)
integsatior
fomthe tols uiuable. data and to
type! q makeit data
p:0COr q alto
dat impeove
datCompatikle
into a
etanging eas1ly op
of the
feom Date PaceNo.
acc plourwithin
flauwed
moise
inEMation
dINg Centtalixe seheduo datq doon on
alwede include fot
mal abi stseam ueab
to
an fct pi
f and t
-
qu:1C)
4)TRecoMmendation
3)sad
doteetion 21
cact concluuion.
the cleani
peocaiTiq to psdiction.face
lweathe
and B0C0gniZotinh.
helthcase enstsuct
dadqusod SAsu TncluSuence and 41
An
ke apeiations. Dat
involves ctueddals
da1g dog Data V) 1)
nosmalkaion
Datgcleaninq
i ) Data i
le Diyesentiate
analysiSyctcm with vasiey 0duction
teancfsmation
Data Data
Souonqstages
and
Datq uranging
dalstcapt
Acach) q do
y the of Tt
integsato cis
tatitical
mdels. weTt Tt cai
act is
betwren
A MQchine
(oaning tiaation
think' tow such the
that as
and a aJ
SubseTh data
emoric
iheaieThe
spotiy
docicion
land 0. Science
eaninmc
g Date PageNo,

wekotJth Consciow yoice


rampas. Qceistane uses A
hemaneliot to chathotsand Înclude
the
mae
idoo mgchi
ma
qut2
dipendent
asiab Centinuous iwobich Da
tha ndbnt Linea qsked quantties. this tasget the peditedfeotag, mumbegAegietion
sqacgsion
Linca fE
Regseäsion Peofil,
Logidic Otooth ohat
to i an
YaiabshegeesiO) Basedon calied tempnaBuo
wing Cocs, woith is
pcdictgaste input
is tasta egscaron
teo seqietion matage,
X,i tiq diyannt.
aEzpl
gesonaibr
n
Wes tho nthe canCer0ed data anpie
luinq'a
t type yeut histo the
begiwintsh edsion dit
demahd Saleuhote
piCCA,
an ?
the piedct sepe &ical nininq 6
cott output
peobtum. trchnique.
of etc
motel,
ntmetion the
oQua
chi0ne the fo
t Knows. Qocld Date PageNo.
ege ic data unction
contin
SSNi9N
eu podieliancontinuou4 allbe
mosth. 0
squo10 that
uous f
donond
'agiable ec comm0)
y TbË:the tmpscaBue
slationship
eoblemt
the etl a
comnonly Legicie
uwing Ttdetbnmi ineat
1gaszsion get is
tion,
i
Vey Yaiable
dicesc4e is used
classification
ded diabetic degzes8i0)
al
to
to
eay
sificatios te shapd pscdi
psedictct
qivsetenaf
Aeaight
predicion wed
imple ththee oblem.
method to betwcon to
nent .
oidfnction
Sigm catcgon) etc. , SoNe
technique Such uohatbe Date

ard ndupendon'wasid mÞ.

solve
nclmort
dependent
male
-’ qu2
elasrdata
Cpation )
diccestizaion. millatay Colot Smalleg
int0
ones data
y
stiatcqy
sfütting i mumbers conyeatinq hge the
valigiusappose
ves.en haveoe egemnt
attsibute
Typeu
SupesoiedSupuised theo -picfeion. of Datadiscan what
kotinuou-Nlominavaw
fiom discac
d0t4,8
2,44,46, 4
precoadr. Tt ase is
is Rati
of datq
nethodleatnisq tcew0forms osg
used. disOelization
kal datg izat ?
and disCtetization, 6
io)
meanc kyetUn b
munbeg
kottom
dapending nunbea that
attEibute bupesuised ef como keeI
an
and data n the
up eA
wãE dosodset dozod
unGE data of10 erplain |Date
mct hod in
disartizalion, evauattonValuo' PageNo.
discaietixot
KÝ ion nd a
mging ponay k intge
on fot
unstip
vised rm
cuth he
siatg top seteg
which and
-docr fisst
which the
v)
|techniqu0
isoetization s is
ata analysis
analysis 4) contmuow O
vauesinto
ContimuO! discotizationAteioute
-final techniguo
Combined
nteeva)
toand oluteaq
eomputaHonal A teéhniqu
to Data
Data analys-
clutegTL ble
20 eECJend Binning
discaotiation
in that
jctcbr
eslapping disacdkation
develop 40 you
data ohich dataset: - Childdicaization
then Can wied algotithm enda help.
t s Tt
feate of
into Aees
ffcquiny
dieibetion
lyinqlyi methods
t
real.oighloojnq
tasge begdel
uing wisng youn
ineat nq
oveslapto
feotm. siingdootop shallisgoup -
intei egitolation Cotieco
lustesoexecite ae
docsjon ce
decisionts0e a
vale xalnes. huge NaBue90-46
gmoothing
m0.
analyi plot ue
dividis o
Qu'3a)
’A
yafiabo:Hcevo eombjnat
Costgen
idhs A
idisuete
maltiple
ation
ds
and thigIn
-CouLmD desceibes
eompui
gende it
Saler' Ctocstabulation
yasi ab D l eanploceth
tota!
femalemale fsgueny
valsCel q Analysts Wene
ows contiq
thei
table, and ency a
kopseents
qends. con fot
ae tho Sales outcome
lasiy
table also hocL Unit 2
counns tiqency each and tabe
type al talo
Random mote
fequenCjes PC
46 g0 66fquencies Columos
and i eqesD
of tabos eniquQ sunmaizo
psleand the
catgotical
to plays dis contigencg on
opiasend |24 40Mac 1
Aicna!stote. contigeney vaziabs.
fe by
Combina
guchaed.
Cemput tablo fscquencicj
inosotmaton Date PageNo.
displays
tho belous Vatiabas
each Compute
type total
Roo sectionr
intez ao
223 (O6 tion euplain
tabe.,Madai
kltati
of cwtone2 onefre tnb
the fot
the of
qu
is
hsi) IpsobabilGty
PscbabiGty
(3r-6e)
wtl2e00instancefo adand age Peobabiitiey aleady euent D
i explain Siendist -These
Keiyts 6)seyeatch. tablas Cotiqency
infrmatioh about-The
detesmining Thun Baye! With eseatch Total
utoeLS A TE
buy clase bookhow
bcok oczing
gitoce1:
hvaten thecem an
state
cenditionaloccheod. ae
the 26
Ceestomeas Of
thah Cthe buines
aCCULate
e1 heauilyuedate Yn
otheis
boktht ate mana method isums.ample
6 taLo
s the intelli
ctholdan ase peobabiliteic th to as
eTAain
disteibuted pscdictions
ha and baicfoundationof nes
CIg-as)
Youth dotesmine ence,
i
peobailty Bayes
anotho
kerE
COaNtS
engine0)ing
toe dbot
)ation
thgee
QClor theo00).
éit ha
Conditiona
middlo
to

Also
ikelihood 4 2PeioE
c4sto
) leNidencE.Tt
$2000
Income Lhatthe be
is thome
gosteekt Postesealcealatíon snamely
based
mputs tthat
his the
tBayer
hm :ikelihood
thoBook obebility
iof 26 P:ababilHy
known
eAH) Cund
PH}x) os PeobabiityQCCLtate.
Pobability
Catome of
is PCx\*) : yease As
2o00. this
done Hnply to
buuingad diseeessed
Ceutomet
not Case
PCXlH) KDoco ho and (P as
pCX) PeobabibBy CH{x) px) and the PHIx)
eakns bookCH) be C,
PCH) andage26 the eaiel
-

Ptovided
Sinein
Lesecome.
ptobabilty gienX.
Ceutomet
fitelh k
thepe
20oo PCX) Date
inp,tho tho
is od given
(x)'
that 26
obabilty potany th and
that i
psoodliit
hatho hypothei the
qenbaiity. book. a
an tho kno prolabili
buy the probaaiy. eoiru con
of Csto
sng a
qag©
Sample MatksHouss
folloulngthein
4 2

T
34
20 16 25 Houss tho
4
t2896Yé3146
éx-l64
= -584Ex S6 IS
Hous=Cx)ond
m(2XY)- Peasson a
to034) 32 6 93 S6 (Y) dataset
2 CoSLclation
6| 10
(4

-(2x)(EY) 324 400 324 |00 196 62 225 CosectaB


2s6
(64)(S84) C (8
MakyEY befe
ion
42
I164 2500 812 866 3136 Ls2| en
\o24 436 CoC TO |6 Date
7itiend, tho
66 20
yaiiahJ
198 1350 232Y 840 ets
S61820 t120
what
qut4 is
->)
)ContaDentiso 1The
thtealto houss So
mpler Schoo) wil 9A to Sample
Population
SampleThe An dsaud
Population thec
sizosampleco dtedied
tudont
at'althe Set e llect
population xample :concltuions i
The of
aie timepopulat09s Sizeofdatq is
tion that ds tho Population
is pasitive
ictuo Qnd
wded is of of the 14
bet fom entise
datapopulation Sample is
specijc ineas
when,Sample bodyal
ischoot about
topesents
cenbiased and
colle
study
toho Coelotior
ction
gotup how obiaincd.
usoutd
alwayssensthon that |Date Page No
Cage is
wbolSubeet that y
digey
colleto
ct
20uldTt
bethe bete0n
data. fs0n
that uant
the
4
qu:
4
3
beteveen
d'eente the0theis
it isa tajled
testS.
tcOo the
9Sample
6)e7Hechye 1 the
fgeod mostbt
complain.
shoutd unlimitdr
Conly A with The dtaThe
g tiaht dsoction.One tailed
trtOne Popalation
the ong-
subjectuually
jdoa, alsoTt
is
tekested taikd is
an . population is
taikd.tailed example 0f
that test
indivialuals collected
test tett claijiedCan be
V
uhethe Vasiasey thee tionCancedela known
to att
is
may is
hypo
ic weg explain.
be the that thoicol 2olinolabl
the betoeon as and s
ethe if dimen
decidthesa it dand
tale
OSe- make
populatio Qcotdinqto
dittto
be yos ond
tailedlus toto
knceledz have doteem
Vaia
sionale
?r
fctin
it
a hypa
hayewe deteemine taukd.testT0)
-Bec i havecoeSo
bulb days. 6o inesthe oulb
gcatijetime
y of test
tt0o hypotheiii
Saiinq qhttailed Th
H1
: Ho
havewe
So
ef
TH1
he : Ho îs
ange is the enetg
thanand : be less
40gr not The ght moan
icons1de than mean Saving
the thsn"s 60 bulbt ction,)disothoo
the
mean nean
marufactse
ensy 60
ictd
tha have Bays.ictime ight
mot' iyetine Last vsalations tert days..
SavingH both an too
in ef
eCor claims f
the the bi,betveors
cait0Cal siqyicanCe
of an
possbißity dictionalDOn |Date PageNG
aHeinatiye
ghl neg that days.thantas
than teqions. tauing
enegy
bulb
&a. it tho to
Itha Souing
hypothei eg vOs the
the balb
4
Tet4(

aheasetical 4#
betweon'ttoo
Chancegasial
att )and
cetyed
uteg,
ditiuonu.
turing, emptically
the Onainobtd
ohee isteibtetion f
pothetntt Aad4 the bcesved chi.squate
Jeqwscd to A Tt Tbis dotes The Descabe
Casselatep
te hlps
f basis
mining
testchi-squae
of telationship OF
Cna l vaus to Can the
Chi'quate yaia
obseved
value find diyeonce
be chi
Expected
valee test
ase blew tet
Vallue cut squaie
to
categtial
Nalue wewed d is
+hase obetaveen
dictibution aalu uhethes befoeo a
thetharis.
hypo hypothasis
secot
ding beteeen out to
disoct
that independanetcttetof
statistical
datq.
detesmime
tettabt
tho obreswéd
obsenotHOn. pasamoteC tes a
of Vaiaber. them.
dweloped dyeonce. paod1g
hypottheui ohethes
selatvely Adont an Came Heo
cendndig uneelatable hchesand peoplei
have
oVIe coho the Kst eqi mullthe madó is is
comp
toeeoatch tho oIn
The fissl ad
ml! move a
avaslable
imphct
snack
sale. Cametowbtch nd
Yati movie
teet
thic - to
istsue this hypothosis
bougt vaiablo abo
Roqurtthe o those qenu,ket
theate, the
Bample bo
ceitjca)
qenie Snack! whethet is
fc &uppase
indopendoee
Sikeand thare
eonsithis us a
value
tho that a
Date |PageNo

Oteiq fo mmovie the thbtsa cwe and


decisi ao
aujcot
f
P(Pas)=
OUsingEayd
peobability
that its Ís P(Pos
we PosPC P(~ P(D)
P(D|poc) = PPos
hawe D)-
|D)* 6 ~D)-
to
D)-
tve. Peob Peob
p = qithat
Yen that
O
that fnd CD) Paobabiity th"
045 PCPos 06 Peobability that that
Unit 2
the - patient
* denote
lets
0-0| pattenthas PCP0s patient patient
PCPOs ) | the oiséayes L
D) lD) of has op
0-06 disaasepositive
P positive
Sauld
disease do
CS * has
P(MD es
given
dicease Date
O-44
)
noE diseawe

that frgont houe


kjull give dira
SkeUNers
qu:B
and
-eigel
sfdedet
mode
Cnd This
also fomthe
Hehce Valesthe Qsymmeteical-
Mot |aymmctey
occecë
that
Skewness
acymmetiy ouEin s
handlethste Sometime,
higheA
and it
Numbezof PositiVely
skewd )
Mudentr dara
eq
the In
q' ually
Median
Mean
Mode types themomnal
beinq
elyFig cal)scelte
Qte StatistBand Lwed
disteibetion
that is distes sido
alwaysHence
Sdo. the hence
fMean Kuito
sLS
bu noeO to
ight ted. Thisi géaph.
Median pasitve. cont£atedon eoðhen Level
oE meas020to
means
tha
disteibution
keued teül Tho
thatthe les be
Mod Tt
ic syener the caute
Tn than i
|Dato Page No
ge
d tocesad
aptbentalpositively tende datq tho
this disteieetion
tho the the
Can
disteibuti
mean,modnnoceatdato pedouiaté
to
î
dala obability
mean
Stesed,
the titt
be
4udentiNunbca nakethis
and
coeve
trepe
ite distithe I Can Kuutosis: Of the
datathe ase bend towasdthe
Squiahing be Kustocic aley,
towatds
Alattez dataqointsmQ
ate
bution :
it. cdo
almert oan ngatvely
nethe eighl
Thic data.Ttused Median
ModeMean gative the odedSke
be
Med
Ukepanchinq i PEgt tight.mean,hand
heavy-taikd an
Negathe the to
nd &keusda
data
CLeptokutBe si
)
Tn disteibeton
sEewed
y Hence Side
give distebution nedian
batio
diste,
the mean this
Kutccis of
coNcenfeated
thes Date PageNo.
ditelbetjo)the and
Peence a and the
(platkutt) te
tota!
distoutio
ualoy
peak
Kutoss
Kustosis.
valee
tof
lhe ValueThe
cbseeved
q2catee
than The
Exces expeded
bgtokutic
t8o) isThe i
KuttoSic
a
Kusto& thice
tangeindicate
caoie valeee
symnmetic platrutic e
=
of of
Kus innit to
the valus
t0siS disteibetion, kus Fig
1o
-3 sir NoEmal
fes No.
Date |Page
peak.tenegat
pasitie.
gecateEThe ve
megati
Qve
diteibut
tacisKue A
Thisis
iGn

You might also like