0% found this document useful (0 votes)
20 views4 pages

Ai CH-5 Assignment

Uploaded by

hardik.nankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views4 pages

Ai CH-5 Assignment

Uploaded by

hardik.nankar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Page No.

29
Date o8 os 24
Unit-s Docta Analysis and
Pacessiag
Explain types of deda in detail
Ane Dtu cum be classihed into cifeget thype baseck anits natuae

Numeaical Dda Tt nepresents quanti tative Ualues and


cam be fuather Cate gáizedimto tso Subtypes i

1 Corinuos dat H Cam tuke amy ualue withim a specihc


It is typicaly oeasuaed Gm a Cantinuous s cale
RangecAn have cleima e factiooal vaues

2Di5(Tete crta s- Tt Consis ot twhole nmbers c distint

values that camnot be Subdiviced turthe T4 aepresents


Countable or Categcaical cdcta

Categszi cal Data - Tt imchdes elistint categczie cT gaoups


coithot any iahezent numeaical memimg

1Noroinal dota -CategoHes aith no inhexentoader o

hie rarchy eri gender, eye cale

9 oxdinal dcta ?- categoaies with specichc oacder or aanking


The cateaon havea delatiue position oT namk but may
26t have a fxed mune aical difteremce between them

TextuaL Data Repesents mshucuzed on seanir sthuchied


teshuad rfenoatioo H incucdes clo cunerts, aheles Soclal
9edia post ny other foIm of texha cntet
Page No.
Paga No
Dato
Date

key aspeds cf dctu Amalysisi


Descaiptive Strtisics i- TA pzouides asuonam of the
Spahal deta Tepzeseds iotmion abat eoqxaphic
maimchazcHeaisiceef aderta set This incldes
location aa fechine on the cath's Suface It iocucles
measuTe Guch as me m made4ncepeTceniles
Cocrdincdes, mapsamy dcta asseciated with specitic
locotion e GaeS Sustern
2Tntexehal stadistis i aLawsS t make texence and

caaw Conclwsion cibout laacx papulatrn based


Bimagdata Cansist af only too paSsible ualues thypicall
Nepresened as o and1 Tis aflen used in carmpute Sample
Science aod cigital sustens- Tepzesemtim on|off states
haie falseCndion absense ot certaio chataclers, 3Data mioing pocess of ciscauetng padtesns and
relcctioDship in laTqe dtaset s inualues applyimg
Sepal plain E<ploratony Datu Amalysis Statistical algoaithum nachine learning techniques.
Anc 1+ efers to he pncess of inspectig trmstoroing
datu to discovex useful infoamatincaaw ConclusioD Becictive Analytics- uses histozical data to make
Tt 0nualuesseveral stepsincuding dactecallection exediction ox foae casts abeut futuTe events o outcames.
data cleamig dati modeling
Deta ccllecion Gaathezing relavant cleckee fm aatysts Text amd Sehment Andysis invalues erttactimg
isrng altes, eadelin oties aaiaus iafoamation amd insights from textua decta A incudes
by addessAg
Souces Such as dctuba se, suNeysS Such as text nnq NLP

Data cleaning - fxepazi ng the dta hen t r aooalysic by Dcta Qramqling Also knaan ls data mumging or data
addaessin miscing Uaues,handlinq outlies, stmduadizinq pae-puce ssinginualues icleaninq 3eshapng the dda
the foTOat to make it suitable for cnalysis

3. Dta tamstaDcticm' Cenuntinq dctu intoa situh le tormat Data Inieqaation - Combininq c t a fom mmultiple souTce
fog anaysis which an y nuolue e tespt zeshapimq into a sinqle niied detase t may 3ecuite mergLDg
the dcca cTeatims mew vaaichle olminq datu to caecte a commerehensiveuiew tu analysis

Dat modeling Applyimq tetisical nd machine leuTnin SDatu Tntepretion- prucess ot makinq sense ct fhe
techoicae to build models hat cum make predichoms. cnalyzed datee amd dazawing meaning fl iasights amd
Canclusion T4 inulue s caiticalls an alyzing the 3esults
Page No Page No.
Dato Date

3
Consideaing the context cnd making data-cai uen decislans. Data tceosformccticmi
Narmalizationi TH enses thct oneziculteotuzes
siol a Scale,preventing ome teatune hoo
3
Explain datapre- panle ssina in detanl
Ans + is a caucia step io dette amlsis_ipeline Tt muclues clorobaating the analusis due to He laanex uaue s
pepaang Tau clcta to ensuae it is in a sutable fomt | La 7-SCTKe Xmalization

foy analysis 9.Featuze scaling


3 mio- max scalnc
Aims to addess Comnan issues cuch as mmissimg uahues
Cutlierscise among othes. Stancladadizaticm -
hem sklearnpIepcessimq impoz t Gtandazd Scaler
key steps toT catu pae-pcescimg - Scaler = Stumdzd SalerC)
Af Aledr'Age' Inceme' J3= 5caler ft tan sfurn
Data cleamimg Caf Aled C[ Aae', Tncarme' 11)
o Handlinq Missing values- Occut due to Vaaious Teasn paimt c" Dacta fume aftez stamdazdizton)
Such as data collectio enTS oa paint Caf fled ))
incoroplete records. Common shategies tog handling
miss nq ualle incucle : Norm aizatium :
L Deletin aews ar coloun coith oissimgwaues hrum skle amprepcessing impot minMax Scaler
Tmpthinq missinq ualues min maX Scaler = MinMcx SclegC
df fledcr"Ae Tncume 1 = min maX Scale ft
oeDealing with otlieas i- datu poits that siqniicandly transfum caf flled CE Aae Lacome')
deviate hung the noToal data disthibutiom oudliexs painnDatu faTOe atter min -zoa Scaling
Can be addzessedby' pain t Cdf fled )
Rernouinq otliers
2. Taamsforming outliers One hot emcading Represening each catego as binay
featue caloun
2. Date inteqrationi
Lnvclve s combioimg data tum nutiple scuTe-b Cecte Label enceding Assignimg numeical label each catego
C ugifed datuset This step is essenia when clealing
wi thdatu collecied hm dffexent clatabase Ales_c
fea mts
Page No.
Date

Paincipal Component Analy sis : Iaan sfozni nq oziginal


teatuzes imto lowex dioensional space usinq liney
Comhimerion of aigimaluaHables.

Data Discaeizationi
TH mvelve s Conyeting continuos datu intoeiscrete
intevals ox binS. t cam be usetul when
worldmq
with algozithuns that aecquire ate qoical
crtu. lechaiques like equal- widtta himningenD py
based binninq can be used

You might also like