0% found this document useful (0 votes)
19 views

EDA Notes Part_1

Uploaded by

soulayush28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views

EDA Notes Part_1

Uploaded by

soulayush28
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 25

EDA

Pardas

pd.NerS09n versin chek

0Taporing tHe Data SeriesData Pae)

Date claaning
’ Deteet remove dupca tes
Deteet B eitthe vemeve impute
misaing veey
o
change date typed lag as.type )
pd.toomeie C)
pd.to dotetie

Dade aplonatien
tofo)
desribe
Pcolmns
slape
Data Macipulation
menge s mopoefl to pandas tan jsin

Gfeate RAgineering
Somples Datagsiat
comns ’featuyes
staistel Areis
Corvela ia yar mean med iaA
Data Aggvegeion G Summaeisgti
pit

8)yisuaizaion
Barpie scate heat map bx plot

Anyais sgtlig

datagvin Thdian dataset web site


Pip iet Ca mpmand cmd to check
whlehn nodle an in our Sateg.

Series=
seriel is simila a |-D
Contain Scalar valuel of sare tpe

Pandas Panel t Data

-D (vecter) SerieA
2-0 (matri) ’ Daaramne Data styete
TensoY Panel

- Pandas is byit on Inumey(aume


oump is tekea
due tto it

ohy Pandas wed to do anaie on


we Can dao al editing 2ilee

deletingaddig
fegture4 fSeries

One -dimeneioni contai


Remsgeneoud Date
Todeedabelled
C) eleaibe Todening - Suppo-rts bot inteqee
baded idexing
ser les n have duplicote indent valel

Nectovized speratins :Support element-sise

shape af seies’(A
shape is change
But size f Series mutabe
staistica peaio
Colum

ediseiesTelentsl)
pd senies ([lo12])

12
Entte

changing Thde value'

pd. Series dath inden =

But itJau give iade veles Sae foy a9


Jau Qcce SS indee Hen it igive
all e values
reating seies from typle
-pdSeriesdata = vaesideg
gpseies data

pd. Sexies( data lel"yash",lo2"abe"3)


be a indeveey
dype ile be abjest becae
sting dtpe Consideed object in
pando

series sig set becgUIe


unordeed lbut
theg
iadevalues Set while SeoieA
tnde duplicate yalues i2 be emoved
Set

Bramumpg
pd. Seies

2-D
wea te Series wig
but
O d cor also name t te seies hi eh cwe ceate
by ing name paramet while Creat
He srie

but f we convert that 2-D


elaments ie each
thenSerled be eate

pd seien date=

giue
ele 2 thea also itai Q
objectfoy aboye aQ

Io teiee an we coe ate a serie ing


o- dimensina qns iS Yes

ist
K
Series A+tib4y

indeof seie

seme Senes -name ade

2ist S we can gccess tHat


2ist
series
ioaes giues
volresSeriesnane vivalues
True
.SeieAOempt pd
$eriesname
emp
syat emp is
seies ifte vee boolean etrn empty
32) in+ )dtype
ap. = Seaiesdata pd
eatg sle dtA speciy
te alss
type dats gises dtype
shape te
Q4tibute
Size
Ca) ie. shape gies shape G)
dieneio giwes ndi
ye
sbt indenirg
io lat
we
ten
Peo popa octons

O head () bdefalt it retans top s


Se ied name head (-3)

bydaPat veae lest s


series-nanetai (-3

RAcept yt3

Oshateyer thins apamt Boon tte baaie dateyee


ler abject dataty pe
come under
gsteags istsples,ete

TndezingRtiesing
Foy iadezing either we csa indea b inden
loc
but we should oot Jnde He vaes b
Qike pthen Loithout Soc
import warnigs
waraingsAlrsavaings
This above Codeaie

Tf e regte q Sezies by giing inde


52Cen access
te va lues oormal indeig
but if e Crea te Series ggiig iode
vales Can' we aormol
todeaig any indexed

22 23

lo2 22

20

inde inde

nermal ndeing noTmal indeaing


Silising
inde inde
2 2
|
3 \62 3

S |93 S

Seied a'a':'e] Sly_nl:lo3


2
Qutput’ l
3
empy Seiel

far st baed iadenlng it is ineusie


boththe indices

we have iat based


indesing t we aorm
indeaig sig
uae 2ike above

22 SeriesIlos 25
lo2 2)
22
23 lo2 2\
lo3
seies| |o3 25
Seriesll 2:3) 20

22

25

Adding neo yoes ie appending


Seies-nane

This de shold nt be in Series Rs


add at end

all He inded les a


e in shing gie tnde
ten i+ tho er
3

b |o2
erro lo3
Seiesn a lo 2
lo2 3

Qutput 2

be cone abiet
Date frames (2-Dimentenal ara
Data vames qe e colestio f seies
indemedlabe0led
shape which is Mutable
size also nutable
Heterageaeus data - fach Con ctain
differet tyge sf data
Both ns

Syata to reote

df - pd. Dasafsme list,seriescmeg diet


Can Cordaio ca diffeet
datatyee ie heteagsaa
YoS

each cal is indiisually homage


Dote frames lust f Secies

ahile creging dataframed


ell as tples

columns
Tf a dotetne hieh s alead
gie a cafmnparemete
to that date6ace column whieh
clusten at
NaN vaes
pd.Data frame "Abe" :

cAumns ["abe/)
Above Code geatrak olataRro which
cln eAbe

O Tf we create clatafrene wing num


datatyee of all data o be
Same i-e highest ode datatpe
je
Bu or nmal cceatiu heon s+ to
data frae tea it nt hagpen

butf we

Nemain all He datteeA in


datatgee
name attribte of Seies

agme food
fptDadafyametats fsetetnarKE
win
di
data 1sezies-l.name i Seies)
series2, name i series-2

food
O Ab
20

Hee name ot sezies beComes col names

panda, Cove tae. Datfrane

Prom hece it fetched df

Can also Create series datarane


anothe method

pd. Core. Series , Series data [ J

pd.ore frame .Data Prme data - J )

This above g baskend to ceate


element df.srze

Size
-
shape df.
shape
df.ndin
ndig
(
vales’
Seies in
byt ist a a Hee
QcesS
inde c- give
index as
Af.iadeoi2
give
inde
Yodiadea both give
coQumns inde we
Dataframe A+tribMtes
af
dtypes Datatypes of
each colum

conversio
Msing aaee
sig asyee e can canvet te data'ype
already Created daterame columns

df Age' . asype (op uint)


But to Save this we have to Yegssigo
whole

uint ’ posihve integed alsoCast gatie)


fynctins of Detiframe

head ()

gise method vales in cendingit ill


gie io ascending rde descendig
oyi bydefault
Qast S rows
Reading
pdiead-csv Ble-patt")

Giye a l iafomaho about e


dotafrome i ke
shaeesizeindex dtypes
mem usage coR.
usagecol-nomes colums ahich.

Pananetd
Sep lelinitee
both panameted Sae

speciß te sep delimite ia te

Rr ead-csybydefalt
tese both a equal to

header’
This is wwed to header
tat which have to Specify tte
header

giveheade, 3
amed
Usefal to give cu name to
date frome i e if heades does not pyesent
give headee names by this

pdreadcs

inde col
Suppoe e have to make a

pdead-csy indencol col-nane

age beiyht
30 ]6
25

above vollno oas coQan initallytea

-we cen gye


But ty gecesinge have to giue both
index to qccess tht sa it we.
than indencol
skip oo inde
So ist I3
cas use
skip be
thes skipro
ws give the
sleipvgus vil)
type data
acoß asdict qin give shauld
be values
se)
dtyeee
pd.readcsy
takes atum which memo eahra $aveHe
dataypes
to te
dtype
memo ves sa
ohichiodatafrane sh to
ave se col hich specify can LOe
Tf we want to g2t altenate number of

pd.read -csv "


ski s ange (2,20,2))
tor)
pdreadcsu
sleipros

edvead-csu

skipfog tee

Hen it iee not ake


dateame

navalyes
Tf we have cornpted vatie in
te coumns ter we place NA
Het be
Yeonove
ag fe have 99
ia calms Hhen

pdread-csv
na voues 199 1399])
Guttte calum datatype wile beceme fleot
on shichNAN ' yae hasplaced
becaus e NAN flogt yale

Canvesters

Conyest e femle a
mole M at a e se
Converter

pd. read -csv


Converters - Gendei
Pemale else

anotey wecae is tht weCan eplace te


this
70
Seriedn Se TLen
30 2
20
ol Sejes
S
=
Series A)
oithgpeatigns Mathematicel
betore
paramets Remainig
Same
slar indealng
eet the qccesi do also
lndedingwe
cen g
name sheet
sheet specitied Takes name
= Osheet
Pamametes O
pd.weadeacel(io
Readig
Suppase we have to qdd symbs
Series is 7,0
84.0
2. 8s.4

Series -Seriesastype ((opst)


Stries+"-).
760).
84.0-.
2 8S.4

(G) Date Frames

We Ca also Qike this

)LCcol.1
this above Code i2 give datefae Canteinig

pd.readcsy + loo

this alb
ak eve iee
i2l qdd loo to coluns hieh
Rarenetes

we haye to set index = Lalse


because we do ntneed lnde
yalesf catafene iote ouY otput csy

lndea = false)

Name Name
Ab Ab
cd
2

calmas speaify te calums shich


have to capart thase coiel
Save o

Af.tocsy (" )

You might also like