0% found this document useful (0 votes)
27 views

Python Jupyter

jupyter notes

Uploaded by

khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
27 views

Python Jupyter

jupyter notes

Uploaded by

khan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 97
1 (20) tn (74) tn (75) cout(75) tn (76) o761 8SCSI05)Fot2- per Nene Anport pandas as pa GFT = pl.read csv nic_train.cs¥") (fT.shape to observe the number of rows and columns won, 32) GFTnead(10) C0 show the FUrSt 10 dace of dataset 1 24 Gaming Jt Sat Fence rape TR ewe 38D 1 rors 7am csc a 44 tala i, ses ah ay Poa rae 31 yom som css . 25 arms a sear Ehabeh Vibes Bem) fersle 27902 ware sss 8 . wo 4 Nuear Me Nee (Ac Acton) female 491 aur orm MN Osanna SCS(08) False me [2} tn (37) ovt(7] 1 (85) 2 (36) ovt(s6) 8SCSI05)Fot2- per Nene ATAnf0() #9 detailed information of dataset ‘Dara columns, (Rota1 32_colunns) Golam wonehull Court type Possengerte Goi noncnull inten 3 Survived 491 noronsll inte 2 ane 4391 non-null object 4 Se 494 noncnull object Stee id nonenuti — flosten & Hckee Sot non-null object 3 fre 451 poncnail — floseee fe Cabin ag pononull abject Hi Enbarked 89 noronull abject types: floates(2), intea(s), object(s) ery usage: 63.7% KB (FF tsms12().sun()# to count the missing values according to each column assengerts me 2 sles Tiewet Fare ‘ype! anes GFT © dFT-dropnaaubset=[ Enbarked®,"Age"]) aera. shape om, 2) beahos ttrlebook Od Pyne FetlgOnaaheBSC8(0S)Fazzip tn [87]: a#Ta.tsnslaQ). sun) fut(87]: passengertd types ince I [18]: a¢r.hesge) v8]: paeemguld Survived Pelee Name Sex Ave_Sinsp Parc ‘etal _Fare_Cabin_Enbuted In (1A3]: deTedrop( Passengerta", axis Inplacestrue) # dop the PassengertD because tt {s duplicate ond useless re [211s afr .teae) 88221: saved Pe Name Sex Age Sip Patch etat_Fare_Catin_Embaned 114 lenge. conn say eee Bigs Th Yona 982 ° pore riz css leOaaantece8SCS(0S|Falzzips ser [22] out 22) 1m (24) tn (25) ovs25] 1 [261 [271 out 27} [38] out(30) 8SCSI05)Fot2- per Nene AFT-Ssna11Q).20n() 4 to show the mtssing values count suvives 8 Peas 8 ge 7 Stosp Pareh 5 2 Goin ae Fnbarkee types ines Temp = aft. sropnat) drop all NaN without distinguish of coLum on, 2) Taub © aFT.sropna subse ‘Age"1) # drop all missing volues with respect to Eabarked ard Age and store in a new dota frane 1 [28) 10 (33) owe 22} abo trlsbookvOl Python Fe 8SCSI05)Fot2- per Nene data.info() #0 detailed information of dataret bats columns (total 8 colurns): cola Monit Count type © First Mame 433 noncnail object 3 Gener 855 noncnult_—Gbject 2 Start pate 00o ponrnula Goject 3 Last Login Tie 1000 fon-nuld Gbject 4 Shy 1809 non-null Inte 5 Gone x 42000 poncnutt floats 6 Senior Wansgenant $33 norenail object 7 team 957 honenaii object types: floates(a), intea(a), dject(6) ery usage: 62.50 KB 1 creating bool sertes Trut for HON values bool. sertes = pa. tsnull(¢ataf"cender"]) 1 steering deta 1 spaying dato onty with Gender = No ata[bool series] Fetname Gonder StartOate_LastLognTine Satay Sonus Senor Management tam Sieghon ts Thea DEM asean—.908 Fate ena “WS rows « @cokimns Osanna C808) False In [22]: datal-bool series] vet221: pat ame Code#3 beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> i (33) ovt(3] 2 (26) ove(6) Inport pandas 25 pa 1 taporting nunpy 05 me Anport nunpy 35. 9p Git s ("Furst score':(108, 98, ap.nae, 95), Second Score’! [58, 43, 56, no-man]y hire score’ eEnp.nan, 48, 88, 981) 1 eveating 0 datofrane using dictionary GF = po.oatarranateses) using notnull() function Geinoteull) 4. notouli().sur() Hest Score hind score 3 ype: ares beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> 8SCSI05)Fot2- per Nene In [28]: Snport pandoe as pe 1 waking dove frane from csv file Gata = pdxread_esu(*D:\Wusers\\ce\\bigoatatralytics\ \éataser\ employees. csv") 1 creating boot sertes True for HW values bool series = pé.notnull(data{"cencer")) 2 faltering Gata[bool series] “# Tis {s sone a5 datel-bool_sertes} ovt(28) Code #4: Dropping rows with at least 1 null value. beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> oro 1 18) out (30) 4 aoorttg nay 05 op Giet = ("First score':(108, 98, ap.nan, 95), Second Scare’: (38, no.nany 45, 56}, Shira Seore’ #52, 40, 88, 98], Fourth Score':[np.nan, mpunan, ap-nan, 651) 4 eveating 0 datofrone from dictionary a = pa.oatatraneoses) Now we drop rows with at least one Nan value (Null value) beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> i (3) out(23} ze (22) ove(22] [22] 1s (23) Amport pandas as pd 4 atcttonory of Lists dicts (rrarse score':(188, 98, mp.nan, 95]5 Second Score’! [36, na.namy 45, 56], “nine score’ :(52, 48, 80, 98), n score’ [apeRah, MBN, np. nan, §5]) 4 creating 2 datofrane fron dictionary GF = pa.Datarrare(osce using dropnat) fonction Gtidropnal) # only For display {fdropnatSnplacesTrue # penminant dropping from detafrane Code #5: Dropping rows if all values in that row are missing. beahos ttrlebock Od Pyihons FlatigOnaaheBSC8(0S)Fazziprs i 104) ovt(a4) 18 (35) coves} 1m [261 te (27) out(27) 8SCSI05)Fot2- per Nena 1 taporting nunpy 05 me Inport nuny 35 "9 Gict = ("Furst score’ (199, np.nan, np.man, 95), second Scare’! [8, Rana 45, 36]s Thana score :E52, Apenany 88, 38], Fourth Score":[np.nan, mp-nan, mpcran, 651) 4 eneating 0 datofrone from dtetionary GF © powdataFranaoret) 4 deopnathow = ‘212') 6 deap iF © complete mo tS Na {4 deopnathow = "912", Anplace-Trae) abo ehrlabockOM PybosFleyOnersBS09(08)Falzzin tn (28) out28) Code #6: Dropping columns with at least 1 null value. 1 taporting pandas os pa Inport pandas 95 pd {npore nunpy 25°" 42 siettonary of (ste diets ("First Score’ :(100, np.nan, mpenan, 85), ‘second Score’! (36, Ra.nan, 45, 36] “nae Score'1(52, npntany 88, 38], Sh Score! 166, 67, 68 651), ing o datofeane from dictionary = po.datarraneeict) Now we drop a columns which have at least 1 missing values beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> 116) me (a7) ove t271 rn (38) out 18) 8SCSI05)Fot2- per Nene Anport’ pandas os pe 4 sporting nunpy 05 np Snpore any 92 "9 4 atcttonary of Lists ice = ("Furst Score’ :(199, np.nany_mpenany 95), ‘Second Score!: [38, Rahat 45, 5612 "ie Seore':{52, npurany 8@, $8], Fourth seore':(@0, 57, 66, 65)} 4 creating @ datafrone fron dictionary GF = pa.batatrane(ases) # using dropra() function Gt.aropnatants « 1) beahos ttrlebock Od Pyihons FlatigOnaaheBSC8(0S)Fazziprs 1 (28) out20) i [21] 1s (23) tn (29) et(29) a. deopnataxis = 1, Snplacestrue) # for perwinant dropering Dropping NaN with respect to specific column Amport pandas as pd 4 atettonary of Lists diets ("rinse Seore' 1199, np.nan, mpenan, 95), ‘second Scare’! [36, nova, 45, 36] “ine Score'1(52, npvtany 8, $8], Uren Score (68, 67, MPN, 5 1 creating 2 dtofrane fron dettonary © pa.Datarrare(asct) beahost trlebook Pyne FetlgOnaahce8SC8(0S)Fazzips In [34]: af.dropna(subsets( Fourth Score’, Third Seore"]) UEL>4]: Frat Score Second Sere Th ears Fourth Sere In [24]: FE = pdsread_esv(“De\\Users\\ce\\Bigbatatnalytice\ dataset enployees.c8¥") te [7]: ate ssu22(). 240) fout(37]: Fhest Hane ° ender aS salary ° Bonus x ° Code #7: Dropping Rows with at least 1 null value in CSV file 1m [32]: # importing pandas module Inport panées 95 pd oto rave from csv file eng csv{"0:\WUsers\\ce\\Sigoatasnalytics\\dataset\\enployees.cs¥") 1m [33]: datacheaagy Ou(93]: _rumtname Gander StatOate LantLogin Time Salary onus SnirManagement Team beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> In [34]: # making new data frome with dropped NA volues feuscata = data.dropna(enis = @, how ="any") Ovst41* gt Name Gander Stat Date Ls ‘Ssiry onus Senor Management ean 784 rows * 8cobmns Now we compare sizes of data frames so that we can come to know how many rows had at least 1 Null value tn [3515 print(*0ld data érane Yength"s Yen(éatay) Print("New dota ‘rane Lengths" Tenfrew data) Brint(-wanber 2 rons with ot feast 2 RA value: ", (Len(dats)-Len(new_dsts))) 014 dats frane length: 1000 Fillna accroding to the requirement Code#9 beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> avn ass 8SCSI05)Fot2- per Nena mm prea Anport pandas 25 pa Snpore nunpy 85 4 atettonory of Lists diets ('eirs® Score’ :(109, 98, mp.nan, 98], ‘second Score’! [3b, 85, Soy nbonan], ‘nine score’ (na.nen, 48, 88, 96]) 4 creating @ datafrane fron dictionary GF = pasbatatrane(sses) 1m [107]: de First Score’ ).flna(method = “FFL11°, inplace = True) 1 (ws: ae Ove(108): Fer Second Sore Td Score 1m (104): de #sttna(netnede-beti1") OWST1041: ft com Second Sore Tht Sere beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> 1 (109) out 109}: tm [51] tn (52) out(s2} mB 1 fiting missing volue using fillna() Ge{ rirct Scores] « ef[" Test Score!) #412n9(" abs") Code #10: Filling null values with the previous ones 4 drporting nunpy 05 np 1 atcttonary of Leste set = ("First Score':(168, 98, mp.nan, 95], ‘Second Score’: [38, 43, 56, noananfy ‘nse Score’ fnp.nan, a8, 80, 38]) 1 creating 0 datofrone fron dictionary GF = po.bataéraneeses) beahos ttrlebock Od Pyihons FlatigOnaaheBSC8(0S)Fazziprs lek: ae ULE]: rater Second Sere Thin Sear In [68]: # fiLLing @ missing value with Ger fina(nettod ="p30") ous(e3) tn [5]: # fitLing @ missing vatue wth Geillnateethod «#011 ut(S]: rater Sacond Sere Tn Seer Ping sna forwar flare te same Code #11: Filling null value with the next ones (backward fill) beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> m6) (71 ovtt mm (81 outs tn (32) beahos ttrlebock Od Pyihons FlatigOnaaheBSC8(0S)Fazziprs Anport pandas 25 pa Snpor® nanpy 99" 4 atcttonory of Lists diets ('eirs® Seore':(109, 98 mp.nan, 98], ‘Second Seore'! [33,45 Sey e's(aaenan, 42, 66, 38)) re.nan, 4 creating @ datafrane fron dictionary GF = pa.batatrane(ases) f# felting mull value using f$LLn9() fanction a fallna(nettod ='be811') FL Second Scare") = af{ Second Score") #i2tna(wethode'*=1 8SCSI05)Fot2- per Nene in (3h: ae OUL23]: Frat Seo Seeend Sere Thin ear Code #12: Filling null values in CSV File - Employee dataset beahost trlebook Pyne FetlgOnaahce8SC8(0S)Fazzips In [24]: # Saporting pandas package Airport’ pandas 0s pe 4 waking ooto frove from csv file 8SCSI05)Fot2- per Nene data = pd. read_esv(-0"\Users\\ce\\bigbatatralytice\\dataset\\enployees.c") 1 pointing the First 10 to 24 rons of 4 the dato frave for atarie:25] vt): tia Gander 2 cay tae tn [26]: data.tenattQ) sn) fout(26}: Furst none ‘ender Stare Date Last Login Tine Selsey Bonus x Senior Mr stype: tne abot trlsbockvOl Python Fl Visuotizarton State Lost Login Tie teaser aiaew ‘Seiry Bonus Senior Management Osanna C808) False Now we are going to fill all the null values in Gender column with “No Gender” tn [39]: # Amporeing pondos package 1 making dota fone fron csv file 4 dota = po,read.csu(“enployees.c5¥") 1 fiting 0 nul volves using fillna() Aata.cender lna(-N> Gender, Anplace = True) ut(28]: rtName Gonder StatOne_LatLoain Tine Sanyo Senor Teen Code #13: Filling a null values using replace() method esha ttrlebork Pyne FletgOnaaheBSC8(0S)Fazzips In [58]: # Saporting pandas pockage 4 woking ooto frane from csv file Gata = pdsread_esv("D:\WUsers\\ce\\bigbatatralytics\\dataset\enptoyces.csv") 1 potnting the First 10 to 24 rons of 4 the dato frave for visualization dataso2251 F581: tte ‘etry Now we are going to replace the all NaN value in the data frame with -99 value. beahos ttrlebock Od Pyihons FlatigOnaaheBSC8(0S)Fazziprs 1 (28) 48 taport pandas os pd 4 making dota frome fron csv file 1 ata = pd read csu(“empleyees.c80") Sas peace tan waae n aeafone Meh aut 99 data.roplace(so. replace = mponen, vale = -93) ee20): pit ry Bonus 6 eon 1000 ows «8 columns Code #14: Using interpolate() function to fill the missing values using linear method. beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> me [2} Amport pandas 2s pd 42 creating the datofrane GF = pabataFrane( (“8 none, 3, 8] voratae, 3, None, None, 613) 1 point the datafrone owt ke 8 Let's interpolate the missing values using Linear method, Note that Linear method ignore the index and treat the values as equally spaced. In [22]: # ¢0 interpolate the missing values Gf-interpolate(wethod =" Linear", lintt_dfrection ="forward') oul) a ee In [SB]: # Ar we con see the output, values in the first row could not get filled as the direction of fAlling of values ts forward ard the beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> 20s filling with aggregation mean, median, mode Code#15 Use weather_data.csv 16 [9]: port pandas as oa {fh = pdsread_esv(-0:\\Users\ \ce\\biqoatatnalytics\ \éataset\wenther_data_ntssing. 3") 1m (8): e4ehesacno) outta In [5]: 4fu.sarut2()-sun0) outls]: aay ° 1m [6]: 64 temperature nean() outs}: sania beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips 8SCSI05)Fot2- per Nene In [7]: dtw.tonperature.i1ina(éew.tenperature.nean(),inplace-True) # Filling in temperature using mean In [26]; atwevent.£83]na(aPd-event node) [8], Enplace-True) me [71 at out 27) se (20): ea beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> ‘snry | FALLing An temperature using mean tenperature-tnterpolate(nethod ='Linear’, Limit etrection ='orward’,Anplacestrue) In [231 at cuepue ovt(a3} oy temper winder 8SCSI05)Fot2- per Nene 1m (80): dekowindspeed. ¢41.2na(ete.winespeed.nean(),snplacestrue) abo tehrlsbookOl Pyne FleyOneresBS09(08)Falzzin ie (a: at outa} oy tempat windspend 1m (42: Inport statisties as st Tn [ ]: dehcevent.node() an (43): stanode(dtv.event) fout(es}:“Sumy* 8SCSI05)Fot2- per Nene In [44]; atwcevent.f8llna(stmoce(aevent),inplacestrue) fill by mode abo tehrlsbookOl Pyne FleyOneresBS09(08)Falzzin ie (ou: at out(sa} temperate filling by Interpolation code# 16 In [106]: dea = pé.read_c5¥("D:\Users\\ce\ \igoatadnalytics\\dataser\ weather data_missing. SV") beahost ttrlebook Od Pyhons FetgOnaraheBSC8(0S)Faz2ipr i (ier): at out(107} diy_tempratre_windpond 8SCSI05)Fot2- per Nene 1m [109]: dens temperature. #ilina(dew-tenperature.nean(),tnplace=True) #ftLLing with wean abo ehrlsbookOl PybosFleyOnersBS09(08)Falzzin in [ine]: aw out(az8) diy_temprate_windspond 1m [96]: 46460)" J>pd.to_datetine Fa 639° }) a [97]: dekesnfot) Rangeincoc: 26 entries, @ to 23 Data columns (total # columns) column Ron-Nell count Dtype ° ay 36 non-null 1 tenperature 9 non-null 2 winespeed 18 nor-nuld 3 event 1 moron abo tehrlsbookOl Pyne FleyOneresBS09(08)Falzzin datetinecates) ‘Hoste ‘oatsa object types: aatetinesa(ns](2), Foatse(2), ject(2) pry usage: 56.0" bytes 8SCSI05)Fot2- per Nene In [182]: a. s08_index(" ay" Snplacestrue) 8SCSI05)Fot2- per Nene “t ahen interpolate wir. t.tine then make doy tine column as sndex tn [108]; éA4.tempersture. interpolate nethods"tine',inplacestrue) # filling with interpolation re [1851+ at ovt5) temperature windspend event In [: #A clear disference between mean and interpolation Filling, according to tine the interpototion filLing 4s good IL: # rose beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips avn ass 88cSi08) Fatt pr Ntsc In [451+ data.baad() ULES]: Rrathame Gander Start Doe LattLogin Tne © Dobe Mob OBTOHS——~I2AZPM ToD ams 3 sory Ma iezons DPM tae 840 4 Ly He 209806 AAT PM fore 330 in [46]: Inport seaborn a5 srs in [#9]: sns.boxplot(data[ Sones *°]) Tee Wonainn ‘c\users\enaise knan\anacondad\Lio\site-packages\seaborn\_decarators.2y:36: Futurelarning: Pass the following variable a 2 key (ord arg: x. Fron version Tkeywore will result in an error or nisinterpretation warnings warn ut[48]: eAnessubplot-adabel="Gonus > Tn [58]: €#T = pdoread_esv(":\\users\\ce\\Bigoatatnalytics\\dataset\\titanic train. csv") beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips a2, the only valte positional argunent wii be “data', and passing other argurents without 26 explicl avn ass 8csi08) Fat pr Ntsc In [501s aft .cotunns outst]: andex( Passengers", "Survives", “Pelass', ‘Mane’, ‘Sex’, ‘age’, ‘St85p", aren "Ticket", "Fare", “Cabin, “enbarked], etypee object") 40 [52]: sns.boxolot 4FT. Age) ws\thalfd than\anscondoB\Lib\ste-packages\seaborn\_ decorators. 9:36: Futunedarning: Pass the folloding variable a5 a keys “x fron version 8-12, the only valle positional argorent will be "data', and passing other argunents without an expliel T keywore will resuit in an error or nisinterpretation warnings arnt ‘ut(52]: caxessubplot:xdabel>age'> beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips sas sana ass 8SCSI05)Fot2- per Nene In [581+ ans. boxplottaFare) {:\Usens\xnalie Knan\aracondas\_1a\sive-packages\seaborn\ decorators py:36: ‘ord arg: x. Fron version Ckeyuore will result in an error or aisinterpretation watngsuarn(| ‘ut[S8]: eAxessubplot-atabel=Fare’> assingment - 1 In [2]5 dnport pandas 2s pe 1m (89): # Use 3 dectnat places io output atsplay d.set_option("splay.precision”, 3) 1 bn't wrap repe(Datafrane) across adtttonal Lines pdaset_option("aispiay.cxpané franc rege, False) 4 set nox rovs displayed in output to 25 bpuset option( “display man rowt™y 13225) pdsetoption("ispiay eax coluans", 25) Futureharning: Pass the following variable as 2 key 32, the only valie positional argunent will be "data’, and passing other argurents without an explicL In (134): dt = pasread_csv(°G:\\Teaching Subject\\big Data Aralytics\\I0A Diplona Prog\\60R-8\Mrargling\\Assignment\ \Wataset\\ebite Price_| beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips Xn [127]: @f.shape ous(1a]: (2000, 22) Xn [13615 €F-isrut1()-sim0) oue( 135]: unnamed @ ral mdep Pa neigh Pecwidth pricerenge 8 tn (136): de-heaaey OST3T: Monae try pomar ide cach speed. ut. sin fe fourg I-mamony m.dep mabe wt nLsare pe PHNOM PLWiah tm ash Se beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips 97 1 [28) out(20) 18 (329): 18 (53) 1m (323): 1n (122) in (125) 1m (125) oust125] 1 [1261 out 126}: af bive.vatue_counts() F. ve replace(to,repl Inport nunpy 25 99 y_power = of battery sypelste) (4), value = ('False', True’), daplac «af bartery_poner replace(to_replace='[6](0-9]1(5]", value= npinan, ragex-True,dnplace-True) <.bartery_power = af battery_power.2stype('(1o3%64") a.nd) af eed) baterypowee Mie loc.speed_duasin fe fou.g In.menory_m.dep mobil. coe xvi am seh sem tte lpdaaantece8SCS(0S|Falzzips 7 In [1271+ aF.four-g.replace(te_replace = [8,2], value = (°No","Yos'1, inplace=trae) tp [tee]: aF.hesde) fam tech sen tthe oue(104) ie _lockepeed_dalsin tn (128): df.ram = df.ran.astype(ste) tn [130]: df.ran.repdace(te_replace='(2]{0-9]6[2]", value = op.Nah, rege In (134): df.ram = dram. astype( #loat64") leOaaantece8SCS(0S|Falzzips avn ass 8SCS}05)Fot2- per Nene Xn [132]: 4F-4nr00) ‘bats colums (total 22 columns): cola NonsWit Count type (@ barcery.pouer 3967 noncnull foatse 3 blue 2000 non-null beck 2 Clock speed 2080 non-null flautse 3 dusi.sin 2080 non-null intea a fe 2000 oncnullantea Stour 2000 non-null obJect 6 Snteperory 2000 non-null Sntea 7 meee 2000 oncrull loatse & obile at 200 noncrull dneea 3 necores 2000 noncrull sna HL pxelght 2060 non-null intea 12 pide 2060 non-null inesa oto 4787 noncnulloatse 16 tallctine 2060 non-null intea 7 three g 2000 non-null ines 24 tauerctereen 2000 non-null Snes 20 price_range 2060 non-null _intea types: FToaton(a), Sneeacas), object(2) sowory usage: 328.20 KB In [133]: f.to_esv(-Mobide Price classisiation train missing.cs¥') --- outliers --- 1m [3]: Anport pandas as pe Anport nanpy 2 90 nport matplotlib-pyplot as ptt Anport Seabor® as ss ‘fron stLeorn. datorets ‘apart (00d boston abo tehrlsbookOl Pyne FleyOneresBS09(08)Falzzin mt mE) outs 1 (9) ost 61 mr AF = pdsroad_eav("/dataset/borton_train.cs¥") «ated sns.boxolotetsen) 8SCSI05)Fot2- per Nene {C:\usens\enalsal whan onacondas\Lin\sise-packages\seabora\ decorators py:36: ‘ord arg: x. Fon version 8-12, the only valia positional argurent will be "data', and passing other argunents without an explieh Tkeywore will result in an error or nisinterpretation warnings arnt “aessubplot:alabel=*ra'> 445 © oet[ Istat' (ray erin T) beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips Ty [8]: d¢S.cotumns © (°1STAT", 8M", "CRIH'] i Dk: as outta} 2333 rows «3 cobs Osanna SCS(08) False 8SCSI05)Fot2- per Nena savan ass 88cSi08) Fatt pr Ntsc in [30]: sns-dtstplot(erst'R4")) ‘c\users\enalt.é xnan\anacondos\L10\site-packages\seaborn\¢istributions.py:2619: Futurekarning: “eistplot’ is 2 ceprecated functi ‘on and will be renoved in a future version. Please acapt your code to use either “aisplot™ (a figure-level function wich similar Flexibstity) ar" mistplot” (an seesvlevel function for MistOprans) wieningswara(asg, Futurekerning) ut[28]: eAxessubplot-adabel="RH", ylabel= ensity’> . beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> avn ass 8SCSI05)Fot2- per Nene In [12]: sns.borptot (arse ']) Cr \usens\naléa Knan\anaconda3\LLa\stve-packages\seaborn\ decorators py:36: Futurekarning! Pass the following variable a5 a key ‘ord arg: x. Fron version 8.12, the only valu@ positional argorent will be "data'y and passing other arguvents without an explieh Cteywore will result in an error or aisinterpretation watings warn | ‘ut[2t]: eAxessubplot-adabel=e4'> -- removing the outliers —- --- outliers boundires function --- In [a2]: def find poundaries(a, variable, distance) GQ = de[variavle} quantile(@.25) G = artvariapie} quaneilece.75) Ta @ at ower poundary = QL - (xQR + distance) Upper_boundary = Q3 + (20R * distance) retura upper boungary, lower boundary beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> avn ass 8SCSI05)Fot2- per Nena In [2315 ans. boxplottars.6) Cr \usens\naléa Knan\anaconda3\LLa\esve-packages\seaborn\ decorators py:36: Futurekarning! Pass the following variable a5 a key ‘ord arg: x. Fron version 8.12, the only value positional argorent will be "data'y and passing other arguvents without an explieh Ckeyword will result in an error or aisinterpretation wangs warn | ut[28]: ehwessubplot-adabel=e4'> In [24]: AALepper_Tinis, ® Lower tints = find poundaries(Af, °8H°,3.5) tn [35]: Mopper inte, aM lower 2énte ut(25]: (7. 681499885999998, 4. B17s99000@80003) te (26): 5.0.nIn0) fout(6}: 3,561 Let's create a Boolean vector to flag the outliers in RM: In [a7]: utters Am = npuvhere(4éS{‘R) > RLupper Lint, True, mpoahere(afS[ 2H") < MH Lover limit, True,False)) beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips Xn [28]: outhiers. eH fut(a8}: array( (raise, False, False, False, False, False, False, False, False, False, alse, False, False, False, false, False, False, False, False, False, False, 1m [a9]: 5.shape ourtann: (33, 3) raise, False, False, False, False, False, False, False, False, False, False, alse, False, False, False, False, False, False, False, False, False, False, False, raise, False, False, False, False, False, False, False, false, False, False, ralse, False, False, False, False, False, False, False, False, False, False, False, raise, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, ralse, False, False, False, False, False, false, False, False, False, False, false, ralse, false, False, alse, False, False, False, pase, False, false, false, raise, False, False, False, Fase, False, False, Fase, alse, Fase, False, False, false, False, False, False, False, False, False, Fase, False, False, pase, beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> 8SCSI05)Fot2- per Nena raise, False, False, False, Paise, False, False, pase, False, pase, False, eaise, alse, False, False, paise, False, False, pase, False, False, pase, raise, False, False, False, pase, False, False, Paise, False, Paise, False, patse, alse, false, False, alse, pase, false, ese, False, False, pase, raise, false, raise, false, Fase, False, fase, faise, false, false, False, raise, Enise, False, ase, False, faise, raise, fase, pase, False, false, Emse, Fase) 1 (62) oute2} tn (24) tn (25) out(25) 2 (25) over26) tn (39) out(29} FSI Jfoutliens Rm] count()# count the outlier --- Finally, let's remove the outliers from the dataset: 45 trinned = ofS. 2oe{~(outiters.R4)) on, 3) (45_trinned. head) 4F5..nin() beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips sav ass 8csi08) Fat pr Ntsc In [26]: sns.borplot(as_trtmmed. AM) ‘c\sers\enalié xnan\anacondas\Li0\site-packages\seaborn\_decorators.py:36: FutureWarning: Pass the following variable 2s 2 key ‘ord arg: x. Fron version 8.12, the only value positioral argorent will be "data'y and passing other arguvents without an explieh keyword will result in an error or aisinterpretation. varnings warn ‘ut[26]: eAxessubplot-adabel="e4'> beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips avn ass 88cSi08) Fatt pr Ntsc In (281s snecatstplot (ats tréamedt“8°1) ‘c\users\enaité xnan\anacondos\L10\site-packages\seaborn\¢istributions.py:2619: Futurekarning: “eistplot’ is 2 deprecated functi ‘on and will be renoved ina future version. Please acapt your code to use either “aisplot” (a figure-level function wich similar Flexibsiity) ar" mistplor” (an axesclevel function for nistoprans) vwieningswara(asg, Futurekerning) ut[38]: eAxessubplot-adabel="@¥", ylabel= ensity"> in [37] sns-boxptot(#f5_trinmedt'84°1) ‘ut(37]: enwessubplot:xdabel='t4'> beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips 1s (23) tn (30) 1m (33) ove32} 1 [38] 8SCSI05)Fot2- per Nena W.UE, RA LB = Find_boundares (465, N 1.5) S["toL alt trie] = aF5.RM(465.AM ¢ RHUB) & (FSR > MALLOY] TALE for the ten the outliers ‘c:\users\enalié xnan\spp2ata\tocal\Tenp\ipykernel_16476\2533889385.py:1: Settingwincopynorning [value 1s trying ta be set on 3 copy 9f 3 aise Tron 3 UataFrane Try using -Loe[re_indexer,¢o!_tadener] = value instead See the caveats in the cocunentation: hetos://pandas.pycata.ong/panéas-docs/stable/user_guice/indexing.henl#returning-2-view-ver Sos-a-copy {nttps"//psneas.pyaseaong/pandar-coca/stable/user uice/ indexing. nenlareturfangea-vsew-verrUt-3-C0pY) S[ Role trie") = 245. RM (405A © AM UB) & (aS.AM > RALLB)] H ThGs for the trim the cutiters 445.1500110.50r0) muattrin 2 a45.inf00) Rangsincex: 333 enerier, @ to 392 Data columns (total 4 columns) t cola Kon-hull Count type @ Gat Sia nonnull” Hoacse a B33 noncnult— floatsa 3 ohalt.trin 312 noncmull Floated types Hoaten(a) snewory usage: 18.5 XB abo tehrlsbookOl Pyne FleyOneresBS09(08)Falzzin sus i (36) 8 (371 out(7) 28 (39) 1m [22] 1s (33) 8SCSI05)Fot2- per Nene 4¢5.dropnatinplace-true) Cr \Usens\enalie enan\appoata\Local\Tenp\ipykernel_16232\557178398.py:1! settingkithCopyNersing: value ts trying to be set on 3 copy of 2 slice fron 3 Datarrane See the caveats in the documentation: htps://pandas.pysata.org/pandas-docs/stable/user_guide/indexing.htaldreturning-a-view-ver Sts-a-copy (https! //pancas.pysataorg/pandas-oocs/stable user. ise/ indexing. NEMLAPetuPAing-a-View-verSUS-3-COPY) ‘5. erepnatinplacesrrue) FS. shape en, --- 2. Making NaN outliers --- (5 nem NaN = 5. (45.00 < RHLupper_Linit) & (4F5.RK > M_Lower_Linit)] 44S neu] = 6F5_new Nat C:\usens\enatsa nan\appoata\Local\Tenppykernel_70980)3764860249.py:1: Settingutencopyaaratng: ‘value 15 trying to be set on 2 copy oF 2 slice fron 2 Datafrane. Try using -loe[row_inderer,col_indewer] ~ valve instead See the caveats in the documentation: htps://pandas.pysata.org/pandas-docs/stable/user_guide/indexing-htnldeturning-a-view-ver Sts-a-copy (https! / pandas. pata org/pandas-oocs/stable user. uise/ indexing. NEMLAPetuPAing-a-View-verSUS-3-COPY) EST RHLnew J = 365_pew_ Nat fs.infot) Rangeincex: 33) enerter, @ 0292 Dara columns” (total 4 columns) Colum” NonsWll Count iyo (© StAT 333 monorail’ floats 2 Gh 323 noncmull floated 3 anew 352 noncmull floats types floatot(a) beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> a7 1 (63) outa} 1 (64) out) tn 51 (38) out53} 1n (36) ‘vt 36) 8SCSI05)Fot2- per Nene 445. R.nin0) 4¢s.to.nax0) 3. FS" RHBin 2") = pa.geutAFS{'WH'], 4 , Labelse[2,2,3,4] ) # Fox Bening {C:Ausers\enalses wnan\appoata\Local\Tenp\pykernel_30476\2694225253.py:1: Seetingutencopylera ing 1 value 1s toying to be set on a copy o* a siice Fran a Datafrane Try using -Loe[rom_indexer,col_indexer] = value instead See the caveats in the docunentatfon: hits://pandas.pydata.org/pandas-docs/stable/user guide indexing ntnldreturning-s-view-ver Sts-avcopy (nttps!//pandas.pyanta-org/pandas-docs/stable/user.ue/ indexing. NURLAretarhing-a-Wiea-versus-2-COPy) s[-RHLBIn_2') = ponacst(fS{ "WH ], 4 » 1abelsH[2,2,3,4] ) ¥ Fix bining «fs.head() STAT _M_CRIM_RMLILin_ RAL 445.00.pin_2.valve_counts0) aoe abo ehrlsbookOl PybosFleyOnersBS09(08)Falzzin sus [72] 10 (73) out73) 1 (08) ‘ovt68) 1m (37) overs} sn (74) 8SCSI05)Fot2- per Nene S{" WH Bin 3") = paveUeCAFSE ANT, (2,3-8,5.256.8,8-8], Iabels-[1,2, 344] » tnelude_lowest-True) # Uriable pining :AUsens\xnalie nan\appoata\Local\Tenp\ipykernel_16232\2845768594.py:1: Sectinghtthcopyiarning ‘value is trying to be set on a copy of a slice fron a DataFrane, Try using -loe[rom indexer, co!_{ndeuer] ~ value instead see the caveats in the documantatior 1 pandas pyeata.org/pandas~tocs/stable/user_guide/ indexing. haltreturning-2-view-¥er Sos-a-copy (nttps'//psreas.pyseea.onglpandae-coes/stable user usce/indessng.Nenlaresurfsngea-vsew-versus-9-°0py) PSL MLBIN_S) = poneutCSE AM], [243-8 5.256.8,8.8], Tabelse[,2,3,6] » include JowesteTrue) © viable bining (445.00pin_3. valve courts) 2 268 Name: WLpin 3, type: ints 445.00bin_2.valve counts) tame: WDin_2, types inte 5. pin.valua_courts() te 44s mpin_2°) = pangestaFS{'M#"], 5, Labels=(2,2.2,3,4)) {C:\usens\enatsal khan\appoata\Locad\Tenppykernel_36232\6263262163.py:1: Settinguiencopylaraing 1 value 1s teying to be set on a copy of a slice fron a Datafrane. Try using -Loe[rom_indexer,col_indexer] = value instead See the caveats in the docunentatfon: hits://pandas.pydata.ora/pandas-docs/stable/user_guide/indexing-htnléreturning-s-view-ver Sts-avcopy (nttps'//pondas.pyanta-org/pandas-doca/stable/user uise/ indexing. MURLAPetuPhing-o-\uea-verSUS-2-C0py) s[-RHLPIN_2') = porgest OFS "WH ], 5, 12bE15-18,1,2,3.4]) beahost ttrlebock Od Pyihons FletlgOnaraheBSC8(0S)Fazziprs sn 095) out75) 1m (66) ovt(s6} tm (67) 8SCSI05)Fot2- per Nene 4¢5.40pin_2.value_counts0) ane: Sin 2, étype: tres -~-Practice on selection of values --- 4#5.nead() AST RLBSn'] = pALeURLAMSER], & LabelsH[3,2,3,4], Inelue rue ) {:\Usens\xnali Knan\appoata\Locat\enppykerne]_209802335332182.py:1: settinghtthcopyilarning 1A value 1s trying to be set_on a copy of a slice fron a OstaFrane ‘ey using -lo¢(row_dndexer,col_indexer] = valve instead See the caveats in the docunentation: https://randas.pydata-ort/pandas-docs/stable/user_guide/indexing.htnléreturning-2-view-ver Stscaveopy (nttps"//pangas.pysoeaong/pandas-doca/stable/user guise/ indexing. nenlareturnang-a-Wuew-versus-a~ Rangeincexs 2000 entries, @ £0 1999 ‘Dara colunns. (RotS1 22 columns) cole Mons Unnamed: © 2008 non Clock. spece 2088 non Gusi.sin 2000 non fe 2000 on fours 2000 non intlpenory 20068 non abliext 2000 non pe 2000 non Prpeight 2008 non Prcwideh 2000 non ‘sik sire 2000 non “a 2000 non Drice_range 2008 non types: S051(3), foates(a), spetory usage! 330.24 KB abo tehrlsbookOl Pyne FleyOneresBS09(08)Falzzin all all rll all mull rll mull rll all all rll ll ull ll Tesse(as), abject() floats object floats freee sree freee sree freee sree avn ass 8SCSI05)Fot2- per Nene In [82]: aF-anfot) ‘bats columns (total 22 columns): cola NonsWit Count type © Unnamed: © 3000 noncrull dna 3 bateery.power 867 non-null floats 3 clock specs 2060 non-null floatst 4 duslisin 2000 noncnolldncea Ste 2000 noncrall sna 6 foure Jone nencrull object 7 nt-penory 2000 nonerull tsa 5 3000 non-null floatse ° 3000 noncnallSnsea a ao0e noncnull tea 12 prpelgnt 2068 non-null ineea 33 prcwidth 2000 non-null ines i sow 2000 noncnull nts 17 tallcsine 2000 non-null inte 24 theez 2000 non-null Snes Peril zope non-null nea Bi price_range 2000 non-nell inset types: poot(a), foatsaCa), Int6e(26), object(t) peory usage! 330.26 KB 1m [93]: d¢{"bartery_ power" .mean() fous[98]+ sa4a.7orssnezaoss In [98]: import statistics as st In [99]: st.nean(de-battery power) {99}: nan abo trlsbookvOl Pythons Fle Osanna C808) False avn ass 8SCSI05)Fot2- per Nene In [123]: af grourby('n_cores',ual_sin’ 1) [battery power" Joneant) fous[223]: meores. aual_sin 3080000 gna 36 ree. 503639| ane: battery_poner, ctype: Float6e In [122]: a grourby(susl_sin’, cores") battery powers" Joneant) fous[222]: aual_sin n.cones ° 312.4403 sss melas 3am 386000 32601827279 ane: battery_poner, ctype: Floatse In [228]+ FL battery poem] #8 Lna(]proupby( cores’) “bateery_poner]-transtona mean’), inplac trae) abo ehrlsbookOl PybosFleyOnersBS09(08)Falzzin 351 (2351 26) az (25) 129} a hesacne0) Urmamed: satery power he clock sped f foug Inmamory sep mabie.wt .. pxhelght pxwith ram ssh a6. th tw det ae) print(“sea') 20) et sy2(nane) prine(nane) nC Faizan") Faszan leOaaantece8SCS(0S|Falzzips in [aaa 1m (135) tn (136) ouet136} tn (naa) rn [182] rs [183] ouet343} 1m (147): ove(147} tn (348) 148) det cat(a,o) nat ante = eal9,5) (36, 62) xe mp 3FrayC(2346:743,5420) x peoataFrame() x.noce(0(8] ane: 0, dtype: Antz stenoae(s) abo tehrlsbookOl Pyne FleyOneresBS09(08)Falzzin 8SCSI05)Fot2- per Nene me i 116) rn 81 me [221 1 135) structuring -- Metacharacter [] Wind all Lower cose characters alphabetically between a" and *n x= re, findail(*T3-n)"", et) Print) Cr me eR TE tat oT ws resseareh("ai", #48) Print) rematch objects spane(S, 7), matche'3i"> x= cesmateh("The", 48) print) re.naten cbjects spane(@, 3), matehe'The’> xs ressplst(ps", txt) Print [Te rein in S's "an') x ressub("pa", "ta", txt) Print ---one or more + and * tanoaenSCS(08) False in (a2 tn (29) rn (22} 1m (24a) tn (32) 8SCSI05)Fot2- per Nena ‘Find alt Lover case characters alphab x= revéindall((sen)=", txt) brant) Coins et ty ‘ext = "The pain in Spain FFind alt lower cose characters olphab we restindaii "(aon Jey txt) paint) ET ey omy ty 5p --- metacharacter \ --- ‘ext = "That wa be $9 do789L136rs wee nestindsi(\8e", beats) 1-789", °6°) ‘ext = "That wall be $9 dotiaérs” wee restindsii (Nbr, 3) pein) [that wit, “2 be ty * dota’, 5°] AFind all aigit, choracters: xa nes findall (de, 8) delntted 3 598, 267 --- metacharacter '." --- beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips 116) tn [248] 1m (2a) 1n (39) ‘Search foro sequence thot starts with "he", folLoded by two (ory) characters, and an "o x re, Cindali (ne 00", xt) Print [Vette planet he b20 eae") = metacharacter '*' start with given character -- Acheck UF the string starts wlth ‘hello x re-findaii('*neito", tt) Print("Yes, the string starts with ‘helio"™) print(Ne > ‘ext = "jens331lo plané-7et 99hels36+210 vor scheck Uf the string starts wth ‘hello eo res Findant Tors" "110-9", txt) Prints Lea, 6 985, as, oa ‘ot = "helio planet hello werle eheck if the string starts with ‘hello Aovpetindall(eohello marke’, txt) prine(*Yes, the string starts with ‘hello™) Print ("No natch") No math --- metacharacter ' with []' start with given character --- beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips 65] rn (a7) 1 (28) se) 8csi08) Fat pr Ntsc ‘ext = "hello planedt neatle world” Weheck if the string Starts wlth ‘hetto ws reetinaait(e(™ §), 2x) aput te [48] = resfinaali(e(™ 5}, txt) syntaxerror: invalid syntax sf - metochracter § ~ check UF the string ends with ‘planet xs re-tindaii(oloness", Ot) tex Print(“Yes, the string ends with “ptonet'*) rine(Wo match") Vos, the string ends wich “planet ‘ott = "hello planet worta Aeheck Uf the: string ends with ‘planet ws re, FIndati ptonets", bet) Print(“Yes, the string ends with ‘alanet') ise Print (Wo match") No match ‘ext = “hello planet eo helo® ‘Search fore sequence thot storts with "he", followed by @ or nore (any) >> characters, ond an "o” wes nertindsime[ar2)#0", te) dein) [inenio", ‘heto"y beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> avn ass 8csi08) Fat pr Ntsc In (571: #-— metachnacter «= ‘ext = “hello planet” ‘Search foro. sequence thot starts with "he", follosed by 1 or more (any) choracters, and an *2" wes rertindsii (se-05", St) pean) (rene) In [59]: # --- netachracter ? ‘at = “helio planet” ‘Search foro sequence thot starts with "he", followed by @ or 1 (any) charocter, x re. findaii (ne 70", xt) deine) (rete) In [75]: # <-> netachracter ? tat = "hello planet” ‘Search for o sequence thot starts with "he", folloded ty @ or 2 (any) character, and an *o xn ree eineai 1 1(2)", et) point) ry ry In [63]: # —-- netachracter () ~ ‘xt = “hello planet” ‘Search fore sequence that starts with "he", followed exactly 2 (any) characters, and an "o" SEs eo) [retie") In [71]: 4 netachracter (} ~ ‘xt = "hello planet” ‘Search fore sequence thot starts with "he", followed exactly 2 (amy) characters, and on "0" x reetindaii( fa-2](2)"y bet) deine) Tine, 11", “ply vant ‘et beahost ttrlebork Od Pyitons FetlgOnaahceBSC8(0S)Fazzip> 1 (76) tn (33) tn (30) 88cSi08) Fatt pr Ntsc 2 netachracter | = {xt = "The rain in Spain falls masnly i the plate!” acheck if the string contours etther “falls” or "stays" sore, fAndsii('alieletays', 28) point) Print("Yes, there 4s at Least one ratehi") ease Print(Wo match") (ans) Yes, there is at Least one natchl 1 spectot sequence VA ‘at = "tne rain in Spain acheck if the string starts with “The were Fineai(e"\arhe”, xt) beaton Print("¥es, there 1s 2 natch!) else Print (lo match") (ome) Yves, thore is 9 match! 1 spectot sequence \b ‘txt = "tne.rain in Spain Acheck Uf “ain” 1s present at the end of o NORD: Xn ree fIneald ("aime") painted Sexe Print("Yes, there is at Least one ratchI") else Print("o match") Cain’, ain") beahost trlebook Od Pyne FetlgOnaraheBSC8(0S)Fazzips In [85]: # - spectot sequence 16 fat = "The rain in spain Weheck if “ain te present at the X= res findall(o"\batr”, et) point) Print("Yes, there is at Least ease Print (Wo match") o > mate 1m [86]: 4 o> spectot sequence Vb fat = "tne rain in spain Acheck if “ain” (5 present ot the Xo retindatl ("pais et) pointed fee print("Yes, there 1s at least else Print ("o match") o No eaten In [87]: # —- spectat sequence 18 begning ‘xt = "tne.rain in spain Acheck Uf “ain is present ot the son re. Fingal (©™\B5p8i0", et) beintte) Sexe rint("Yes, there is at least else Print("o match") [spain] 88cSi08) Fatt pr Ntsc end of wore: one maten!*) end of 0 v0Ro: one maten!*) (or end of word end of @ v0R0: one natch") beahost ttrlebook Od Pyihons FletlgOnaahce¥SC8(0S)Falz2ips

You might also like