Prog Found Final
Prog Found Final
__ _
: ~ - - -- -
_____
7 00 ._ _ _ __ _
N am e ed .
th Fl y be en ex ec ut
Fo r al l th e pr ob
le ms in g th f'iOI lo w in g st ate m ents ha ve al re ad
e oa f, as su m in c
as np
Im po rt au m py
u pd
Im po rt pa nd as a cs v file un da -
Th e da ta is st or ed in
on a pr oj ec t to
an aJ by co lle geby ty pe . fiv e ro w s o f th e
R ya n is w or ki ng e .a. . rr,c
or y as th • .7 u1 00 SC np t W ith
p.,
~ c sa la ry
a na m sa J .
an cs - -c ol lc ge -t yp c.c sv . Th e fiTSt
c
th e sa m e di re ct be lo w.
ye d
d at as er ar c di sp la
Sc ho ol
SC .tt ing ~
M ed ian
~ .. ~
10 U,
Pw ,c -tl le
h
lil ad -C .,- ,
2IC
Pe Ke nt lle
ne ,
~
lll d- C •w
a. la fy
..'".._.,'
..___
M ed ian W er y
Sc ho of Ha ne Ty pe Sa la, y Sa lar y Sa lar y
'2 20.00 00 0
11 68.00 0.0 0
1911.2 00 00
Ma.. .c:hUMtta $7 6.8 00.00
$7 2.2 00.00 $1 26.00 0.0 0
0 Ins titu te of ~ng
IT) Na N
Te ch no log y (M $16 1.0 00.00
NaN S1 04,00 0.0 0
e
Callfo ml a lns tiM y $75.5 00.00 $1 23 ,00 0.0 0
cl Te ch no log Engl, -r tn g Na N
1
(C IT ) $1110,00 0.0 0
Na N $9 6.0 00.00
$ 71 .80 0.0 0 $1 22 ,00 0.0 0
Ha tvey Mu dd Engi, -f n g $1 90 ,00 0.0 0
2 Co lla ge $1 ◄3.000 .00
$9◄ ,300.00
$6 6.8 00.00
Po lyted ln lc $11 ◄,000.00
w En gi, -t n g $82,◄00 .00 Na N
Unlverw/ty of Ne $1 ◄2.,000.00
3 $8 0,2 00.00
Yorlt. Broolclyn NaN
$1 1 ◄.000.00
$6 2,2 00.00
En gl .-- in g m e w ith a
4 Co op er Un ion
t into Pa nd as as a D at aF ra
en t is co rr ec t to lo ad th e da ta se
fo ll ow in g st at em
1. W hi ch is th e
?
va ri ab le na m e di )
cs v( 's al ar ie s- by -coJlege-type.csv'
A . d f= pd.read_ aries-by-coJlege-t
ype.csv')
.l oa d_ cs v( 's al e.cs v')
B . d f = pd 's aJ ar ies-by-college-typ
_cs vf iJ e(
C. d f = p d.read Jege-type.csv') th e following
.l oa d_ cs vf ile ('saJaries-by-coJ Pr ob lem 1, which of
D . d f = pd DataFrame ob ta in ed in
o f th e co lu m n na m es from the
2. T o g et a list
ld be ex ec ut ed ?
st atem en t sh ou
4 . d fi n d ex
3. df'.indexes
-;_df'.column th
y different school types does
) . df'.columns find bow man
llo w in g statement should be used to
. Which o f the fo
1t.aset have?
'] .unique()
df['School Type
df['Scbool Type'
J.nunique()
d f[ 'S ch o o l Type'
J.sumO
'].totalQ
d f['School Type
• st -· • td b c_. -, t tO {iJtd O Ut th c ~tu·a l d a ta ty p e o f
e (o lfo" 1n g at en te rt l !!,1...~
Wb.id> o (l .b e a c h it em ii'\
4. 1u m o?
,- ~ •'-
M c: ,d jd Sal ~~
Mid..c.-,c,r:r at Y co
A . rypc<cfffM . SaJar)" ))
Ni..c.,ccr M e
8 . ry p c( d tr M d :: Sa]ar)").dlY
~ .c a, -c .c r ~ ~ J>C)
c . ry p c( d tf M ian SaJary'J.d
~ .. ca ,, ec r M ed ty p eS )
c . rypc(dtfMid.. ia n 5a1ary"]l0
D
ca rc c: r
s. From _P ro b_lc m th e ty p e o f th
e d a ta points
. e M id -C a re e
4 , ~-e n o ;: r Med
co lu m n 1s st
n n g. W h ic h O
r :i - 1 :: n g st
at em en t sh o u
in thre m o
v e th e S si g n ia n S a la ry
d at a p o in ts fr ld be u se d to a n d c o n v e rt
o m st ri n g to • p o in t o u m b er th e
floating S ?
A . d tf M id -C
a re e r Median
B . d tf M id -C Salal)"].re p la [" 0-9}' ").a st y pe(fl"' '"l)
._
a rc e r M ed ia n ~ c c (' [" .0 ~ ]'
c . d Q 'M id -C ar Salary'J.str.re~ ,
9 ") .a st y p e( fl o at )
D . d ff M id -C ec r M ed ia n S a l~ .s u b 0 _ ', ") .a st y p
a re e r M ed ia n (' [ 9 e( fl o at )
Salary J.st r.su · " ~ 9 , " ).a st y p c (f lo a t)
6. In o rd e r to b (' [ . ] '
o b ta in th e in . b e lo w a b o u t
sh o u ld b e e x
e c u te d ?
fo n n a n o n sh
own th e d a ta se t, b ' h o f th e fo ll o w in g
w ,c st a te m e n t
< cl as s ·~ n
d a s. c o r e .
R an 1e in d ex fr a M . D at a F •>
: 269 e n tr
D at a c o lu a is ie s, e to 26rame
(t o ta l 8 co l\ ft 8
• Col\aVI W IS ):
Non-Null c o D ty p e
unt
e
l
S ch o o l H u
e
--·------- ----
S ch o o l T yp
e
269 n o
n -n u ll o b je c t
2 S ta rt in & M 26 9
ed n o n -n u ll o b je c t
3 H id -C at "f fr ian S a la ry 269 n o n -n u
4 H id -c aN H tr
M ed ia n S a la ry ll o b je c t
1 8 th P e rc 269 n o n -n u
S H id -C a re er e n ti le S a la ll o b je c t
2 ry 2 3 1 n o n -n u ll
6 H id -c a re er 5 th P e r c e n ti le S a la ry 2 6 o b je c t
9
7 H id -C a r• er 7 S th P e r c e n ti le S a la ry 2 6 n o n -n u ll o b je c t
d ty p e s: o b 9 9 th P e rc 9 n o n -n u ll o b je c t
je rt (S ) e n ti le S a la
■eaory
ry 2 3 1 n o n -n u ll
u sa g e: 1 6 .9 o b je c t
A . df.dtype + KB
B . df.dtypes
C. d f. in fo
D . d fi n fo ()
7. B a s e d o n th
e sc re e n s b o t show
n in Problem
A . T h e datas 6 , which o f th
et has 2 6 9 ro w e following s
B . T h e in fo rm s and 8 c o lu m ta te m e n t is in
a ti o n contain ns correct7
C . T h e Mid-C ed in th e Dtype
areer 9 0 th P e c o lu m n m o s t
D. Two colum rc e n ti le Sala likely is n o t a c
ns have missing ry c o lu m n h a c u ra te
values s 3 0 m is s in g v a lu e s
8. Which o f th
e following sta
::ach college ty tement should
pe, respective b e used to fin
ly? d the average M
id-Career Med
. df['School T ian Salary fo1
ype'J['Mid-Care
df['Mid-Care er Median Sala
er Median Sa ry'].meanO
jf g r o u p b y [' S lary'J.meanO
c h o o l Type'J
'f groupby('S ['Mid-Career M
chool Type') edian Salary'] .m
f'M id-Career Med eanQ
ian Salary'].me
anQ
o \\'h1,'h ,,; the ,;,111•" in!! s1:11r mcn 1 s huuld he
used to fiiuJ 0111 whic h univ ersit y hn'I the h ighe
( :u~r ,\ -frdm n S11lnn '? st M id-
1 2.0 3.0
5 NaN
4 dl dtopna(thre 2 NaN 4.0
sh ::: I) 6 NaN 0 1 2
· df. dro,,na(thresb ~3
:: 2) 1 2.0 3.0 5
NaN
~ droPfla(thresJi
·one ofthe aboveO J)
15 1:01 1hc Dn111}mm c df slmwu 111lhc ton hc low, which ~,n1c 1110 111 rrrn u lhc follow 111 ~ , hould l>e used 1
u
ob1111n the res ult sho" 11 in the ri~ hl'/
0 1 2 3 0 1 2 3
A. df.fillna(melhod = 'ffill')
B. d f. fillna(m cthod = 'bfill')
C. df.filloa(O)
D. df.fillna( { I : 0.3 , 3: O})
followi ng should be used to
16. For the DataFr ame df shown in the left below, which statem ent from the
obtain the result shown in the right?
Item realgdp Inn unemp
dab!
A . df.unstackO
B. dfstac kO
C. df.rein dexO
D. df.set_ index( )
ing should be used to
17. For the Data Frame df shown in the left below, which statem ent from the follow
obtain the result shown in the right?
v.l11t nlue2
date Item value value2
date hem
a 1959-03 -31 23:59:59.999999999 realgdp 2710.349 1.057270 1959..03--31 23:S9:59,19tH99 99 rulgdp 2710.$49 1057270
1 1959.QJ .J l 23.59;59.999999999 Inn 0.000 -0.0'1 4329 Intl 0 ,000 .() G«l29
A. df.set_index(['date', 'item'])
B. df.reindex(f'date', 'item'J)
C. df.resel_ index(['date', 'item'))
D. None oftbe above
.u ~ ,I I I 1 81 C1
8 C B C 0 2 B2 C2
A
e, Cl 3 B3 C3 03 3 B3 C3
1 A1
2 A2. 62 C2 4 84 C4 04 4 84 C4
prec lalo n
re ca l l ti-sc ore aupp ort
0 l.O0
I 1.00 LOO
o.,e L. 00 ,45
2 0.98 0. 99
l ,O 45
l 0. 9~ 0.99
4 o.,e o. u 0.95
H
0 . 98 H
~ 0.98
0 . 91 l.00 so
', 1.00
0. 96
1.00
1.00
0 . 99
l.00
38
42
8 0 , 91 0.96
45
9 o. ,e 0.89
~ .. 0. 9)
\ Vhic h orule r1111,I11 \\ Ill~
. ·
,1111l· 11 w ,11 nhmrt ti ll' tt•p, 111 " '
1· 1 ,,,
' ' Sc ' . . . . d b the
Ii 0 given drg1t d1v1de Y
A 11,e P ~ ii:;inn co l1111111 <1 h lH\,. th!' tntnl 1111111 h m o l e111Tc1:t ptcdlotl~l~•<i . or k •ng ul each co lumn in the
total number .,, pl"('dr.:111)11, h)r thnt dtJ,1.lt Yr•11 co11 co.111in11 the prccr1u o n l,y 100 1
confusi0n ni:,r, , , d the support column is the
B Then .,,·,,n• ,·,~h1111 n ,s the 1wcrn~c or the precis ion ,111<1 recall . 'nic roca 11 an beled as 4s. and 38
mm, t,~, ,,f '-'Hllf'll•, "ith A p,, , c 11 e~pct:lcd vnluc for cx11111plc, 50 samples w er e I8
<:1m1r1r~ ,wn.• lnhl"'lcrl R!> c;s . , . . d ' •c.J d by the total
C 111<.' n:-..·All l·,,lumn •~ 11te hltnl 11111111.>or o l' corrcc l predictio ns for a g iven digit '"'he II by looking at
numhcr (' f AAmplcs thot shl•uld hnvc beon predicted ' ' y o u can
as t Irnl di gll.. · confirm
· t c reca
en,~h ro\\ in the C(1nf11sion matrix.
n . Onl) stateme nts A :md B nre correc t.
25 Conside-r the following code and output for the Digits dataset's predictions:
A . The machine- learning parameters which the estimator calculates as it learns from the data are called
hyperparameters-i.n the k-nearest neighbors algorithm, k is a hyperparameter.
B. There are two parameter types in machine learning-those the estimator calculates as it learns from the
data you provide and those you specify in advance when you create the scikit-learn estimator object that
represents the model.
C. In machine leamingj a model implements a machine-learning algorithm . In scikit-leam, models are
called estimators.
D. For simplicity, we use scikit-Jeam's default hyperparameter values. In real-world machine-learning
d' vou'II want to experiment with different I Of
5
~ill '.e · ~. . process is called hyperp . ~a ucs k to produce lhc best possible models for"Jour
stild1es-u11s
arurncter tun 111 g•
-0 1193 5
tlmestamp g,ndtr
2000-12.J I
22:12'40 F
■ge
Under
18
occup■llon zip
1n order to display users' age distribution like the one shown below, which statement should be used?
25-34 395556
35-44 199003
18-24 183536
45-49 83633
50-55 72490
56+ 38780
Under 18 27211
Name: ageJ dtype: int64
A. data['age'].describeQ
B. dataraselcountQ
C. data['age'].sizeQ
D. data['age1,value_co untsQ
28. ln order to obtain figure below for the rating's distribution, which statement should be used? X-axis
represents the rating and y-axis represents the total number of ratings .
.r,c,ooo
DlOOO
ZIOOOO
;'00000
1:IOOOO
I
100000
0
... r, .,
"'
■
A. data['rating'].plot.box()
B. datarrating'].value_countsQ.plot.boxQ
C dntal'rating').vnluc count ~().plo t hnr()
D. da1n('ralin1t'l plo1 hM()
h' h
29. To get n,,crni;tc """ il" rn1111p.~ fo1 onl.lh 111111 lor l lke the c1110 11how n helow , w IC
~f't)1111l'cl hy t:t0 11 t
stntcm enl i;hn11ld h<' u-:c.t''
gendc1 F M
title
$1 ,000,0 00 Duck (1971) 3.375000 2.761905
'Nigh t Mo ther (1986) 3.388 889 3 35291\ I
'Til There Was You (1997) 2.675676 2.73333:\