0% found this document useful (0 votes)

44 views10 pages

Prog Found Final

The document appears to be a programming exercise related to data analysis using Python and pandas, focusing on a dataset of salaries by college type. It includes various problems and statements that require the reader to manipulate and analyze the data, such as loading CSV files, calculating averages, and filtering data. The document contains numerous code snippets and questions aimed at testing understanding of data handling in Python.

Uploaded by

luffyzoroonep365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views10 pages

Prog Found Final

Uploaded by

luffyzoroonep365

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

C S 50 00 F1aa

__ _

: ~ - - -- -
_____
7 00 ._ _ _ __ _
N am e ed .
th Fl y be en ex ec ut
Fo r al l th e pr ob
le ms in g th f'iOI lo w in g st ate m ents ha ve al re ad
e oa f, as su m in c
as np
Im po rt au m py
u pd
Im po rt pa nd as a cs v file un da -
Th e da ta is st or ed in
on a pr oj ec t to
an aJ by co lle geby ty pe . fiv e ro w s o f th e
R ya n is w or ki ng e .a. . rr,c
or y as th • .7 u1 00 SC np t W ith
p.,
~ c sa la ry
a na m sa J .
an cs - -c ol lc ge -t yp c.c sv . Th e fiTSt
c
th e sa m e di re ct be lo w.
ye d
d at as er ar c di sp la

Sc ho ol
SC .tt ing ~
M ed ian
~ .. ~
10 U,
Pw ,c -tl le
h
lil ad -C .,- ,
2IC
Pe Ke nt lle
ne ,
~
lll d- C •w

a. la fy
..'".._.,'
..___
M ed ian W er y
Sc ho of Ha ne Ty pe Sa la, y Sa lar y Sa lar y
'2 20.00 00 0
11 68.00 0.0 0
1911.2 00 00
Ma.. .c:hUMtta $7 6.8 00.00
$7 2.2 00.00 $1 26.00 0.0 0
0 Ins titu te of ~ng
IT) Na N
Te ch no log y (M $16 1.0 00.00
NaN S1 04,00 0.0 0
e
Callfo ml a lns tiM y $75.5 00.00 $1 23 ,00 0.0 0
cl Te ch no log Engl, -r tn g Na N
1
(C IT ) $1110,00 0.0 0
Na N $9 6.0 00.00
$ 71 .80 0.0 0 $1 22 ,00 0.0 0
Ha tvey Mu dd Engi, -f n g $1 90 ,00 0.0 0
2 Co lla ge $1 ◄3.000 .00
$9◄ ,300.00
$6 6.8 00.00
Po lyted ln lc $11 ◄,000.00
w En gi, -t n g $82,◄00 .00 Na N
Unlverw/ty of Ne $1 ◄2.,000.00
3 $8 0,2 00.00
Yorlt. Broolclyn NaN
$1 1 ◄.000.00
$6 2,2 00.00
En gl .-- in g m e w ith a
4 Co op er Un ion
t into Pa nd as as a D at aF ra
en t is co rr ec t to lo ad th e da ta se
fo ll ow in g st at em
1. W hi ch is th e
?
va ri ab le na m e di )
cs v( 's al ar ie s- by -coJlege-type.csv'
A . d f= pd.read_ aries-by-coJlege-t
ype.csv')
.l oa d_ cs v( 's al e.cs v')
B . d f = pd 's aJ ar ies-by-college-typ
_cs vf iJ e(
C. d f = p d.read Jege-type.csv') th e following
.l oa d_ cs vf ile ('saJaries-by-coJ Pr ob lem 1, which of
D . d f = pd DataFrame ob ta in ed in
o f th e co lu m n na m es from the
2. T o g et a list
ld be ex ec ut ed ?
st atem en t sh ou

4 . d fi n d ex
3. df'.indexes
-;_df'.column th
y different school types does
) . df'.columns find bow man
llo w in g statement should be used to
. Which o f the fo
1t.aset have?
'] .unique()
df['School Type
df['Scbool Type'
J.nunique()
d f[ 'S ch o o l Type'
J.sumO
'].totalQ
d f['School Type
• st -· • td b c_. -, t tO {iJtd O Ut th c ~tu·a l d a ta ty p e o f
e (o lfo" 1n g at en te rt l !!,1...~
Wb.id> o (l .b e a c h it em ii'\
4. 1u m o?
,- ~ •'-
M c: ,d jd Sal ~~
Mid..c.-,c,r:r at Y co
A . rypc<cfffM . SaJar)" ))
Ni..c.,ccr M e
8 . ry p c( d tr M d :: Sa]ar)").dlY
~ .c a, -c .c r ~ ~ J>C)
c . ry p c( d tf M ian SaJary'J.d
~ .. ca ,, ec r M ed ty p eS )
c . rypc(dtfMid.. ia n 5a1ary"]l0
D
ca rc c: r
s. From _P ro b_lc m th e ty p e o f th
e d a ta points
. e M id -C a re e
4 , ~-e n o ;: r Med
co lu m n 1s st
n n g. W h ic h O
r :i - 1 :: n g st
at em en t sh o u
in thre m o
v e th e S si g n ia n S a la ry
d at a p o in ts fr ld be u se d to a n d c o n v e rt
o m st ri n g to • p o in t o u m b er th e
floating S ?
A . d tf M id -C
a re e r Median
B . d tf M id -C Salal)"].re p la [" 0-9}' ").a st y pe(fl"' '"l)
._
a rc e r M ed ia n ~ c c (' [" .0 ~ ]'
c . d Q 'M id -C ar Salary'J.str.re~ ,
9 ") .a st y p e( fl o at )
D . d ff M id -C ec r M ed ia n S a l~ .s u b 0 _ ', ") .a st y p
a re e r M ed ia n (' [ 9 e( fl o at )
Salary J.st r.su · " ~ 9 , " ).a st y p c (f lo a t)
6. In o rd e r to b (' [ . ] '
o b ta in th e in . b e lo w a b o u t
sh o u ld b e e x
e c u te d ?
fo n n a n o n sh
own th e d a ta se t, b ' h o f th e fo ll o w in g
w ,c st a te m e n t
< cl as s ·~ n
d a s. c o r e .
R an 1e in d ex fr a M . D at a F •>
: 269 e n tr
D at a c o lu a is ie s, e to 26rame
(t o ta l 8 co l\ ft 8
• Col\aVI W IS ):
Non-Null c o D ty p e
unt
e
l
S ch o o l H u
e
--·------- ----
S ch o o l T yp
e
269 n o
n -n u ll o b je c t
2 S ta rt in & M 26 9
ed n o n -n u ll o b je c t
3 H id -C at "f fr ian S a la ry 269 n o n -n u
4 H id -c aN H tr
M ed ia n S a la ry ll o b je c t
1 8 th P e rc 269 n o n -n u
S H id -C a re er e n ti le S a la ll o b je c t
2 ry 2 3 1 n o n -n u ll
6 H id -c a re er 5 th P e r c e n ti le S a la ry 2 6 o b je c t
9
7 H id -C a r• er 7 S th P e r c e n ti le S a la ry 2 6 n o n -n u ll o b je c t
d ty p e s: o b 9 9 th P e rc 9 n o n -n u ll o b je c t
je rt (S ) e n ti le S a la
■eaory
ry 2 3 1 n o n -n u ll
u sa g e: 1 6 .9 o b je c t
A . df.dtype + KB
B . df.dtypes
C. d f. in fo
D . d fi n fo ()

7. B a s e d o n th
e sc re e n s b o t show
n in Problem
A . T h e datas 6 , which o f th
et has 2 6 9 ro w e following s
B . T h e in fo rm s and 8 c o lu m ta te m e n t is in
a ti o n contain ns correct7
C . T h e Mid-C ed in th e Dtype
areer 9 0 th P e c o lu m n m o s t
D. Two colum rc e n ti le Sala likely is n o t a c
ns have missing ry c o lu m n h a c u ra te
values s 3 0 m is s in g v a lu e s
8. Which o f th
e following sta
::ach college ty tement should
pe, respective b e used to fin
ly? d the average M
id-Career Med
. df['School T ian Salary fo1
ype'J['Mid-Care
df['Mid-Care er Median Sala
er Median Sa ry'].meanO
jf g r o u p b y [' S lary'J.meanO
c h o o l Type'J
'f groupby('S ['Mid-Career M
chool Type') edian Salary'] .m
f'M id-Career Med eanQ
ian Salary'].me
anQ
o \\'h1,'h ,,; the ,;,111•" in!! s1:11r mcn 1 s huuld he
used to fiiuJ 0111 whic h univ ersit y hn'I the h ighe
( :u~r ,\ -frdm n S11lnn '? st M id-

4 dfs('lr1 , "h1<",11'~ ='M,d -Con.-er Mcd inn Snla

ry'. o!loo11ding = Puls e)
B .;( <ort_ , 11/ul"$(b~ ='M id-C aree r Med ian Sala
ry'. ascundin g = Fals e).il oc[O ]
c d1:so n , alue.~'h.' - 'Mid -Cor cer Med ian Sa lary' . asce ndin g = True )
D dfs" n~,-ahteS{h) 'Mid -Car eer M edia n Sa lary' . ascending = True
).iloc[OJ
Jo \\ oich of the follo " i "!? stalcmcn r s hould be used lo find o u t the tota l num bers of univ ersit ies
Mid- Care er Median Sala ry arc above $ I 00,0 whose
00?
A dl]dfT'Mi d-Ca reer Med ian Salr uy'J > I000O0J(
'Scbool Nam e').c ount ()
B. dff_df['/vfid-Carccr Medja11 Sala ry') > I 0000
0JCSchoo l Nam e'].sum( )
C dffdff'M id-Caree r Medran Salary'] > I000 00Jr
scho ol Nam e'J.size( )
D. dfTdfI'"l\.fid-Career Med ian Sala ry'] > 1000
00] .sum O
11. Whi ch ofrb e follo wing state men t shou
ld be used to find out the tota l num bers ofur uve rsity
wfncb contains rhe word ' State'? nam es

A dff'Scbool Nam eJ.s tr.contai ns('S tate' ).co

uot( )
B. dl['S cboo / Name'J.str.conrains('State').s umO
C. df['Scbool Naroe'J.c oota ins(' Stat e').c oun tO
D . df['School Namc'].contains('StareJ.s umO

12. ~or the Data Fram e df shown in the left belo

w, which stat eme nt from the folJowi ng sho uld
obtamed the result sbown in the right? be u sed to

subject Bob Guido Sue

type HR Temp HR Temp HR Temp
ear visit
13 1 47.0 39.1 29.0 36.3 63.0 34.8
2 49.0 38.4 51.0 37.0 27.0 39.1
3 34.0 36.9 40.0 37.4 33.0 36.3
1 21.0 39.0 28.0 37.0 56.0 38.3
2 30.0 36.7 36.0 37.7 38.0 36.7
3 35.0 37.1 41 .0 36.6 38.0 37.8
1 37.0 37.5 46.0 37.8 33.0
37.8
2 44.0 38.2 54.0 37.7 31 .0 36.8
3 46.0 37.8 40.0 37.2 35.0 subject Bob
36.1 Gui do
1 34.o Sut!
35,8 29.0 type HR
35.1 31.0 Temp HR
35.1 Temp HR Temp
2 50.0 year
37.5 39.0 visit
36.3 4 7.
363 63 .0 34.8
~ ,$ {I ss.• 51 .0 370 27.0 39.1
j 34 () 369 400 37.4 33.0 36.J
1 21 0 l9 0 28.0 37.0 56·0 38,3
2 30.0 36.7 36.0
37.7 38.0
36.7
l ~ O 37
.1 41 .0 36.6
38.0 37.8
' 37 0 3
7.5 46.0 37.8 subject Bob Guido Sue
33.0 37.8 HR
2 '4.0 type HR HR
382 54.0
3 46.0 37.7 31.0
37.8 ~.o 36.8 ~ y = visit
37.2 35.0 ea_r_~
36.1 2013 ~ - :29.0
' 34.0 35.8 2 1 47.0
9.0 35.1
31.0
2 50.0 35.1 2014
37.5 39.0 1 21.0
36.3 47.0 2B.0
3 , 3.0 38.2 35.1 2016 56 .0
25.O 37.1 1 37.0
41.O 36.4 2016 46.0 33.0
A df.loc{(slx
:e
B. df.Joc[(:, 1), (None), I), (slice(None), 'H 1 34 .0
(:, 'HRJJ R')) 29.0 31.0
C. dlJoc{T~ I]
,[:,
D.d.f.loc/1, 'H 'HR JJ
RJ

o1b4ta. inFctbt thc ereosu.wlt =

sho,.w.,n, dinfshtheowring
hint?the left below, whic
h statement from
the following s
houId 'oe u
sedto
0 1 2
0 1.0 NaN
2 NaN
- 3

1 2.0 3.0
5 NaN
4 dl dtopna(thre 2 NaN 4.0
sh ::: I) 6 NaN 0 1 2
· df. dro,,na(thresb ~3
:: 2) 1 2.0 3.0 5
NaN
~ droPfla(thresJi
·one ofthe aboveO J)
15 1:01 1hc Dn111}mm c df slmwu 111lhc ton hc low, which ~,n1c 1110 111 rrrn u lhc follow 111 ~ , hould l>e used 1
u
ob1111n the res ult sho" 11 in the ri~ hl'/

0 1 2 3 0 1 2 3

0 1.0 NaN 2 NaN 0 1.0 0.3 2 00

1 2.0 3.0 5 NaN 1 2.0 3.0 5 0.0

2 NaN 4.0 6 NaN 2 NaN 4.0 6 0 .0

A. df.fillna(melhod = 'ffill')
B. d f. fillna(m cthod = 'bfill')
C. df.filloa(O)
D. df.fillna( { I : 0.3 , 3: O})
followi ng should be used to
16. For the DataFr ame df shown in the left below, which statem ent from the
obtain the result shown in the right?
Item realgdp Inn unemp

dab!

1959-03-31 23:59:59. 9999999 99 2710.34 9 0.00 5.8

2.34 5.1 dat e i tem

1959-06-30 23:S9:69 .999999 999 2778.801 2719.3 49
1959-0 3- 31 23 : 59:59 . 999999 999 realgdp
1959-09-30 23:S9:5 9.99999 9999 2775 .488 2.74 5.3 infl e .eee
unemp S . 809
1959•12 -3123:5 9:59.999 999999 2 785.204 0.27 5.6 2778 . 861
1959- 06 - 30 23 :59 :59.999 999999 realgdp
5.2 infl 2 . 340
1960-03-31 23:59:5 9.99999 9999 2847.69 9 2.31

A . df.unstackO
B. dfstac kO
C. df.rein dexO
D. df.set_ index( )
ing should be used to
17. For the Data Frame df shown in the left below, which statem ent from the follow
obtain the result shown in the right?
v.l11t nlue2
date Item value value2
date hem

a 1959-03 -31 23:59:59.999999999 realgdp 2710.349 1.057270 1959..03--31 23:S9:59,19tH99 99 rulgdp 2710.$49 1057270

1 1959.QJ .J l 23.59;59.999999999 Inn 0.000 -0.0'1 4329 Intl 0 ,000 .() G«l29

unernp 5.800 -0.88B8i 6 unemp 5.800 .0,868826

2 1959-03-31 23:59:59.999999999
1.090907 1151-06-30 23:lll:69.lfflt11 9tl rulgdp 2TT8.801 l.090907
3 1959-06 -30 23.59:59.89~99999 realgdp 2776.80 1
Intl 2.$40 .Q.009227
4 1959-06-30 23:59:59.999999999 lnfl 2.340 -0.009227

A. df.set_index(['date', 'item'])
B. df.reindex(f'date', 'item'J)
C. df.resel_ index(['date', 'item'))
D. None oftbe above

18. Unsupervised machine learning uses _ _ _ algorithms.

I\ r,•r" ,,11'"
1, , 11,,, ,111 Id
, , Iii""'" ''"'" lno shou
1110 folIow 0
I ff(ll ll
1' , ,11,r ,,1 1hr n~' ' i , s111
• 1hc \cl\ h<'lvW, whnlh 10111cn
I Ult , h1)\\II Ill
~ I •" 1hr \,~11,I tlll'I~• ill' 11,m ' 111 1he I lithl'I
i ' ·"'111111 11,,, fl:•' 11\I ~ 11 ''
1 11
~ \I\('\ \ 1 ll\'
B C

.u ~ ,I I I 1 81 C1

8 C B C 0 2 B2 C2
A
e, Cl 3 B3 C3 03 3 B3 C3
1 A1

2 A2. 62 C2 4 84 C4 04 4 84 C4

~ pJ Nnc.al\ldfS. df6], join = 'inner')

8 p,d concatt(dfS. df6 ), join= 'ou1er')
C pd.conCA\((df5. df6])
o dfS.append(df6 )
, F thr: Datafl1lllles dO and dO shown t from the following should
_o. or in the left below, which statemen
be used 10 obtain the result shown in the . h?
ng l.
d:: df3

employee group name salary employee group salary

o Sob Aca>untlng 0 Bob 70000 0 Bob Accounting 70000
Jat.e Engineering 1 Jake 80000 1 Jake Engineering 80000
2 Lisa Englneering 2 Lisa 120000 2 Lisa Engineering 120000
3 Sue HR 3 Sue 90000 3 Sue HR 90000
A pd.merge(dO , df3 ).drop('name', axjs =
0)
B. pd.merge(dfl , dn).drop('name', a.xis =
l)
C. pd.merge(dfl , df3 , left_on = "employee
", right_on = "name").drop('name', axis
D. pd.merge(dfl , df3 , left _on= "employee = l)
", right_on = "name").drop('name', axis
= 0)
21 For the Datafr.uncs dfla and df3 shown
in the left below, which statement from
be used to obtain the result shown in the the following should
right?
tH l11 d!l

ll'IIUp ~rn• ul.r y

t,nploJtt group
0 IIClll 70000 MIM 1111,y
lob ~ 1 ..... e4000
0 Auoun1Jno Bob 70000
Jab E/v.etrw,,,i
2 1 Engineering
Ula 120000 Jakt eoooo
Llu ~IY IJ l 611t OG()OO 2 Englnoe1w,g Uta 120000
lua kR a HR Sue 00000
A. pd.me rge (dn a. dlJ , left imle:\ - 'em
ployee', right 011 e 111am c')
B pd.m erg c(d n a. dO . lcf\ index = I r\lc
, right_on =- 111amo1)
C, pd.r ncr gcl dila . dO. lcO _on
= 'cmplnycc', righ t_on - '11nmc')
D pd.mcrie(dtl a. dn . left _on - Tmc. righ
t on = 'nnmo')
:2. \\1, ich of the foll'-1wing stotemcnt s is
folsc?
A The \,.-nearest neighbors alg.orithm atte
mpts to predict n lest sample's class by
samples tl,at are nearest (in distance) to lookin g at the k trainin g
the test sample.
B. Always pick an even value of k for the
k-nearest neighbors algorithm .
('. Scikit-\eam supports many c\assi(1cati .
on algorithms, including the simplest- k-n
't\'N). earest neighbors (k-
D ln the k-ne.arest neighbors alg,orithm
, the class with the most "votes" wins.
23 . Consider the confus ion matrix for the
Digits dataset's predictions:
arra y ( [ [ ~ 5, 0, 0, 0 , o.
o, o. o, Ol ,
0,
l o, 45, o, o, o, ol,
0 . 0, 0 , 0,
{ 0, o, 54, o, o,
o, 0 , o, o, ol ,
[ 0, o, 0, 4 2, 0,
l, o. 1, o, Ol,
l o. 0, o, 0, 49,
o, o, 1, o, 01,
l O, 0 , o. o, o, 38, o, o, o, OJ,
( 0, o, 0, o, 0, o, 42, o, o,
r o, o, o, o, o, 0, 0, 45, o, 01,
01,
( 0, 1. 1. 2, O, o, o, 0, 39, l l,
( o, o. o. o. 1, 0, 0, 0, 1, 41 l l l

Which of the following statement is fals

e?
A. The columns within a row specify bow
many of the test samples were classified
distinct class 0-9. incorrectly into each
B. The nonzero values that are not on the
principal diagonal indicate incorrect prediction
misses) . s (that is,
C. Each row represents one distinct class-
that is, one of the digits 0-9 .
D. The correct predictions are shown on the
diagonal from top-left to bottom-right-this
principal diagonal. is called the

24. The skleam.metrics module's classifica

tion_report function produces a table of
based on the expected and predicted values classification metrics
for the Digits dataset's predictions, as in
the figure below
t ro• .1U ucn. Htr ics lmpo ct clau
l.tlc atlo n npo rt
n.... • (1trCd191 t) tor dig it in
d.lQl ta , t a7ge t nun )
prln t(c lasa ltic1t lon_ rep0 rtC• •p•c
t ed, pr•d lcte d,
tacg et_nam11•nom•all

prec lalo n
re ca l l ti-sc ore aupp ort
0 l.O0
I 1.00 LOO
o.,e L. 00 ,45
2 0.98 0. 99
l ,O 45
l 0. 9~ 0.99
4 o.,e o. u 0.95
H
0 . 98 H
~ 0.98
0 . 91 l.00 so
', 1.00
0. 96
1.00
1.00
0 . 99
l.00
38
42
8 0 , 91 0.96
45
9 o. ,e 0.89
~ .. 0. 9)
\ Vhic h orule r1111,I11 \\ Ill~
. ·
,1111l· 11 w ,11 nhmrt ti ll' tt•p, 111 " '
1· 1 ,,,
' ' Sc ' . . . . d b the
Ii 0 given drg1t d1v1de Y
A 11,e P ~ ii:;inn co l1111111 <1 h lH\,. th!' tntnl 1111111 h m o l e111Tc1:t ptcdlotl~l~•<i . or k •ng ul each co lumn in the
total number .,, pl"('dr.:111)11, h)r thnt dtJ,1.lt Yr•11 co11 co.111in11 the prccr1u o n l,y 100 1
confusi0n ni:,r, , , d the support column is the
B Then .,,·,,n• ,·,~h1111 n ,s the 1wcrn~c or the precis ion ,111<1 recall . 'nic roca 11 an beled as 4s. and 38
mm, t,~, ,,f '-'Hllf'll•, "ith A p,, , c 11 e~pct:lcd vnluc for cx11111plc, 50 samples w er e I8
<:1m1r1r~ ,wn.• lnhl"'lcrl R!> c;s . , . . d ' •c.J d by the total
C 111<.' n:-..·All l·,,lumn •~ 11te hltnl 11111111.>or o l' corrcc l predictio ns for a g iven digit '"'he II by looking at
numhcr (' f AAmplcs thot shl•uld hnvc beon predicted ' ' y o u can
as t Irnl di gll.. · confirm
· t c reca
en,~h ro\\ in the C(1nf11sion matrix.
n . Onl) stateme nts A :md B nre correc t.

25 Conside-r the following code and output for the Digits dataset's predictions:

1:n 1S71 : t or t in r a nge(l, 20, 2):

lcfo1-d - IU'o1-d(n .1p1-it.5•10, random_.,, tate=ll, shutf l e•T rui.)
J.;.nn ~ 11.Neigh,bo r ;ciassi.Cie.c (11_oe i ghbo r3 • kj
scores~ cro ss va l ~core(estimaCor=knn#
x- digit .5 .d ata: y=di g i t.5. targe t: , cv=ltfoldJ
pr i n~Cf'k• lk :<2 }; mea n accur,,cy=(scores.mean(): .2 ,1; ' +
f'at.andard devia t i on%1sc ores,st d() :.2%1')

k=l ; mean accuracy-98.83 %; standard devia.Uon•0.58\

1<=3 , mean accuracy-98.78i; s tandard deviation•0 .781,
k ~ S : rnea:i <1ccu.racy--98. 72\; .5tandard de.v iat.ion-0. 75\
k= 7 : mean accu racy=-98.441; standard deviation=0.96t
k=-9 ; mean accuracy=98.33\; sta.nd.ard deviation=0.801
lt=.ll; mean accu racy-98. 39\; s ·tandard deviation• O. 80'1
k=l3; mea:1. accuracy-97. 89 \ ; s t.a.nd4rd deviation 5 Q. 89l
Jc=.15; mean ac,cuc·aCf""97 . 89%; standa r d dev:iation=l . 02\:
k•l 7; mea..-i accuracy-97. 50.\; st.andai,d devi.ation•l. 00\
Jr.=1 9; mean accuracy,,97.66\; standard deviation=0.96'11

Which oftbe following statement is false?

A . The k value 7 in kNN produces the most accurate predictions for the Digits dataset.
B . The loop creates KNeighborsCiassifiers with odd k values from 1 th,rougb 19 and performs k-fold
cross-validation on each.
C. The accuracy tends to decrease for higher k values.
D. Compute time grows with k, because k-NN needs to perform many more calculations to find the
nearest neighbors

26. Which of the following statements is false?

A . The machine- learning parameters which the estimator calculates as it learns from the data are called
hyperparameters-i.n the k-nearest neighbors algorithm, k is a hyperparameter.
B. There are two parameter types in machine learning-those the estimator calculates as it learns from the
data you provide and those you specify in advance when you create the scikit-learn estimator object that
represents the model.
C. In machine leamingj a model implements a machine-learning algorithm . In scikit-leam, models are
called estimators.
D. For simplicity, we use scikit-Jeam's default hyperparameter values. In real-world machine-learning
d' vou'II want to experiment with different I Of
5
~ill '.e · ~. . process is called hyperp . ~a ucs k to produce lhc best possible models for"Jour
stild1es-u11s
arurncter tun 111 g•

;\nusha is working on a project to annlyz , . .

27. A . D e movie ratmgs. fhc firsl five rows of the dataset arc
displayed be1ow. ssummg t1,e ntaFrame is referred by variable data.
user_id moYie_ld rating

-0 1193 5
tlmestamp g,ndtr

2000-12.J I
22:12'40 F
■ge

Under
18
occup■llon zip

K-12 aludent 48067

litlt

One Flew (}ltft lhe

a-nrn

Cudcoo's NMI (1975) Orama

2 1193 s 2000-12-31
21 :33:33 M 56+ self-employed 70072
One flew Oter the
Orama
Cucleoo't ~I 11975)
12 1193 2000-12-30 One flew (},er !he
2 M 25-3" programmer 32793
23:49:39 Cuclcoo'a Ntsl 11975) Orama
15 1193 4 2000-12.30 One flew Over the
3 M 25-34 exacutJve/managerial 22903
18.01:19 CuW>O's Nest (1975) Orama

17 1193 5 2000-12-30 01'18flew0verlhe

4 Cuci<oo'sNest (lg?S) Orama
06:<ll :III M 50-55 academic/educator 95350

1n order to display users' age distribution like the one shown below, which statement should be used?
25-34 395556
35-44 199003
18-24 183536
45-49 83633
50-55 72490
56+ 38780
Under 18 27211
Name: ageJ dtype: int64
A. data['age'].describeQ
B. dataraselcountQ
C. data['age'].sizeQ
D. data['age1,value_co untsQ
28. ln order to obtain figure below for the rating's distribution, which statement should be used? X-axis
represents the rating and y-axis represents the total number of ratings .

.r,c,ooo

DlOOO

ZIOOOO

;'00000

1:IOOOO

I
100000

0
... r, .,
"'
■
A. data['rating'].plot.box()
B. datarrating'].value_countsQ.plot.boxQ
C dntal'rating').vnluc count ~().plo t hnr()
D. da1n('ralin1t'l plo1 hM()
h' h
29. To get n,,crni;tc """ il" rn1111p.~ fo1 onl.lh 111111 lor l lke the c1110 11how n helow , w IC
~f't)1111l'cl hy t:t0 11 t
stntcm enl i;hn11ld h<' u-:c.t''

gendc1 F M
title
$1 ,000,0 00 Duck (1971) 3.375000 2.761905
'Nigh t Mo ther (1986) 3.388 889 3 35291\ I
'Til There Was You (1997) 2.675676 2.73333:\

'burb s, The (1989) 2.793478 2.962 085

... And Justict? tor All (1979) 3.828571 3.689024

A. data.pi\·01 table(indcx = 'title', columns= 'gender', aggfu nc

= 'mean')
B. data.pivot=table('rnting', index = 'title', columns = 'gender', aggfu
nc = 'mea n/
C. data.pivot_table('rating'. index = 'gender', colum ns = 'title',
aggfu nc - 'mea n)
D. data.p ivot_table(index = 'gend er', colum ns = 'title', aggfu nc
= 'mean')
30. To gel the top 6 movies by age 18-24 users like the one show
n below, whic h state ment shou ld you
use?

age 1&- 50. Under

24 26-34 3M' •M9 56+
55 18
title
I Am Cuba (Soy Cuba/Ya Kuba) (196') 5.0 4.666667 NaN · 5.000000 NaN NaN NaN
Sanigossa Manuscript, The (Rekopls Ut&lw ony w
Saragossie) (1165) 5.0 2.800000 2.666667 4 .50000 0 NaN NaN NaN
Arguing the World (1996) 5.0 4.200000 4.000000 NaN 2.5 4.0000 00 NaN
Under lhe Rainbow (1981 I 5.0 2.314286 2.1818 18 2.750000 3.0 1.6666 67 2.0
City, The (1998) 5.0 3.200000 3.000000 3.5000 00 NaN 4.0000 00 NaN
Twice Upon a Yesterday (1998) 5.0 3.500000 3.666667 3.333333 NaN 1.000000 NaN
A data.pivot_table('rating', index = 'title', columns = 'age', aggfunc
= 'mean'). sort_values(by = ' 18-24' ,
ascending = False)[:6)
B. data.pivot_table('rating', index = 'title', columns = 'age', aggfunc
24')[:6]
= 'mean'). sort_values(by = ' 18-
C. data.pivot_ table('rating', index = 'age', columns ='title', aggfunc =
'mean'). sort_values(by = '18-
24')[:6)
D. data.pivot_tabl e('rat ing', index = 'age', columns = 'title', aggfu
nc = 'mean'). sort_values(by ='1 8-24',
ascen ding = False)[:6]

Wahyudi 2020
No ratings yet
Wahyudi 2020
10 pages
DATAFRAME
No ratings yet
DATAFRAME
11 pages
Behenchod
No ratings yet
Behenchod
10 pages
QP Xii Ip Hy 2024-25
No ratings yet
QP Xii Ip Hy 2024-25
9 pages
Class 12 Ip Sample Question Paper
No ratings yet
Class 12 Ip Sample Question Paper
9 pages
28 03 2024 Sample Paper Grade 12 Informatics Practices 2023 24
No ratings yet
28 03 2024 Sample Paper Grade 12 Informatics Practices 2023 24
8 pages
QP Xii Ip Hy 2023-24
No ratings yet
QP Xii Ip Hy 2023-24
9 pages
Question - Paper Set (IP)
No ratings yet
Question - Paper Set (IP)
117 pages
Python MCQs
No ratings yet
Python MCQs
21 pages
Question Bank Class XII IP 065 Long Question Answer
No ratings yet
Question Bank Class XII IP 065 Long Question Answer
35 pages
Ge - Computer Science Data Analysis
No ratings yet
Ge - Computer Science Data Analysis
16 pages
Assignment 2 Clss12 Pandas I
No ratings yet
Assignment 2 Clss12 Pandas I
65 pages
CH 1 Type B Exercise
No ratings yet
CH 1 Type B Exercise
11 pages
Backpropagation Neural Network
No ratings yet
Backpropagation Neural Network
21 pages
12 IP Dataframe and Pyplot Notes
No ratings yet
12 IP Dataframe and Pyplot Notes
14 pages
Prog Found Final
No ratings yet
Prog Found Final
10 pages
All-In-One Xii Ip PB QP Ms 2024-25 (301 Pages)
No ratings yet
All-In-One Xii Ip PB QP Ms 2024-25 (301 Pages)
301 pages
Adobe Scan 11-Jul-2025
No ratings yet
Adobe Scan 11-Jul-2025
8 pages
Maths Pyq Class11
No ratings yet
Maths Pyq Class11
8 pages
QP DAV 3rd Sem Dec 2023
No ratings yet
QP DAV 3rd Sem Dec 2023
12 pages
Divp Pyq 2023
No ratings yet
Divp Pyq 2023
7 pages
K-Nearest Neighbors (KNN)
No ratings yet
K-Nearest Neighbors (KNN)
9 pages
Lab Session 07: Perform Following Operations Using Pandas
No ratings yet
Lab Session 07: Perform Following Operations Using Pandas
4 pages
Tugas Data Mining Pertemuan 10 Kelompok 3
No ratings yet
Tugas Data Mining Pertemuan 10 Kelompok 3
4 pages
Tutorial 2 QB & QP
No ratings yet
Tutorial 2 QB & QP
4 pages
DAI 101 Tutorial
No ratings yet
DAI 101 Tutorial
12 pages
IP XII U1 Ch3 DataHandling (DataFrame) Final
No ratings yet
IP XII U1 Ch3 DataHandling (DataFrame) Final
45 pages
Python - Final 1
No ratings yet
Python - Final 1
17 pages
Chapter 2 Python Pandas
No ratings yet
Chapter 2 Python Pandas
8 pages
KNN Solved Example
100% (1)
KNN Solved Example
6 pages
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
No ratings yet
Cs Sem III Dav Upc 2343012002 Sl. No. Qp. 1673 Dec '23
12 pages
Holy Innocents Public School Term-1
No ratings yet
Holy Innocents Public School Term-1
6 pages
Practices Practices: Informatics
No ratings yet
Practices Practices: Informatics
46 pages
Half Yearly Examination 2022-23 PT2: Class XII
No ratings yet
Half Yearly Examination 2022-23 PT2: Class XII
7 pages
Info Pract Xii Ms PB 1 Set 1
No ratings yet
Info Pract Xii Ms PB 1 Set 1
4 pages
Sample Paper General Instruction
No ratings yet
Sample Paper General Instruction
12 pages
Pandas & Vis 2
No ratings yet
Pandas & Vis 2
11 pages
D.A.V. Institutions, Chhattisgarh: Compulsory
No ratings yet
D.A.V. Institutions, Chhattisgarh: Compulsory
8 pages
5 2 Algoritam Ucenja Graficki BP
No ratings yet
5 2 Algoritam Ucenja Graficki BP
23 pages
Ip 1
No ratings yet
Ip 1
26 pages
(K Nearest Neighbors) KNN
No ratings yet
(K Nearest Neighbors) KNN
3 pages
Dav 2024 Pyq
No ratings yet
Dav 2024 Pyq
7 pages
MCQ On Dataframe
No ratings yet
MCQ On Dataframe
11 pages
Unit 1 Python Pandas
No ratings yet
Unit 1 Python Pandas
20 pages
101 Onwards On Python Pandas and Pyplot
No ratings yet
101 Onwards On Python Pandas and Pyplot
33 pages
Pratice Paper 6
No ratings yet
Pratice Paper 6
7 pages
More Practice Questions For DataFrame
No ratings yet
More Practice Questions For DataFrame
9 pages
Ip 123 Questions
No ratings yet
Ip 123 Questions
6 pages
QP of IP - 1st Preboard 2024-25 - Set1
No ratings yet
QP of IP - 1st Preboard 2024-25 - Set1
14 pages
Xii Ip QP
No ratings yet
Xii Ip QP
11 pages
DataFrame Revision
No ratings yet
DataFrame Revision
5 pages
12 Ip Pa2 2024-25
No ratings yet
12 Ip Pa2 2024-25
7 pages
Informatics Practices
No ratings yet
Informatics Practices
9 pages
HEALTHCARE
No ratings yet
HEALTHCARE
3 pages
Ip CLSS Xii 2024-25 Hy
No ratings yet
Ip CLSS Xii 2024-25 Hy
14 pages
Oisb Cbse-Gr 12 Sa1 Ip
No ratings yet
Oisb Cbse-Gr 12 Sa1 Ip
8 pages
Python - DataScience Question - Paper
No ratings yet
Python - DataScience Question - Paper
5 pages
Worksheet - Pandas
100% (1)
Worksheet - Pandas
16 pages
12th - QPAPER - Half Yearly 2023
No ratings yet
12th - QPAPER - Half Yearly 2023
9 pages
DS Question Bank Unit-1 Part-2
No ratings yet
DS Question Bank Unit-1 Part-2
3 pages
Dav End Sem
No ratings yet
Dav End Sem
2 pages
XII IP Model 1 Ans
No ratings yet
XII IP Model 1 Ans
8 pages
Ip - Capsule
No ratings yet
Ip - Capsule
17 pages
Attachment
No ratings yet
Attachment
4 pages
Attachment
No ratings yet
Attachment
3 pages
PYQ Data Analysis and Visualisation Using Python GE May 2024
No ratings yet
PYQ Data Analysis and Visualisation Using Python GE May 2024
6 pages
12th - Mid-Term-IP
No ratings yet
12th - Mid-Term-IP
5 pages
Fence Challengers: Long Shot #1
From Everand
Fence Challengers: Long Shot #1
C. S. Pacat
4.5/5 (11)

Prog Found Final

Uploaded by

Prog Found Final

Uploaded by

C S 50 00 F1aa

4 dfs('lr1 , "h1<",11'~ ='M,d -Con.-er Mcd inn Snla

A dff'Scbool Nam eJ.s tr.contai ns('S tate' ).co

12. ~or the Data Fram e df shown in the left belo

subject Bob Guido Sue

o1b4ta. inFctbt thc ereosu.wlt =

0 1.0 NaN 2 NaN 0 1.0 0.3 2 00

1 2.0 3.0 5 NaN 1 2.0 3.0 5 0.0

2 NaN 4.0 6 NaN 2 NaN 4.0 6 0 .0

1959-03-31 23:59:59. 9999999 99 2710.34 9 0.00 5.8

2.34 5.1 dat e i tem

unernp 5.800 -0.88B8i 6 unemp 5.800 .0,868826

18. Unsupervised machine learning uses _ _ _ algorithms.

~ pJ Nnc.al\ldfS. df6], join = 'inner')

employee group name salary employee group salary

ll'IIUp ~rn• ul.r y

Which of the following statement is fals

24. The skleam.metrics module's classifica

1:n 1S71 : t or t in r a nge(l, 20, 2):

k=l ; mean accuracy-98.83 %; standard devia.Uon•0.58\

Which oftbe following statement is false?

26. Which of the following statements is false?

;\nusha is working on a project to annlyz , . .

K-12 aludent 48067

One Flew (}ltft lhe

Cudcoo's NMI (1975) Orama

17 1193 5 2000-12-30 01'18flew0verlhe

'burb s, The (1989) 2.793478 2.962 085

A. data.pi\·01 table(indcx = 'title', columns= 'gender', aggfu nc

age 1&- 50. Under

You might also like