0% found this document useful (0 votes)
481 views71 pages

Python For Data Science

PYTHON FOR DATA SCIENCE NPTEL

Uploaded by

gireesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
481 views71 pages

Python For Data Science

PYTHON FOR DATA SCIENCE NPTEL

Uploaded by

gireesh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

Python for Data Science - - Unit 3 - Week 0 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?

unit=16&assessment=141

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 0: Assignment 0


outline Assignment not submitted
Note : This assignment is only for practice purpose and it will not be counted towards the
About NPTEL Final score.
()
1) Statistics and Probability is the title of a book. If each letter was carved into a block 1 point
and dropped into a bag, what are the chances a person would draw either the letter A or I from the
How does an
NPTEL online bag?
course work?
7 / 24
()
3 / 24

Week 0 () 1/6
1/4
Python Setup
Yes, the answer is correct.
Guide (unit?
Score: 1
unit=16&lesson
Accepted Answers:
=17)
7 / 24
Practice: Week
2) A manufacturing company is set up in two different locations. If the number of 1 point
0: Assignment
employees in one location are 663, and the average monthly salary for their employees is $13454,
0
(assessment? and the number of employees in the other location are 504, and the average monthly salary for their
name=141) employees is $17591. Find the combined arithmetic mean of the monthly salary?

Week 1 () $15804.33
$15522.5
Week 2 () $15240.67
None of these
Week 3 ()
Yes, the answer is correct.
Score: 1

1 of 5 04-09-2024, 12:54 pm
Python for Data Science - - Unit 3 - Week 0 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=16&assessment=141

Accepted Answers:
Week 4 () $15240.67

Supporting 3) Given 2 samples, Sample 1 = [13.3, 2.4, 10, 13.3, 11] and Sample 2 = [8.5, 7.1, 12.6, 1 point
material for 11.5, 10.3]. Find the sample which has a relatively greater spread of values from the mean?
Week 4 ()
Sample 1
Sample 2
Download
Videos () Both the samples are equally spread
None of these
Books ()
Yes, the answer is correct.
Score: 1
Text Accepted Answers:
Transcripts () Sample 1

4) Given below is tabular data on a test conducted recently to detect a new mutant of the 1 point
Problem
coronavirus.
Solving
Session -
July 2024 ()

Find the number of people who have not actually contracted the virus yet have been tested
positive?

138
227
284
173

Yes, the answer is correct.


Score: 1
Accepted Answers:
173

5) Given a pie chart that indicates the expenditure of a manufacturing organization 1 point
towards various activities, what is the ratio of expenditure for the R & D department to the Marketing
department?

2 of 5 04-09-2024, 12:54 pm
Python for Data Science - - Unit 3 - Week 0 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=16&assessment=141

1 : 1.54
1 : 0.65
1 : 0.44
None of these

Yes, the answer is correct.


Score: 1
Accepted Answers:
1 : 0.44

6) Ben is the customer relation manager at a hotel. Recently, Ben has been receiving 1 point
customer feedback saying that the customers had to wait too long to be served by a customer
service representative. Ben decides to note down the customer's waiting time in minutes. What kind
of graph would be appropriate to check the frequency distributions of customers' waiting time?

Line plot
Bar plot
Histogram
Scatter plot

Yes, the answer is correct.


Score: 1
Accepted Answers:
Histogram

3 of 5 04-09-2024, 12:54 pm
Python for Data Science - - Unit 3 - Week 0 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=16&assessment=141

7) 3 natural numbers are chosen at random. What is the probability that their product 1 point
yields an odd number?

1/8
1/6
2/3
1/2

Yes, the answer is correct.


Score: 1
Accepted Answers:
1/8

8) The mean of the first n natural numbers is 1 point

n!
(n / 2) + 1
(n + 1) / 2
n2

Yes, the answer is correct.


Score: 1
Accepted Answers:
(n + 1) / 2

9) 128 players are participating in a knockout tournament. How many games are required 1 point
to decide the winner?

Note: In a knockout tournament, whenever two people play, the loser is eliminated and the winner
advances to the next round.

124
127
64
130

Yes, the answer is correct.


Score: 1
Accepted Answers:
127

10) Given [x1 , x2 , x3 , . . . , xn ] are the possible values of a random variable X, and 1 point
p1 , p2 , p3 , . . . , pn be the corresponding probabilities to each value of the random variable. The
mean is computed by the formula

∑ni=1 pi

∑ni=1 pi xi

∑ni=1 xi
n

4 of 5 04-09-2024, 12:54 pm
Python for Data Science - - Unit 3 - Week 0 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=16&assessment=141

None of these

Yes, the answer is correct.


Score: 1
Accepted Answers:
∑ni=1 pi xi

Check Answers and Submit

Your score is: 10/10

5 of 5 04-09-2024, 12:54 pm
Python for Data Science
Week 1

1. What is the output of the following code? [1 marks]

(a) 36
(b) 121212
(c) 123
(d) Error: Invalid operation, unsupported operator ‘*’ used between ‘int’ and ‘str’

Answer: (b)

2. What is the output of the following code? [1 marks]

(a) -1
(b) -2
(c) -1.28
(d) 1.28

Answer: (b)

1
3. Consider a following code snippet. What is a data type of y? [1 marks]

(a) int
(b) float
(c) str
(d) Code will throw an error.

Answer: (c)

4. Which of the following variable names are INVALID in Python? [1 mark]

(a) 1 variable
(b) variable 1
(c) variable1
(d) variable#

Answer: a, d

5. While naming the variable, use of any special character other than underscore( ) will
throw which type of error? [1 mark]

(a) Syntax error


(b) Key error
(c) Value error
(d) Index error

Answer: a

6. Let x = “Mayur”. Which of the following commands converts the ‘x’ to float datatype?
[1 mark]

(a) str(float,x)
(b) x.float()
(c) float(x)
(d) Cannot convert a string to float data type

Answer: d

2
7. Which Python library is commonly used for data wrangling and manipulation? [1
mark]

(a) Numpy
(b) Pandas
(c) scikit
(d) Math

Answer: b

8. Predict the output of the following code. [1 mark]

(a) 12.0
(b) 12
(c) 11.667
(d) 11

Answer: b

9. Given two variables, j = 6 and g = 3.3. If both normal division and floor division
operators were used to divide j by g, what would be the data type of the value obtained
from the operations? [1 point]

(a) int, int


(b) float, float
(c) float, int
(d) int, float

Answer: b

3
10. Let a = 5 (101 in binary) and b = 3 (011 in binary). What is the result of the following
operation? [1 mark]

(a) 3
(b) 7
(c) 5
(d) 1

Answer: d

4
Python for Data Science - - Unit 5 - Week 2 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=30&assessment=143

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 2: Assignment 2 (Non


outline
Graded)
About NPTEL Assignment not submitted
()
Note : This assignment is only for practice purpose and it will not be counted towards the
How does an Final score
NPTEL online
course work? 1) Variable ‘a’ is defined as 1 point
() a = ‘gOOd moRning’
Command to convert ‘a’ from ‘gOOd moRning’ to ‘Good Morning’ is:-
Week 0 ()
a.upper( )
a.lower( )
Week 1 ()
a.string( )
Week 2 () a.title( )

Yes, the answer is correct.


Jupyter setup
Score: 1
(unit?
Accepted Answers:
unit=30&lesson
a.title( )
=31)
2) Create a list called “Stationery” with the below data 1 point
Sequence_data
_part_1 (unit?
unit=30&lesson Product = ['Pencil', 'Pen', 'Eraser', 'Pencil Box', 'Scale']
=32) Price= [5, 10, 2, 20, 12]
Brand = ['Camlin', 'Rotomac', 'Nataraj', 'Camel', 'Apsara']
Sequence_data
Stationery = [Product, Price, Brand]
_part_2 (unit?
unit=30&lesson
The command to add “Notebook” as the first element inside the first level of the

1 of 3 04-09-2024, 12:51 pm
Python for Data Science - - Unit 5 - Week 2 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=30&assessment=143

=33) list “Stationery” is:-

Sequence_data Stationery[0].append('Notebook')
_part_3 (unit?
Stationery[0].insert(0,'Notebook')
unit=30&lesson
=34) Stationery[0][1] = "Notebook"
Stationery[0].extend('Notebook')
Sequence_data
_part_4 (unit? Yes, the answer is correct.
unit=30&lesson Score: 1
=35) Accepted Answers:
Stationery[0].insert(0,'Notebook')
Numpy (unit?
unit=30&lesson 3) The method to clear all the elements from a Set is:- 1 point
=36)
remove( )
Week 2 : discard( )
Lecture slides
clear( )
(unit?
unit=30&lesson delete()
=37)
Yes, the answer is correct.
Score: 1
Week 2 - FAQs
(unit? Accepted Answers:
unit=30&lesson
clear( )
=38) 4) Consider the list, 1 point
Practice: Week
2: Assignment Mylist =[‘a’, ‘a’, ‘b’, ‘b’, ‘b’, ‘c’, ‘c’, ‘d’, ‘e’]
2 (Non
Graded) The output of the code: Mylist.index(‘d’) is
(assessment?
name=143) 7
8
Quiz: Week 2 :
Assignment 2 4
(assessment? 6
name=147)
Yes, the answer is correct.
Week 2: Score: 1
Solution (unit? Accepted Answers:
unit=30&lesson 7
=124)
5) Which of the following python sequence data type is immutable? 1 point
Week 2
Feedback Form list
: Python for dictionary
Data Science tuple
(unit?
array
unit=30&lesson
=114) Yes, the answer is correct.
Score: 1
Week 3 () Accepted Answers:
tuple
Week 4 ()

2 of 3 04-09-2024, 12:51 pm
Python for Data Science - - Unit 5 - Week 2 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=30&assessment=143

Supporting Check Answers and Submit


material for
Week 4 () Your score is: 5/5

Download
Videos ()

Books ()

Text
Transcripts ()

Problem
Solving
Session -
July 2024 ()

3 of 3 04-09-2024, 12:51 pm
Python for Data Science
Week 2

1. Which of the following oject does not support indexing? [1 mark]

(a) tuple
(b) list
(c) dictionary
(d) set

Answer: d

2. Given a NumPy array, arr = np.array([[[1, 2, 3], [4, 5, 6], [7, 8, 9]]]), what is the
output of the command, print(arr[0][1])?

(a) [[1 2 3]
[4 5 6]
[7 8 9]]
(b) [1 2 3]
(c) [4 5 6]
(d) [7 8 9]

Answer: c

3. What is the output of the following code?

(a) [2, 3, 4, 5]
(b) [0 1 2 3]

1
(c) [1, 2, 3, 4]
(d) Will throw an error: Set objects are not iterable.

Answer: c

2
4. What is the output of the following code? [1 mark]

(a)

3
(b)

(c)

(d)

Answer: c

4
5. Which of the following code gives output My friend’s house is in Chennai? [1
mark]

(a)

(b)

(c)

(d)

Answer: a, d

6. Let t1 = (1, 2, “tuple”, 4) and t2 = (5, 6, 7). Which of the following will not give any
error after the execution? [1 mark]

(a) t1.append(5)
(b) x = t2[t1[1]]
(c) t3 = t1 + t2
(d) t3 = (t1, t2)
(e) t3 = (list(t1), list(t2))

Answer: (b, c, d, e)

7. Let d = {1 : “Pyhton”, 2 : [1, 2, 3]}. Which among the following will not give the error
after the execution? [1 mark]

(a) d[2].append(4)

5
(b) x = d[0]
(c) d[“one”] = 1
(d) d.update({‘one’ : 2})
Answer: (a, c, d)
8. Which of the following data type is immutable? [1 mark]
(a) list
(b) set
(c) tuple
(d) dictionary
Answer: (c)
9. student = {‘name’: ‘Jane’, ‘age’: 25, ‘courses’: [‘Math’, ‘Statistics’]}
Which among the following will return
{‘name’: ‘Jane’, ‘age’: 26, ‘courses’: [‘Math’, ‘Statistics’], ‘phone’: ‘123-456’}
(a) student.update({‘age’ : 26})
(b) student.update({‘age’ : 26, ‘phone’: ‘123-456’})
(c) student[‘phone’] = ‘123-456’
student.update({‘age’ : 26})
(d) None of the above
Answer: (b, c)
10. What is the output of the following code? [1 mark]

(a) [‘M’, ‘A’, ‘H’, ‘E’, ‘S’, ‘H’]


(b) [‘m’, ‘a’, ‘h’, ‘e’, ‘s’, ‘h’]
(c) [‘M’, ‘a’, ‘h’, ‘e’, ‘s’, ‘h’]
(d) [‘m’, ‘A’, ‘H’, ‘E’, ‘S’, ‘H’]
Answer: (a)

6
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=144

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 3: Assignment 3 (Non


outline
Graded)
About NPTEL Assignment not submitted
()
Note : This assignment is only for practice purpose and it will not be counted towards the
How does an Final score
NPTEL online
course work? 1) Which of the following can be inferred from scatter plot of ‘mpg’ (Miles per gallon) vs 1 point
() ‘wt’ (Weight of car) from the dataset mtcars.csv (https://drive.google.com/file/
d/1Ua21bZfbtN4DUw4fK9XCF3AJmcIqSn4w/view?usp=sharing)?
Week 0 ()
As weight of the car increases, the mpg decreases
As weight of the car increases, the mpg increases
Week 1 ()
There is no relation between weight of the car and mpg
Week 2 () When weight increases, mpg increases exponentially

Yes, the answer is correct.


Week 3 () Score: 1
Accepted Answers:
Reading data As weight of the car increases, the mpg decreases
(unit?
unit=41&lesson 2) Plot a boxplot for “price” vs “cut” from the dataset “diamond.csv (https:// 1 point
=42) drive.google.com/file/d/1oSRxlHG8NcK9jNgIn4Q1Y5GGi6Jm5asX/view?usp=sharing)”. Which
of the categories under “cut” have the highest median price?
Pandas
Dataframes I
Good
(unit?
Ver Good
unit=41&lesson
=43) Premium
Fair

1 of 3 04-09-2024, 12:49 pm
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=144

Pandas
Yes, the answer is correct.
Score: 1
Dataframes II
Accepted Answers:
(unit?
Fair
unit=41&lesson
=44) 3) In the churn.csv (https://drive.google.com/open? 1 point
Pandas id=14eJFzce4nMREzCsd4tCTewnFdz6GZAD4) dataframe, what are the total no. of missing values
Dataframes III for the variable TotalCharges?
(unit?
unit=41&lesson
10
=45) 23
15
Control
structures & 5
Functions Yes, the answer is correct.
(unit? Score: 1
unit=41&lesson Accepted Answers:
=46) 15
Exploratory 4) The command used for line plot from the package Matplotlib? 1 point
data analysis
(unit? plot( )
unit=41&lesson line( )
=47)
join( )
Data plt( )
Visualization-
Part I (unit? Yes, the answer is correct.
Score: 1
unit=41&lesson
=48) Accepted Answers:
plot( )
Data
Visualization- 5) The probability of two different events occurring at the same time is known as 1 point
Part II (unit?
Marginal probability
unit=41&lesson
=49) Conditional probability
Joint probability
Dealing with
missing data
Marginal and Joint probability
(unit? Yes, the answer is correct.
unit=41&lesson Score: 1
=50) Accepted Answers:
Joint probability
Datasets (unit?
unit=41&lesson
=51)
Check Answers and Submit
Week 3:
Lecture slides Your score is: 5/5
(unit?
unit=41&lesson
=52)

Week 3 - FAQs
(unit?
unit=41&lesson

2 of 3 04-09-2024, 12:49 pm
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=144

=53)

Practice: Week
3: Assignment
3 (Non
Graded)
(assessment?
name=144)

Week 3
Feedback Form
: Python for
Data Science
(unit?
unit=41&lesson
=115)

Quiz: Week 3 :
Assignment 3
(assessment?
name=151)

Week 4 ()

Supporting
material for
Week 4 ()

Download
Videos ()

Books ()

Text
Transcripts ()

Problem
Solving
Session -
July 2024 ()

3 of 3 04-09-2024, 12:49 pm
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=151

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 3 : Assignment 3


outline The due date for submitting this assignment has passed.
Due on 2024-08-14, 23:59 IST.
About NPTEL
()
Assignment submitted on 2024-08-07, 13:36 IST
How does an 1) Which of the following is the correct approach to fill missing values in case of 1 point
NPTEL online categorical variable?
course work?
() Mean
median
Week 0 () Mode
None of the above
Week 1 ()
Yes, the answer is correct.
Score: 1
Week 2 ()
Accepted Answers:
Mode
Week 3 ()

Reading data Assume a pandas dataframe df cars which when printed is as shown below. Based on this
(unit? information, answer questions 2 and 3.
unit=41&lesson
=42)

Pandas
Dataframes I
(unit?
unit=41&lesson
=43)

Pandas

1 of 5 04-09-2024, 12:02 pm
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=151

Dataframes II
(unit?
unit=41&lesson
=44)

Pandas
Dataframes III
(unit?
unit=41&lesson
=45)

Control
structures &
Functions
(unit?
unit=41&lesson 2) Of the following set of statements, which of them can be used to extract the column 1 point
=46) Type as a separate dataframe?
Exploratory
df_cars[[‘Type’]]
data analysis
(unit? df_cars.iloc[[:, 1]
unit=41&lesson df_cars.loc[:, [‘Type’]]
=47) None of the above
Data Yes, the answer is correct.
Visualization- Score: 1
Part I (unit? Accepted Answers:
unit=41&lesson df_cars[[‘Type’]]
=48) df_cars.loc[:, [‘Type’]]
Data 3) The method df_cars.describe() will give description of which of the following column? 1 point
Visualization-
Part II (unit? Car name
unit=41&lesson Brand
=49)
Price (in lakhs)
Dealing with All of the above
missing data
(unit? Yes, the answer is correct.
Score: 1
unit=41&lesson
=50) Accepted Answers:
Price (in lakhs)
Datasets (unit?
unit=41&lesson 4) Which pandas function is used to stack the dataframes vertically? 1 point
=51)
pd.merge()
Week 3: pd.concat()
Lecture slides
join()
(unit?
unit=41&lesson
None of the above
=52) Yes, the answer is correct.
Score: 1
Week 3 - FAQs
Accepted Answers:
(unit?
pd.concat()
unit=41&lesson
=53) 5) Which of the following are libraries in Python? 1 point

2 of 5 04-09-2024, 12:02 pm
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=151

Practice: Week Pandas


3: Assignment Matplotlib
3 (Non Graded)
NumPy
(assessment?
name=144)
All of the above

Yes, the answer is correct.


Week 3
Score: 1
Feedback Form
Accepted Answers:
: Python for
All of the above
Data Science
(unit?
unit=41&lesson Read the ‘flavors of cocoa.csv (https://drive.google.com/file/
=115)
d/1jIf8xWRFp5OW7tzI4bPe5nqWfZW7shoW/view?usp=drive_link)’ file as a dataframe ‘df cocoa’
Quiz: Week 3 : and answer questions 6-9. The description of features/variables is given below:
Assignment 3
(assessment?
name=151)

Week 4 ()

Supporting
material for
Week 4 ()

Download
Videos ()
6) Which of the following variable have null values? 1 point
Books ()
ID
Text Company
Transcripts () Review Date
Rating
Problem
Yes, the answer is correct.
Solving Score: 1
Session -
Accepted Answers:
July 2024 () Review Date

7) Which of the following countries have maximum locations of cocoa manufacturing 1 point
companies?

U.K.
U.S.A.
Canada
France

Yes, the answer is correct.


Score: 1
Accepted Answers:
U.S.A.

8) After checking the data summary, which feature requires a data conversion considering 1 point

3 of 5 04-09-2024, 12:02 pm
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=151

the data values held?

Rating
Review date
Company
Bean origin

Yes, the answer is correct.


Score: 1
Accepted Answers:
Review date

9) What is the maximum rating of chocolates? 1 point

1.00
5.00
3.18
4.00

Yes, the answer is correct.


Score: 1
Accepted Answers:
5.00

10) What will be the output of the following code? 1 point

[bool, int, float, float, str]


[str, int, float, float, str]
[bool, int, float, int, str]
[bool, int, int, float, str]

Yes, the answer is correct.


Score: 1
Accepted Answers:
[bool, int, float, float, str]

11) What does df.info() provide? 1 point

Summary of the DataFrame, including the number of non-null entries.


The first 5 rows of the DataFrame
The data types of the columns
The correlation matrix of the DataFrame

Yes, the answer is correct.


Score: 1
Accepted Answers:

4 of 5 04-09-2024, 12:02 pm
Python for Data Science - - Unit 6 - Week 3 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=41&assessment=151

Summary of the DataFrame, including the number of non-null entries.

12) What will be the output of the following code? 1 point

[1, 2]
[1, 3, 5]
[1, 2, 3, 4, 5]
[5, 4, 3, 2, 1]

Yes, the answer is correct.


Score: 1
Accepted Answers:
[1, 3, 5]

5 of 5 04-09-2024, 12:02 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=145

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 4: Assignment 4 (Non


outline
Graded)
About NPTEL Assignment not submitted
()
Note : This assignment is only for practice purpose and it will not be counted towards the
How does an Final score
NPTEL online
course work? 1) Which of the following functions can be used to split the data into train and test? 1 point
()
pandas.train_test_split( )
numpy.train_test_split( )
Week 0 ()
sklearn.model_selection.train_test_split( )
Week 1 () sklearn.train_test_split( )

Yes, the answer is correct.


Week 2 () Score: 1
Accepted Answers:
Week 3 () sklearn.model_selection.train_test_split( )

2) The function used to perform k-Nearest Neighbors classification is: - 1 point


Week 4 ()
sklearn.KNN
Introduction to
sklearn.KNearestClassifier
Classification
Case Study sklearn.neighbors.KNeighborsClassifier( )
(unit? sklearn.neighbors.KNeighborsRegressor( )
unit=56&lesson
Yes, the answer is correct.
=57)
Score: 1
Case Study on Accepted Answers:
Classification sklearn.neighbors.KNeighborsClassifier( )

1 of 3 04-09-2024, 12:47 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=145

Part I (unit? 3) A Linear Regression model is said to be good when the R-squared value tends to 1 point
unit=56&lesson
=58) 0
1
Case Study on
Classification -1
Part II (unit? 0.5
unit=56&lesson
Yes, the answer is correct.
=59)
Score: 1
Introduction to Accepted Answers:
Regression 1
Case Study
4) The Gini coefficient ranges from 1 point
(unit?
unit=56&lesson 0 to 1
=60)
-1 to 0
Case Study on -1 to 1
Regression
None of the above
Part I (unit?
unit=56&lesson Yes, the answer is correct.
=61) Score: 1
Accepted Answers:
Case Study on 0 to 1
Regression
Part II (unit? 5) What is heteroscedasticity as used to assess a Linear Regression model? 1 point
unit=56&lesson
=62) Linear regression with varying error terms
Linear regression with constant error terms
Case Study on
Regression
Linear regression with no error terms
Part III (unit? All the above
unit=56&lesson
Yes, the answer is correct.
=63) Score: 1
Data sets (unit? Accepted Answers:
unit=56&lesson Linear regression with varying error terms
=64)

Case Study Check Answers and Submit


codes (unit?
unit=56&lesson Your score is: 5/5
=65)

Practice: Week
4: Assignment
4 (Non
Graded)
(assessment?
name=145)

Week 4
Feedback Form
: Python for
Data Science
(unit?

2 of 3 04-09-2024, 12:47 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=145

unit=56&lesson
=116)

Quiz: Week 4 :
Assignment 4
(assessment?
name=152)

Supporting
material for
Week 4 ()

Download
Videos ()

Books ()

Text
Transcripts ()

Problem
Solving
Session -
July 2024 ()

3 of 3 04-09-2024, 12:47 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=152

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 4 : Assignment 4


outline The due date for submitting this assignment has passed.
Due on 2024-08-21, 23:59 IST.
About NPTEL
()
Assignment submitted on 2024-08-21, 17:32 IST
How does an 1) Which of the following are regression problems? Assume that appropriate data is 1 point
NPTEL online given.
course work?
() Predicting the house price.
Predicting whether it will rain or not on a given day.
Week 0 () Predicting the maximum temperature on a given day.
Predicting the sales of the ice-creams.
Week 1 ()
Yes, the answer is correct.
Score: 1
Week 2 ()
Accepted Answers:
Predicting the house price.
Week 3 () Predicting the maximum temperature on a given day.
Predicting the sales of the ice-creams.
Week 4 ()
2) Which of the following are multiclass classification problems? 1 point
Introduction to
Classification Classifying emails as spam or not spam.
Case Study Classifying a person’s blood type as A, B, AB, or O.
(unit? Predicting the price of a second-hand car.
unit=56&lesson
Classifying a movie genre into Drama, Comedy, Action, or Thriller.
=57)
Yes, the answer is correct.
Case Study on Score: 1
Classification
Accepted Answers:
Part I (unit?

1 of 5 04-09-2024, 12:03 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=152

unit=56&lesson Classifying a person’s blood type as A, B, AB, or O.


=58) Classifying a movie genre into Drama, Comedy, Action, or Thriller.
Case Study on 3) If a linear regression model achieves zero training error, can we say that all the data 1 point
Classification
points lie on a straight line in the feature space?
Part II (unit?
unit=56&lesson Yes
=59)
No
Introduction to
Yes, the answer is correct.
Regression Score: 1
Case Study Accepted Answers:
(unit? Yes
unit=56&lesson
=60)
Read the information given below and answer the questions from 4 to 6:
Case Study on
Regression
Part I (unit? Data Description:
unit=56&lesson
=61) An automotive service chain is launching its new grand service station this weekend.
They offer to service a wide variety of cars. The current capacity of the station is to check 315 cars
Case Study on
thoroughly per day. As an inaugural offer, they claim to freely check all cars that arrive on their
Regression
launch day, and report whether they need servicing or not!
Part II (unit?
unit=56&lesson
=62) Unexpectedly, they get 450 cars. The servicemen will not work longer than the working hours, but
the data analysts have to!
Case Study on
Regression
Can you save the day for the new service station?
Part III (unit?
unit=56&lesson
=63) How can a data scientist save the day for them?

Data sets (unit? He has been given a data set, ‘ServiceTrain.csv (https://drive.google.com/file/
unit=56&lesson
d/1n1Hv9TtHTBUhU84z-S4wBBgoOOoKKrm7/view?usp=drive_link)’ that contains some attributes
=64)
of the car that can be easily measured and a conclusion that if a service is needed or not.
Case Study
codes (unit? Now for the cars they cannot check in detail, they measure those attributes and store them in
unit=56&lesson ‘ServiceTest.csv (https://drive.google.com/file/d/1h_Va9tkMB6UDSuqD6MzeYqgdph6yhtmy/view?
=65)
usp=drive_link)’
Practice: Week
4: Assignment Problem Statement:
4 (Non Graded)
(assessment? Use machine learning techniques to identify whether the cars require service or not.
name=145)

Week 4 Read the given datasets ‘ServiceTrain.csv (https://drive.google.com/file/


Feedback Form d/1n1Hv9TtHTBUhU84z-S4wBBgoOOoKKrm7/view?usp=drive_link)’ and ‘ServiceTest.csv
: Python for (https://drive.google.com/file/d/1h_Va9tkMB6UDSuqD6MzeYqgdph6yhtmy/view?
Data Science usp=drive_link)’ as train data and test data respectively and import all the required packages
(unit? for analysis.
unit=56&lesson
=116) 4) Which of the following machine learning techniques would NOT be appropriate to solve 1 point

2 of 5 04-09-2024, 12:03 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=152

Quiz: Week 4 : the problem given in the problem statement?


Assignment 4
kNN
(assessment?
name=152) Random Forest
Logistic Regression
Supporting
Linear regression
material for
Week 4 () Yes, the answer is correct.
Score: 1
Accepted Answers:
Download
Linear regression
Videos ()

Books () Prepare the data by following the steps given below, and answer questions 5 and 6.
• Encode categorical variable, Service - Yes as 1 and No as 0 for both the train and test
Text datasets.
Transcripts () • Split the set of independent features and the dependent feature on both the train and test
datasets.
Problem • Set random state for the instance of the logistic regression class as 0.
Solving
Session - 5) After applying logistic regression, what is/are the correct observations from the 1 point
July 2024 () resultant confusion matrix?

True Positive = 29, True Negative = 94


True Positive = 94, True Negative = 29
False Positive = 5, True Negative = 94
None of the above

Yes, the answer is correct.


Score: 1
Accepted Answers:
True Positive = 29, True Negative = 94
False Positive = 5, True Negative = 94

6) The logistic regression model built between the input and output variables is checked 1 point
for its prediction accuracy of the test data. What is the accuracy range (in %) of the predictions
made over test data?

60 - 79
90 - 95
30 – 59
80 – 89

Yes, the answer is correct.


Score: 1
Accepted Answers:
90 - 95

7) How are categorical variables preprocessed before model building? 1 point

Standardization
Dummy variables

3 of 5 04-09-2024, 12:03 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=152

Correlation
None of the above

Yes, the answer is correct.


Score: 1
Accepted Answers:
Dummy variables

8) A regression model with the function y = 80 + 4.5x was built to understand the 1 point
impact of temperature x on ice cream sales y. The temperature this month is 10 degrees more than
the previous month. What is the predicted difference in ice cream sales?

56 units
45 units
80 units
None of the above

Yes, the answer is correct.


Score: 1
Accepted Answers:
45 units

9) X and Y are two variables that have a strong linear relationship. Which of the 1 point
following statements are incorrect?

There cannot be a negative relationship between the two variables.


The relationship between the two variables is purely causal.
One variable may or may not cause a change in the other variable.
The variables can be positively or negatively correlated with each other.

Yes, the answer is correct.


Score: 1
Accepted Answers:
There cannot be a negative relationship between the two variables.
The relationship between the two variables is purely causal.

The Global Happiness Index report contains the Happiness Score data with multiple features
(namely the Economy, Family, Health, and Freedom) that could affect the target variable
value.

Prepare the data by following the steps given below, and answer question 10.

• Split the set of independent features and the dependent feature on the given dataset
• Create training and testing data from the set of independent features and dependent feature
by splitting the original data in the ratio 3:1 respectively, and set the value for random_state
of the training/test split method’s instance as 1

10) A multiple linear regression model is built on the Global Happiness Index dataset ‘GHI 1 point
Report.csv (https://drive.google.com/file/d/1c7UeZMZuYYfOXMMagI4UpvC-VrJ7MXc8/view?
usp=drive_link)’. What is the RMSE of the baseline model?

4 of 5 04-09-2024, 12:03 pm
Python for Data Science - - Unit 7 - Week 4 https://onlinecourses.nptel.ac.in/noc24_cs68/unit?unit=56&assessment=152

2.00
0.50
1.06
0.75

Yes, the answer is correct.


Score: 1
Accepted Answers:
1.06

5 of 5 04-09-2024, 12:03 pm
Python for Data Science
WEEK 0 ASSIGNMENT QUESTIONS

Read the table given below, and answer questions 1 & 2:

1. The table above indicates that there were 3,193,886 confirmed Covid 19 cases at one point in
time around the world. The table also shows that 9,72,719 people recovered. What % of people
have recovered from COVID-19?

a) 30.45

b) 7.1

c) 37

d) 20.5

Answer: a 30.45

Solution: (9,72,719/3,193,886) * 100 = 30.45

2. Out of the 3.1 million confirmed cases, what ----- % of people have lost their lives?

a) 30.5

b) 7.1

c) 37

d) 20.5

Answer: b 7.1

Solution: (227638/3,193,886) * 100= 7.1


3. There are 6 marbles in a sling bag. Of these 3 are red marbles, 2 are blue marbles and 1 is a
yellow marble. What is the probability of drawing a red marble?
a. 1/3
b. 2/3
c. ½
d. 5/6
e. None of the above

Answer: c) ½.

There are 3 red marbles ,2 blue marbles and 1 yellow marbles in a bag.

Probability of drawing a red – 3 red marbles / total 6 marbles = 1/2

4. “Statistics and Probability “is the title of a book. If each letter was carved into a block and
dropped into a bag, what are the chances a person would draw either the letter A or I from the
bag?
a. 1/4
b. 3 /24
c. 1/6
d. 7/24
e. None of the above

Answer: d) 7/24

Probability of drawing Letter A → No. of letters A / Total Letters -- > 3 /24

Probability of drawing Letter I → No. of letters I / Total Letters -- > 4 /24

Total probability → 3 /24 + 4 /24 →7/24

5. From a shuffled deck of 52 cards, a card is drawn randomly. What is the probability that the card
drawn is neither a Queen nor a Heart shaped card?
a. 14/52
b. 17 /62
c. 35/52
d. 9/13
e. None of the above

Answer: d) 9/13

Number of Hearts in a deck - 13 [ including 1 Queen]

Number of Queens in deck - 3 [ Clubs, Spades, Diamond]

Number of Queens in deck - 3 [ Clubs, Spades, Diamond]


Total card that can be Queen and Heart: 13 + 3 = 16

Remaining cards: 52 -16 = 36

Probability of not drawing a Queen and a Heart = 36/52 = 9/13

6. What will be the output for the following code snippet?


c = 10 d = 20 e = 5
sum = c + d – e
print(sum)
A) 35
B) 25
C) 30
D) 15

Answers: b) 25

7. Suppose we have a variable n = 5. If we wanted to print numbers from 1 to the variable


[including n], which loop helps with the iteration?
a) for loop
b) while loop
c) a and b
d) None of the above

Answer: c) a and b

By assigning the variable with a random numeric value, the number of iterations can be decided. So,
in this case, both for loop and while loop will work.

8. Which of the following operators is used to check an “Equal to” relationship between variables?

a) <=
b) >==
c) =
d) ==
e) None of the above

Answer: d) ==

9. Find the median for from the following numbers: 13,42, 24, 9,11, 18, 11, 7
a) 12
b) 11
c) 13
d) 9

Answer: a) 12
Sorting numbers in order: 7,9,11,11,13,18,24,42
Median is - (11 + 13)/ 2 = 12

10. Find the mode for the following sets of numbers:


Python for Data Science - - Unit 3 - Week 1 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=19&assessment=94

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 1: Practice Assignment 1


outline
(Non Graded)
How does an Assignment not submitted
NPTEL online
course work? Note : This assignment is only for practice purpose and it will not be counted towards the
() Final score

Week 0 ()
1) Which of the following is/are valid variable naming convention(s) in Python? 1 point
Week 1 ()
ageEmp = 45
Introduction to AgeEmp = 45
Python for Data age_emp = 45
Science (unit?
AGE_EMP = 45
unit=19&lesson
=20) Yes, the answer is correct.
Score: 1
Introduction to Accepted Answers:
Python (unit? ageEmp = 45
unit=19&lesson AgeEmp = 45
=21)
age_emp = 45
Introduction to AGE_EMP = 45
Spyder - Part 1
2) Which of the following is not accepted as a representation of complex numbers in 1 point
(unit?
unit=19&lesson Python?
=22)
k = 2 + 3j
Introduction to k = complex(2, 3)
Spyder - Part 2
k = 2 + 3l

1 of 3 04-09-2024, 01:11 pm
Python for Data Science - - Unit 3 - Week 1 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=19&assessment=94

(unit? k = 2 + 3J
unit=19&lesson
=23) Yes, the answer is correct.
Score: 1
Variables and Accepted Answers:
Datatypes k = 2 + 3l
(unit?
unit=19&lesson 3) What is the output of the following expression, 7//2? 1 point
=24)
1
Operators 3
(unit?
7
unit=19&lesson
=25)
2

Yes, the answer is correct.


Setup Guide
Score: 1
(unit?
Accepted Answers:
unit=19&lesson
3
=26)
4) Which of the following functions returns a valid list of attributes of the object it is called 1 point
Week 1:
Lecture slides
upon?
(unit?
print()
unit=19&lesson
=27) bin()
abs()
Week 1-FAQs
dir()
(unit?
unit=19&lesson Yes, the answer is correct.
=28) Score: 1
Accepted Answers:
Practice: Week
dir()
1: Practice
Assignment 1 5) An entity that can be different assigned different values is called__ 1 point
(Non Graded)
(assessment? Operator
name=94) Operand
Quiz: Week 1: Variable
Assignment 1 Value
(assessment?
Yes, the answer is correct.
name=89)
Score: 1
Week 1: Accepted Answers:
Solutions (unit? Variable
unit=19&lesson
=90)
Check Answers and Submit
Week 1
Feedback Form Your score is: 5/5
: Python for
Data Science
(unit?
unit=19&lesson
=29)

2 of 3 04-09-2024, 01:11 pm
Python for Data Science - - Unit 3 - Week 1 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=19&assessment=94

Week 2 ()

Week 3 ()

Week 4 ()

Non Graded
Assignment ()

Supporting
material for
Week 4 ()

Download
Videos ()

Books ()

Text
Transcripts ()

Non
Proctored
Exam(Mar 20)
Session 1
(10am - 1pm -
for Jan 2022)
()

Non
Proctored
Exam(Mar 20)
Session 2
(8pm - 11pm -
for Jan 2022)
()

3 of 3 04-09-2024, 01:11 pm
NPTEL
WEEK 1 ASSIGNMENT QUESTIONS

1) Which of the arithmetic operators given below cannot be used with ‘strings’ in Python?
A) *
B) –
C) +
D) All of the mentioned

2) When the following statement is executed, what type of error is obtained?

a. Type Error
b. Syntax Error
c. Value Error
d. None of the above

3) Two variables X and Y were assigned the following values initially. X = 3 and Y = 6. Which of the
following statements will help swap the values between these two variables?

a. Y = X
X=Y
b. X = Y
c. X = Y
Y=X
d. X ,Y = Y,X
e. Both a and d

4) From the following set of statements, what will be the value of variable y in the final print
statement?

a. 8
b. 9
c. 1
d. Error
e. 16
5) Consider j = 5 and k = 11. We change the values from j = 7 and k remains constant.

What is print(j|k) before and after modification of value in variable j?

a. 3,15
b. 15,15
c. 11,15
d. 15,7
e. None of the above

6) What would be the output of the following statements?

a. False
b. True
c. Not True
d. None of the above

7) What does k = 4%7 evaluate to and what is the type of variable k?

a. 4, int
b. 0.0, float
c. 0, int
d. 1, int
e. None of the above

8) j = 6 and g = 3.3. If normal division and floor division was done between j and k, what would be
the type of the resultant variable?

a. int,int
b. float,float
c. float,int
d. int,float
e. None of the above

9) Consider two answers to a question; answer1 and answer2. What is the output of the following
set of statements?

a. True
b. False
c. 0
d. 1

10) Consider the list of instructions and resulting outputs given below. Pick the set that is incorrect.
1. print ("Good", end ="")
print ("Day")
Output -> GoodDay

2. word1 = "Trial"
print("Word is %s" %word1)
Output -> Trial

3. num1 = 23
print( " Number: %f " %num1 )
Output -> Number: 23.000000

4. print( "ready\nsteady\ngo")
Output -> ready
steady
go

a. 4
b. 2
c. 1,3,4
d. 3,4
e. All are correct.
Python for Data Science - - Unit 4 - Week 2 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=31&assessment=98

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 2: Practice Assignment 2


outline
(Non Graded)
How does an Assignment not submitted
NPTEL online
course work? Note : This assignment is only for practice purpose and it will not be counted towards the
() Final score

Week 0 ()
1) Create a dictionary “Movie_A” with the following details of a movie: 1 point
Week 1 ()

Week 2 ()

Jupyter setup
(unit?
unit=31&lesson
=32)
The correct command to extract the movie’s year of release is___
Sequence_data
_part_1 (unit?
unit=31&lesson Movie_A[2019]
=33)
Movie_A[2]
Sequence_data Movie_A[‘Year’]
_part_2 (unit? Movie_A[3]
unit=31&lesson
=34) Yes, the answer is correct.
Score: 1
Sequence_data Accepted Answers:
_part_3 (unit? Movie_A[‘Year’]

1 of 3 04-09-2024, 01:13 pm
Python for Data Science - - Unit 4 - Week 2 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=31&assessment=98

unit=31&lesson 2) Which of the following is /are container(s) for sequential data? 1 point
=35)
Lists
Sequence_data
Dictionary
_part_4 (unit?
unit=31&lesson Strings
=36) Sets

Numpy (unit? Yes, the answer is correct.


unit=31&lesson Score: 1
=37) Accepted Answers:
Lists
Week 2 : Strings
Lecture slides
(unit? 3) Which of the following code is appropriate to create an array of float datatype (min 8 1 point
unit=31&lesson bytes)?
=38)
sample=array('d',[2.2,2,5,6])
Week 2 - FAQs
sample=array(‘f’,[1,2,3])
(unit?
unit=31&lesson sample=array('b',[1,2.3,3])
=39) sample=array('B',[1,2.3,3])

Practice: Week Yes, the answer is correct.


2: Practice Score: 1
Assignment 2 Accepted Answers:
(Non Graded) sample=array('d',[2.2,2,5,6])
(assessment?
4) The correct command(s) to install Jupyter Notebook through command prompt is/are__ 1 point
name=98)

Quiz: Week 2: pip install notebook


Assignment 2 pip install jupyter
(assessment? pip install Jupiter
name=97)
pip install jupyter notebook
Week 2:
Yes, the answer is correct.
Solutions (unit? Score: 1
unit=31&lesson Accepted Answers:
=91) pip install notebook
Week 2 pip install jupyter
Feedback Form pip install jupyter notebook
: Python for
5) The Markdown cells in Jupyter Notebook allows you to add maximum of _______ hash 1 point
Data Science
(unit? (#) signs followed by a space to add section headers
unit=31&lesson
8
=40)
6
Week 3 () 3
2
Week 4 ()
Yes, the answer is correct.
Score: 1
Non Graded Accepted Answers:
Assignment () 6

2 of 3 04-09-2024, 01:13 pm
Python for Data Science - - Unit 4 - Week 2 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=31&assessment=98

Supporting
material for Check Answers and Submit
Week 4 ()
Your score is: 5/5
Download
Videos ()

Books ()

Text
Transcripts ()

Non
Proctored
Exam(Mar 20)
Session 1
(10am - 1pm -
for Jan 2022)
()

Non
Proctored
Exam(Mar 20)
Session 2
(8pm - 11pm -
for Jan 2022)
()

3 of 3 04-09-2024, 01:13 pm
NPTEL
WEEK 2 ASSIGNMENT QUESTIONS

1. Consider a variable job = "chemist". Which of the following expressions will retrieve the last
character from the variable value?
A) job[7]
B) job[len(job) - 1]
C) job[5:6]
D) job[- 1]
E) All of the above statements are true.

2) Which of the following expressions should be used to assign the variable get_num to get the final
print statement output as value 75 from the below tuple?

a. nst_tup[1][2]
b. nst_tup[1:2][1]
c. nst_tup[1][1]
d. nst_tup[1:2](1)
e. None of the above

3) What would be the output for the following set of statements?

a. [13, 23, 18, 64, 51, “True”]


b. [13, 23, 18, 64, True]
c. [13, 23, 18, 64, 51, True]
d. Index Error

4) What result does the final statement print?

a. Output is: 12, (25, 32, 39), 44


b. Output is: 12 and (25, 32, 39) and 44

c. Output is: 12 and 25 and 39

d. ValueError: Too many values to unpack

e. Output is: 12 and [25, 32, 39] and 44

5) When the following set of instructions are executed, how many times does the vowel “e” appear
in the result?

a. 1
b. “e” is not printed
c. 2
d. 4
e. None of the above

6) Which of the following options, when executed, will result in a tuple?

a. t = (2,2)
b. y =['h','4','3']
c. r = ('v',)
d. s = ('w')
e. All except b

7) Which statement/ statements will result in an empty datastructure?

a. dict1 = {}
b. tup1 = ()
c. st1 = set()
d. toy = "baseball"
gt_str = toy[2:2]
print(gt_str)
e. All of the above

8) Consider a dictionary city created with the following keys and values.

Through which all possible way / ways can we access the value 5 from the dictionary city?
a. city['Bengaluru']
b. city.get['Bengaluru']
c. city.values()[1]
d. list(city.values())[1]
e. None of the above

9) Count the number of elements in the below list.

a. 2
b. 1
c. 3
d. 0
e. None of the above

10) A datastructure is defined as celebrate = set('Nativity Day'). What are the possible outputs if
celebrate is printed?
1. {'v', 'N', 't', 'i', 'y', 'a', 'D'}
2. {'v', 'N', 't', 'I', 'y', 'a', 'D', ' '}
3. {'v', 'N', 't', 'i', 'y', 'a', 'D', ' '}
4. {'v', 't', 'i', 'y', 'a', 'D', ' ', 'N'}
a. 1
b. 1 and 3
c. 1,2,3
d. 3 and 4
e. All are correct.
Python for Data Science - - Unit 5 - Week 3 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=42&assessment=99

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 3: Practice Assignment 3


outline
(Non Graded)
How does an Assignment not submitted
NPTEL online
course work? Note : This assignment is only for practice purpose and it will not be counted towards the
() Final score

Week 0 ()
1) Which of the following parameter is an alias for ‘sep’ for the read_csv and read_table 1 point
Week 1 () functions from pandas?

index_col
Week 2 ()
skiprows

Week 3 () na_values
delimiter
Reading data
Yes, the answer is correct.
(unit?
Score: 1
unit=42&lesson
Accepted Answers:
=43)
delimiter
Pandas
2) What will be the output of code given below? 1 point
Dataframes I
(unit?
unit=42&lesson
=44)

Pandas
Dataframes II
(unit?

1 of 4 04-09-2024, 01:15 pm
Python for Data Science - - Unit 5 - Week 3 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=42&assessment=99

unit=42&lesson
=45)

Pandas
Dataframes III
(unit?
unit=42&lesson
=46)

Control
structures &
Functions
120
(unit?
unit=42&lesson -120
=47) 60
-60
Exploratory
data analysis Yes, the answer is correct.
(unit? Score: 1
unit=42&lesson Accepted Answers:
=48) 60

Data 3) Identify the correct statements. 1 point


Visualization-
Part I (unit? I. Scatter plot is used to convey the relationship between two continuous variables
unit=42&lesson
II. Histogram is used to depict the shape and spread of a continuous variable
=49)
III. Bar plot is used to depict the visual representation of statistical five-number summary
Data of a variable
Visualization-
Part II (unit? I and II only
unit=42&lesson II and III only
=50) I and III only
Dealing with I, II and III
missing data
Yes, the answer is correct.
(unit? Score: 1
unit=42&lesson
Accepted Answers:
=51)
I and II only
Datasets (unit?
4) Which one of the following syntaxes is used to import a csv file by considering special 1 point
unit=42&lesson
characters as NaN?
=52)

Week 3: pandas.read_csv(file_name.csv, true_values = [ ])


Lecture slides pandas.read_csv(file_name.csv, na_values = [ ])
(unit? pandas.read_csv(file_name.csv, skiprows = [ ])
unit=42&lesson
pandas.read_csv(file_name.csv, na_filter = [ ])
=53)
Yes, the answer is correct.
Week 3 - FAQs Score: 1
(unit?
Accepted Answers:
unit=42&lesson pandas.read_csv(file_name.csv, na_values = [ ])
=54)
5) What is the statistical measure related to the box plot? 1 point
Practice: Week

2 of 4 04-09-2024, 01:15 pm
Python for Data Science - - Unit 5 - Week 3 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=42&assessment=99

3: Practice Mean
Assignment 3
F1_score
(Non Graded)
(assessment? Median
name=99) Mode

Quiz: Week 3: Yes, the answer is correct.


Assignment 3 Score: 1
(assessment? Accepted Answers:
name=101) Median

Week 3:
Solutions (unit? Check Answers and Submit
unit=42&lesson
=92) Your score is: 5/5
Week 3
Feedback Form
: Python for
Data Science
(unit?
unit=42&lesson
=55)

Week 4 ()

Non Graded
Assignment ()

Supporting
material for
Week 4 ()

Download
Videos ()

Books ()

Text
Transcripts ()

Non
Proctored
Exam(Mar 20)
Session 1
(10am - 1pm -
for Jan 2022)
()

Non
Proctored
Exam(Mar 20)

3 of 4 04-09-2024, 01:15 pm
Python for Data Science - - Unit 5 - Week 3 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=42&assessment=99

Session 2
(8pm - 11pm -
for Jan 2022)
()

4 of 4 04-09-2024, 01:15 pm
NPTEL
WEEK 3 ASSIGNMENT QUESTIONS

1) Data from the file “brand_data.csv “has to be loaded into a pandas dataframe. A snippet of the
data is shown below:

What is the right instruction to read the file into a dataframe df_brand with 4 separate columns?

a)

b)

c)

d)

Answers: b) and d)

Option a) chooses the wrong column as index. When set with index_col = 0, the dataframe ends with
only 3 columns and brand becomes the index.

Option b) returns a dataframe of 4 rows and 4 columns. This is correct.


Option c) reads the dataframe with the wrong header. The data is read into a dataframe in an
illogical manner.

Option d) used read_table which can read csv files using the delimiter = ‘,’ setting. Note that the
header is also correctly marked. This is correct.

2) For the same file above “ brand_data.csv “, which parameter in pd.read_csv will help to load
dataframe df_brand with the selected columns as shown below?

a. index_col =[‘brand’,’Price’]
b. skiprows =[‘brand’,’Price’]
c. usecols =[‘brand’,’Price’]
d. None of the above

Answer: c) usecols. Returns a subset of the columns from the original file.
3) Data from the file “ weather.xlsx “ has to be loaded into a pandas dataframe df_weather which
when printed is as shown below:

Of the following set of statements which of them can be used to move the column “Direction” into a
separate dataframe

a.

b.

c.

d.

Answer: a and c.

Option a. ->

Option b ->

Option c ->

Option d ->
4) Referring to the same dataframe df_weather in Question (3), which statement/statements will
help to print the last row from the dataframe?

a.

b.

c.

d.

Answer: b and d

Option a. Retrieves all rows except the last row.

Option b. Correct option.

Option c. Retrieves the row with index 2 [ second last row].

Option d. Correct option.

5) In reference to the same dataframe df_weather, we add an additional column ‘Hot_day’ to


determine whether the day is hot or not based on the values in the Temperature column. What will
the print statement derive?
a. True
b. SyntaxError
c. False
d. None of the above

Answer: c). The third row has a temperature of 35, so it will return False.

6) What statement would give the number of columns in a dataframe df?

a. len(df.columns)
b. len(df)
c. df.size
d. All of the above.

Answer: a) len(df) returns number of rows. df.size returns the number of elements.

7) A file “Students.csv” contains the attendance and total scores of three separate students. This
data is loaded into a dataframe df_study and a pandas crosstab is applied on the same dataframe
which results in the following output

Which student scored the maximum average score of all three subjects? Which subject has the best
average score for all three students?

a. Harini,Chemistry
b. Rekha,Physics
c. Harini,Physics
d. Rekha,Maths

Answer: d) Rekha, Maths.


8) The following histogram shows the number of books read in a year:

Find the mean and median in the above histogram.

a) 7,8
b) 8,9
c) 8.5,7
d) 8,8
e) None of the above

Answer: d) 8 is the central tendency for the above histogram. It is the mean, median and mode.

9) For the following box plot, which among the given options are the median and the outlier?

a. 15,52
b. 22, 52
c. 13.5, 29
d. 25, 50

Answer: b) Median is between 20 and 25, so 22 is the median. Outlier is between 50 and 55, hence
52 is the outlier.

Q1 -13.5 Q3 – 27.5
10) A dataframe df_logs has the following data.

All the NaN / Null values in the column C1 can be replaced by zero value by executing which of the
following statements?

a. df_logs['C1'].fillna(0,inplace = True)
b. df_logs.fillna(0,inplace = True)
c. df_logs.fillna(0,inplace = False)
d. df_logs['C1'].fillna(df_logs['B1'],inplace = True)

a. Answer: a) df_logs['C1'].fillna(0,inplace = True)

Option a) Only Column C1 values get replaced by zero value.

Option b). All the null values in the dataframe get replaced by zero value. Incorrect.
Option c). No changes are reflected in the dataframe. Incorrect.

Option d). Column C1 null values get replaced by Column B1 values. Incorrect.
Python for Data Science - - Unit 6 - Week 4 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=57&assessment=100

Answer Submitted.
X

(https://swayam.gov.in) (https://swayam.gov.in/nc_details/NPTEL)

[email protected]

NPTEL (https://swayam.gov.in/explorer?ncCode=NPTEL) » Python for Data Science (course)

Course Week 4: Practice Assignment 4


outline
(Non Graded)
How does an Assignment not submitted
NPTEL online
course work? Note : This assignment is only for practice purpose and it will not be counted towards the
() Final score

Week 0 ()
1) The most linearly correlated feature set in the given dataset is? 1 point
Week 1 ()
Cdur and InRate
Age and Camt
Week 2 ()
Cdur and Camt
Week 3 () Camt and Ndepend

Yes, the answer is correct.


Week 4 () Score: 1
Accepted Answers:
Introduction to Cdur and Camt
Classification
Case Study 2) What is the average loan amount claimed by borrower for the purpose of education? 1 point
(unit?
unit=57&lesson 31684
=58) 44132.5
14860
18725

Yes, the answer is correct.


Score: 1
Accepted Answers:

1 of 4 04-09-2024, 01:16 pm
Python for Data Science - - Unit 6 - Week 4 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=57&assessment=100

Case Study on 31684


Classification
3) To identify invalid values in categorical columns of the dataframe, which of the following 1 point
Part I (unit?
will be very helpful and less time consuming in Python?
unit=57&lesson
=59)
View the data manually and find the invalid values
Case Study on By observing unique values of every categorical column of the data using unique() method
Classification Only by viewing info of the dataframe using info() method
Part II (unit?
By estimating the statistics of the data using describe() method
unit=57&lesson
=60) Yes, the answer is correct.
Score: 1
Introduction to
Accepted Answers:
Regression By observing unique values of every categorical column of the data using unique() method
Case Study
(unit?
unit=57&lesson
Check Answers and Submit
=61)

Case Study on Your score is: 3/3


Regression
Part I (unit?
unit=57&lesson
=62)

Case Study on
Regression
Part II (unit?
unit=57&lesson
=63)

Case Study on
Regression
Part III (unit?
unit=57&lesson
=64)

Data sets (unit?


unit=57&lesson
=65)

Case Study
codes (unit?
unit=57&lesson
=67)

Practice: Week
4: Practice
Assignment 4
(Non Graded)
(assessment?
name=100)

Quiz: Week 4:
Assignment 4
(assessment?

2 of 4 04-09-2024, 01:16 pm
Python for Data Science - - Unit 6 - Week 4 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=57&assessment=100

name=102)

Week 4:
Solutions (unit?
unit=57&lesson
=93)

Week 4
Feedback Form
: Python for
Data Science
(unit?
unit=57&lesson
=66)

Non Graded
Assignment ()

Supporting
material for
Week 4 ()

Download
Videos ()

Books ()

Text
Transcripts ()

Non
Proctored
Exam(Mar 20)
Session 1
(10am - 1pm -
for Jan 2022)
()

Non
Proctored
Exam(Mar 20)
Session 2
(8pm - 11pm -
for Jan 2022)
()

3 of 4 04-09-2024, 01:16 pm
Python for Data Science - - Unit 6 - Week 4 https://onlinecourses.nptel.ac.in/noc22_cs32/unit?unit=57&assessment=100

4 of 4 04-09-2024, 01:16 pm
NPTEL
WEEK 4 ASSIGNMENT QUESTIONS

Given Data: Credit Worthiness data containing 1000 observations of income details of
individuals comprising 21 attributes along the columns (Cbal, Cdur, Chist, Cpur, Camt, Sbal,
Edur, InRate, MSG, Oparties, Rdur, Prop, age, inPlans, Htype, NumCred, JobType, Ndepend,
telephone, foreign, creditScore)
Problem statement: By observing the features of the dataset, the problem statement can be
defined as a binary classification problem of classifying any individual into an appropriate
category of creditScore such as Good or Bad.

1) How many unique values are present in the Sbal feature; also, what is the most frequent value
within Sbal?

a) 5, Rs. >= 10,000

b) 4, Rs. < 1000

c) 5, Rs. < 1000

d) 4, '1000 <= Rs. < 5,000'

Answers: c)

All features of object type can be analyzed by describe (). MARKUP ON THE PICTURE.

2) Find the average age of those customers who have a credit history [Chist] wherein the dues are
not paid earlier.

a. 35.54
b. 38.44
c. 33.00
d. None of the above
Answer: b) 38.44

3) A Logistic Regression model is built in which none of the features used are standardized. The train
to test proportion is 75:25 and the random state is set to 1. The accuracy of the model is ________.

a. Less than 50%

b. Between 50% and 60%

c. Greater than 70%

d. None of the above

Answer: c)
4) Import StandardScaler() from the sklearn.preprocessing package to standardize the features. Use
the same train-test proportion and the random state should be set to 1. After standardizing the
logistic regression model, by what percentage has the misclassified samples changed?

a. 11.11%
b. 3.7%
c. 20%
d. 39.2%

Answer: a

After Standardizing:
Percentage change in misclassified samples : (56-63 /63)*100 = 11.11%

5) When KNN classification is applied on the same standardized data at the optimal value for k
nearest neighbours, the accuracy achieved is ______.

a. 64%
b. 78%
c. 76.4%
d. None of the above

Answer: b)

6) A multiple linear regression model is built on the Global Happiness Index dataset
“GHI_Report.csv”. What is the rmse of the baseline model?

a. 1.99
b. 0.85
c. 1.06
d. 0.33

Answer: b) 1.06
7) From the multiple linear regression model built on the GHI index, we get an R-squared value of
___ on the test data subset.

a. 55.63
b. 45.81
c. 75.59
d. 81.46

Answer: d)

8) Which of the following statement/s about Linear Regression is / are true?

a) Linear Regression assumes that there exists a linear relationship between the
independent variable and dependent variable.
b) The errors terms are assumed to be independent and normally distributed.
c) The percentage of variation in the dependent variable as explained by the independent
variable/variables is expressed by R-squared value.
d) Residuals are the product of the predicted value and the actual observed value.

Answer: a,b and c.


Residuals are the difference between the predicted value and the actual observed value.

9) Which of the following statements is inaccurate about Logistic Regression?


a) Logistic Regression doesn’t require a linear relationship between the dependent and
independent variables.
b) The value of the logistic function being a probability will range between 0 and 1.
c) Cost function of Logistic Regression is also called as the Log Loss function.
d) The dependent variable can be of both numerical or categorical type just like the
independent variables.

Answer: d) Only categorical dependent variable.

10) In a KNN model, by which means do we handle categorical variables?

a) Standardization
b) Dummy variables
c) Correlation
d) None of the above

Answer: b) Dummy variables can be used to encode the different values contained in a particular
categorical independent feature.

You might also like