0% found this document useful (0 votes)
14 views

Python Lab3

The document provides instructions for Lab 3 in a Programming for Data Science course. It lists required pre-lab reading and exercises students must complete for the lab. The exercises involve writing Python functions to analyze names, classify eggs by weight, find Supreme Court justices appointed by a president, calculate an interpolated median grade, determine fuel economy from mileage data, and translate phrases to different languages using a dictionary file. Students must submit their code on GitHub and have teaching assistants review selected exercises for points.

Uploaded by

narendra ravuri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views

Python Lab3

The document provides instructions for Lab 3 in a Programming for Data Science course. It lists required pre-lab reading and exercises students must complete for the lab. The exercises involve writing Python functions to analyze names, classify eggs by weight, find Supreme Court justices appointed by a president, calculate an interpolated median grade, determine fuel economy from mileage data, and translate phrases to different languages using a dictionary file. Students must submit their code on GitHub and have teaching assistants review selected exercises for points.

Uploaded by

narendra ravuri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Lab 3

Programming for Data Science, Summer 2023

Before You Begin


Before you begin, you should:

ˆ Read Chapters 7 - 9 of your Gaddis text

ˆ Read 3.1 of your McKinney text

ˆ Complete the Lab 3 lecture

ˆ (Optional) Complete the MyProgrammingLab exercises for Gaddis Chapters 7 - 9

Permitted Resources
You may use the following to complete this lab:

ˆ Lecture materials, examples, reference materials, websites, and any other materials directly linked to
from our course website

ˆ Your textbook

ˆ Personal notes taken during class

ˆ Utilities, such as Excel or a calculator for checking your results, or translation software (if English is
not your rst language)

You may not use any other resources when completing this lab. Specically, you may not use:

ˆ Resources not directly linked to from the course website

ˆ General purpose search engines, such as Google, Yahoo!, Bing, etc.

ˆ Articial intelligence tools, such as ChatGPT

ˆ Q&A / homework help sites such as stackoverow.com, coursehero.com, chegg.com, and many others

If you have any doubt as to whether you should be using a particular resource, please ask.

Instructions
Work with your team to complete the following exercises and submit via GitHub. Be sure to name your les
using the convention ex1.py, ex2.py, etc. No le or folder names contain capital letters or spaces. If you
don't follow this convention, GitHub will not evaluate your exercise.

1
Review Your Solutions with Sta for Credit
For each exercise marked Review, after GitHub validates your solution, review the solution with the course
sta to earn credit for the exercise. To earn full points on each exercise, you must review the exercise with
the course sta prior to the due date posted in the course website. If you need more time to complete the
exercises, discuss your plan with the instructor before the published due date/time.

The course sta will look at the following for each exercise:

ˆ Correctness and adherence to the exercise requirements.

ˆ Knowledge of data types and use of conversion functions.

ˆ Output formatting. The instructions for each exercise provide sample output.

ˆ Comments that add value, where needed.

I provide additional exercises throughout the lab for practice. You may complete these exercises and submit
to GitHub for validation. We are happy to review your solution to these additional exercises, but there are
no points associated with completing and reviewing these exercises.

Reect on Your Eort


When you nish all exercises, we will provide instructions to assess and reect upon your eort in completing
this lab. This reection will count towards your overall score for the lab.

Academic Integrity Guidance


Please do:

ˆ Use only permitted resources

ˆ Exchange thoughts and ideas with any other students in the class, not just the students on your team

ˆ Show / distribute your code or solutions to students on your team

Do not:

ˆ Use resources other than the permitted resources listed above

ˆ Use concepts that you don't understand

ˆ Show / distribute your code or solutions to students outside of your team (without instructor permis-
sion)

© 2023 Ken Reily. Do not distribute or reproduce without permission. 2


Exercises
ex1.py
(Review, 5 pts) Write a function named duplicate_names that accepts a list of names as its argument.
Return a list of those names that appear in the input list more than once:

In [1]: list_of_names = ['Bob', 'Alice', 'Jerry', 'Jerry', 'Nathan', 'Alice']

In [2]: duplicate_names(list_of_names)
Out[2]: ['Alice', 'Jerry']

To pass our review and earn credit for this exercise, your solution should use the histogramming technique
from the Lab 3 Lecture.

ex2.py
The US Department of Agriculture (USDA) categorizes eggs by weight:

% Grade Min. Weight (oz)


small 1.5
medium 1.75
large 2
extra large 2.25
jumbo 2.5

Eggs weighing less than 1.5oz do not fall into any category. Write a function named classify_eggs that
accepts a list containing the weight of each egg. Returns a dictionary of egg classications:

In [1]: egg_weights = [1.21, 1.82, 1.9, 1.31, 2.45, 2.2, 1.4, 2.74, 2.99, 2.38]

In [2]: classify_eggs(egg_weights)
Out[2]: {'Extra Large': 2, 'Jumbo': 2, 'Large': 1, 'Medium': 2}

ex3.py
appointed_by function from the previous lab by writing a new function named
(Review, 5 pts) Build on your
justices_appointed_by that accepts two arguments (a lename and a President's name). Returns a list of
all US Supreme Court justices appointed by that President. Like the previous version, your function should
support partial searches:

In [1]: justices_appointed_by('justices.csv', 'bush')


Out[1]: ['Samuel Alito', 'John Roberts', 'David Souter', 'Clarence Thomas']

© 2023 Ken Reily. Do not distribute or reproduce without permission. 3


ex4.py
(Review, 5 pts) To ensure consistency in grading among courses, the Carlson School of Management has a
grading policy based on an interpolated median. This policy requires that the interpolated median of nal
letter grades for this course fall in the range of 3.33 ± .2 (or about a B+). The formula for the interpolated
median is:

.5N − n1
M − .167 + × .333
n2
Where:

ˆ M is the standard median

ˆ N is the total number of grades

ˆ n1 is the number of grades less than the standard median

ˆ n2 is the number of grades equal to the standard median

If n2 = 0, then IM = M . Write a function named interpolated_median that accepts a list of letter grades
and returns the interpolated median. Here is a list you can copy/paste for testing:

grades = [ 'A=' , 'B ' , 'C ' , 'A ' , 'C ' , 'A ' , 'B ' , 'C =' , 'C+ ' , 'B =' ,
'B =' , 'A = ' , 'B ' , 'B+ ' , ' C= ' , 'C+ ' , 'C ' , 'B=' , 'A ' , 'C ' ,
'C ' , ' B= ' , 'A ' , ' C= ' , 'C ' ]

...and here is a sample call to the function using the above list:

In [1]: interpolated_median(grades)
Out[1]: 2.627875

NOTE: In the US, letter grades generally run on a 4.0 scale with A = 4.0, A- = 3.67, B+ = 3.33, B = 3.0,
B- = 2.67, etc. There is no A+ or D- grade at the U of Minnesota, and a grade of F earns 0 grade points.

ex5.py
The le mileage.txt contains the results of a fuel economy study. The researchers drove each model of car
100 miles and recorded the fuel consumption (in gallons). Some models were tested more than once. Write a
function named mpg that accepts one argument (a lename) and returns a dictionary of fuel economy where
the key is the model name and the value is the miles per gallon (mpg) to the nearest tenth:

In [1]: mpg('mileage.txt')
Out[1]:
{'Accord': 23.4,
'Camry': 25.0,
'Mustang': 19.0,
'Prius': 45.5,
'Sebring': 23.8}

© 2023 Ken Reily. Do not distribute or reproduce without permission. 4


ex6.py
(Review, 5 pts) Expand on the rudimentary translator example from this week's lecture material. Write a
function named translate that accepts 4 arguments: a dictionary le, a phrase or sentence, the language
to translate from, and the language to translate to. Your function should return a rough rendering of the
phrase or sentence in the new language:

In [1]: translate('dictionary.csv', 'My pencil is on the table.', 'EN', 'FR')


Out[1]: 'Mon crayon est sur la table.'

In [2]: translate('dictionary.csv', 'Yes the table is yellow.', 'EN', 'DE')


Out[2]: 'Ja dem tisch ist gelb.'

In [3]: translate('dictionary.csv', 'My sick friend often has NO hat on.', 'EN', 'DE')
Out[3]: 'Mein krank freund oft hat NEIN hut auf.'

In [4]: translate('dictionary.csv', 'Mon Chapeau', 'FR', 'EN')


Out[4]: 'My Hat'

Note:

ˆ The rst line of the le contains the ISO codes of each language, which are used as the from and to
arguments of the function. Your function will not be tested with any language codes that do not exist
in the le, so there is no need to spend time on this case.

ˆ The sample dictionary le contains 3 dierent languages (English, French, German) and 17 dierent
words. Your solution should not make any assumptions about the number of languages or the number
of words in the le. Modify the le, or make your own, to test dierent scenarios.

ˆ You may assume that the phrase or sentence contains no punctuation, or only contains a period (full
stop) at the end. Your function should preserve the punctuation when translating.

ˆ Don't make any assumptions about the capitalization of words in the dictionary. Instead, your program
should preserve capitalization in its translation. See the examples above.

ˆ Words not in the dictionary should be rendered in their original form. This allows for the preservation
of proper nouns (i.e. Python).

© 2023 Ken Reily. Do not distribute or reproduce without permission. 5


ex7.py
Write a function named life that accepts two arguments: an initial state, and a number of generations. Your
function should implement the rules of Conway's Game of Life and return a list object representing the board
after the specied number of generations. Read the linked Wikipedia article for an in-depth description of
the game, the specic rules, and interesting categories of test cases. In our implementation, we represent the
initial state with a 2-dimensional board (a list of lists) where 1 represents a living individual and 0 represents
an empty cell. First, a trivial example with only 1 individual who dies after only 1 generation:

In [1]: b = [[1]]

In [2]: life(b, 0)
Out[2]: [[1]]

A more interesting example on a 6x6 grid, after 1 and 2 generations:

In [1]: b = [[0, 0, 0, 0, 0, 0],


[0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 1, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]

In [2]: life(b, 1)
Out[2]:
[[0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]

In [3]: life(b, 2)
Out[3]:
[[0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 1, 0, 0, 1, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0]]

Note:

ˆ Your function should support arbitrary board sizes from 1×1 to n × n.

ˆ The Game of Life is a common exercise in introductory programming courses. Solutions are widely
available on the Internet. Please remember the rules for permitted resources (listed above), and be
sure the code you submit is uniquely your own. We will be carefully checking your submission for this
exercise.

 Specically, avoid use of the copy and deepcopy methods. Most solutions available on the Internet
use these methods. However, these methods unnecessarily complicate the problem and using them
highly suggests copying.

© 2023 Ken Reily. Do not distribute or reproduce without permission. 6


ex8.py
The le usstates.json contains JSON-formatted data about each of the US states. It is the same data
from the previous exercises, just reformatted for JSON. Write a function named state_density_rank that
accepts one argument (an integer in the range 1-50) and returns the state of that rank in population density.
For example, New Jersey has the highest population density in the US (ranked #1) and Alaska has the
lowest (ranked #50):

In [1]: state_density_rank(1)
Out[1]: u'New Jersey'

In [2]: state_density_rank(50)
Out[2]: u'Alaska'

HINT: Try loading the JSON data in the interactive prompt rst in order to explore its structure, then write
your program.

NOTE: The u in front of each string here indicates that these are Unicode strings, which is the default when
reading JSON data. You shouldn't need to do anything out of the ordinary to support this.

ex9.py
(Review, 5 pts) The US National Weather Service measures cloud cover in percentage terms (0% = clear,
100% = overcast) The le minneapolis.json contains 48 weather observations of Minneapolis provided by
openweathermap.org. Write a function named avg_temp that will return a tuple containing the average
temperature (in degrees C and F) for all observations in the set that have the percentage cloud cover
determined by the argument. For example, the rst call returns the average temperature (in degrees C and
F) for clear observations and the second call returns the average temperature for observations with 40%
cloud cover:

In [1]: avg_temp(0)
Out[1]: (21.786666666666672, 71.21600000000002)

In [2]: avg_temp(40)
Out[2]: (22.13363636363639, 71.84054545454549)

In [3]: avg_temp(97)

Note that there are no observations with 97% cloud cover, so the function in this case returns the Python
special value of None. openweathermap.org provides a key that you may nd helpful in understanding the
le structure.

© 2023 Ken Reily. Do not distribute or reproduce without permission. 7


Exercise 10 (Challenge Problem)
(Review, 5 pts)This is a designated challenge problem designed to challenge your thinking and logic skills.
As you encounter challenge problems in the class, please keep things in perspective: these problems are worth
only a small number of points.

Did you know arm-wrestling is a competitive sport? Based in Chicago, the World Armwrestling League is
the largest and fastest growing professional arm-wrestling league in the world.

ˆ A contest consists of a 60-second period in which wrestlers A and B attempt to pin each other's arm
to the table.

 A wrestler earns one point during the contest for pinning the other wrestler's arm to the table.

 Each wrestler can score a maximum of one point during each contest. If both wrestlers score a
point before the 60 seconds expires, then the contest ends.

ˆ A match consists of 5  15 contests, depending on the format of the match.

Like many sports, gambling is a large source of economic activity in professional arm-wrestling. One gam-
blimg scheme looks like this:

1. Each gambler places a bet on one or more single-digit numbers in the range 0 - 9.

2. After each contest, gamblers earn a payout if the last digit of the total score is a digit on which they
placed a bet.

For example, you place a $10 bet on the number 3. After two contests, the score is 2-1 (total of 3) and you
earn a $10 payout. After 10 contests, the score is 10 - 3 (total of 13) and you earn another $10 payout. Now
you've doubled your money! Good job.

Having taken this Python course, you want to take a more analytical approach to placing your arm-wrestling
bets. Write a function named payouts that accepts four arguments:

1. The probability pa that wrestler A scores a point during a contest.

2. The probability pb that wrestler B scores a point during a contest.

3. The number of contests (n) in a match.

4. The amount of the dollar bet placed on each digit.

Figure 1 shows example probabilities for placing a $5 bet with two evenly-matched competitors (pa = pb = .5)
playing a match consisting of 9 contests. We determine the probabilities using a straightforward analysis of
each contest. For example, there are four possible outcomes of the rst contest:

ˆ Neither wrestler scores a point

ˆ Wrestler A scores a point, but wrestler B does not

ˆ Wrestler B scores a point, but wreslter A does not

ˆ Both wrestlers score a point

Since two of the four scenarios end in a total score of 1, we see that the number 1 has a higher probability
of winning (p = .5) over the numbers 0 and 2. We obtain subsequent probabilities by multiplying the prior
probabilities. For example, the probability of digit 0 after contest two is equal to the probability of digit 0
after contest 1 times the probability of neither wrestler scoring a point in contest two.

© 2023 Ken Reily. Do not distribute or reproduce without permission. 8


Figure 1: Example probabilities when both pa and pb are .5 for 9 contests and a $5 hypothetical wager.

According to our spreadsheet, at the end of this game the friend who bought number 2 should have the
highest payout. Let's check this against the output of our function:

In [1]: payouts(.5, .5, 9, 5)


Out[1]:
{0: 3.5031700134277344,
1: 5.510292053222656,
2: 5.705814361572266,
3: 5.140533447265625,
4: 5.051727294921875,
5: 4.884185791015625,
6: 4.662799835205078,
7: 4.236488342285156,
8: 3.576488494873047,
9: 2.7285003662109375}

© 2023 Ken Reily. Do not distribute or reproduce without permission. 9


Another useful example is to check the boundary cases. In the case where both probabilities are 0, neither
wrestler will score during the match and whoever bought number 0 will earn a large payout of $50 on a $10
initial investment in a match of 5 contests:

In [1]: payouts(0, 0, 5, 10)


Out[1]:
{0: 50.0,
1: 0.0,
2: 0.0,
3: 0.0,
4: 0.0,
5: 0.0,
6: 0.0,
7: 0.0,
8: 0.0,
9: 0.0}

If the probabilities are both 1, each wrestler will always score in each contest. The score after the rst contest
will total 2, after the second contest will total 4, and so on. Whoever bought even numbers will earn money,
while gamblers betting on odd numbers won't win anything:

In [1]: payouts(1, 1, 10, 20)


Out[1]:
{0: 40.0,
1: 0.0,
2: 40.0,
3: 0.0,
4: 40.0,
5: 0.0,
6: 40.0,
7: 0.0,
8: 40.0,
9: 0.0}

In our nal scenario, wrestler A is much stronger (.8) than wrestler B (.2), the number 1 earns the highest
payout for a $25 wager in a match consisting of 15 contests:

In [1]: payouts(.8, .2, 15, 25)


Out[1]:
{0: 29.996431560599188,
1: 48.59117987571765,
2: 48.55668137154,
3: 46.345217251069506,
4: 43.208807021500775,
5: 39.040897034253426,
6: 34.578206955126696,
7: 30.66292170112848,
8: 27.851048830326913,
9: 26.16860839873718}

© 2023 Ken Reily. Do not distribute or reproduce without permission. 10

You might also like