0% found this document useful (0 votes)

44 views

Building Your Own Supervised Learning Model: This Project (Both Part 1 and 2) Is Due Friday, Feb. 26, 11:59 PM PST

This document provides instructions for a two-part project to build a supervised machine learning model using Google's Teachable Machine and then introduce bias into the model. For part 1, students are asked to build a classification model with at least 15 samples per class for training and 5 for testing. They must submit their model, test data, and a write up describing the model, datasets, and test results. For part 2, students introduce bias into the model by changing the data or model, test it on the original test set, and submit the biased model and a write up analyzing the introduced bias and its effects on accuracy as well as mitigation strategies and real world examples of related biases.

Uploaded by

George Keru

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views

Building Your Own Supervised Learning Model: This Project (Both Part 1 and 2) Is Due Friday, Feb. 26, 11:59 PM PST

Uploaded by

George Keru

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Building your own supervised learning model

In this project, you will build your own supervised machine learning model and investigate
potential sources of bias. . This project does not require any coding experience! Template is for
write up submission

This project (both Part 1 and 2) is due Friday, Feb. 26, 11:59 pm PST.

● Part1
○ Instructions

○ Deliverables

● Part2

○ Instructions

○ Deliverables

● Rubric

● Submission template
● Gradescope submission page

Part 1

In your group, use Google’s Teachable Machineto create a machine learning model to
classify images, human poses, or sounds. You may want to follow along with the tutorial
here(which creates an image model that can tell if a banana is ripe or not).

Here are the suggested steps to building the machine:

Determine the goal of your classifier: Your classifier could contribute to society or be
completely just-for-fun. In either case, consider who might use your model, and how it
might be used. Will you classify images, sounds, or poses? How many classes do you
want the model to be able to identify, and what are the classes?

Assemble a data set.2 It is easiest to upload and label all your files in a Google Drive
folder, which you will then import into your Teachable Machine in Step 3.
Make sure you keep track of the sourcefor each of your files! (See footnote for more
information)3

1 If you would like our help to find a group, submit this formby Friday Feb. 19.

2You may NOT use a pre-curated dataset (from kaggle, google, etc), you need to construct your OWN
dataset.

Make sure you keep track of the source for each of your files, as you will be deducted points if you’re
3

missing the source link for any of them! You CANNOT say you got the file from Google or the internet,
3. Training: Your training data set should haveat least 15 samples per class.4You can use
files from the internet or your webcam to create your training set. Upload your samples for each
class to the Teachable Machine. Check out the video here.

a. Testing:Your testing data set (different from your training set!) should have a
t least 5
samples per class.5Be sure to make this data set as representative (i.e., not biased) as
possible. This test set will remain the same between Part 1 and 2.

Train your model: Upload only the t rainingdata to your Teachable Machine and click
on “Train Model". Check out the video h ere.

Test your model using your testing data set: On the right side of the screen you
should see a window with the title “Preview”. Click on the dropdown menu and select
“File” (red box in the screenshot below) and upload your t estsamples one at a time.6
Record the observations (to which class was each sample classified, and at what
probability?).

You may want to iterate on your model to improve accuracy: See what happens if you
change your number of training data points, epoch, batch size, or learning rate. Aim to
get at least 80% accuracy.

Save your model to Google Drive(red box in the screenshot below; “Save project to
Drive”). This saves a .zip file that contains all the samples in each of your classes to
Drive. You can then open that .zip again from Teachable Machine later to pick up where
you left off. Check out the video h ere.7
you must provide the exact link where you found the file. Tip: One easy way to do this (for images) is
creating a google doc where you paste the image file and its URL, and then downloading the google doc
as a Zipped HTML file. If you open that, you will see an “Images” folder with all the images inside.

4 If you are using audio, please use 15 different kinds of audio files (with each file at least 3-5 seconds
long)

5 If you are using audio, please use 5 different kinds of audio files (with each file at least 3-5 seconds long)

6Note: If you decided to train your model on audio instead of images, you actually don’t have the option to
select files from your computer, you can only input from your microphone. Please create test audio clips
and save them onto your computer, then play them out loud while the built in mic is running.

7 Do NOT use “Export Model” in the preview screen -- this only shares the model without showing any any
training data

Part 1 Deliverables

Use this templatefor your deliverables. If you are working in a group, please make sure that
each person in your group submits the deliverables on Canvas. The write-up should be each
student’s own work; i.e., you may discuss your answers with your group or other students but
you may not share any written materials.

● A Google Drive link to your project (.zip file for Teachable machine algorithm
that you created in step #5 above) from Part 1.Please make sure that you set the link
to be accessible to anyone with a Stanford email.

● A Google Drive link with your test set (folder with test data, which will remain
the same for Part 1 and Part 2).Please make sure that you set the link to be
accessible to anyone with a Stanford email.

● A write-up answering the following:

○ [1 paragraph] Description of model: What is the computational goal of your

Teachable Machine (i.e., the thing you want the model to be able to do)? How
many classes do you want the model to be able to identify, and what are the
classes? What is a potential use case (e.g., how might the algorithm be used)?

○ [1 paragraph] Description of dataset:How did you assemble your datasets?

How did you make sure that the datasets are as representative (and not biased)
as they can be? Describe the sources of your data for both training and testing.

Include screenshots and descriptions for at least three training data

points per class.

Include screenshots and descriptions for at least two testing data

points per class.
○ [1 paragraph] Analysis of results:

What was the accuracy on your test data set? (number of correct

predictions / number of total test samples)

Select at least one instance of successful prediction and one

instance of

failed prediction from the test data set. Provide your own hypothesis
about the reason for the success/failure.

Part 2

After you have completed your Teachable Machine model in Part 1, you and your group will
attempt to intentionally introduce or exacerbate existing algorithmic bias.
Here are the suggested steps:

Make a copy of your Teachable Machine from Part 1.

Read Week 6 Reading 2: 2019 Brookings report on algorithmic bias detection and

mitigation (link here).

Introduce or exacerbate algorithmic bias in your model: Change your data set

and/or Teachable Machine model to intentionally highlight one of the forms of algorithmic
bias named in the Brookings report (or a bias that has some similarity/connection to
algorithmic bias described in the Brookings report). (Your model should still have at least
15 samples per class but you don’t have to replace all of them, just some changes)

Test your model: Test your model again but using the same testing data set from Part 1.
Record the observations.

Save your new model to Google Drive. This saves a .zip file that contains all the
samples in each of your classes to Drive. You can then open that .zip again from
Teachable Machine later to pick up where you left off.

Part 2 Deliverables

Use this templatefor your deliverables.

● A Google Drive link to your updated project (Teachable Machine .zip file) from Part 2.
● A write-up answering:

○ [1 paragraph] Description of bias: What type of algorithmic bias did you

introduce/exacerbate in your machine learning model? Cite the Brookings paper

when possible. What did you change in order to cause this bias?

○ [1 paragraph] Impact on accuracy: How did the bias you

introduced/exacerbated in the data set affect your model's performance? Report

the test results from Step 3. Include screenshots and descriptions of at least 2
egregiously misclassified data points to illustrate your point.

○ [1 paragraph] Reflection on algorithmic bias:

What are some forms of bias detection (suggested in the Brookings

article) that would help reduce bias in this situation? Speculate on

how

successful this mitigation might be in reducing bias over the long

term.

Describe a real-world case of algorithmic bias that is related to the

bias
you used in Part 2. How might this type of bias potentially lead to
harm or negative consequences? Illustrate with the real-world
example (and cite a

source if you find something that actually happens/happened in real life!) and/or your own
Teachable Machine as a case study.