Building Your Own Supervised Learning Model: This Project (Both Part 1 and 2) Is Due Friday, Feb. 26, 11:59 PM PST
Building Your Own Supervised Learning Model: This Project (Both Part 1 and 2) Is Due Friday, Feb. 26, 11:59 PM PST
In this project, you will build your own supervised machine learning model and investigate
potential sources of bias. . This project does not require any coding experience! Template is for
write up submission
This project (both Part 1 and 2) is due Friday, Feb. 26, 11:59 pm PST.
● Part1
○ Instructions
○ Deliverables
● Part2
○ Instructions
○ Deliverables
● Rubric
● Submission template
● Gradescope submission page
Part 1
In your group, use Google’s Teachable Machineto create a machine learning model to
classify images, human poses, or sounds. You may want to follow along with the tutorial
here(which creates an image model that can tell if a banana is ripe or not).
1.
Determine the goal of your classifier: Your classifier could contribute to society or be
completely just-for-fun. In either case, consider who might use your model, and how it
might be used. Will you classify images, sounds, or poses? How many classes do you
want the model to be able to identify, and what are the classes?
2.
Assemble a data set.2 It is easiest to upload and label all your files in a Google Drive
folder, which you will then import into your Teachable Machine in Step 3.
Make sure you keep track of the sourcefor each of your files! (See footnote for more
information)3
1 If you would like our help to find a group, submit this formby Friday Feb. 19.
2You may NOT use a pre-curated dataset (from kaggle, google, etc), you need to construct your OWN
dataset.
Make sure you keep track of the source for each of your files, as you will be deducted points if you’re
3
missing the source link for any of them! You CANNOT say you got the file from Google or the internet,
3. Training: Your training data set should haveat least 15 samples per class.4You can use
files from the internet or your webcam to create your training set. Upload your samples for each
class to the Teachable Machine. Check out the video here.
a. Testing:Your testing data set (different from your training set!) should have a
t least 5
samples per class.5Be sure to make this data set as representative (i.e., not biased) as
possible. This test set will remain the same between Part 1 and 2.
4.
Train your model: Upload only the t rainingdata to your Teachable Machine and click
on “Train Model". Check out the video h ere.
5.
Test your model using your testing data set: On the right side of the screen you
should see a window with the title “Preview”. Click on the dropdown menu and select
“File” (red box in the screenshot below) and upload your t estsamples one at a time.6
Record the observations (to which class was each sample classified, and at what
probability?).
You may want to iterate on your model to improve accuracy: See what happens if you
change your number of training data points, epoch, batch size, or learning rate. Aim to
get at least 80% accuracy.
6.
Save your model to Google Drive(red box in the screenshot below; “Save project to
Drive”). This saves a .zip file that contains all the samples in each of your classes to
Drive. You can then open that .zip again from Teachable Machine later to pick up where
you left off. Check out the video h ere.7
you must provide the exact link where you found the file. Tip: One easy way to do this (for images) is
creating a google doc where you paste the image file and its URL, and then downloading the google doc
as a Zipped HTML file. If you open that, you will see an “Images” folder with all the images inside.
4 If you are using audio, please use 15 different kinds of audio files (with each file at least 3-5 seconds
long)
5 If you are using audio, please use 5 different kinds of audio files (with each file at least 3-5 seconds long)
6Note: If you decided to train your model on audio instead of images, you actually don’t have the option to
select files from your computer, you can only input from your microphone. Please create test audio clips
and save them onto your computer, then play them out loud while the built in mic is running.
7 Do NOT use “Export Model” in the preview screen -- this only shares the model without showing any any
training data
Part 1 Deliverables
Use this templatefor your deliverables. If you are working in a group, please make sure that
each person in your group submits the deliverables on Canvas. The write-up should be each
student’s own work; i.e., you may discuss your answers with your group or other students but
you may not share any written materials.
● A Google Drive link to your project (.zip file for Teachable machine algorithm
that you created in step #5 above) from Part 1.Please make sure that you set the link
to be accessible to anyone with a Stanford email.
● A Google Drive link with your test set (folder with test data, which will remain
the same for Part 1 and Part 2).Please make sure that you set the link to be
accessible to anyone with a Stanford email.
Teachable Machine (i.e., the thing you want the model to be able to do)? How
many classes do you want the model to be able to identify, and what are the
classes? What is a potential use case (e.g., how might the algorithm be used)?
What was the accuracy on your test data set? (number of correct
failed prediction from the test data set. Provide your own hypothesis
about the reason for the success/failure.
Part 2
After you have completed your Teachable Machine model in Part 1, you and your group will
attempt to intentionally introduce or exacerbate existing algorithmic bias.
Here are the suggested steps:
1.
2.
Read Week 6 Reading 2: 2019 Brookings report on algorithmic bias detection and
3.
Introduce or exacerbate algorithmic bias in your model: Change your data set
and/or Teachable Machine model to intentionally highlight one of the forms of algorithmic
bias named in the Brookings report (or a bias that has some similarity/connection to
algorithmic bias described in the Brookings report). (Your model should still have at least
15 samples per class but you don’t have to replace all of them, just some changes)
4.
Test your model: Test your model again but using the same testing data set from Part 1.
Record the observations.
5.
Save your new model to Google Drive. This saves a .zip file that contains all the
samples in each of your classes to Drive. You can then open that .zip again from
Teachable Machine later to pick up where you left off.
Part 2 Deliverables
when possible. What did you change in order to cause this bias?
source if you find something that actually happens/happened in real life!) and/or your own
Teachable Machine as a case study.