ML Projects For Final Year
ML Projects For Final Year
The goal of this document is to provide a common framework for approaching machine learning
projects that can be referenced by practitioners. If you build ML models, this post is for you. If
you collaborate with people who build ML models, I hope that this guide provides you with a
good perspective on the common project workflow. Knowledge of machine learning is assumed.
Overview
This overview intends to serve as a project "checklist" for machine learning practitioners.
Subsequent sections will provide more detail.
Project lifecycle
Machine learning projects are highly iterative; as you progress through the ML lifecycle, you’ll
find yourself iterating on a section until reaching a satisfactory level of performance, then
proceeding forward to the next task (which may be circling back to an even earlier step).
Moreover, a project isn’t complete after you ship the first version; you get feedback from real-
world interactions and redefine the goals for the next iteration of deployment.
3. Model exploration
• Establish baselines for model performance
• Stay nimble and try many parallel (isolated) ideas during early stages
• Find SoTA model for your problem domain (if available) and reproduce results, then apply to
your dataset as a second baseline
4. Model refinement
• Perform model-specific optimizations (ie. hyperparameter tuning)
• Revisit Step 2 for targeted data collection and labeling of observed failure modes
• Revisit model evaluation metric; ensure that this metric drives desirable downstream user
behavior
6. Model deployment
• Expose model via a REST API
• Deploy new model to small subset of users to ensure everything goes smoothly, then roll out to
all users
It's worth noting that defining the model task is not always straightforward. There's often many
different approaches you can take towards solving a problem and it's not always immediately
evident which is optimal. If your problem is vague and the modeling task is not clear, jump over
to my post on defining requirements for machine learning projects before proceeding.
Prioritizing projects
Ideal: project has high impact and high feasibility.
Mental models for evaluating project impact:
• Look for complicated rule-based software where we can learn rules instead of programming
them
When evaluating projects, it can be useful to have a common language and understanding of the
differences between traditional software and machine learning software. Andrej
Karparthy's Software 2.0 is recommended reading for this topic.
Software 1.0
• Explicit instructions for a computer written by a programmer using a programming
language such as Python or C++. A human writes the logic such that when the system is
provided with data it will output the desired behavior.
Software 2.0
• Implicit instructions by providing data, "written" by an optimization algorithm
using parameters of specified model architecture. The system logic is learned from a provided
collection of data examples and their corresponding desired behaviour.
A quick note on Software 1.0 and Software 2.0 - these two paradigms are not mutually exclusive.
Software 2.0 is usually used to scale the logic component of traditional software systems by
leveraging large amounts of data to enable more complex or nuanced decision logic.
For example, Takeoff Projects about how the code for Google Translate used to be a very
complicated system consisting of ~500k lines of code. Google was able to simplify this product
by leveraging a machine learning model to perform the core logical task of translating text to a
different language, requiring only ~500 lines of code to describe the model. However, this
model still requires some "Software 1.0" code to process the user's query, invoke the machine
learning model, and return the desired information to the user.
In summary, machine learning can drive large value in applications where decision logic is
difficult or complicated for humans to write, but relatively easy for machines to learn. On that
note, we'll continue to the next section to discuss how to evaluate whether a task is "relatively
easy" for machines to learn.
Determining feasibility
Some useful questions to ask when determining the feasibility of a project:
• 90% coverage (model confidence exceeds required threshold to consider a prediction as valid)
The optimization metric may be a weighted sum of many things which we care about. Revisit
this metric as performance improves.
Some teams may choose to ignore a certain requirement at the start of the project, with the goal
of revising their solution (to meet the ignored requirements) after they have discovered a
promising general approach.
Some teams aim for a “neutral” first launch: a first launch that explicitly deprioritizes machine
learning gains, to avoid getting distracted.
The motivation behind this approach is that the first deployment should involve a simple model
with focus spent on building the proper machine learning pipeline required for prediction. This
allows you to deliver value quickly and avoid the trap of spending too much of your time trying
to “squeeze the juice”
Setting up a ML codebase
A well-organized machine learning codebase should modularize data processing, model
definition, model training, and experiment management.
configs/
baseline.yaml
latest.yaml
data/
docker/
project_name/
api/
app.py
models/
base.py
simple_baseline.py
cnn.py
datasets.py
train.py
experiment.py
scripts/
data/ provides a place to store raw and processed data for your project. You can also include
a data/README.md file which describes the data for your project.
docker/ is a place to specify one or many Dockerfiles for the project. Docker (and other
container solutions) help ensure consistent behavior across multiple machines and
deployments.
api/app.py exposes the model through a REST client for predictions. You will likely choose to
load the (trained) model from a model registry rather than importing directly from your library.
models/ defines a collection of machine learning models for the task, unified by a common API
defined in base.py. These models include code for any necessary data preprocessing and output
normalization.
train.py defines the actual training loop for the model. This code interacts with the optimizer
and handles logging during training.