0% found this document useful (0 votes)
55 views7 pages

Machine Learning For Product Managers

The document discusses best practices for using machine learning to build products. It outlines common problems well-suited for ML, including pattern recognition, complex cognition, predictions, anomaly detection, and human interactions. It also discusses skills needed by PMs working with ML, such as understanding algorithm limitations and ensuring representative data. Common mistakes include lacking data, small or sparse data sets, and high-dimensional data without proper feature selection.

Uploaded by

sidhartha das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
55 views7 pages

Machine Learning For Product Managers

The document discusses best practices for using machine learning to build products. It outlines common problems well-suited for ML, including pattern recognition, complex cognition, predictions, anomaly detection, and human interactions. It also discusses skills needed by PMs working with ML, such as understanding algorithm limitations and ensuring representative data. Common mistakes include lacking data, small or sparse data sets, and high-dimensional data without proper feature selection.

Uploaded by

sidhartha das
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Focus on the user!

Part I — Problem Mapping: What types of problems are best suited for machine learning

ML is just a tool. Rather than tech, the problem that user is facing should be your focus. At the core, ML
is best suited for problems that require some sort of “pattern recognition”. Usually these are:

1. User is inundated with too much data - If a user has to sift through a lot of data to complete a
task, ML/AI is a great tool for that. Search technologies, classifications, categorizations, etc.

2. Problems that require complex cognition abilities - Autonomous cars need to be able to
understand the environment around them. A gallery app that automatically sorts photos needs
to be able to detect places, people and things. These require complex cognition skills and the
only way to build these kinds of smart machines is to feed it a lot of data and learn through
pattern recognition.

3. Predictions and Forecasting – One of the major questions is whether user would like the product
or not.

4. Anomaly detection

5. Helps with decision making via Recommendations

6. Experiences that interact with humans – Alexa, Siri, etc.

7. Augmenting, Creating new experiences

Part II — ML Skills: What additional skill-sets does PM need in building products that leverage ML

Product managers typically use five core skills — customer empathy/design chops, communication,
collaboration, business strategy and technical understanding.

Algorithms have limitations:

Being able to understand the impact and limitations of the algorithms, helps you understand what
gaps would exist in the customer experience and solve for those gaps through product design or by
choosing other ML techniques. This is an absolutely necessary skill-set if you are a PM in this space.

ML algorithms learn from data. So, the data you provide determine the quality of product. So, data must
be representative of the users who will use the product. It must represent all users. (Google photo tag
algorithm once mentioned a black guy as gorilla!)
if you model, say search behavior based off of searches on the internet, it would have a higher
distribution of users from developed economies (due to more data). Thus, when you model the patterns
of search, it may not be very accurate for users who don’t search often enough, say first time Internet
users in India.
Trade-off between precision and recall
We wanted every good actor to be able to use the feature. So, we cared about precision. The other
team cared that no bad actor was allowed to use the product, even if that meant a few good actors
were also restricted from using the product. So, they cared about recall. Work on the requirements of
the problem.

Cold Start Problem


Cold start is a term used in cars when the engine is cold and the car may not function optimally.
However, once the engine is up and running, you are good to go. Similarly, there are scenarios when the
algorithms don’t have any data on the user or an item and have cold start issues leading to sub-optimal
experiences. Cold Start Problem is of 2 types -

(a) User based - when the user is using your product for the first time, and the models have no signal on
the user. Netflix provides you recommendations of shows and movies to watch based on your watch
history. But, the very first time you use, it’s hard for the algorithms to predict your viewing preferences.
This problem can be solved in several ways. A few commons ones are
1. You prompt the user to give you data by choosing a few favorite movies from a random selection
2. You can provide a best effort recommendation using some basic signals you may have about the
user — say looking at where the user is logging in from, and assuming that the user might like movies
that other users in that location like. For example, if you are logging in from California, you might show
the top 10 shows that other users from California watch
3. Curation. You may have a manually curated golden set, say all-time favorite movies, that you can
show as a starting experience.

(b) Item based - an item is being provided to your users for the first time and it’s hard to pick which
users to show it to. For example, when a new movie comes to the Netflix catalog, deciding who to show
it to as a recommendation is tough, especially if the meta-data on the movie is sparse.
1. Human annotation: You can have humans annotate this item with meta-data to give you signal on
what it is about. For example, you could get the new movie added to the catalog to be seen by a panel
of experts who then categorize it. Based on the categorization, you can decide the group of users this
movie should be shown to.
2. Algorithmically: In layman terms, algorithmic techniques work by showing the new item to a number
of users to understand their preferences, thereby getting more signal on who prefers this item, and
narrowing down the user segment. If you are interested in going deeper in this topic, you can check out
this primer on multi-armed bandit algorithms .

In anomaly detection, cold start issues can throw off a lot of false alarms. For example, when a new
employee is accessing the database systems, the ML system may consider him/her as an intruder and it
may throw off a false alarm. The way to tide over this is to create mechanisms to gather feedback
(either before or after the action) to correct the false positives, which brings me to the next section.

Feedback Loop
Provide feedback to the algorithms, such that they can improve their accuracy over time. These
feedback loops may be as simple as logging negative signals, for example, how fast did a user scroll past
a news headline vs spending time reading it. You can even provide more explicit feedback loops by
letting users intervene when the algorithm makes mistakes — thumbs down, crossing out, etc. or when
the algorithm does a good job, allowing users to thumbs up, share, save etc.

Explore vs Exploit
Netflix’s algorithms have figured out that I enjoy watching football. So, I see in my recommendations of
movies, shows and documentaries related to football. I watch a few of those, now Netflix shows me
even more content related to football and the cycle continues. The algorithms here are Exploiting the
signal they have available — that I like football and thus optimizing for just that. However, there’s a lot of
content outside of football that is interesting for me. I like technology too, but Netflix doesn’t show me
any content related to technology. This has been referred to as filter bubbles, most recently in media.
The system should periodically introduce items where there’s less signal about the preferences to the
user allowing for Exploration. The introduction of new content can be random (show me a few of Yoga,
Food, Wildlife, Technology shows) or learned (other users who prefer football, prefer food
documentaries for example).

You might ask — how might I understand what gaps the algorithms have, especially if you don’t
understand the algorithms themselves. Here’s where you put on your PM hat.

1. Provide clear use-cases to your engineering team. Run through those use-cases with your
engineering team explaining what the expected experience should look like for these use cases.
These use-cases should include both primary, secondary and negative personas for your
product. Once your models are ready, evaluate if the model outputs match the intended
experiences for these use-cases.

2. Pay attention to data collection. Ask questions about how the data is being cleaned and
organized. Is the data representative of your user-base ?

3. Plug the gaps using product solutions. If the model outputs don’t meet your expectations,
figure out what you might do to either improve the ability of the model to account for these
use-cases or if these use-cases stretch the machine learning capabilities of the model, then
creating product solutions, filters etc. to cover for these gaps.
Part III — Caveats: What are the common mistakes made in building products that use machine
learning

Data Issues

1. No data: Many companies want to use ML and had built up ‘smart software’ strategies but don’t
have any data. You either need to have data that your company collects or you can acquire
public data or accumulate data through partnerships with other companies that have data.

2. Small Data: Most of the ML techniques published today focus on Big Data and tend to work well
on large datasets. You can apply ML on small data sets too, but you have to be very careful so
that the model is not affected by outliers and that you are not relying on overcomplicated
models. But if our need is indeed of small data, even simple ML algorithms would do good. For
example, most clinical trials tend to have small sample sizes and are analyzed through statistical
techniques, and the insights are perfectly valid.

3. Sparse Data: Sometimes even when you have a lot of data, your actual usable data may still be
sparse. Say, for example, on Amazon, you have hundreds of millions of shoppers and also tens of
millions of products you can buy. Each shopper buys only a few hundred of the millions of
products available and hence, you don’t have feedback on most of the products. This makes it
harder to recommend a product if no one or very few users have bought it. When using sparse
datasets, you have to be deliberate about the models and tools you are using, as off the shelf
techniques will provide sub-par results.

4. High Dimensional Data: High dimensional data will need to be converted into a small
dimensional space to be able to be used for most ML models. While throwing away some of the
dimensions, you need to be careful in feature selection.
PMs should involve themselves in feature selection discussions with engineers/data scientists.
This is where product intuition helps. For example, say we are trying to predict how good a
video is, and while you can look at how much of the video the user watched as a metric to
understand engagement on the video, you found through UX studies that users may leave a tab
open and switch to another tab while a video is playing. So watch time does not correlate
absolutely with quality. Thus, we might want to include other features such as was there any
activity in the tab while the video was playing to truly understand the quality of the video.

5. Data Cleaning: A large part of the success of ML is dependent on the quality of the data (both
the features and cleaning). Have you removed all outliers, have you normalized all fields, are
there bad fields in your data and corrupted fields — all of these can make or break your model.
As they say, Garbage In Garbage Out.

Fit Issues

1. Overfitting - To explain overfitting, I have an interesting story to tell. During the financial crisis of
2007, there was a quant meltdown. Events that were not supposed to be correlated, ended up
being correlated, and many assumptions that were considered inviolable were violated. The
algorithms went haywire and within 3 days, quant funds amassed enormous losses. D.E. Shaw
suffered relatively fewer losses than other quant funds at that time. Why? The other quant
funds were newer and their algorithms had been trained on data from the recent years
preceding 2007, which had never seen a downturn. Thus, when prices started crashing the
models didn’t know how to react. D.E. Shaw on the other hand had faced a similar crash of the
Russian rouble in 1998. D.E. Shaw too suffered losses but since then its algorithms have been
calibrated to expect scenarios like this.
Hence overfitting is, in layman terms, the models optimized for hindsight and less for foresight.
Ensure that you can test your models on a wide variety of data. Also, take a hard look at your
assumptions. Will they still hold true if there are shifts in economy, user behavior changes?

2. Underfitting - when your model is too simple for the data it’s trying to learn from. Say for
example, you are trying to predict if shoppers at Safeway will purchase cake mix or not. Cake
mix is a discretionary purchase. Factors such as disposable income, price of cake mix,
competitors in the vicinity etc. will impact the prediction. But, if you do not consider these
economic factors such as employment rates, inflation rates as well as growth of other grocery
stores and only focus on shopping behavior inside of Safeway, your model will not be predicting
sales accurately.
How do you avoid this? This is where product/customer intuition and market understanding will
come in handy. If your model isn’t performing well, ask if you have gathered all the data needed
to accurately understand the problem. Can you add more data from other sources that may help
provide a better picture about the underlying behavior you are trying to model?

Computation Cost
Based on the cost, you may need to trade-off the quality of the predictions. For example, you may not
be able to store all the data about your products.

Some Cool Algorithms you need to know:

1. Google Search Algorithm – Page Rank Algorithm. Google uses spiders to crawl over
webpages and store them in its database as list of pages called “Index”. The algorithm
can estimate a webpage’s importance by looking at other important pages that link to
this page. Thus, each web-page gets a score based on every other page that links to that
page. Also, the PageRank considers the quality of incoming links (such as your website
gets mentioned by The New York Times) more than the quantity.

2. Spotify Discovery Algorithm - It looks at the songs that you have listened to and at all the
playlists that other people have made. It then uses “collaborative filtering” to compare the two
datasets to figure out which songs you might like to listen to (for ex. if person A has listened to 8
out of 10 songs that are in person B’s playlist, Spotify may recommend other 2 songs to Person
A). However, it also goes beyond that to build your “User Profile” based on songs you listen to,
their frequency, songs you skip- genres, artists, moods. It also considers other parameters such
as age, gender, location among others to match your profile with others.

3. Amazon’s “Recommended because you purchased”, “Recommended because you


added X to Wishlist” Algorithms:

It uses what is called “Item-to-Item Collaborative filtering”. For each item you buy/add
to Wishlist/look at — Amazon ranks other items on basis of cohorts of other users’ list.
The items that get the highest weightage (called ‘closest to neighborhood’) are
showcased as the relevant recommendation. The crucial part is to define the cohort- do
you go through list browsed by all users or a well-defined cohort of users. Defining this
cohort require various parameters including- Location, Purchase history, demographic
segmentation etc. Currently, the most powerful algorithms in “recommendation
system” category are by Spotify, Netflix and (believe it or not) Target.
4. Facebook’s News Feed Algorithm:

5. Tinder’s Elo Score Algorithm: Tinder’s core metrics is to ensure probability of a match is
high (i.e. no. of matches/ no. of swipes should be lowered) as well as the matches
shouldn’t get concentrated (for e.g. you go to a bar, and most guys hitting on the same
hottest girl) For this, Tinder uses variety of factors: -
New users get a boost- call it ‘Beginner’s Luck’ in digital dating.
User Attractiveness Rating on scale of 1–10. If a user is rated 6, user will only see 4–8
rated other users. This Rating works similar to “PageRank” algorithm of Google. It
depends on the rating of users that swipe you right, as well as with users that you get
matched.
Activity status- the more active user is on Tinder, the more matches the user gets, the
more messages the user send to the matches- the better is your rating.
Paid Account- If you pay for Tinder, your rating is boosted.
Pickiness- No. of right-swipes/no. of users showcased — will also determine you ranking
or Elo Score.

You might also like