HBR - How To Decide Which Data Science Projects To Pursue
HBR - How To Decide Which Data Science Projects To Pursue
HBR - How To Decide Which Data Science Projects To Pursue
DATA
DNY59/Getty Images
In 2018, every organization has a data strategy. But what makes a great one?
We all know what failure looks like. Resources are invested, teams are formed,
time goes by — but nothing comes of it. No one can necessarily say why; it’s
always Someone Else’s Fault.
https://hbr.org/2018/10/how-to-decide-which-data-science-project…wsletter_series&utm_campaign=datascience_t1&deliveryName=DM91211 Page 1 of 7
How to Decide Which Data Science Projects to Pursue 04/09/2020, 11:21 AM
It’s harder to tell the difference between a modest success and excellence.
Indeed, in data science they can they look very similar for perhaps a year. After
several years, though, an excellent strategy will yield orders of magnitude more
valuable results.
Both mediocre and excellent strategies begin with a series of experiments and
investments leading to data projects. After a few years, some of these projects
work out and are on their way to production.
In the mediocre strategy, one or two of these projects may even have a clear ROI
for the business. Typically, these projects will be some kind of automation for
cost savings, or applying machine learning to an existing process to improve its
efficiency or performance. This looks a lot like success, and it may suffice, but
it’s missing out on the unique advantages of an excellent data strategy.
In an excellent strategy, more data projects have worked out, and they were
surprisingly cost-effective to develop. Further, the process of building the first
few projects inspires new project ideas. In an excellent strategy, the projects will
include automation and efficiency and performance improvements, but they will
also include projects and ideas for new revenue generation and entirely new
businesses driven by your unique data assets. The data teams work well
together, build on each other’s work, and collaborate smoothly with their
business partners. There’s a clear vision of what the machine-learning driven
future of the business can look like, and everyone is working together to achieve
it.
https://hbr.org/2018/10/how-to-decide-which-data-science-project…wsletter_series&utm_campaign=datascience_t1&deliveryName=DM91211 Page 2 of 7
How to Decide Which Data Science Projects to Pursue 04/09/2020, 11:21 AM
Crafting a data strategy requires many parties at the table, including data
experts, technology leadership, and business and subject-matter experts. It also
requires leadership support that goes beyond just wanting to check off a
“machine learning” box.
Here’s how most companies decide which data projects to pursue, which alone
is a recipe for the mediocre data strategy. Management identifies a set of
projects it would like to see built and creates the ubiquitous prioritization
scatterplot: one axis represents a given project’s value to the business and the
other axis represents its estimated complexity or cost of development. Each
project is given a spot on the chart, and management allocates the company’s
limited resources to the projects that they believe will cost the least and have the
highest business value.
This is not wrong, but it is also not optimal. An excellent data strategy moves
beyond a straightforward evaluation of each project in isolation to consider a
few additional dimensions.
For example, one global media company I worked with had grown dramatically
through acquisitions. Each business line had a different technology stack and
independent IT group, leading to challenges integrating data that already
https://hbr.org/2018/10/how-to-decide-which-data-science-project…wsletter_series&utm_campaign=datascience_t1&deliveryName=DM91211 Page 3 of 7
How to Decide Which Data Science Projects to Pursue 04/09/2020, 11:21 AM
existed, and different architectures for all future investments. Centralizing this
practice was key to their ongoing success.
Second, an excellent data strategy is specific in the short term and flexible in the
long term. We know quite a lot about what the machine learning capabilities of
tomorrow look like, but less about what the capabilities of next year will look
like. We can only guess what will be possible in five years. Similarly, the
business landscape is transforming, leading to new competition and new
opportunities. Organizations that engage in five-year planning cycles will miss
the opportunities that emerge in the meantime. An excellent strategy is one that
is adaptable and considered to be a living document.
The best strategies are strong in directional conviction, but flexible in the details.
You want to know where you want to end up, but not necessarily pre-define
each step you need to take to get there.
Finally, an excellent data strategy takes into account one key insight: data
science projects are not independent from one another. With each completed
project, successful or not, you create a foundation to build later projects more
easily and at lower cost.
Here’s what project selection looks like in a firm with an excellent data strategy:
First, the company collects ideas. This effort should be spread as broadly as
possible across the organization, at all levels. If you only see good and obvious
ideas on your list, worry — that’s a sign that you are missing out on creative
https://hbr.org/2018/10/how-to-decide-which-data-science-project…wsletter_series&utm_campaign=datascience_t1&deliveryName=DM91211 Page 4 of 7
How to Decide Which Data Science Projects to Pursue 04/09/2020, 11:21 AM
thinking. Once you have a large list, filter by the technical plausibility of an idea.
Then, create the scatterplot described above, which evaluates each project on its
relative cost/complexity and value to the business.
This approach makes higher-value projects — those that would perhaps have
seemed too ambitious — look less like an aggressive, expensive push forward.
Instead, it reveals that such projects may indeed be more efficient and safer to
proceed with than other lower-value projects that looked attractive in a naive
analysis.
Put differently, an excellent data strategy acknowledges that projects play off of
one another, and that the costs of projects change over time in light of other
projects undertaken (and new technology, as well). This allows more accurate
planning and may expand the organization’s capabilities more than expected.
You can revisit this planning process quarterly, which is in line with how quickly
machine learning technologies are changing.
https://hbr.org/2018/10/how-to-decide-which-data-science-project…wsletter_series&utm_campaign=datascience_t1&deliveryName=DM91211 Page 5 of 7
How to Decide Which Data Science Projects to Pursue 04/09/2020, 11:21 AM
choose well.
Hilary Mason is the GM for Machine Learning at Cloudera. She was the Founder of Fast Forward
Labs, acquired by Cloudera in 2017, and is the Data Scientist in Residence at Accel.
Comments
Leave a Comment
Post Comment
7 COMMENTS
Reply 13 1
https://hbr.org/2018/10/how-to-decide-which-data-science-project…wsletter_series&utm_campaign=datascience_t1&deliveryName=DM91211 Page 6 of 7
How to Decide Which Data Science Projects to Pursue 04/09/2020, 11:21 AM
POSTING GUIDELINES
We hope the conversations that take place on HBR.org will be energetic, constructive, and thought-provoking. To comment,
readers must sign in or register. And to ensure the quality of the discussion, our moderating team will review all comments
and may edit them for clarity, length, and relevance. Comments that are overly promotional, mean-spirited, or off-topic may
be deleted per the moderators' judgment. All postings become the property of Harvard Business Publishing.
https://hbr.org/2018/10/how-to-decide-which-data-science-project…wsletter_series&utm_campaign=datascience_t1&deliveryName=DM91211 Page 7 of 7