Scaling Data - Data Informed To Data Driven To Data Led - Reforge
Scaling Data - Data Informed To Data Driven To Data Led - Reforge
400
Thanks to contributions from Dan Wolchonok (Head of Data at Reforge), Elena Verna (EIR at
Reforge, Advisor at Miro, Netlify, MongoDB), Behzod Sirjani (EIR at Reforge, ex-
Slack/Facebook), Shani Hadiyanto, and Sarah Catanzaro.
SUBSCRIBE
One of the most common questions I get from founders is: ‘When should I hire my first data
person?’ Invariably, the same types of questions are asked over the lifecycle of the company:
• Should I be using Looker (an advanced data transformation and visualization tool)?
At the core of these questions is the common mistake of viewing data as a team to hire or
set of tools to implement rather than as a strategic lever for growth. The answers to these
questions are dependent on your product, business, and points of leverage. In this article, I
lay out:
1. Why data is not a team to hire or set of tools to implement
Answering questions around who to hire, the tools to implement, and the analyses that
SUBSCRIBE
need to be done are ultimately informed by what the product strategy is and how data
plays a role in helping achieve that product strategy. But often times the product
strategy isn't well defined, and even if it is well defined, where data fits in isn't.
Often times there is a mismatch between the stage a person has historical experience
with and the stage the company is at. For example, a Data PM coming into a new
company having worked only with a mature company's data. They never saw the steps it
took to get from 0 to great, and end up misapplying technology, team needs, and a lot
more. Scaling data requires many evolutions and it is rare that someone has seen the
entire lifecycle.
Often, the culture and incentives of the org create a non-functioning data environment.
For example, data teams should not be measured by the answers they give but rather
the impact of those answers on the business. In a lot of culture-poor organizations, PMs
or others take credit for "asking the right question" instead of attributing it to the data
team. This type of system rewards bad behavior and disincentivizes the data team from
doing more impactful analysis — it instead incentivizes them to design pretty result
tables. It may even incentivize the data team to seek out new questions that aren't
relevant to the business but can provide "interesting" answers, which leads to a negative
cycle.
1. Strategy - What are your points of leverage? How does data improve those points of
leverage?
2. Stage - What stage of maturity is our product in? What stage of maturity is our Data in?
3. Team - What people do we need to achieve the data strategy? Are they set up for
SUBSCRIBE
success internally?
• How much data do the product and business operations generate each day?
• How much more efficient could business operations be with data automation?
It's more about identifying the right points of leverage — and not just jumping to the end
because you think everything else will come as a result of it.
Going through some of the above questions tends to reveal some uncomfortable truths. The
most common one is that the company doesn't have enough data for advanced data
infrastructure to be impactful to a company’s business operations. Even the most
sophisticated data science team and infrastructure will fail to add value to a business that
just isn’t generating enough usable data — there aren’t enough signups, retained users, or
actions in the product for meaningful data science solutions to exist.
• How much of this data is tracked, stored, and owned by the company?
• How consistent and descriptive is the data for our market and trends? (For example, in
SUBSCRIBE
• Are you tracking data at the right level of granularity or asset class (event-based,
timebound, derived, aggregated)?
Understanding both what stage of maturity your product is at and where your data is at is
critical. It helps you understand where you should be, where you are, and informs the kind of
tools and team you need to fill the gap. The most common scenarios I see are:
Companies that waste resources on projects like these failed to identify an appropriate data
strategy for their stage of the business, and instead of building appropriate capabilities,
looked to solve an advanced, specific problem. The key is to identify the right sequence of
problems to solve with the right foundations built in tandem. This means understanding how
data should be leveraged at each stage to meet the needs of the business today in
preparation for what the company will need in the (near) future.
SUBSCRIBE
Under building is when the maturity of the product is ahead of data maturity. You can under
build in different areas - infrastructure, analytics, team, and operations. This is most
problematic when some of the company’s business operations are at scale but are totally
unprepared to leverage data as a strategic, competitive advantage. Some signals that you've
underbuilt:
• You have multiple products using inconsistent data attributes. For example, timestamp
fields use different time zone logic and definitions the taxonomy is all over the place and
inconsistent (https://www.reforge.com/blog/why-most-analytics-efforts-fail).
• Data that has been tracked is stuck in a 3rd party system that the company doesn’t
have ownership of. For example, I recently worked with a company that uses Firebase,
thinking they could eventually export logs. But Firebase does not store individual event
data making this impossible — they have literally wasted years of data collection.
• The business has been operating with sub-optimal decision-making without data for so
long, that it’s unlikely to change easily.
The realization of this opportunity cost is painful, and making up for it can take hours of
realigning metrics definitions, sourcing available data, backfilling data pipelines, and a
realignment of the company’s culture.
Team + Tools
Once you understand what role data plays in the overall strategy, and what stage the product
and data are in, then you can begin to understand what team and tools you need and where
there are gaps. Team and tools is not just about having the right heads in place, but about
making sure that the org is working well together. Signals that teams are not aligned:
• Teams aren't collaborating on both problems and solutions with the data team. They are
instead coming to the data team with a hypothesis to validate.
• The data and product org don't have time to align on strategic initiatives because they
are bogged down by minutiae of tasks to be done.
• Analyses aren't treated as valuable findings that help people move closer to their
SUBSCRIBE
objectives, and instead simply evaluate whether something was a win or not.
• 3 Stages of Data Maturity: What the business needs to grow and how data plays a role
informs the data strategy at each stage.
• 4 Capabilities Within Each Stage: The necessary building blocks and capabilities of
each stage across 4 key work streams (infrastructure, analytics, operations, and team).
SUBSCRIBE
• Stage 1: Data Informed. These companies are focused on building the business and
getting to product-market-fit (stable user retention rates). The key business need is for
data to provide operational visibility.
• Stage 2: Data Driven. These companies have reached product-market-fit and are
actively optimizing for specific users, behaviors, and experiences in the product at the
feature-level. The key business need is for data to support the organization’s growth with
scalable tooling, data products, and deep-dive insights.
• Stage 3: Data Led. These companies are operationally run by data products,
infrastructure, and services. The key business need is the “productization” of data
services that unlock Product and Data Science teams, allowing them to automate
operational decision-making and user product experiences.
The successful advancement from one stage to the next requires two things:
• Needs: The company’s activities and desired business objectives have evolved due to
new levels of growth, scale, or product-market-fit
• Capabilities: The dependencies and foundations required for the next stage have been
built and unlock new leverage and capabilities
The implication here is that each stage is a linear progression, but it’s important to note that
not all companies become data led. While most companies may self-describe themselves
today as Data Informed or Data Driven (or aspiring to reach those stages), some businesses
envision reaching the Data Led Stage.
SUBSCRIBE
However, this stage does not apply to all businesses; it describes a globally scaled
organization in which data dictates what and how you operate. Businesses with meaningful
traction may find that building Stage 3: Data Led capabilities are possible, but would not
dramatically impact their strategy due to the nature of their business, such as having a small
number of SKUs to optimize for or a low-frequency product in an evolving market that renders
prediction and forecasting models less effective.
The 4 Capabilities
Founders should use this playbook by considering the needs of their business (what they
need to achieve) in comparison to the next section in this framework: the recommended
capabilities (what’s needed to fulfill their business needs) for each stage.
How and what to develop across these 4 capabilities differs by stage but ultimately leads to
building sustainable infrastructure, developing compounding insights, unlocking business
operations, and evolving skill sets.
SUBSCRIBE
The most common pitfall at the data informed stage is being indecisive about the truth
(and allowing multiple versions to co-exist). If the company has already reached product-
market-fit, but is missing one of the crucial capabilities above, teams might think they have a
shared understanding and single source of truth for data when in reality, they really don't.
If Finance believes we gained 100 new transacting users from Facebook Ads in October, but
Marketing thinks it was 120, we’re likely operating from different tools, metrics, definitions,
time zones, or even accrual vs. cash based accounting. This friction commonly leads to
wasting time on alignment, frustration, and avoiding using data at all. Organizations that do
not stamp this out quickly will fail to mature as a data-driven company.
The increased capabilities of the data function unlocks deeper accountability as specific
teams can now be responsible for input metrics (# of hand-raisers on feature walls, #
SUBSCRIBE
contacts added, days to first call, or active team members per org) instead of a generalized,
org-wide shared responsibility of output metrics (revenue, retention, # of paid upgrades, etc).
Organizations at this stage leverage the Data team for decision-making guidance, as
opposed to operational data retrieval and visibility. To improve data-driven decision
making, the organization must have some self-serve access to information, comprehensive
insights that answer ***why something is happening (not just what is happening)***, and
an early set of productized data products that unlock operational capabilities.
• Smarter & faster function-specific business operations (sales, customer service, ops)
• Scalable data warehouse infrastructure and tooling through a data lake, customer data
platform, and more
SUBSCRIBE
The most differentiated thing about Stage 3: Data Led businesses is that they cannot
operationally function without data products. The scale and complexity of both the
company’s operations and its active user base is such that relying solely on business-
generated recommendations, rule sets, and SOPs are not enough to maintain a defensible
product experience. A good example of this would be Amazon, which could not successfully
manage the scale of their business without the proprietary predictive models that power
fulfillment, logistics routing, and warehouse SKU storage.
At this stage, the Data team has built out a self-service data infrastructure platform that
solves for ingestion, governance, monitoring, and automation. It is no longer the data team’s
sole responsibility to take care of onboarding new data sources and integrating them into the
product’s feedback loops and ML models. This “productization” of data services unlocks
the Product and Data Science teams and allows them to quickly build new products and
features with the data they need.
The right balance is achieved with a thoughtful sequencing of architectural engineering work,
analytics, and application of analytics to business and product. The struggle for early-stage
Data Informed companies will be cultivating the necessary blend of technical and business
skills in the organization that can unlock meaningful insights efficiently. Teams with strong
communication lines between business, product, and engineering will sequence these efforts
SUBSCRIBE
more efficiently than teams with equivalent skill sets but siloed communications. Shared
business and technical fluency encourages the right sequencing by focusing on
understanding what grows the business and having the complementary technical know-how to
select tools, research solutions, and implement quickly without taking on large engineering
projects that would not add proportional value. It is a constant process of identifying
business needs, building the necessary capabilities, and seeing it unlock growth, which
leads to new business needs.
53 Likes Share
Newer Post Older Post
Upsides to Unshipping: The Art of Removing Announcing the 2021 Spring EIRs and OIRs
Features and Products (/blog/unshipping- (/blog/2021/3/16/announcing-our-new-
features) 2021-spring-eirs-and-oirs)
SUBSCRIBE