Data Strategies For AI Leaders Compressed
Data Strategies For AI Leaders Compressed
Data strategies
for AI leaders
2 MIT Technology Review Insights
Key takeaways
1
Executives’ top ambition for generative
AI adoption is driving increased
efficiency or productivity (72%), far
exceeding their interest in increasing
O
revenue (30%) or reducing costs (24%).
rganizations are starting the heavy lifting to
2
get real business value from generative AI. Strong data capabilities will be essential
As Arnab Chakraborty, chief responsible underpinnings to these aspirations, but
AI officer at Accenture, puts it, “2023 was only 22% of businesses consider their
the year when clients were amazed with data foundations “very ready” to support
generative AI and the possibilities. In 2024, we are generative AI applications today.
starting to see scaled implementations of responsible
3
generative AI programs.” The rise of AI exacerbates longstanding
challenges in data management—data
Some generative AI efforts remain modest. As Neil governance, security, and privacy (cited
Ward-Dutton, vice president for automation, analytics, by 59%), data quality and timeliness
and AI at IDC Europe, describes it, this is “a classic kind (53%), and data integration (48%)—and
of automation: making teams or individuals more may supply the urgency needed to finally
productive, getting rid of drudgery, and allowing people address them.
to deliver better results more quickly.”
For companies rolling out generative AI, these are not example. They, he says, are asking fundamental questions
necessarily distinct choices. Chakraborty sees a “thin about the technology’s power: “How can I use generative
line between efficiency and innovation” in current activity. AI to create new treatment pathways or to reimagine my
“We are starting to notice companies applying generative clinical trials process? Can I accelerate the drug discovery
AI agents for employees, and the use case is internal,” time frame from 10 years to five years to one?”
he says, but the time saved on mundane tasks allows
personnel to focus on customer service or more creative Data strategy underlies AI innovation
activities. Gultekin agrees. “We’re seeing innovation with Behind the diversity of ways in which respondents hope
customers building internal generative AI products that to secure value from generative AI looms one common-
unlock a lot of value,” he says. “They’re being built for ality: the need for enormous quantities of the business’s
productivity gains and efficiencies.” own data, accessibly stored and ready to use. Off-the-
shelf AI tools will not differentiate businesses when their
Chakraborty cites marketing campaigns as an example: adoption will soon be universal. For any enterprise AI
“The whole supply chain of creative input is getting use case, says Ward-Dutton, “there is no value without
re-imagined using the power of generative AI. That is good business data” of the company’s own.
obviously going to create new levels of efficiency, but
at the same time probably create innovation in the way
you bring new product ideas into the market.” Similarly,
Gultekin reports that a global technology conglomerate
and Snowflake customer has used AI to make “700,000
pages of research available to their team so that they Executive aspirations for generative
can ask questions and then increase the pace of their AI adoption
own innovation.” What are the primary types of value your organization
hope to achieve from its generative AI efforts?
First choice Second choice Third choice
The impact of generative AI on chatbots—in Gultekin’s
words, “the bread and butter of the recent AI cycle”— Increased efficiency or productivity
may be the best example. The rapid expansion in chatbot 36% 23% 13% 72%
capabilities using AI borders between the improvement
of an existing tool and creation of a new one. It is Increased market competitiveness
unsurprising, then, that 44% of respondents see improved 24% 14% 17% 55%
A closer look at our survey results reflects this overlap Improved customer satisfaction
between productivity enhancement and product or 7% 18% 19% 44%
service innovation. Nearly one-third of respondents (30%)
included both increased productivity and innovation in Increased revenue
10% 9% 30%
the top three types of value they hope to achieve with 11%
But efficiency gains are not the only path to product Reduced costs
or service innovation. Some companies, Chakraborty 6% 10% 8% 24%
22%
How ready, though, are most companies’ data estates
to support them in the race to generative AI value?
Somewhat
Gultekin puts it plainly: “The data foundation is at the core ready
of generative AI capabilities.” Data foundations cover a
broad collection of processes and assets involved in
the gathering, aggregation, storage, and accessibility
Very
unready 53%
of organizational data.
8% Somewhat
unready
Chakraborty identifies three specific data capabilities
necessary to support the effective deployment of
17%
Source: MIT Technology Review Insights poll, 2024
generative AI: the quality of the data; the ability to
integrate multiple sources of data; and the timely well, because data is very dynamic and the speed at which
democratization of data to relevant business users. it is getting created is magnifying every month.”
These are persistent difficulties, he notes, saying “all
have been issues that we have heard about for 10 years.” At first glance, most poll respondents seem positive about
These challenges will not easily be addressed this time, the state of their company data foundations. The majority
either, he warns: “We still will be 10 years from now, as rate their business “very ready” (22%) or “somewhat
IDC’s research, Ward-Dutton says, shows that “somewhat Companies are also addressing hallucinations through
ready” is rarely “almost ready” in practice. The company’s improved techniques for ensuring accurate AI responses.
surveys have found that only 30 to 40% of businesses are Retrieval augmented generation (RAG), for example, is a
confident in their ability to perform each of the following technique used to ensure that AI outputs are grounded
data controls in their AI work: strictly control sensitive in verifiable outside data. Guardrails can impose firm
data with certainty that none will be leaked during model boundaries around the language or topics generative AI
training or use; manage use of third-party IP included in outputs can contain, and human and computer review of
the models they are using and ensure that their own IP outputs can iteratively improve an AI application’s results.
does not leak; and track and control how generative AI is
And sometimes it’s best for AI to simply acknowledge
interacting with their own internal data. His conclusion
that it doesn’t know. Gultekin says, “We’ve been focused
from such data is that “overall, readiness is pretty
on building products that know when to abstain from
equivocal.”
answering a question. LLMs are blissfully ignorant about
when they should not be answering, which is a dangerous
Nor is it all clear sailing for companies self-ranked “very
slope for businesses’ critical decision-making. We’ve
ready” by respondents. Instead, investment of resources
been investing in making systems that know when not
and time in their data foundations may just have made
to answer.”
clear their next set of challenges.
That said, generative AI’s benefits are becoming visible Data governance, security, or privacy
to those companies farther along the road. Gultekin 59%
reports that companies that have invested heavily in data
Data quality or timeliness
foundations are “all now reaping the benefits, because
53%
they are able to bring AI onto that data.” Ward-Dutton
adds that this deployment of generative AI represents a Costs or resource investment
business leap as well as a technological one. Companies
49%
that have invested in data governance and quality, he says,
“typically have also got to the point where they’re really Data silos or data integration challenges
engaging business people in how to manage data.” Now, 48%
the business people see, “‘Wow, this data actually has Access to scalable computing power
value for me.’”
25%
to bring AI closer to their data, ensuring that data security capability piece and the organizational and cultural one
and privacy are upheld. On top of that, customers want are two sides of the same coin.”
assurances that they are not liable for the data that the
large language model was trained with, and they want to Data silos and data integration remain a challenge for
ensure that their data does not get used to improve the nearly half of organizations. Gultekin notes that generative
model for others without their permission. These are all AI has heightened awareness of this problem as well.
table stakes and should be the industry standard.” By making large amounts of data all “much more useful
and much more accessible,” he says, “it is shedding a lot
Safe use of generative AI requires careful governance of light on the data infrastructure that is needed.” He
of its data sources. Chakraborty explains that the vast argues that investment in a single, standardized data
amount of public and company data used to train foundation across the organization will enable much
a large language model brings with it inherent risks. more powerful generative AI uses. It will also reduce
“You need a very surgical view in data governance,” he governance and security concerns: “When you keep
says. From sourcing data to creating outputs, companies your data in one place for one thing, another place for
require very strong data governance, including clarity another thing, governing and securing that data
on who owns the data and who stewards it. Adding to becomes really difficult,” he says.
the challenge, Gultekin explains, “is that the new data
generative AI makes accessible may not itself be Spending and resource decisions, including those needed
fully governed.” to shore up data foundations, are of course a challenge
when it comes to any technology investment. On the
Data quality and timeliness is another predictable concern. bright side, the cost of generative AI itself is decreasing
As Chakraborty puts it, “if your data quality is not clean, substantially. In particular, Gultekin reports, in recent
you are going to get all kinds of garbage.” Yet because months, enterprises have begun creating smaller LLMs
of the opaque way in which the technology works, this that remain extremely capable while being less expensive.
may not be obvious. This is where data lineage becomes “All these distilled models are still as good,” he says.
critical, ensuring transparency and traceability of data as “They are also very cost-effective and very good from
it moves through systems. For those scaling up generative a latency perspective.”
AI pilots, thereby giving the technology a greater role,
such quality issues magnify, adds Gultekin. As organizations feel increasing urgency to deploy
AI applications, they are realizing that their data is the
While IT departments are typically told to fix low-quality key to how quickly and effectively they can unlock new
data, the problem is caused where the information began value. Many will find that their generative AI ambitions
“in the first place: that typically is the business,” says are just castles in the air if they don’t have the right data
Ward-Dutton. Employees who understand the value of foundations to support them. A strong data strategy,
their data “are much more likely to care about the quality however, can guide them to surmount their data
of what they’re providing,” he says. “The technical or the challenges on the way to AI success.
“Data strategies for AI leaders” is an executive briefing paper by MIT Technology Review Insights. We would like to thank
all participants as well as the sponsor, Snowflake. MIT Technology Review Insights has collected and reported on all
findings contained in this paper independently, regardless of participation or sponsorship. Teresa Elsey was the editor
of this report, and Nicola Crepaldi was the publisher.
Illustrations
Cover art and spot illustrations created with Adobe Stock.
While every effort has been taken to verify the accuracy of this information, MIT Technology Review Insights cannot accept any responsibility or liability for reliance by any person
on this report or any of the information, opinions, or conclusions set out in this report.