Generative AI in The Enterprise
Generative AI in The Enterprise
Enterprise
Mike Loukides
iii
Generative AI in the Enterprise
1 Meta has dropped the odd capitalization for Llama 2. In this report, we use LLaMA to
refer to the LLaMA models generically: LLaMA, Llama 2, and Llama n, when future
versions exist. Although capitalization changes, we use Claude to refer both to the
original Claude and to Claude 2, and Bard to Google’s Bard model and its successors.
1
Executive Summary
We’ve never seen a technology adopted as fast as generative AI—it’s
hard to believe that ChatGPT is barely a year old. As of November
2023:
Which Model?
While the GPT models dominate most of the online chatter, the
number of models available for building applications is increasing
rapidly. We read about a new model almost every day—certainly
every week—and a quick look at Hugging Face will show you more
models than you can count. (As of November, the number of models
in its repository is approaching 400,000.) Developers clearly have
choices. But what choices are they making? Which models are they
using?
It’s no surprise that 23% of respondents report that their companies
are using one of the GPT models (2, 3.5, 4, and 4V), more than any
other model. It’s a bigger surprise that 21% of respondents are devel‐
oping their own model; that task requires substantial resources in
staff and infrastructure. It will be worth watching how this evolves:
will companies continue to develop their own models, or will they
use AI services that allow a foundation model (like GPT-4) to be
customized?
16% of the respondents report that their companies are building on
top of open source models. Open source models are a large and
diverse group. One important subsection consists of models derived
from Meta’s LLaMA: llama.cpp, Alpaca, Vicuna, and many others.
These models are typically smaller (7 to 14 billion parameters) and
easier to fine-tune, and they can run on very limited hardware;
many can run on laptops, cell phones, or nanocomputers such as
the Raspberry Pi. Training requires much more hardware, but the
ability to run in a limited environment means that a finished model
can be embedded within a hardware or software product. Another
subsection of models has no relationship to LLaMA: RedPajama,
Falcon, MPT, Bloom, and many others, most of which are available
on Hugging Face. The number of developers using any specific
model is relatively small, but the total is impressive and demon‐
strates a vital and active world beyond GPT. These “other” models
have attracted a significant following. Be careful, though: while this
group of models is frequently called “open source,” many of them
Only 1% are building with Google’s Bard, which perhaps has less
exposure than the others. A number of writers have claimed that
Bard gives worse results than the LLaMA and GPT models; that
may be true for chat, but I’ve found that Bard is often correct
when GPT-4 fails. For app developers, the biggest problem with
Bard probably isn’t accuracy or correctness; it’s availability. In March
2023, Google announced a public beta program for the Bard API.
However, as of November, questions about API availability are still
What Stage?
When asked what stage companies are at in their work, most
respondents shared that they’re still in the early stages. Given that
generative AI is relatively new, that isn’t news. If anything, we should
be surprised that generative AI has penetrated so deeply and so
quickly. 34% of respondents are working on an initial proof of con‐
cept. 14% are in product development, presumably after developing
a PoC; 10% are building a model, also an early stage activity; and
8% are testing, which presumes that they’ve already built a proof
of concept and are moving toward deployment—they have a model
that at least appears to work.
2 Many articles quote Gartner as saying that the failure rate for AI projects is 85%.
We haven’t found the source, though in 2018, Gartner wrote that 85% of AI projects
“deliver erroneous outcomes.” That’s not the same as failure, and 2018 significantly
predates generative AI. Generative AI is certainly prone to “erroneous outcomes,” and
we suspect the failure rate is high. 85% might be a reasonable estimate.
Missing Skills
One of the biggest challenges facing companies developing with AI
is expertise. Do they have staff with the necessary skills to build,
deploy, and manage these applications? To find out where the skills
deficits are, we asked our respondents what skills their organizations
need to acquire for AI projects. We weren’t surprised that AI pro‐
gramming (66%) and data analysis (59%) are the two most needed.
AI is the next generation of what we called “data science” a few
years back, and data science represented a merger between statistical
modeling and software development. The field may have evolved
from traditional statistical analysis to artificial intelligence, but its
overall shape hasn’t changed much.
Missing Skills | 17
Over half of the respondents (52%) included general AI literacy as
a needed skill. While the number could be higher, we’re glad that
our users recognize that familiarity with AI and the way AI systems
behave (or misbehave) is essential. Generative AI has a great wow
factor: with a simple prompt, you can get ChatGPT to tell you
about Maxwell’s equations or the Peloponnesian War. But simple
prompts don’t get you very far in business. AI users soon learn that
good prompts are often very complex, describing in detail the result
they want and how to get it. Prompts can be very long, and they
can include all the resources needed to answer the user’s question.
Researchers debate whether this level of prompt engineering will be
necessary in the future, but it will clearly be with us for the next
few years. AI users also need to expect incorrect answers and to be
equipped to check virtually all the output that an AI produces. This
is often called critical thinking, but it’s much more like the process
of discovery in law: an exhaustive search of all possible evidence.
Users also need to know how to create a prompt for an AI system
that will generate a useful answer.
We’re optimistic about generative AI’s future. It’s hard to realize that
ChatGPT has only been around for a year; the technology world
has changed so much in that short period. We’ve never seen a new
technology command so much attention so quickly: not personal
computers, not the internet, not the web. It’s certainly possible that
we’ll slide into another AI winter if the investments being made in
generative AI don’t pan out. There are definitely problems that need
to be solved—correctness, fairness, bias, and security are among
Appendix
Methodology and Demographics
This survey ran from September 14, 2023, to September 27, 2023.
It was publicized through O’Reilly’s learning platform to all our
users, both corporate and individuals. We received 4,782 responses,
of which 2,857 answered all the questions. As we usually do, we
eliminated incomplete responses (users who dropped out part way
through the questions). Respondents who indicated they weren’t
using generative AI were asked a final question about why they
weren’t using it, and considered complete.
Any survey only gives a partial picture, and it’s very important
to think about biases. The biggest bias by far is the nature of
O’Reilly’s audience, which is predominantly North American and
European. 42% of the respondents were from North America, 32%
were from Europe, and 21% percent were from the Asia-Pacific
region. Relatively few respondents were from South America or
Africa, although we are aware of very interesting applications of AI
on these continents.
Appendix | 21
About the Author
Mike Loukides is vice president of content strategy for O’Reilly
Media, Inc. He’s edited many highly regarded books on technical
subjects that don’t involve Windows programming. He’s particularly
interested in programming languages, Unix and what passes for
Unix these days, and system and network administration. Mike is
the author of System Performance Tuning and a coauthor of Unix
Power Tools. Most recently, he’s been fooling around with data and
data analysis, exploring languages like R, Mathematica, and Octave,
and thinking about how to make books social. Mike can be reached
on Twitter as @mikeloukides and on LinkedIn.