Understand The Technology Ecosystem
Understand The Technology Ecosystem
The new transformer architecture brings us to the third major factor in the rapid
advancement of generative AI: computational power. It takes a lot of processing
power to do the math behind AI model training. Historically, AI models are designed
in a way that requires a sequence of calculations, run one after the other. The
transformer architecture is different—it relies on many separate, concurrent
calculations.
So, one computer processor can do the first calculation while a different processor
does the second at the same time. That’s called parallel computing, and it greatly
reduces the time it takes to train a transformer. On top of that, in recent years
processors that can perform parallel computing have become much more powerful and
abundant.
These three factors of data, architecture, and computing have converged for just
the right conditions to train very capable large language models. One of the
biggest LLMs is the GPT language model, which stands for generative pre-trained
transformer. In other words, a model that’s already been trained that can be used
to generate text-related content
Emerging Ecosystem
Right now, there are already hundreds of sites on the internet where you can go to
get hands-on with generative AI. When you visit one of those sites, you’re at the
tip of a technology iceberg. And that technology can come from a lot of different
sources. Let’s investigate the tech stack that makes it possible to bring awesome
generative AI experiences to the masses.
At the bottom of the iceberg, we start with the compute hardware providers.
Training an LLM can take a staggering amount of computational power, even if you're
training a transformer. It also takes computing power to process requests to
actually use the model after it’s been trained. Technically you can train AI models
on any computing hardware, but processors that excel at parallel computing are
ideal. Today the biggest name in AI compute is Nvidia.
Next are the cloud platforms that allow developers to tap into the compute hardware
in a cloud deployment model. Devs can rent the appropriate amount of time for a
specific project, and the platforms can efficiently distribute requests for
computing time across a connected system. Google, Amazon, Microsoft, and Oracle are
the main tech providers in this space.
AI models, including LLMs are the next layer. These models are carefully crafted
using research techniques and trained using a combination of public and privately
curated data. Developers can connect to LLMs through an application programming
interface (API), so they can harness the full power of NLP in their own
applications. The trained and accessible AI model is commonly referred to as a
foundational model. Because these models are accessed through an API, developers
can easily switch from one foundational model to another as needed. A few examples
of foundational models are GPT4, Claude, Stable Diffusion, and LLaMA.
The next layer is infrastructure optimization, which is all about providing tools
and services that make for more efficient and higher-quality model training. For
example, a service might offer perfectly curated data sets to train on. Another
might provide analytics to test the accuracy of generated content. It’s also at
this point where foundational models can be fine-tuned with specialized,
proprietary data to better meet the needs of a particular company. This is a busy
space in the AI ecosystem, with many companies offering a variety of optimization
services.
Finally, we find ourselves back at the tip of the iceberg: the applications.
Developers of all kinds can tap into optimization services and foundational models
for their apps. Already we see LLM-powered standalone tools, as well as plugins for
mainstream applications.
Diagram of AI tech stack
Many businesses are just starting to get a handle on what AI can do for them. Given
the unprecedented demand for AI technology, there’s a huge amount of opportunity
for businesses to make their mark at several levels of the AI tech stack
Data security: Businesses can share proprietary data at two points in the
generative AI lifecycle. First, when fine-tuning a foundational model. Second, when
actually using the model to process a request with sensitive data. Companies that
offer AI services must demonstrate that trust is paramount and that data will
always be protected.
Plagiarism: LLMs and AI models for image generation are typically trained on
publicly available data. There’s the possibility that the model will learn a style
and replicate that style. Businesses developing foundational models must take steps
to add variation into the generated content. Also, they may need to curate the
training data to remove samples at the request of content creators.
User spoofing: It’s easier than ever to create a believable online profile,
complete with an AI generated picture. Fake users like this can interact with real
users (and other fake users), in a very realistic way. That makes it hard for
businesses to identify bot networks that promote their own bot content.
Sustainability: The computing power required to train AI models is immense, and the
processors doing the math require a lot of actual power to run. As models get
bigger, so do their carbon footprints. Fortunately, once a model is trained it
takes relatively little power to process requests. And, renewable energy is
expanding almost as fast as AI adoption!
In Summary
Generative AI is capable of assisting businesses and individuals alike with all
sorts of language-based tasks. The convergence of lots of data, clever AI
architecture, and huge amounts of computing power has supercharged generative AI
development and the growth of the AI ecosystem.