AI Infrastructure 101
AI Infrastructure 101
101
AI Infrastructure 101
In this E-Guide:
As companies race to execute AI projects, capital investment in tools and technology that
support AI capabilities has skyrocketed. But AI infrastructure isn’t as easy as a single
How to develop a
purchase.
successful, modern AI
infrastructure Continue reading to learn how to create a successful and modern-ready AI infrastructure
without making hasty CapEx purchases.
Page 1 of 7 SPONSORED BY
AI Infrastructure 101
The shift from experimental research and academic approaches to acceptable, common use
systems is rapidly changing the environment of the tools landscape. In this area of
continuous development, it's difficult to keep track of the new tools on the market that
organizations of all types are using to develop their AI infrastructure.
Page 2 of 7 SPONSORED BY
AI Infrastructure 101
popular Jupyter notebook, and Google's Colaboratory built on top of that, as well as a wide
range of open source tools and toolkits covered in a previous article on this topic.
However, open source is not the end-all for machine learning model development. The tools
alone lack specific requirements for the management of models and data that are needed by
serious machine learning-focused data scientists and developers. As a result, over the past
How to develop a decade, tools focused on the immediate needs to build and train machine learning models
successful, modern AI have emerged. These tools have a focus on algorithm selection, tuning and evaluation, with
infrastructure the final result being Python, R, Java or other objects that can then be directly used to
answer any specific ML-related queries or data science needs, or put into operation by
production teams to be used in more highly scalable manners.
These tools include those by the major platform vendors including Amazon, Microsoft,
Google and IBM, as well as by focused data science and ML vendors including H2O,
RapidMiner, DataRobot, Databricks, Anaconda, Dataiku, Domino, KNIME, Alteryx, Ayasdi,
SAS and Mathworks. Since data science and ML model development is so data-dependent
and data-centric, big data vendors have entered this space with offerings from vendors
including Cloudera and SAP. The tools all share a focus on data centricity, with many of the
tools having origins in big data or data analytics. As a result, the core features of these
systems are algorithm and model focused, and not as much operationalization or
consumption focused. However, model operationalization and "ML ops" is rapidly becoming
the forefront of evolution for these tools.
The biggest change in the machine learning development space has been the emergence of
autoML. Given the lack of data science skills and expertise, many ML modeling and
development tools have released capabilities to automatically handle aspects of ML model
development that used to require the time and expertise of the user. In particular, data
Page 3 of 7 SPONSORED BY
AI Infrastructure 101
scientists and ML developers would have to clean and process their data, selecting among
the wide array of algorithms, configuring and managing the training of that model, tuning the
model and selecting the right hyperparameters, handling model evaluation and a variety of
additional steps required to operationalize the resultant models. AutoML tools have emerged
to handle many, if not all, of those steps. As a result, organizations are finding much greater
ability to simply drag and drop their data set into a tool, click a few options, then watch and
How to develop a
wait as a suitable model is automatically selected, tuned, configured and set up for
successful, modern AI
operationalization. AutoML vendors include open source solutions such as Auto-sklearn,
infrastructure
Auto-WEKA, OptiML AutoML and TPOT, as well as commercial offerings from companies
such as Cloudera, DataRobot, Google, H2O.ai, RapidMiner and others.
Page 4 of 7 SPONSORED BY
AI Infrastructure 101
Cloud ML Engine brings Google's hosted platform to the fore to enable developers and data
scientists to run and develop machine learning models and datasets. Microsoft Azure ML
has likewise proven to provide a wide range of tools and solutions for data scientists,
developers and administrators looking to put ML into production.
Separate from the MLaaS market is the concept of model as a service. Rather than
How to develop a providing the environment to build, run and manage your own models, model as a service
successful, modern AI gives you access to prebuilt and trained models specific to individual tasks. Clarifai,
infrastructure Gumgum, Modeldepot, Imagga and SightHound are major companies building and curating
ML models for use. As a developer, you can query these models that will provide results as
specified. For example, some models might identify specific things in images while others
might help you categorize text or process natural language. Many model-as-a-service
offerings focus on specific models targeted to image recognition or text analysis but there is
an emerging class of companies trying to gather a widely curated set of models applicable
to many different domains.
Using models in production brings up a lot of concerns, including making sure that models
are providing reliable, secure and manageable results in an environment of continuous
Page 5 of 7 SPONSORED BY
AI Infrastructure 101
change. An emerging set of ML ops tools provide capabilities for machine learning model
governance, version control, security, model discovery, model transparency and model
monitoring and management. These tools, like ParallelM, make sure that only qualified
users are allowed to make use of certain models, help ensure that new versions of models
don't cause unpredictable results, help safeguard models from data poisoning and
cybersecurity attacks, and make sure that the models continue to provide results at the
How to develop a
required levels of accuracy and precision as needed by their usage constraints.
successful, modern AI
infrastructure
Fundamental skills still needed
Before an enterprise can get started with AI, there are a few key considerations that must be
made. While it is true that an increasingly diverse range of users are able to develop and
make use of models, it is still necessary for ML developers and users to have skill sets to
effectively use these systems. At the most fundamental levels, organizations still need data
scientists with mathematical knowledge and a solid understanding of algorithms in order to
build their own models. In order to not only yield effective results but analyze and
understand the results being given, it is crucial for a citizen data scientist on the team to be
well-versed with probability and statistics, an integral part of working with machine learning.
Since a fair amount of exploration and theorizing goes into determining how to utilize these
systems to yield the desired results, having an employee who can think outside the box and
is willing to truly explore the extents of these systems is important when it comes to getting
the most out of them. Since AI platforms and tools are constantly changing, there is also a
need for the ML team to be able to stay up to date on modern methods for ML model
creation, autoML capabilities, ML Ops and other rapidly changing technology ecosystem
considerations. The artificial intelligence landscape is constantly changing, making it
Page 6 of 7 SPONSORED BY
AI Infrastructure 101
important for those working within this area to understand that today's platform investment in
their AI infrastructure might have to change tomorrow.
How to develop a
successful, modern AI
infrastructure
Page 7 of 7 SPONSORED BY