DevOps For AI
DevOps For AI
1. Prepare 4. Test
- Idea generation - Model testing
- Business case - Wizard of Oz
- Validate idea - Prepare feedback loop
EVAL PLOY FO
U DE R
E
AT
AI
FO R R
E
PREPA
A
T
S
TE
TE
N
RA
O
AI
N
R
T
FO ITOR
PE
AI
D +
ES
LD
O
IGN BUI R
AI
DEVOPS FOR AI - AN
INTRODUCTION
find a good way that works within your Each phase has a detailed model with
context. the most important activities. The
models are not designed to be
The DevOps for AI cycle consists of complete and comprehensive. Any
the following phases: valuable additions are welcomed.
ION BU
AT SIN
R E
NE
SS
GE
CA
AI/ML use cases Value Proposition
A
IDE
SE
Experiment Canvas Feedback
T
GDPR Data authorisation
I
A
OL
AV
P
AI
TA LA
DA BIL
ITY
During ‘Prepare for AI’ we try to validate our idea as soon as possible. Is it possible to make
significant impact on the business goals. To start validating we probably need some data
and an idea about an initial AI/ML model. The proposed solution must be feasible within the
boundaries of our company’s data policy.
S AI A
N RC
TIO
CA
H
IT
I
IF
EC
EC
TUR
Usablity requirements AI/ML platform
SP
E
LS
A
O
T
Monitoring requirements Locate data
A
TI
SE
NC
LE
FU CT
N- ION
NO
In the phase ‘AI Design’ we draw the first contours for our solutions. We give answers to the
questions like which architectures are feasible, which data is needed and where is the data
located? The functional and non-functional requirements sets a framework for the solution.
RE E -
PROCESSING
FIN P R
E TA
DA
Metrics Data quality
Algorithm selection Feature engineering
Solution refinement Pre-processing pipeline
BU
TE ILD
IDA MO
AL DE
V L
During ‘Build + Train’ we create our models and the software to serve the models and
choose the metrics to optimize our models on. For continuous delivery purposes we
instantiate data pre-processing and machine learning pipelines next to our regular build.
Model Testing
TE
L
DE
ST
MO
ING
G
SO
TIN
FW
SECURITY TESTING
During ‘Test’ we test our models for production and make sure the whole solution works as
aspected. This means we perform traditional DevOps software testing with extra focus on
security testing. We don’t want to expose senstive data and protect our AI/ML models
against abusive use.
Most of the times when we think of Better, more tags and faster
testing our AI or Machine Learning One of the key elements for our model
model it’s about accuracy or maybe is the use of tags. A tag is a keyword
even precision and recall. We presume or term assigned to a piece of
that these measures tell us the information. This kind of metadata
correctness of our product. But the helps describe an item. The search,
correctness of the model is not the recommendation and profiling
same as a satisfied user. Let me tell algorithms are leaning heavily on
you a short story of how we created a these tags. Long story short, they're
great model for tagging knowledge important for the system, so it's
which resulted in poor results after the important the users add the right tags
first user tests. And by the way, this to the information.
story has a happy end.
User testing
We were as confident as we could be.
This was a neat model. All the
measures were looking good.
Precision high, pretty good recall. The
user interface was slick and during
typing the system recommended tag
after tag after tag. Let the user testing
begin. Nothing can go wrong, right? Figure 1: First UI design
“Our usability testing expert came back with the first results
and they were disappointing”
users read all the tags and had to think 4 we came up with the solution in
for each tag if it was the right tag. So it Figure 2 and now we have better tags,
slowed them down. more tags and the tagging is faster
and easier than ever.
After some test rounds we found the
perfect solution for this problem. We Conclusion
split the selected tags and the So although the recommendation
suggested tags and showed less of the model performed well, it didn't satisfy
tags. It took some testing to find the the user. We had to adjust the model
sweet spot for the number of tags to to create the best solution for the user.
show. It wasn't a game of just picking a We often think we know what our
number. The sweet spot is different for model does, but we never know how
each piece of information. After round users react to it, unless we test it.
VERSIONING CO
NT
Model Management
. INTEGR
Data Versioning
Code Versioning Train model for production
Data processing pipeline
Build code
AT
ION
CO
N
T. DEPLOY
T
CLUSTER MGMT
During ‘Deployment for AI’ we train our models for production and deploy them to the
production environment. This can be a public or private cloud with a GPU cluster. An
important part of the process is to version the AI/ML models along with the data and the
code.
Bias monitoring
Model monitoring
SOFTWARE
Metric validation
MO
DE
Performance
Diagnostics
L
Logging
Human-in-the-loop
Data quality
Data flow
R A
INF DATA
Network
Hardware
Capacity
During ‘Operate’ we make sure the system keeps running the way as defined. We operate
the software, the infrastructure, but also the model and the data for the AI part of the
solution. We make sure the model keeps performing as specified even when it keeps
learning and evolving.
VALUE COST OF
OWNERSHIP
BUSINESS
Business metrics
Cost monitoring
Mean Time to Repair
Mean To Repair
Return of Investment
Feedback loops
USAGE FEEDBACK
During ‘Monitor for AI’ we monitor our process as well as our product. We monitor if we
meet the expected business impact and challenge this against the running cost. We also
monitor the usage of our product and collect user feedback to find improvements.
T
VO
Pivot Process
PI
Pivot Product
Pivot Strategy
Retrospective
Improve solution
VE
Process improvement
PRO
IM
VA
LI
DA
Product
TE
Process
Business Case
During ‘Evaluate’ we validate our business case based on our learning and outcomes of the
product and process. We determine what to improve and pivot accordingly. We can pivot
our business strategy, product or process but we stay grounded in our vision and learned
facts.
DEVOPS CAN
PREVENT AI
EXPERIMENTS
Although more and more time and DevOps therefore has the potential to
budget is being made available within overcome the main pitfalls of AI
companies and institutions to experiments (and in particular that
investigate the possibilities of AI, the translation into practice). And that's in
actual application is often still in an three success factors.
experimental phase. After that phase,
the moment should follow when a The use of qualitative data
new AI tool is actually taken into use The success of an AI solution
by the business; and that is exactly depends on the data that you have at
where it still lingers. your disposal and the quality of it.
Ideally you as a development team
The reasons why AI experiments still want to have access to many
often end up in sight of the finish are different data sources within the
not new - they are phenomena IT has organisation to test and develop.
been dealing with for years. In practice things go wrong there
Fortunately, there has also been a soon. The owners of databases are
solution for these obstacles for years, often spread over the organisation,
namely DevOps teams. This way of so that a lot of time and energy can
working not only ensures that an AI be spent collecting data. Teams
average IT person can slow down or can be improved, for example from 72
even prevent the putting into practice to 80 percent. That in 20 per cent of
of a new AI solution. cases a customer service
representative is needed, that does not
An example: when a team will develop have to be a problem; if the investment
an AI chat bot that is used to handle only pays for itself in the improvement
customer inquiries, the team usually you have made.
wants a solution that always comes up
with the right answer to questions. If This too is a typical example of a
the accuracy then turns out to be 'only' DevOps approach; an incremental
80 percent, then one conclusion may improvement or 'minimal viable
be that the tool is not working properly. product' also has a value, especially in
What is actually forgotten is that the experiment phase. Learn to live
people are not flawless either; a with small steps; several small steps
customer service representative does eventually lead to a big impact.
not always immediately know the right
answer to a question. In these cases Involving the end user
you have quickly made a business But perhaps the main advantage of
case if the accuracy of an AI chatbot working in DevOps teams is to involve