Lecture 2-3
Lecture 2-3
Engineering
Prof. Roberto Pietrantuono
Lecture 2
Outline
• The Functional Architecture
• From What to How: The Implementation Framework
• Development Process
• DevOps, MLOps
• A first simple full production pipeline
How to build an AI
system?
The system engineering high-level V-Model
Stakeholder’s requirements Solution
System System
development Integration, verification, realization
and validation planning
Upper-level Upper-level
system element system element Architecture
Architecture
development realization Integration
decompositio
and
n and
Verification
definition
Lowerrlevel Lowerrlevel
system element system element
development realization
A Functional
AI System Architecture
Sensors Sources
Increasing Difficulty
Increasing Difficulty
High
Signal-to-noise
Dimensionality
Low
Low
High
Problem scale Latency requirement
Entities (Volume) Collection time (Velocity)
Few. Many Few Entit. Many
Increasing Difficulty
Increasing Difficulty
Local
Long
Response Time
Relationships
Global
Short
Data conditioning
Key Questions
Questions Comments
What insight does the customer need? Work from the expected result back to the needed data
What data input is required to achieve Make sure that we have curated data
the desired insight?
Are the incoming data and required Forwardly deployed applications required batched processing in the
processing batched or streaming? cloud, streaming, or both
Can we solve the AI problem using ML Deep learning is very powerful when we have data that is (statistically)
techniques other than Deep Learning? representative of the population, scenarios, or both
Do we need edge computing? We might need to operate under low-capacity and intermittent data links
How do we protect our AI system from Ai systems are very fragile and easy to fool
adversarial AI (e.g., data poisoning)?
Data conditioning
Data Storage Solutions
▪ Since the advent of big data, the has been a surge in DB types over traditional
relational DB for transactional processing
▪ Distributed storage systems for structured big data
▪ Hadoop distributed file system (HDFS) (+ MapReduce for distributed data processing model +
YARN for resource allocation/scheduling on clusters) became a standard
▪ Can work with stream-processing-oriented systems (e.g., Spark, Kafka, Ray)
▪ Rapid ingestion of streaming data
▪ Often coupled with distributed processing systems (e.g., Spark, Kafka, Ray)
▪ NoSQL solutions, more suitable for unstructured data
▪ Consistency vs Availability vs Partition tolerance needs to be considered too (CAP
Theorem)
▪ Computational resources (e.g., storage on the edge in IoT)
SQL and NoSQL comparison
SQL NoSQL
Relational database with fixed columns and rows Nonrelational databases typically found with unstructured
typically found with structured data data
Follows a specific schema; for example, last name Relational data (structured data) can be stored, but no in a
(row) followed by first name, home address, phone, … row-column pair. Instead, data is nested in a single data
structure
Much easier to comply with the ACID properties Harder to meet ACID properties (often use BASE (Basically
Available, Soft State, Eventually Consistent) properties).
Difficult to scale to large volume and velocity Easier to scale, optimized for developer productivity, runs
well on distributed clusters
Originated in the early 1970s with the need of Originated in the mid-2000s with he need for rapid ingestion
transactional processing (e.g., internet searches)
NoSQL Database types
Key-value stores are the simplest NoSQL (e.g.:
Key-value databases Customer #ID and Attributes). Example: Amazon
DynamoDB
Orders
Order
ShippomgAdress
OrderPayment
OrderItem
Product
Example: Wide-col db
Data conditioning
Data Wrangling/Munging
Stage Issues/Activities
Data discovery Analyze and understand your data. Are the required data available? Are
they representative? Is there any bias?
Publishing Prepare the data set for use downstream, which could include use for
users or software.
Data conditioning
Semi-
Supervised Unsupervised Self-supervised Reinforcement
supervised
Off-policy vs
Classification Regression Inductive Clustering Association
On-policy
(Non-)Linear Offline vs
Decision Tree Transductive K-Means Rule mining
Regression Online
Value-, Policy-,
SVC SVR Hierarchical Itemset mining
Model-based
Neural
K-NN Probabilistic
Networks
Regression
Neural Network
Trees
Machine Learning – Type of models
• Discriminative models
• capture the conditional probability p(Y | X)
• discriminate between different kinds of data instances
• Has to learn a boundary
• Generative models
• capture the joint probability p(X, Y), or just p(X) if there are no labels.
• can generate new data instances
• Has to capture correlations in the data
• Ex.: GAN, (Variational ) Autoencoders
Machine Learning – Limitations/Challenges
• Opacity, explainability (DNN)
• Bias (Fairness)
• Computational resources (e.g., training), human resources (e.g.,
labelling)
• Over-Under fitting
• Vulnerabilities
• Bias-Variance Trade-Off, Hallucination
Quality Cost
Assurance optimization
Stakeholder Business goals
assessment (stakeholders req)
Human-Machine
Teaming
Risk Reliability and
Management Security
Quality Cost
Assurance optimization
Stakeholder Business goals
assessment (stakeholders req)
Develop
Deployment Machine learning
Human-Machine
Teaming
Risk Reliability and
Management Security
Performance
Metrics
Still on “how”: An ML perspective
Operational
MORE ON HOW:
excellence
Quality Cost
Assurance optimization
Stakeholder Business goals
assessment (stakeholders req)
Develop
Deployment Machine learning
Human-Machine
Teaming
Risk Reliability and
Management Security
Performance
Metrics
The final AI Product/Service heterogeneity (1/2)
Aerospace (drones,
airplane design, Agriculture Automation (e.g., Chemicals
satellites, ATC (precision farming) self-driving cars) (cleaning, hygiene)
prediction)
Healthcare (medical
Finance (banking, Fitness (wearable diagnosis, medical
Education (virtual
fraud detection, devices, clothing, equipment,
teacher)
trading) nutrition) anxiety/depression,
surgical robots)
Level of autonomy
The AI Systems Spectrum
Chat Bots, Deep Fakes, Virtual Personal AIoT, e.g.,
Robotics Autonomous
Malware detectors, sw Assistants, AI- infrastructure
Systems Driving Systems
engineering… Powered Avatars monitoring
Data
LLMs
LLMs
Domain LLMs
Experts
System-centric perspective:
The ADS/ADAS Example
Sensing Perception Route Planner
▪ Architecture/SysML Lidar Object Detection Plans routes via
▪ MBSE Camera Obstacle Detection path search on
global map
GPS Road edge detect.
▪ Digital twins Accelerometers,
Gyros
World Model Monopoly Board
Static and moving Keep tracks of
obstacles what to do via FSM
Road map
Actuators Position/speed Motion Planner
Steering Plans short-term
Braking moves by
Speed searching for a
short path
AI-centric
Systems
Following the process…
Technical requriements
▪ Anaconda
▪ Conda, possibly poetry
▪ IDE
▪ E.g., PyCharm, Spyder
▪ Atlassian Jira
▪ Git, GitHub
▪ AWS
Discover, Play, Develop, Deploy
A simple example of a full pipeline - Instructions
▪ Install Anaconda
▪ Clone the repo
▪ Create and activate the Environment
git clone https://github.com/rpietrantuono/AISE_Ch2.git
conda env create –f env.yml
conda activate env
conda env list
>>>env_test /Users/robertopietrantuono/anaconda3/envs/test
>>>env * /Users/robertopietrantuono/anaconda3/envs/env
>>>others ...
There is ad hoc code for plotting (and it’s Absolutely fine in a this exploratory phase, but
not very intuitive!).
we need to take code like this and make it into
There is a variable called tmp, which is
not very descriptive. something suitable for your production ML
pipelines... Let’s go to the «Develop» stage
Discover, Play, Develop, Deploy
Select a development process
Methodology Pros Cons
Agile Flexibility is expected. If not well managed, can
easily have scope drift.
Faster dev to deploy
cycles. Sprints or Kanban may
not work well for some
projects.
Waterfall Clearer path to Lack of flexibility.
deployment.
Higher admin overheads.
Clear staging and
ownership of tasks.
Discover, Play, Develop, Deploy
Manage your artefacts – Code Version Control Strategies
Engineer A
lr = LogisticRegression(maxIter=model_config["maxIter"],
regParam=model_config["regParam"])
<<<<<<< HEAD
lr = LogisticRegression(maxIter=10, regParam=0.001)
E.g., If engineer A
pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
======= pushes…
lr = LogisticRegression(maxIter=model_config["maxIter"],
regParam=model_config["regParam"])
pipeline = Pipeline(stages=[tokenizer, hashingTF, lr])
>>>>>>> pipeline
Discover, Play, Develop, Deploy
Manage your artefacts – Code Version Control Strategies
• A Git Workflow
• Define types of branches, such as:
• Main contains your official releases and should only contain the stable version of
code
• Dev acts as the main point for branching from and merging to for most work in the
repository; it contains the ongoing development of the code base and acts as a
staging area before main
• Feature branches should not be merged straight into the main branch; everything
should branch off from dev and then be merged back into dev.
• Release branches are created from dev to kick off a build or release process before
being merged into main and dev and then deleted.
• Hotfix branches are for removing bugs in deployed or production software. You can
branch this from main before merging into main and dev when done.
Tag 0.1 Tag 0.2 Tag 1.0
Start of release
branch for 1.0
Bug fixes
Incorporate bug continuously
fixes in develop merged back
Major feature
for next release
• Pull Requests
• To make known your intention to
merge into another branch and
allow another team member to
review your code before this
executes
• Enables code review
• You do this whenever you want to
merge your changes and update
them into dev or main branches.
Discover, Play, Develop, Deploy
Manage your artefacts – Model Version Control
pip install mlflow
import pandas as pd
from prophet import Prophet
from prophet.diagnostics import cross_validation Main code excerpts:
from prophet.diagnostics import performance_metrics Relevant imports
import mlflow
import mlflow.pyfunc
Discover, Play, Develop, Deploy
Manage your artefacts – Model Version Control
class ProphetWrapper(mlflow.pyfunc.PythonModel):
def __init__(self, model): Wrapper class
self.model = model inheriting from
super().__init__() mlflow model
object
def load_context(self, context):
from prophet import Prophet
return
with mlflow.start_run():
…
Discover, Play, Develop, Deploy
▪ How to get your solution built out into the real world ?
▪ On-premise deployment, on owned infrastructures
▪ Legacy software
▪ Strong constraints on data location and processing
▪ Pros: security, privacy
▪ Cons: Need (physical) resources and specialists, e.g., for
configuring networking, load balancing, infrastructure
maintenance…
Discover, Play, Develop, Deploy
• How to get your solution built out into the real world ?
• Infrastructure-as-a-Service (IaaS)
• On-demand access to physical and virtual servers, cloud-hosted
storage and networks, and the back-end IT infrastructure on a pay-as-
you-go basis
How to get your solution built out into the real world ?
• Platform-as-a-Service (PaaS)
• On-demand access to a ready-to-use, complete cloud hosting platform for
developing, running, maintaining and managing applications.
How to integrate and manage your deployment and continuous update cycles?
Discover, Play, Develop, Deploy
Life cycle Activity Details Example of
stage Tools
Testing Unit tests: tests aimed at testing the functionality pytest or
smallest pieces of code. unittest
Integration tests: ensure that interfaces within the Selenium
code and to other solutions work.
Acceptance tests: business focused tests. Behave
Building The final stage of bringing the solution together. Docker, twine,
or pip
Discover, Play, Develop, Deploy
Life cycle Activity Details Example of Tools
stage
Training Train the model Any ML package
steps:
- uses: actions/checkout@v3
- name: Set up Python ${{ matrix.python-version }} define the step. «uses»
uses: actions/setup-python@v4 exploits pre-defined
with: standard actions
python-version: ${{ matrix.python-version }}
Discover, Play, Develop, Deploy
- name: Install dependencies installs the relevant
run: dependencie
| python -m pip install --upgrade pip
pip install flake8 pytest
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi