0% found this document useful (0 votes)

82 views

Data Science APIs With Flask

This document discusses developing a web API for machine learning models using the Flask framework in Python. It begins with an introduction to APIs and HTTP concepts. It then discusses advantages and disadvantages of using HTTP for APIs. Finally, it provides an overview of the Flask framework and demonstrates creating a simple Flask API that returns a prediction from a scikit-learn model.

Uploaded by

Leon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

82 views

Data Science APIs With Flask

Uploaded by

Leon

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 18

Data Science APIs

29 Oct 2017

This post is part of the series Data Science APIs.

You’ve developed a really great machine learning model to solve a difficult data science
problem, but now you want to wrap it in a convenient interface and either share it with the world
or integrate it smoothly into an application. This series of blog posts addresses how to develop a
web API to your model using Flask, a Python web framework. I’ll cover a bunch of suggestions
along the way to help make your API more robust and secure and how to get it up and running in
the wild.

In this first post, I’ll start off by covering some of the concepts of APIs and HTTP.

Why API?

An Application Programming Interface (API) is a common concept in software engineering.

When using a library (for example to generate list the files in a directory), you don’t especially
care how the library is implemented; all you really need to know is that when you pass a
directory path to os.listdir(), it returns a list of files in that directory.

The same goes for code you develop yourself. When writing, for example, a web frontend for
your machine learning model, you ideally don’t want to have to know if the model code uses
numpy arrays or pandas DataFrames for internal storage. For this reason, you want to have a
simple interface that defines a clear contract between the model code and anything that calls it.
This not only makes development easier, but ensures different parts of the system are decoupled,
which generally makes them easier to maintain.

APIs can take multiple forms; the programmatic interface to a library of code perhaps being the
most common. However APIs can also involve communication between different parts of a
system by other means. In this blog post series we’re mainly focussed on web APIs over HTTP.

Intro to HTTP

If you’re unfamilar with the terminology of HTTP and the web, don’t worry! This section will
introduce the basic concepts.

You may not be aware that you probably already make many web requests per day! That’s
because making requests to web servers and rendering the responses is the chief task of the web
broswer you use on your laptop, phone or tablet. When you type, say,
http://example.com/page in your browser’s address bar, it first works out the address of the
server corresponding to example.com, then sends a request to GET the resource /page from that
server:
The web server then returns some HTML to the browser in response to this request (in this case
simply the text ‘Hello!’), which the browser renders and displays.

The response also comes with a ‘status code’, in this case 200, which means ‘OK’, in other
words that the request was handled successfully. Other status codes can indicated various error
conditions. You’re probably familiar with one of them, ‘404 NOT FOUND’, which indicates that
a requested resource does not exist:

As well as different response codes, other HTTP methods than GET are available. The one you
are most likely to encounter in practical applications (and this blog post series) is POST, which is
generally used to create a resource on the server:

Web servers hosting HTML content are great, but what does this mean for data science APIs?
Well, if we replace the browser in this scenario with some client application which wants to
interface with our machine learning model, and write a web server that returns parsable
structured data in response to web requests, we can use HTTP as the mechanism to exchange
data:

Provided the client application knows how the response body is formatted, it is able to make a
call to the server any time it needs to make a prediction, and then use the response prediciton as
desired.

Advantages of APIs using HTTP

It may not be immediately obvious why you’d want to provide an API over HTTP rather than
simply writing a library and interfacing with it locally. That may be easier in the short term, but
defining the interface over HTTP carries several advantages:

1. It’s language-agnostic. The client application does not have to be written in the same language
as the model, giving more freedom to choose the best language for different parts of a larger
system.
2. It allows easy sharing. Once you have your API server up and running, an HTTP interface means
that you can give access to any client application that has access to the internet.
3. It detaches different parts of the system. Provided that the API specification does not change,
the API server (and encapsulated model) and any client applications that call it can be updated
and redeployed independently.
4. It facilitates horizontal scaling. You might have a lightweight web app that uses a model API with
significant computational demands. If the interface is over HTTP, multiple replicas of the API
server can be run, increasing its capacity without having to increase the number of replicas of
the web app.

Disadvantages of APIs using HTTP

There are some disadvantages to using HTTP:

1. Not having full control of the execution environment of the model denies the caller the ability to
take advantage of any local computing resources they might have.
2. The possibility of network failure or latency is also very real, and interfacing over HTTP will likely
be slower (though in many cases acceptably so) than interfacing with a native library.
3. Communication of sensitive data over the internet (and even private networks) must be done
with care. Encrpytion is a must, and security attacks need to be guarded against.
4. There’s an increased maintenance burden as additional servers or services need to be managed.
5. It’s potentially much more difficult for client application developers to debug issues with the
model code.

However, given these caveats, offering services, including machine learning models, over web
APIs is increasingly popular and is a great way to provide a flexible interface to your models.

Data Science APIs: Flask

03 Nov 2017

This post is part of the series Data Science APIs.

Perhaps the greatest strength of Python is the wide spectrum of libraries available for it. Not only
does it have a rich set of data processing tools and machine learning libraries, but it is also
widely used in web programming, having a number of mature web frameworks available. This
general purpose sets it apart from other languages used in data science, and puts us in a great
place for productionising models as web APIs.

In this series of posts, I provide a guide for wrapping data science models in web APIs using the
Flask web framework. Flask is a great choice for this as it is extremely lightweight, requiring
only a little code to convert a Python function into a web endpoint.

Other frameworks, like Django, are great, providing a lot of functionality out of the box.
However, much of this functionality is centred around implementing fully fledged web
applications rendering and serving HTML rather than providing simple web APIs.
In this post, I’ll cover the basics of Flask, and wrap an example model from scikit-learn into a
simple API.

Flask Basics

Flask is often referred to as a microframework since it’s very minimal. Compared toother
frameworks it provides only a core set of features needed to implement HTTP endpoints, and it
needs very little boilerplate. A minimal Flask application looks like the following:

from flask import Flask

app = Flask(__name__)

@app.route('/')
def hello():
return 'Hello World!'

if __name__ == '__main__':
app.run(debug=True)

In the above snippet, we instantiate the Flask app as app, then use the @app.route() decorator
to register the hello() function as an endpoint on /, the root endpoint. Finally, we run the app
with app.run(), in debug mode.

Running the app is as simple as running the script:

$ python app.py

And while it’s running, we can use curl to hit the server on the root endpoint (on the default
Flask port of 5000, which should be indicated in the log of the app ran above):

$ curl localhost:5000
Hello World!

You can see that curl prints the body that was returned by the registered endpoint, Hello
World!.

You can also call the endpoint from any language’s HTTP client library. In Python, I’d
recommend the requests library, which is not in the standard library but is very widely used. This
code snippet performs the same request as above:

>>> import requests

>>> response = requests.get('http://localhost:5000')
>>> response.text
'Hello World!'

You may prefer to use a GUI tool to run test queries against your API. One that I’ve used and
can recommend is Postman, which has a lot of nice features.
Adding Additional Routes

Aside from for the simplest applications, you’ll want to expose multiple entry points into your
API. To do this, you can use the @app.route() decorator to register different Python functions
on different HTTP endpoints. Recall that from the example above we registered our hello()
function on the root endpoint, /:

@app.route('/')
def hello():
return 'Hello World!'

The first argument to @app.route() determines the route the endpoint will be served on. We can
easily register additional functions on other endpoints:

@app.route('/foo')
def foo():
return 'bar'

You can easily choose the right endpoint by including it in the URL:

$ curl localhost:5000
Hello World!
$ curl localhost:5000/foo
bar

You can also parameterise endpoints with variable rules. Use angle brackets in the route name to
match any string and pass it as an argument to the function:

@app.route('/hi/<name>')
def hi(name):
return f'Hi {name}!'

This can again be queried by choosing the right URL:

$ curl localhost:5000/hi/Andrew
Hi Andrew!
$ curl localhost:5000/hi/acroz
Hi acroz!
Encoding Responses in JSON

In the examples given so far, we’ve generated some fairly simple text as output in order to
demonstrate the routing functionality in Flask, however for a practical data science API you’ll
typically want to return more rich structured and/or numerical data. There are a numer of
different ways you can encode this information, but the most common is to use JavaScript
Object Notation, or JSON for short.

JSON is a great choice for APIs as it’s simple, yet provides enough to cover most use cases; it’s
easy to parse, yet is fairly human-readable; practically every language has a parser, so you don’t
need to write one yourself; and it’s so commonly used in APIs that it’s practically a standard.
Python has JSON support in its standard library, but when using Flask I recommend using the
jsonify() helper, which not only serialises your data to JSON, but prepares a Flask Response
object with useful things like the HTTP content type preset.

To use jsonify(), pass the Python object (usually a dict) you want to endcode, and return the
generated response:

from flask import jsonify

@app.route('/api')
def api():
data = {
'name': 'Andrew',
'user': 'acroz'
}
return jsonify(data)

You can then query the endpoint as normal:

$ curl localhost:5000/api
{
"name": "Andrew",
"user": "acroz"
}

As mentioned, a lot of languages and libraries have JSON support built in. The Python requests
library I demonstrated earlier is among them. Call .json() on a requests response object to
decode the body as JSON and return the equivalent Python representation:

>>> response = requests.get(

>>> 'http://localhost:5000/api'
>>> )
>>> response.json()
{'name': 'Andrew', 'username': 'acroz'}
Jsonifying NumPy Values

It’s worth noting that Python’s standard library json package, which is used internally by
flask.jsonify(), doesn’t play nicely with NumPy types. For example, while the serialisation
of a normal Python int works fine:

>>> import json

>>> json.dumps(3)
'3'

Doing the same with a numpy.int64 does not:

>>> import numpy

>>> json.dumps(numpy.int64(3))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/acroz/.pyenv/versions/3.6.2/Python.framework/Versions/3.6/lib/
python3.6/json/__init__.py", line 231, in dumps
return _default_encoder.encode(obj)
File "/Users/acroz/.pyenv/versions/3.6.2/Python.framework/Versions/3.6/lib/
python3.6/json/encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/Users/acroz/.pyenv/versions/3.6.2/Python.framework/Versions/3.6/lib/
python3.6/json/encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "/Users/acroz/.pyenv/versions/3.6.2/Python.framework/Versions/3.6/lib/
python3.6/json/encoder.py", line 180, in default
o.__class__.__name__)
TypeError: Object of type 'int64' is not JSON serializable

For that reason, you’ll want to make sure any content you’re serialising with flask.jsonify()
is converted to native Python types. In the above example:

>>> numpy_integer = numpy.int64(3)

>>> json.dumps(int(numpy_integer))
'3'

And for arrays:

>>> array_1d = numpy.array([1., 1.5, 2.])

>>> json.dumps([float(v) for v in array_1d])
'[1.0, 1.5, 2.0]'
>>> array_2d = numpy.array([[1., 1.5], [1.5, 2.]])
>>> json.dumps([[float(v) for v in row] for row in array_2d])
'[[1.0, 1.5], [1.5, 2.0]]'
Wrapping a Data Science Model

I’ve covered some of the basics of wrapping Python functionality in HTTP endpoints using
Flask; now I’ll go through a brief example of a scikit-learn model that I want to wrap in an API.

Example Model

scikit-learn provides some convenient functions for generating training data that you can use to
test out models. I’m using make_classification from sklearn.datasets to generate some
data to fit a binary classifier to:

from sklearn.datasets import make_classification

X, y = make_classification(
n_samples=100,
n_features=2,
n_classes=2,
n_informative=2,
n_redundant=0
)

This generates two clusters in a 2-dimensional feature space:

Using scikit-learn, it’s fairly easy to train a simple logsitic regression classifier to this data:

from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X, y)

We can then use this trained classifier to predict the class of a point with a value of 2 for both
features:

>>> import numpy

>>> X_predict = numpy.array([[2, 2]])
>>> model.predict(X_predict)
array([1])

We can also determine the probability of that point being of either class:

>>> model.predict_proba(X_predict)
array([[ 0.02086766, 0.97913234]])

Plotting the line where both probabilities are 0.5 allows us to see the decision boundary predicted
by the model:

Model as an API

I now want to make the prediction functionality of this model to be exposed through an API.
Assuming that the trained model is available in the module namespace as model, we can register
a /predict endpoint with the Flask app that takes the two features as inputs:

@app.route('/predict/feature_1/<feature_1>/feature_2/<feature_2>')
def predict(feature_1, feature_2):

# Convert inputs from strings to floats

feature_1 = float(feature_1)
feature_2 = float(feature_2)

# Perform model prediction

features = numpy.array([[feature_1, feature_2]])
predicted_class = model.predict(features)[0]
probabilities = model.predict_proba(features)[0]

# Prepare response
content = {
'class': int(predicted_class),
'probabilities': [
float(p) for p in probabilities
]
}

return jsonify(content)

This example puts together the Flask features described above: the features are extracted from
the endpoint and passed as arguments to the function, used to make a model prediction, and a
JSON response is generated and returned.

We can then query the endpoint to do a model prediction:

$ curl localhost:5000/predict/feature_1/2.0/feature_2/2.0
{
"class": 1,
"probabilities": [0.02086766, 0.97913234]
}

There remain a number of improvements that can be made to this endpoint. For example,
consider what happens when either feature_1 or feature_2 passed to predict() can’t be
converted to a valid float? In further blog posts, I’ll cover ways of guarding against such issues
and provide examples covering more complicated flows that you may wish to implement.

Data Science APIs: Building Robust APIs

08 Nov 2017

This post is part of the series Data Science APIs.

In the previous post in this series, I covered the basics of building web APIs with Flask.
However, without a little care and attention, it’s easy to introduce some unpleasant bugs that
could cause your API to fail in unexpected ways or introduce serious security holes. In this blog
post I’ll provide a few tips and tricks you can use to guard against these issues.

JSON Error Handler

By default, Flask returns HTML-formatted pages when an error occurs trying to fulfill a request.
For instance, when an endpoint is requested that does not correspond to anything known on the
server, it returns the well known 404 error response, which indicates ‘Not Found’ (for a full list
of HTTP status codes check out httpstatuses.com), with the following content:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

<title>404 Not Found</title>
<h1>Not Found</h1>
<p>
The requested URL was not found on the server. If you entered the URL
manually please check your spelling and try again.
</p>
For the primary use case for Flask, serving HTTP web pages, this is great, as it returns something
that a browser can display and show something meaningful to the user. However, for an API
expecting a JSON response it is less useful, as any attempt to interpret the response as JSON will
fail. For example, when using requests to query a missing endpoint on a Flask server:

>>> import requests

>>> response = requests.get('http://localhost:5000')
>>> response.status_code
404
>>> response.json()
Traceback (most recent call last):
File "call.py", line 2, in <module>
requests.get('http://localhost:5000').json()
File
"/home/acroz/.pyenv/versions/3.6.3/lib/python3.6/site-packages/requests/
models.py", line 892, in json
return complexjson.loads(self.text, **kwargs)
File "/home/acroz/.pyenv/versions/3.6.3/lib/python3.6/json/__init__.py",
line 354, in loads
return _default_decoder.decode(s)
File "/home/acroz/.pyenv/versions/3.6.3/lib/python3.6/json/decoder.py", line
339, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/home/acroz/.pyenv/versions/3.6.3/lib/python3.6/json/decoder.py", line
357, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Instead, we’d like to configure Flask to always return JSON. To do this, first make a function
that creates a flask Response for a given Python exception:

from werkzeug.exceptions import HTTPException, InternalServerError

def json_errorhandler(exception):
"""Create a JSON-encoded flask Response from an Exception."""

if not isinstance(exception, HTTPException):

exception = InternalServerError()

response = jsonify({
'error': exception.name,
'description': exception.description
})
response.status_code = exception.code

return response

Note that the HTTPException class from werkzeug is the exception type used by Flask to
represent HTTP failure cases, like 404 above. Any other Python exception being passed here
indicates that an exception was raised while handling a request, and we therefore convert to an
InternalServerError, which returns the corresponding 500 status response.

It’s then a simple matter to register the function as the error handler:
from werkzeug.exceptions import default_exceptions

for code in default_exceptions.keys():

app.register_error_handler(code, json_errorhandler)

Now error messages are formatted with JSON by default. Requesting a missing endpoint as in
the example above will return a JSON encoded body containing a descriptive error message:

{
"error": "Internal Server Error",
"description": "The requested URL was not found on the server. If you
entered the URL manually please check your spelling and try again."
}

This can now be read reliably by requests:

>>> import requests

>>> response = requests.get('http://localhost:5000')
>>> response.status_code
404
>>> response.json()
{'error': 'Internal Server Error', 'description': 'The requested URL was not
found on the server. If you entered the URL manually please check your
spelling and try again.'}
Malformed Inputs

Another issue you might come across is badly formatted inputs. Looking back at the example
from the previous post in this series , consider what happens to the arguments passed to the
function as feature_1 and feature_2. They first get passed to float():

feature_1 = float(feature_1)
feature_2 = float(feature_2)

In the case that the passed strings can be converted to floats, this will work well, however when
something like 'invalid' is passed, it will raise a ValueError. This exception is not caught
inside our API function, and so Flask catches it and returns a 500 Internal Server Error to the
client.

This isn’t a great experience, since there’s no indication to the caller why failure occurred, and
worse still, it implies that the failure was due to some mistake in the code rather than because of
bad inputs being provided.

To correct this, you should catch the ValueError and use Flask’s abort helper function to exit
from the function early and return a 400 HTTP response to the caller, indicating a ‘Bad Request’.
The example above then becomes:

from flask import abort

@app.route('/predict/feature_1/<feature_1>/feature_2/<feature_2>')
def predict(feature_1, feature_2):
# Convert inputs from strings to floats
try:
feature_1 = float(feature_1)
feature_2 = float(feature_2)
except ValueError:
abort(400, 'Input features were not valid floats')

# Model prediction code ...

return jsonify(content)

You could also check that the inputs are vaild explicitly before attempting the conversion,
however Pythonic style usually prefers trying to perform a type conversion first and catching the
exception in the case that it failed.

So now, when calling the client with requests, instead of the generic 500 response received
previously:

>>> endpoint = '/predict/feature_1/1.0/feature_2/invalid'

>>> response = requests.get(f'http://localhost:5000{endpoint}')
>>> response.status_code
500
>>> response.json()
{'error': True, 'message': 'Internal server error'}

We now get a response with a much more approriate status code and a useful error message:

>>> endpoint = '/predict/feature_1/1.0/feature_2/invalid'

>>> response = requests.get(f'http://localhost:5000{endpoint}')
>>> response.status_code
400
>>> response.json()
{'error': True, 'message': 'Input features were not valid floats'}
Variable Rules

If you’re familiar with Flask, you may be aware that it provides some functionality for validating
URL parts match certain formats, called variable rules. It’s a nice feature that avoids writing
boilerplate code, however be careful of the float converter - at the time of writing, it doesn’t
support negative values.

SQLAlchemy

In my model example in the previous post in this series , I trained my classifier to some sample
data generated with scikit-learn, however in a useful model you’ll want to train it to some real
data of significance. It’s good practice to keep such data in a proper database for peristence,
rather than keeping state in memory, which will be lost when the application restarts, or by
managing some custom text files, which can be error prone.
SQL databases are most common for this kind of task, especially when dealing with tabular data,
and the most common library for interacting with them is SQLAlchemy. I’m not going to give a
tutorial here on how to interact with databases with SQLAlchemy, but to give you an idea of a
typical use case, here is that same model trained to data read from a PostgreSQL database:

import pandas
import sqlalchemy

# Create connection with database

DATABASE_URL = 'postgres://localhost/postgres'
engine = sqlalchemy.create_engine(DATABASE_URL)
connection = engine.connect()

# Read query result directly into a pandas dataframe

query = 'SELECT * FROM data'
data = pandas.read_sql(query, connection)

# Train the model

features = numpy.array(data[['feature_1', 'feature_2']])
classes = numpy.array(data['class'])
model = LogisticRegression()
model.fit(features, classes)
Integration with Flask

SQLAlchemy needs some configuration to manage its connection pool and session in an optimal
way for an HTTP request context; if managed improperly, stale database connections could
accumulate and result in an application failure.

Rather than setting up this configuration yourself, I recommend using Flask-SQLAlchemy,

which does it for you. To set it up:

from flask import Flask

from flask_sqlalchemy import SQLAlchemy

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = \
'postgres://localhost/postgres'
db = SQLAlchemy(app)

You can then use db.engine as the connection object with pandas.read_sql(). In the above
example, run:

data = pandas.read_sql(query, db.engine)

Guarding against SQL Injection

Constructing SQL queries from API endpoints is dangerous, as demonstrated by a well known
xkcd strip:
If you were to execute a SQL query constructed from some input provided for a user, it would be
possible for a caller to delete all your data, as in this xkcd comic. Constructing queries by normal
Python string interpolation is therefore not a good idea:

connection.execute(
f'SELECT * FROM some_table WHERE name = "{user_input}"'
)

It’s easy to guard against this by simply using SQLAlchemy’s query interpolator, which sanitises
any provided data:

connection.execute(
'SELECT * FROM mytable WHERE name = :name',
name=name
)

Using the suggestions and tips provided in this post, you’ll be able to write APIs in Flask which
are more robust and secure. In the next post in this series, I’ll show how to handle longer running
calculations effectively in your API.

Data Science APIs: Long Running Tasks

07 Jan 2018

This post is part of the series Data Science APIs.

Data science models often require longer running computations for training or predicion, but
HTTP requests that take longer than a few seconds to respond are at increased risk of failure. In
this post, I’ll show you how to handle longer running tasks in your API with RQ.

RQ (Redis Queue) is a simple job queue library for Python. The idea with job queue systems is
that you have dedicated worker processes which consume tasks from a shared queue, which the
main application process can insert tasks into. In the case of an API server, this allows the main
process to insert a long running task into the queue and return immediately, rather than blocking
until the task is complete and risking connection failure.

To use RQ, put a function defining your task in a separate module from your main application
code (so it is easily importable by the workers):

# tasks.py
import time

def slow_multiply(x, y):

time.sleep(10)
return x * y

Then, in your main code, create an RQ queue using Redis as the queue store, and submit tasks to
it with enqueue():

# main.py
from redis import Redis
from rq import Queue
import tasks

queue = Queue(connection=Redis())

x = 2
y = 3
queue.enqueue(tasks.slow_multiply, x, y)

For your application to run, you’ll need to install and run the Redis server separately, in addition
to running one or more RQ workers with:

$ rq worker
POST/GET Example with Flask

You’ll usually want to get the result of a long running task back, however if some API code
executed when handling a request submits a job into the queue and returns immediately, the
caller will not know the result.

A common model to get results back from the API in this case is for it to return a task ID to the
client that it can then use to query for the result later. The sequence of events looks like:

1. The client sends a request to the API to perform a task.

2. The API server generates a unique task ID, submits the task and ID to the queue, and returns the
ID to the client.
3. A worker will pop the task off the queue, perform it, then insert the result into a database with
the ID provided.
4. In the meantime, the client sends a request with the task ID to the API, which initially responds
to say no result is available. The client retries this request at a fixed interval.
5. Once the result is available in the database, the next time the client queries for the result, it is
retrieved and returned to the client.

Below is a minimal Flask example implementing this model for my trivial slow_multiply()
task above:

core.py provides objects required by both the Flask app and the workers:

from flask import Flask

from flask_sqlalchemy import SQLAlchemy
from rq import Queue
from redis import Redis

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = \
'postgresql://username:password@host:port/database'
db = SQLAlchemy(app)
queue = Queue(connection=Redis())

tasks.py implements the long running task, writing the result into the database when done:

import time
from core import db

def slow_multiply(task_id, x, y):

time.sleep(10)
result = x * y
db.engine.execute(
'INSERT INTO results (task_id, result) ' +
'VALUES (:task_id, :result)',
task_id=str(task_id), result=result
)

views.py, implements the endpoints on the Flask app:

from flask import request, jsonify

import uuid
from core import app, db, queue
from tasks import slow_multiply

@app.route('/multiply', methods=['POST'])
def submit_multiplication():
"""Queue a multiplication and return the task ID."""
body = request.get_json(force=True)
task_id = uuid.uuid4() # new unique id
queue.enqueue(
slow_multiply,
task_id, body['x'], body['y']
)
return jsonify({'task_id': str(task_id)}), 202

@app.route('/multiply/<task_id>', methods=['GET'])
def get_multiplication(task_id):
"""Return the result for a task ID, if completed."""
query_result = db.engine.execute(
'SELECT result FROM results WHERE task_id = :task_id',
task_id=task_id
).first()
if query_result is None:
abort(404)
else:
return jsonify({'result': query_result[0]})

if __name__ == '__main__':
app.run()

To run this app, you’ll need to run a Redis server locally, run the Flask app with python
views.py, and run one or more RQ workers. However, since the workers now use the Flask-
SQLAlchemy db object, the workers need to be run in the Flask application context. The
following snippet is all you need for this:

from rq import Connection, Worker

from core import app, queue

with app.app_context():
with Connection():
w = Worker([queue])
w.work()
Note

If you're not using Flask-SQLAlchemy in your project, you won't need to use the above snippet
to implement your own worker. Just run rq worker to run a worker process.

Example Client Code

To submit a multiplication task to be executed by the workers, submit a POST to the /multiply
endpoint with the expected JSON body:

>>> import requests

>>> response = requests.post(
>>> 'http://your-api.com/predict',
>>> json={'x': 2.3, 'y': 2.0}
>>> )
>>> response.status_code
202

The body of the response includes the task ID:

>>> response.json()
{'task_id': '6ba6be8e-ed13-4216-9959-2edcf08dd8f0'}

This can then be used to check if the result is ready:

>>> task_id = response.json()['task_id']

>>> url = f'http://your-api.com/predict/{id}'
>>> response = requests.get(url)
>>> response.status_code
404

Once the result is ready, the status code will be 200 and not 404:

>>> response = requests.get(url)

>>> response.status_code
200
>>> response.json()
{'result': 4.6}

SVM Using Python
No ratings yet
SVM Using Python
24 pages
Chapter 1 - Software Testing (Lecture 1 & 2)
No ratings yet
Chapter 1 - Software Testing (Lecture 1 & 2)
68 pages
Big Data Analytics - Unit 1
No ratings yet
Big Data Analytics - Unit 1
43 pages
Programming Fundamental All Chapter
100% (1)
Programming Fundamental All Chapter
265 pages
What Is The Difference Between XML, HTML, and XHTML?
No ratings yet
What Is The Difference Between XML, HTML, and XHTML?
4 pages
Java Complete Reference (2023)
No ratings yet
Java Complete Reference (2023)
292 pages
Building Beautiful Restful Apis Using Flask 1
No ratings yet
Building Beautiful Restful Apis Using Flask 1
34 pages
REST Vs GraphQL
No ratings yet
REST Vs GraphQL
37 pages
Web Technology II PHP JS
No ratings yet
Web Technology II PHP JS
211 pages
2 IntroductionToRDBMS
No ratings yet
2 IntroductionToRDBMS
192 pages
Latest Research and Development On Software Testing Techniques and Tools
No ratings yet
Latest Research and Development On Software Testing Techniques and Tools
5 pages
Linear Models: The Least-Squares Method, The Perceptron: A Heuristic Learning Algorithm For
No ratings yet
Linear Models: The Least-Squares Method, The Perceptron: A Heuristic Learning Algorithm For
25 pages
Questions Alten Accademy
No ratings yet
Questions Alten Accademy
3 pages
Qlik Sense Concat
No ratings yet
Qlik Sense Concat
5 pages
Uml PDF
No ratings yet
Uml PDF
29 pages
Harish Oraon
100% (1)
Harish Oraon
5 pages
Lecture 8 - Java Servlets
No ratings yet
Lecture 8 - Java Servlets
36 pages
AI ML Program Playbook (McCombs)
No ratings yet
AI ML Program Playbook (McCombs)
4 pages
API
No ratings yet
API
21 pages
50 Python Concepts Every Developer Should Know
From Everand
50 Python Concepts Every Developer Should Know
Hernando Abella
No ratings yet
Data Mining 101
No ratings yet
Data Mining 101
50 pages
Master Machine Learning in Just 30 Days Version01
No ratings yet
Master Machine Learning in Just 30 Days Version01
25 pages
Api Design Practice
No ratings yet
Api Design Practice
20 pages
Types of Software Testing
No ratings yet
Types of Software Testing
299 pages
Starting With UML - Cheatsheet, 2014
No ratings yet
Starting With UML - Cheatsheet, 2014
10 pages
Power BI Syllabus
No ratings yet
Power BI Syllabus
7 pages
Essential Python Libraries and Functions For Data Science 1706295212
No ratings yet
Essential Python Libraries and Functions For Data Science 1706295212
12 pages
ER Diagram-Automotive Sales Company
No ratings yet
ER Diagram-Automotive Sales Company
11 pages
Lect Final 1
No ratings yet
Lect Final 1
8 pages
KPIs and Metrics For An HR Dashboard
No ratings yet
KPIs and Metrics For An HR Dashboard
6 pages
Sofware Engineering 82% Unified Modeling Language 80%
100% (1)
Sofware Engineering 82% Unified Modeling Language 80%
4 pages
Testing Angular
No ratings yet
Testing Angular
452 pages
Experiment-6: AIM-Write A Program To Implement XOR Gate Using Mcculloch-Pitts Neuron. Program
No ratings yet
Experiment-6: AIM-Write A Program To Implement XOR Gate Using Mcculloch-Pitts Neuron. Program
3 pages
Mp3 Music Player Application Development Using Android
No ratings yet
Mp3 Music Player Application Development Using Android
310 pages
Eye Test
No ratings yet
Eye Test
2 pages
DevOps с Laravel 2. Docker Swarm
No ratings yet
DevOps с Laravel 2. Docker Swarm
82 pages
Data Engineering 6 Months Plan
No ratings yet
Data Engineering 6 Months Plan
3 pages
DevOps Deck PDF
No ratings yet
DevOps Deck PDF
31 pages
Chapter 4 Software Poroject Planning
No ratings yet
Chapter 4 Software Poroject Planning
52 pages
QlikSense TopologiesV0 - 18 PDF
No ratings yet
QlikSense TopologiesV0 - 18 PDF
1 page
M.C.A. Syllabus
No ratings yet
M.C.A. Syllabus
17 pages
Introduction To AngularJS
No ratings yet
Introduction To AngularJS
74 pages
Frontend Development Hands-On Training Program
No ratings yet
Frontend Development Hands-On Training Program
16 pages
MVC Design Pattern PPT Presented by QuontraSolutions
No ratings yet
MVC Design Pattern PPT Presented by QuontraSolutions
35 pages
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
No ratings yet
Future Skills - An Introduction, General Overview of The Future Skills Sub-Sector-1
15 pages
Unit 2 Middleware
No ratings yet
Unit 2 Middleware
13 pages
Python AI ML LLM TrainingJun142024
No ratings yet
Python AI ML LLM TrainingJun142024
192 pages
Building An Analytics Platform
No ratings yet
Building An Analytics Platform
145 pages
Resume Ishita Garg PDF
No ratings yet
Resume Ishita Garg PDF
2 pages
TCS Technical Interview Questions
No ratings yet
TCS Technical Interview Questions
15 pages
CBSE Python Roblox Clusters 2024-08-16
No ratings yet
CBSE Python Roblox Clusters 2024-08-16
552 pages
Full Stack Development
No ratings yet
Full Stack Development
5 pages
AI Notes Updating
No ratings yet
AI Notes Updating
95 pages
MYD IT Solutions Technical Support Written Exam
No ratings yet
MYD IT Solutions Technical Support Written Exam
2 pages
Samuel - Object-Oriented Systems Analysis and Design
No ratings yet
Samuel - Object-Oriented Systems Analysis and Design
317 pages
A To Z Preparation Guide For Code With Cisco by Vikram
No ratings yet
A To Z Preparation Guide For Code With Cisco by Vikram
17 pages
Manual Testing Cheat Sheet
No ratings yet
Manual Testing Cheat Sheet
9 pages
SETLabs Briefings Software Validation
No ratings yet
SETLabs Briefings Software Validation
75 pages
AI Intro and Use Cases
No ratings yet
AI Intro and Use Cases
6 pages
Web Technology Lab Manual PDF
No ratings yet
Web Technology Lab Manual PDF
74 pages
SpaCy Cheat Sheet
No ratings yet
SpaCy Cheat Sheet
2 pages
Languages Visual Dictionary English Spanish French German Italian
No ratings yet
Languages Visual Dictionary English Spanish French German Italian
309 pages
Asyncio Event in Python
No ratings yet
Asyncio Event in Python
7 pages
Tools and Resources For Romanian Text-To-speech and Speech-To-text
No ratings yet
Tools and Resources For Romanian Text-To-speech and Speech-To-text
8 pages
Making 1 Million Requests With Python-Aiohttp
No ratings yet
Making 1 Million Requests With Python-Aiohttp
9 pages
Users and Contexts of Use of Romanian Multilingual Dictionaries
No ratings yet
Users and Contexts of Use of Romanian Multilingual Dictionaries
20 pages
Pyscript Realpython
No ratings yet
Pyscript Realpython
47 pages
Bing Search Operators Cheat Sheet
100% (1)
Bing Search Operators Cheat Sheet
34 pages
JWT For User Authentication in Flask
No ratings yet
JWT For User Authentication in Flask
8 pages
Terminology Search Tools For IMOT
100% (1)
Terminology Search Tools For IMOT
19 pages
Show Local and Internet Ip
No ratings yet
Show Local and Internet Ip
1 page
RAL Colours Multilingual
No ratings yet
RAL Colours Multilingual
7 pages
Word Sense Disambiguation Methods Applied To English and Romanian
No ratings yet
Word Sense Disambiguation Methods Applied To English and Romanian
8 pages
A Processing Platform Relating Data and Tools For Romanian Language
No ratings yet
A Processing Platform Relating Data and Tools For Romanian Language
8 pages
TMop - Tool For Unsupervised Translation Memory Cleaning
No ratings yet
TMop - Tool For Unsupervised Translation Memory Cleaning
6 pages
Fast String Matching in Python
No ratings yet
Fast String Matching in Python
5 pages
Speech and Language Processing
100% (1)
Speech and Language Processing
623 pages

Data Science APIs With Flask

Uploaded by

Data Science APIs With Flask

Uploaded by

Data Science APIs

This post is part of the series Data Science APIs.

An Application Programming Interface (API) is a common concept in software engineering.

Advantages of APIs using HTTP

Disadvantages of APIs using HTTP

There are some disadvantages to using HTTP:

Data Science APIs: Flask

This post is part of the series Data Science APIs.

from flask import Flask

Running the app is as simple as running the script:

>>> import requests

This can again be queried by choosing the right URL:

from flask import jsonify

You can then query the endpoint as normal:

>>> response = requests.get(

>>> import json

Doing the same with a numpy.int64 does not:

>>> import numpy

>>> numpy_integer = numpy.int64(3)

And for arrays:

>>> array_1d = numpy.array([1., 1.5, 2.])

from sklearn.datasets import make_classification

This generates two clusters in a 2-dimensional feature space:

from sklearn.linear_model import LogisticRegression

>>> import numpy

# Convert inputs from strings to floats

# Perform model prediction

We can then query the endpoint to do a model prediction:

Data Science APIs: Building Robust APIs

This post is part of the series Data Science APIs.

JSON Error Handler

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

>>> import requests

from werkzeug.exceptions import HTTPException, InternalServerError

if not isinstance(exception, HTTPException):

for code in default_exceptions.keys():

This can now be read reliably by requests:

>>> import requests

from flask import abort

# Model prediction code ...

>>> endpoint = '/predict/feature_1/1.0/feature_2/invalid'

>>> endpoint = '/predict/feature_1/1.0/feature_2/invalid'

# Create connection with database

# Read query result directly into a pandas dataframe

# Train the model

Rather than setting up this configuration yourself, I recommend using Flask-SQLAlchemy,

from flask import Flask

data = pandas.read_sql(query, db.engine)

Data Science APIs: Long Running Tasks

This post is part of the series Data Science APIs.

def slow_multiply(x, y):

1. The client sends a request to the API to perform a task.

from flask import Flask

def slow_multiply(task_id, x, y):

views.py, implements the endpoints on the Flask app:

from flask import request, jsonify

from rq import Connection, Worker

Example Client Code

>>> import requests

The body of the response includes the task ID:

This can then be used to check if the result is ready:

>>> task_id = response.json()['task_id']

>>> response = requests.get(url)

You might also like