0% found this document useful (0 votes)
787 views133 pages

APIs For AI and Data Science (For DUC PHAM) (Ryan Day)

APIs for AI and Data Science (for DUC PHAM) (Ryan Day)

Uploaded by

vagafix441
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
787 views133 pages

APIs For AI and Data Science (For DUC PHAM) (Ryan Day)

APIs for AI and Data Science (for DUC PHAM) (Ryan Day)

Uploaded by

vagafix441
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 133

Hands-On APIs for AI and Data

Science
Using Python to Build and Use APIs for Machine Learning and
Data Analytics

With Early Release ebooks, you get books in their earliest form—the
author’s raw and unedited content as they write—so you can take advantage
of these technologies long before the official release of these titles.

Ryan Day
Hands-On APIs for AI and Data
Science
by Ryan Day

Copyright © 2025 Ryan Day. All rights reserved.

Printed in the United States of America.

Published by O’Reilly Media, Inc., 1005 Gravenstein Highway


North, Sebastopol, CA 95472.

O’Reilly books may be purchased for educational, business, or


sales promotional use. Online editions are also available for
most titles (http://oreilly.com). For more information, contact
our corporate/institutional sales department: 800-998-9938 or
[email protected].

Acquisitions Editor: Michelle Smith

Development Editor: Corbin Collins

Production Editor: Christopher Faucher

Interior Designer: David Futato

Cover Designer: Karen Montgomery


May 2025: First Edition

Revision History for the Early Release


2024-01-17: First release

See http://oreilly.com/catalog/errata.csp?isbn=9781098164416
for release details.

The O’Reilly logo is a registered trademark of O’Reilly Media,


Inc. Hands-On APIs for AI and Data Science, the cover image,
and related trade dress are trademarks of O’Reilly Media, Inc.

The views expressed in this work are those of the author and do
not represent the publisher’s views. While the publisher and
the author have used good faith efforts to ensure that the
information and instructions contained in this work are
accurate, the publisher and the author disclaim all
responsibility for errors or omissions, including without
limitation responsibility for damages resulting from the use of
or reliance on this work. Use of the information and
instructions contained in this work is at your own risk. If any
code samples or other technology this work contains or
describes is subject to open source licenses or the intellectual
property rights of others, it is your responsibility to ensure that
your use thereof complies with such licenses and/or rights.

978-1-098-16435-5
Chapter 1. Becoming An API Provider

A NOTE FOR EARLY RELEASE READERS

With Early Release ebooks, you get books in their earliest form
—the author’s raw and unedited content as they write—so you
can take advantage of these technologies long before the official
release of these titles.

This will be the 1st chapter of the final book

If you have comments about how we might improve the content


and/or examples in this book, or if you notice missing material
within this chapter, please reach out to the editor at
[email protected].

Before you start building your first API, let me ask a question:
why would you want to build one in the first place? Being an
API provider is going to take money, time, and effort that you
could spend on other parts of your business or project. So you
should be able to state the reason you would take on this effort.

Here are some typical reasons you might want to produce an


API:
You have an existing application or system that you would
like to provide partner access to. For example, a company
with a medical billing platform may create APIs to allow
doctors’ offices and hospitals to submit invoices.
People are accessing your website via screen-scraping or
reverse engineering website APIs, which indicates a
demand for an API.
You would like to extend the reach of your core product or
service to a broader audience, such as value-add products
or onto additional platforms.

Some API providers have a data-related mission and goals of


deploying models or advanced analytics themselves. For
organizations with a data science or data analytics mission,
there are some unique reasons that you may be publishing
APIs:

You have valuable data, analytics, or metrics that you


would like to provide to the public or partners. If your
organization has a public mission (like a government
agency) this may be free to the public. If not, the data may
be a company product that you charge for.
You have created statistical or machine learning models
that you want to make available for inference by external
consumers.
You have developed generative AI models that would like to
make available to application builders.

Another relevant reason to publish APIs is to provide hosting or


platform services, as the large cloud hosting services do.

In summary, although the choice to host an API should not be


taken lightly, there are many solid business reasons to do so.
The following list introduces the related to API hosting:

Partner API

An external API that is available to a limited set of


business partners or members. It requires registration
and is secured.

Private API

An API shared only inside your organization.

Public API

An external API that is available to the general public. It


may still require registration and security.

Meet Your Company: Sports World


Central
Hosting an API is a business activity first, and a technical
activity second. So it is worth taking a moment to understand
the business context of your example project. The example you
will follow throughout Part 1 is a website named Sports World
Central, SWC for short. SWC hosts fantasy games such as
fantasy football and fantasy soccer. This places your company
in the middle of a thriving marketplace of fans who play
fantasy sports and subscribe to a variety of fantasy tools.
Figure 1-1 illustrates the business landscape that SWC competes
in.

Figure 1-1. Fantasy sports landscape

The landscape revolves around fantasy managers. These people


own a fantasy team on a league host website. They play against
friends or strangers and often have multiple teams and use
multiple league hosts, such as ESPN Fantasy or
MyFantasyLeague. Managers often visit free advice websites to
get help managing their teams and making decisions such as
which players to draft at the beginning of the season or who to
start in each week’s games. If they are serious about fantasy,
they are willing to pay for subscriptions to some of the more
feature-rich advice websites.

There are around 40 million fantasy football managers


according to industry estimates, making fantasy football the
largest U.S. fantasy sport. Fantasy soccer is popular worldwide.
For example, more than 10 million active managers compete on
the Fantasy Premier League website, which follows just one of
the major international soccer leagues.

The fantasy manager expects all of the league hosts and advice
websites to connect to each other so that managers don’t have
to manually enter their team roster in multiple places for
advice. They don’t care how the websites accomplish this; they
expect it to just work.

SWC is a fantasy league host website. League hosts like SWC


perform a range of activities, including registering teams,
conducting online fantasy drafts, calculating game scores to
determine winners, and hosting various message boards or
chat rooms for managers. Most league hosts are free and
generate business through advertising views or by drawing an
audience to articles and other content on the site. A few league
hosts charge for an ad-free experience.

The league host’s key asset is a registered and active manager.


Most managers stick with a league host for an entire season,
which is five months for professional football and even longer
for soccer. During that season, if any of the advice websites
want the business of the manager, they’ll be highly motivated to
connect with you. If the league host delivers a good experience,
they can get entire fantasy leagues to return to their platform
year after year. (Some fantasy leagues have been together for
decades.)

Because SWC is an example company just for your project,


there is no real-world website to show. Figure 1-2 displays the
Yahoo! Fantasy league host website, which gives you an idea of
what your website might look like.
Figure 1-2. Example league host website (Yahoo!)

While most league hosts are fairly similar, advice websites vary
widely. On one end of the spectrum are sites with basic advice
articles and projected rankings. On the other end are full-
featured management platforms supporting multiple teams
from multiple league hosts. Somewhere in the middle are sites
with some automated features such as “rate my team” or
weekly projections tailored to a specific league host. The advice
websites consume data from sports data sources and league
hosts to create models and analytics to help the user analyze
their team and make decisions throughout the season. Some
advice websites are ad-supported, and others charge a
subscription fee.
Figure 1-3 shows an example of the analytics products that an
advice website provides, in this case, Fantasypros.

Figure 1-3. Example advice website (Fantasypros)

Sports data sources are companies that collect and sell sports
data to a variety of different subscribers, including broadcast
and media sources, league web hosts, advice websites, and
individuals. The data may include traditional statistics, sensor-
driven telemetry data, labeled training data for models, or
machine learning models.

These data providers depend on business relationships with


league hosts and advice and management platforms along with
subscriptions from end users. The key to their success is
providing reliable and authoritative data through a variety of
technical channels.
Figure 1-4 demonstrates AWS NextGen Stats, which is a sports
data source.

Figure 1-4. Example sports data source (AWS NextGenStats)


APPLYING THESE CONCEPTS TO OTHER BUSINESSES

Throughout the book, you will be exploring how APIs are used
in fantasy sports, which is a large and growing entertainment
business. However, the business and technical concepts you
learn apply to many other businesses.

You will see that a common pattern is a primary source that


owns a customer relationship of some kind, and secondary
service providers that access the primary data via API to provide
value-added services.

Here are a few examples:

Fantasy sports: League host (primary), advice and team


management websites (secondary)
Personal finance: Bank (primary), retirement calculator
(secondary), monthly budgeting tool (secondary)
Travel: Travel reservations website (primary), weather
website (secondary), travel points tracker (secondary)
Personal fitness: fitness tracker website (primary), weight
loss calculator (secondary), personal training management
site (secondary)
SWC does not currently publish APIs for use by outside
consumers. The website grew steadily for the first few years
after launch, but it has hit a plateau in recent seasons.

You have received signals from several sources indicating that


SWC needs better connectivity to outside users. A noticeable
number of customer support tickets ask why SWC is not
supported by many fantasy advice websites. Additional
customer tickets are from data scientists and other data-savvy
users requesting direct access to the football data from the
website so they can create their own dashboards and metrics.
Mobile app developers have also offered to develop mobile apps
for your users on the mobile app stores if APIs are made
available. You believe that APIs show promise in increasing the
company’s reach and generating additional business. Your
portfolio project will guide you through the process of selecting,
building, and deploying the most valuable APIs.

Starting Your Portfolio Project


Have you learned a craft? What are your techniques? Are
you skilled at using certain tools and materials? What kind
of knowledge comes along with your job?

The minute you learn something, turn around and teach it to


others.

—Austin Kleon, Show Your Work

This book has three parts, with multiple related portfolio


projects. In Part 1, you will add functionality to your projects
step by step, while learning the responsibilities of being an API
provider:

Chapter 1: Developing your source data.


Chapter 2: Building your first API
Chapter 3: Documenting your API
Chapter 4: Deploying your API to the cloud
Chapter 5: Protecting your API
Chapter 6: Creating a software development kit (SDK) for
your API
Chapter 7: Deploying a machine learning model via API

Each chapter will introduce additional tools as you gradually


expand the capabilities of your API. Table 1-1 lists the first set of
tools you will be using.
Table 1-1. New tools or services used in this chapter

Software
Version Purpose
name

GitHub na Source control, development


environment, website host

Python 3.10 Programming language

SQLite 3.41.2 Store the data used by the APIs

SQLAlchemy 1.4.49 Object Relational Mapping


(ORM) library to connect
Python to SQLite

GitHub

GitHub is a website that plays a major role in software


development. At its core GitHub is a cloud host of source
control software, but it has added additional features over the
years. These capabilities are generally free or low-cost. Many
prominent open-source projects use GitHub to host their source
code and allow developers to contribute to the project.

You will use GitHub in several ways in this book. You will store
all of your program code in repositories while you develop it.
You will use GitHub Codespaces as your Python development
environment. You will use GitHub Pages to publish your
developer portal.

The reason the book uses so many of GitHub’s tools is that I find
they simplify environment management and work together
well. The end result will be a very professional API and data
science portfolio that demonstrates what you have
accomplished.

Most of the work could be done on a local machine or other


virtual environment instead of using GitHub’s capabilities.
However, the instructions will assume you are using GitHub.

Python

Python adoption has accelerated in recent years for a variety of


software development tasks. The 2022 Stack Overflow
Developer Survey found that Python was tied for first as the
language most developers want to use, and was number four in
most currently used. Python is very flexible and is used in a
variety of situations. It has been called “the second-best
language for any job.”
For data scientists, Python has become the most important tool
in their toolbox. The Anaconda 2022 State of Data Science
Report found that Python was the tool most frequently used by
data scientists, followed by SQL and R.

In Part 1, you’ll be using Python to develop APIs to share data


and serve machine learning models. For this book, you will be
using Python 3.10.

SQLite

APIs often serve data to their users, and that data is typically
stored in a relational database. For this book, you’ll be using
SQLite for the database storage of your APIs. SQLite is well
suited for learning projects like the ones you will be developing
for several reasons:

SQLite comes installed by default with Python.


It is file-based, and can easily be stored in a git repository
like the one you’ll be using.
It does not require any additional configuration or extra
libraries.

Despite being a “lightweight” database, SQLite supports all the


SQL commands that you will use and is fully supported by
SQLAlchemy, which you’ll use for Python database work. It is a
great choice to begin the prototyping of a project. You might
replace it with a traditional database such as PostgreSQL or
MySQL as the application or API develops. But it is used in
many production applications, as well.

You will use SQLite 3 for your project.

SQLAlchemy

SQLAlchemy is a popular Python database toolkit and object-


relational mapper (ORM), and it works nicely with FastAPI,
which will be introduced in Chapter 2. Here are a few of the
ways that SQLAlchemy benefits Python users:

It provides query access to databases using Python, without


using SQL.
It populates Python objects with the data from the source
database without requiring any conversion of datatypes.
It supports a variety of databases.
It allows the same Python code to be used with different
underlying databases.
It creates queries as prepared statements, which combat
SQL Injection attacks.
WARNING

SQL injection is a serious vulnerability in any software that accepts input from users
and queries a database with it, including web applications and APIs. It occurs when
bad actors insert malicious code into inputs that are intended for data values.

Using prepared statements (also known as parameterized statements) instead of raw


SQL queries is one technique to reduce the risk of SQL injection. For more
information, reference OWASP’s article on SQL Injection.

You will be using SQLAlchemy 1.4.49 for your project.

Developing Your Source Data


In the example scenario, you will assume that SWC is an
established website with a large amount of data about fantasy
teams, NFL players, and managers. You do not need to create a
website for your project, but you will create a database to
simulate a small portion of the data that would be contained in
a full league host website. This will allow you to create the APIs
using the data that would be maintained on a normal league
host website.
TIP

For instructions on creating a GitHub repository and a GitHub Codespace, view


Appendix A. You should commit your changes periodically to GitHub to avoid losing
code and more easily trace the history of your development.

To begin, log into GitHub.com and create a new GitHub


repository named hands-on-api with a basic readme.md file.
Create a new GitHub Codespace on the main branch of your
new repository. Your display should match Figure 1-5.

Figure 1-5. New GitHub Codespace

The default Codespace comes pre-loaded with Python and the


Visual Studio Code editor. This simplifies your development
environment setup. On a local development machine, you
would need to create a Python virtual environment. However,
Codespaces provision a clean environment dedicated to this
GitHub repository. So a virtual environment is not necessary.

Next, create a file in the root folder of your repository named


.gitignore. This file is important in excluding any files from your
Codespace that you don’t want to store in the repository. This is
useful for temporary files, cached files, and any sensitive
resources. Add the following contents to this file:

#gitignore for part 1


.mypy_cache
.vscode
__pycache__
.pytest_cache

# macOS
.DS_Store

# google cloud
**/gcloud/
**/google-cloud-sdk/
*.gz

# AWS
**/aws
*.zip

Verify that Python is loaded in your Codespace by entering


python3 --version in the terminal command line. You
should see the following or similar:

$ python3 --version
Python 3.10.13

The next tool you will be using for the job is SQLite.
Conveniently, SQLite comes installed with Python.

Create a directory for your Chapter 1 code with the following


commands:

mkdir chapter1_project
cd chapter1_project

Create a new database using SQLite with the following


command: sqlite3 fantasy_data.db

Which produces:
SQLite version 3.41.2 2023-03-22 11:56:21
Enter ".help" for usage hints.
sqlite>

Creating Database Tables

A league host website would have a large number of tables


containing fantasy data. For this project, you’ll construct a few
tables to provide data to the APIs you will build. You’ll begin
creating the database tables for one vital section of the website:
the player list and weekly scores. Figure 1-6 shows the database
tables you will start with.

Figure 1-6. Database table structure.

Create the Structured Query Language (SQL) table creation


scripts. These are often called DDL scripts because they use a
subset of SQL named Data Definition Language.
TIP

Within SQL, two subsets of commands are used for setting up and maintaining
databases:

Data Definition Language (DDL): The SQL commands that create database
structures.
Data Manipulation Language (DML): The SQL statements that change the data,
through inserts, updates, deletes, and similar statements.

Execute the following statements inside of the SQLite interface,


to create the initial database tables. (You will add more later to
represent a more complete database.)

/* Table creation script - Chapter 1*/


CREATE TABLE player (
player_id INTEGER NOT NULL,
first_name VARCHAR NOT NULL,
last_name VARCHAR NOT NULL,

PRIMARY KEY (player_id)


);

CREATE TABLE performance (


performance_id INTEGER NOT NULL,
week_number VARCHAR NOT NULL,
fantasy_points FLOAT NOT NULL,
player_id INTEGER,
PRIMARY KEY (performance_id),
FOREIGN KEY(player_id) REFERENCES player
);

Each table has a primary key, which is a column that uniquely


identifies each record in a table. The performance table also
defines a foreign key, which is a column in a child table that
matches the primary key of a parent table. This defines a
parent-child relationship between the tables.

This book does not teach the syntax of SQL, but the scripts used
are fairly basic. To learn more about SQL, I recommend
Learning SQL: Generate, Manipulate, and Retrieve Data, 3rd
Edition by Alan Beaulieu (O’Reilly, 2020).

To verify that the tables were created, enter .tables resulting


in the following:

sqlite> .tables
performance player

Loading Example Data

Now that the tables are created, you will insert some sample
data into them. There are several ways to do this, but for
simplicity, you will use a direct SQL statement.

First, load 10 rows into the player table by entering the


following command at the SQLite prompt:

-- Insert 10 kickers into the player table


INSERT INTO player (first_name, last_name)
VALUES
('Justin', 'Tucker'),
('Harrison', 'Butker'),
('Wil', 'Lutz'),
('Matt', 'Prater'),
('Mason', 'Crosby'),
('Daniel', 'Carlson'),
('Graham', 'Gano'),
('Younghoe', 'Koo'),
('Greg', 'Joseph'),
('Eddie', 'Pineiro');

You’ll notice that the script included values for three of the
columns in the table, but not the primary key player_id. Because
this is the primary key, SQLite auto-increments the value for
this field when new records are inserted.

Before querying the table, type .headers on so that column


names will be included in the results. Then enter select *
from player; resulting in:

sqlite> .headers on
sqlite> select * from player;
player_id|first_name|last_name
1|Justin|Tucker
2|Harrison|Butker
3|Wil|Lutz
4|Matt|Prater
5|Mason|Crosby
6|Daniel|Carlson
7|Graham|Gano
8|Younghoe|Koo
9|Greg|Joseph
10|Eddie|Pineiro

As a shortcut to loading the performances table, the script will


pull the values from the player table, and assign everyone a
fantasy_points score of 7.5 for week 1. Type the following
command at the SQLite prompt:

-- Insert sample into the performance table


INSERT INTO performance (player_id, week_number,
SELECT
player_id,
'2023_1',
7.5
FROM
player;

Verify the data loaded into the performance table. Type select
* from performance; resulting in:

sqlite> select * from performance;


performance_id|week_number|fantasy_points|player_
1|2023_1|7.5|1
2|2023_1|7.5|2
3|2023_1|7.5|3
4|2023_1|7.5|4

5|2023_1|7.5|5
6|2023_1|7.5|6
7|2023_1|7.5|7
8|2023_1|7.5|8
9|2023_1|7.5|9
10|2023_1|7.5|10

You have loaded sample data in your database.

Accessing Your Data Using Python


There are several ways to access this data in Python. For
example, you could create a connection to the database and
execute SQL queries directly. Although this method is fairly
simple, you would quickly run into several issues, such as
mapping the SQLite datatypes into Python objects. If you
wanted to dynamically query the database based on user input
(as you will when you create your APIs), you would need to take
steps to avoid SQL injection attacks.

This is where an ORM saves a lot of effort. You will be using a


very common Python ORM: SQLAlchemy.

Installing SQLAlchemy In Your


Environment.

SQLAlchemy is the first Python library that you will need to


install directly in your virtual environment. You want to be
certain of the version of SQLAlchemy installed, so first create a
requirements.txt file in the directory with your Python code.

In your editor, create a file named requirements.txt with the


following contents and save the file:

#Chapter 1 pip requirements


SQLAlchemy==1.4.49
In future chapters, you will add additional Python libraries.
Using the requirements file is a convenient way to install
multiple libraries, and make sure the versions of the libraries
are all compatible with one another.

To install the libraries, execute the following command:

pip3 install -r requirements.txt

You should see a message that states that states SQLAlchemy


1.4.49 has been successfully installed, or was “already satisfied.”
To verify this, type the following command:

pip3 show SQLAlchemy

That should return:

Name: SQLAlchemy
Version: 1.4.49
Summary: Database Abstraction Library
Home-page: https://www.sqlalchemy.org
Author: Mike Bayer

Creating the Python Files For Database


Access
You will now create the files that are required to query the
database using Python. Create all files in the chapter1_project
directory. These follow the template used in the SQL
(Relational) Databases section of the FastAPI Tutorial, with a
few updates.

The directory listing when you complete will look like the
following:

.
└── chapter1_project
├── database.py
├── fantasy_data.db
├── main_cli.py
├── models.py
└── requirements.txt

Table 1-2 summarizes the purpose of each file.


Table 1-2. Purpose of the database-related Python files

File name Purpose

database.py Configures SQLAlchemy to use the SQLite


database

fantasy_data.db SQLite database containing source data

main_cli.py Connects to the database and executes a


query.

models.py Defines the Python classes that match the


database tables

requirements.txt Identifies specific versions of libraries


used for the project

database.py will set up the SQLAlchemy configuration to


connect to the SQLite database, along with some other Python
objects that you’ll use for database work.

The tasks that you need to accomplish in this file are the
following:

Create a database connection that points to the SQLite


database and has the correct settings.
Create a parent class that you’ll use to define the Python
table classes

Here is the complete database.py file:

"""Database configuration - Chapter 1"""


from sqlalchemy import create_engine
from sqlalchemy.ext.declarative import declarativ
from sqlalchemy.orm import sessionmaker

SQLALCHEMY_DATABASE_URL = "sqlite:///./fantasy_da

engine = create_engine(
SQLALCHEMY_DATABASE_URL, connect_args={"check
)
SessionLocal = sessionmaker(autocommit=False, aut

Base = declarative_base()

Take a look at this file piece by piece. At the top of most Python
files, you will import the external libraries that you will use. In
this case, three specific SQLAlchemy libraries are imported.

from sqlalchemy import create_engine


from sqlalchemy.ext.declarative import declarativ
from sqlalchemy.orm import sessionmaker
The next three steps work together to get the session, which is a
SQLAlchemy object that manages the conversation with the
database and allows the Python code to query it.

Create a database URL that tells SQLAlchemy what type of


database you’ll be using (SQLite) and where to find the file (in
the same folder as this file, with the name fantasy_data.db).

SQLALCHEMY_DATABASE_URL = "sqlite:///./fantasy_da

Using this database URL, create an engine object, with one


configuration setting.

engine = create_engine(
SQLALCHEMY_DATABASE_URL, connect_args={"check
)

Then use the engine object to create a session named


SessionLocal that points to that engine, and adds a couple of
more configuration settings.

SessionLocal = sessionmaker(autocommit=False, aut


The last command in this file creates a Base class. You will
import this class in the models.py file that you will create next.
You are using declarative mapping, which allows for a simpler
definition. The alternative would be an imperative mapping,
which requires a few extra steps.

Base = declarative_base()

models.py will create the Python representation of the data. The


classes in this file will be used when you query databases in
Python.

Here are the two tasks that you need to perform in this file:

Define the SQLAlchemy classes to store information from


the player and performance tables.
Describe the relationship between these tables, so that the
Python code can access the related tables.

Here are the full contents of models.py:

"""SQLAlchemy models - Chapter 1"""


from sqlalchemy import Boolean, Column, ForeignKe
from sqlalchemy.orm import relationship

from database import Base


class Player(Base):
__tablename__ = "player"

player_id = Column(Integer, primary_key=True,


first_name = Column(String)
last_name = Column(String)

performances = relationship("Performance", ba

class Performance(Base):
__tablename__ = "performance"

performance_id = Column(Integer, primary_key=


week_number = Column(String)
fantasy_points = Column(Float)

player_id = Column(Integer, ForeignKey("playe

player = relationship("Player", back_populate

Take a look at models.py piece by piece. I won’t usually explain


library imports, but in this case, one is worth noticing. The
database import refers to the database.py file with the
SQLAlchemy configuration. You are using the Base class as the
parent for the classes in the models.py file:

from database import Base

NOTE

When you import a class from another Python file in the same directory, you can
reference the filename without the .py extension.

Now it’s time to begin the definition of the Player class, which
is the Python class you’ll use to store data from the SQLite
player table. You do this using the class statement, stating the
name of the class, and specifying that it will be a subclass of the
Base class imported from the database.py file. Use the magic
command tablename to tell SQLAlchemy to reference the
player table. Because of this statement, when you ask
SQLAlchemy to query Player , it will know behind the scenes
to access the player table in the database. This is one of the key
benefits of an ORM: mapping the Python code automatically to
the underlying database:

class Player(Base):
__tablename__ = "player"
The rest of the Player class definition maps additional details
about that table. Each statement defines one attribute in the
class using the Column method provided by SQLAlchemy.

Here are a few things to notice about the definitions:

The attribute names are automatically matched to the


database element names in the database.
The datatypes used (e.g. String , Integer ) are
SQLAlchemy datatypes, rather than standard Python
datatypes.
The primary_key definition provides several benefits
from SQLAlchemy, such as query optimization and enabling
relationships between classes.

player_id = Column(Integer, primary_key=True,


first_name = Column(String)
last_name = Column(String)

Along with the definition of the tables, you define the foreign-
key relationship between the tables using the
relationship() function. This results in a
Player.performances attribute that will return all the
related rows from the performance table for each row in the
player table:
performances = relationship("Performance", ba

The file also contains the definition for the Performance class.
The definition is similar to the Player definition. One thing to
notice is that the relationship() function results in
Performance.player attribute, which you can use to retrieve
the player related to each performance.

main_cli.py is the file you will execute with Python to query the
database and print the results. It references models.py and
database.py files.

Here is a summary of the tasks that you need to perform in this


file:

Launch the program from the command line.


Open a connection to the database, using the configuration
from database.py.
Query the database for all the player records, and their
associated performance records.
Display a formatted printout of the player and performance
records.

Here are the full contents of main_cli.py:


"""Command line executable - Chapter 1"""
import models
from database import SessionLocal, engine

def main():
with SessionLocal() as session:
players = session.query(models.Player).al
for player in players:
print(f'Player ID: {player.player_id}
print(f'Player ID: {player.first_name
print(f'Player ID: {player.last_name}
for performance in player.performance
print(f'Performance ID: {performa
print(f'Week Number: {performance
print(f'Fantasy Points: {performa

# If running from the command line, main() is cal


if __name__ == "__main__":
main()

Take a look at this code section by section. I’m going to start at


the bottom, since in this situation, that’s the order that the
execution will occur. The statement at the bottom of the file is
commonly seen in any Python script that will be run from the
command line. When a Python script is executed directly, a
special variable name is populated with the value "main“. So
this statement ensures that the main() function will be called
in that situation:

# If running from the command line, main() is cal


if __name__ == "__main__":
main()

Breaking down the main() function you see a few key items.
You are using the SessionLocal object that was created in the
databases.py file with information about the SQLite database
and the settings selected. The with…​
as code creates a Python
context manager, which is Python code that performs enter
logic before a statement runs and exit logic after it runs. By
using the context manager, the session will be opened for the
database operations and then closed when the operations
finish.

The database task that you will perform in this situation is


query(models.Player).all() . It involves two steps
combined into a single statement:

Create a SQLAlchemy Query() using the Player class.


Return all the player records as a list of Player objects.
Combined, the result is similar to a select * from player
SQL query, with the advantages that come from an ORM:

with SessionLocal() as session:


players = session.query(models.Player).al

Table 1-3 describes a few of the common commands that you


can use to query the data using SQLAlchemy.
Table 1-3. SQLAlchemy query cheat sheet

Example command Purpos

session.query(Player).all() Retriev
Player v

session.query(Player).filter(Player.last_name == Filter a
Tucker) a data e
value

session.query(Player).all().order_by(Player.last_name) Return
Player v
sorted b
Player.l

The remaining statements create a recursive loop that prints


the contents of each Player class retrieved from the database.
Notice a second loop that retrieves each player’s performance
records, using the relationship defined in the models.py file:

for player in players:


print(f'Player ID: {player.player_id}
print(f'Player ID: {player.first_name
print(f'Player ID: {player.last_name}
for performance in player.performance
print(f'Performance ID: {performa
print(f'Week Number: {performance
print(f'Fantasy Points: {performa

Now that all the code is written, you are ready to execute it and
see how Python handles the SQLite database. From the
command line enter the command python3 main_cli.py
resulting in:

Player ID: 1
Player ID: Justin
Player ID: Tucker
Performance ID: 1
Week Number: 2023_1
Fantasy Points: 7.5
Player ID: 2
Player ID: Harrison
Player ID: Butker
Performance ID: 2
Week Number: 2023_1
Fantasy Points: 7.5
Player ID: 3
Player ID: Wil
Player ID: Lutz
Performance ID: 3
Week Number: 2023_1
Fantasy Points: 7.5
...[listing continues]
By creating these three Python files, you have connected to a
SQLite database, defined the database structure, and queried
the database.
EXTENDING YOUR PORTFOLIO PROJECT

In addition to the SWC portfolio project, you may want to use


similar techniques to build a portfolio project in another
business domain as you proceed through the book. This is a
great way to apply your learning to an area that you are
familiar with or interested in learning. In each chapter, I will
suggest some ways that you can apply the techniques you have
learned for another portfolio project that is uniquely yours.

Here is how you can extend your project based on this chapter:

Identify a business or market with a similar


primary/secondary/user relationship. An earlier sidebar
gave some examples.
Model a parent-child relationship of two or more tables that
would store data related to your idea.
Create a new GitHub repository for this project and launch
a new Codespace.
Create a SQLite database using DDL scripts and data
loading scripts to populate it with sample data.
Using the Chapter 1 files as a template, create Python code
to represent your database using SQLAlchemy, and retrieve
data from your database tables.
ADDITIONAL RESOURCES

To learn more about API Product Management, here are a few


recommended resources:

APIs: A Strategy Guide, by Daniel Jacobson, Greg Brail, Dan


Woods (O’Reilly, 2011)
Continuous API Management, 2nd Edition, by Mehdi
Medjaoui, Erik Wilde, Ronnie Mitra, Mike Amundsen
(O’Reilly, 2021)

SQL is one of the essential skills for data professionals. The


number of resources available is limitless, but here are a couple
to start:

Learning SQL, 3rd Edition, by Alan Beaulieu (O’Reilly, 2020)


SQL Pocket Guide, 4th Edition by Alice Zhao (O’Reilly, 2021)

SQLAlchemy - To learn more about SQLAlchemy, here are a few


recommended resources:

SQLAlchemy 1.4 documentation


Essential SQLAlchemy, 2nd edition by Jason Myers and Rick
Copeland (O’Reilly, 2015)

To become inspired to develop in the open and share the


projects you’re working on:
Show Your Work! 10 Ways to Share Your Creativity and Get
Discovered by Austin Kleon (Workman Publishing Company,
2014)

Summary
You’ve made a good start in your journey as an API provider.
Let’s review what you have accomplished so far:

You’ve reviewed reasons for becoming an API provider and


identified why Sports World Central wants to become one.
You’ve reviewed some of the major components of the
fantasy sports landscape and seen how your company fits
in.
You’ve set up your development environment using GitHub
and GitHub Codespaces.
You’ve simulated the website database using SQLite to store
the contents of the player and performance data.
You’ve created a Python program using SQLAlchemy to
query the database and store the results as Python objects.

In Chapter 2, you will begin to examine the needs of your API


consumers and design your first API. Then you will implement
that API using FastAPI.
Chapter 2. Building Your First API

A NOTE FOR EARLY RELEASE READERS

With Early Release ebooks, you get books in their earliest form
—the author’s raw and unedited content as they write—so you
can take advantage of these technologies long before the official
release of these titles.

This will be the 2nd chapter of the final book

If you have comments about how we might improve the content


and/or examples in this book, or if you notice missing material
within this chapter, please reach out to the editor at
[email protected].

The biggest source of waste in a startup is building


something that nobody wants.

—Eric Ries, author of The Lean Startup

In Chapter 1, you took your first steps to becoming an API


provider. You reviewed some typical reasons for providing APIs
and applied them to your example company: Sports World
Central (SWC). You also created a starter database (which for
your scenario represents an existing web application), and
created the Python classes to interact with it, using
SQLAlchemy.

It would be tempting to start creating APIs based on the tables


that you have at hand or what is most convenient for you as a
developer. Unfortunately, it’s not unusual for a technical team
to publish APIs based on the data that is readily at hand and
that their team uses the most. We might call this an engineer-
centric design process. Or more broadly a producer-centric
approach.

Technical proofs of concept (POCs) can be useful for


experimentation, but using a producer-centric approach to
develop your real API products leads to APIs that don’t get used.
The lean manufacturing term for this is waste, which is any
action that does not add value to the customer.

The approach you will be using for your portfolio project is


user-centric. You will research your market, find real users, and
identify the APIs that can help them accomplish things that
matter to them.

The following list introduces terminology that is related to the


user-centric design approach:
Design thinking

A human-centered approach to innovation and product


development. This approach begins with user desirability,
which describes solutions that fulfill a need or desire of
users. Then it works to find solutions that are both
technically feasible and economically viable.

Primary user research

Directly interacting with potential consumers through


methods such as surveys, interviews, focus groups, and
usability testing.

Secondary user research

Viewing additional sources beyond direct user input to


validate or further investigate user needs. Includes
techniques such as books, articles, published research,
and reviews of existing products and solutions.

Usability testing

Observing users while they interact with a realistic


scenario related to your product. You may observe them
using prototypes or competitor products.

User story
Structured template used in agile development to capture
user needs. Helps designers and developers to focus on
user outcomes instead of technical objectives.

Walking in the API Consumer’s Shoes


Before you criticize a man, walk a mile in his shoes. That
way, when you do criticize him, you’ll be a mile away and
have his shoes.

—Steve Martin

As mentioned in Chapter 1, you have received several


indications that APIs and data sharing would be beneficial to
current fantasy football customers, and could attract new ones.
You got started by simulating the data structures for the data
that seemed very important to you: individual player scoring.
Before launching into the creation of this API (or any others),
additional user research is needed to decide the right APIs to
prioritize.

You will use two methods of user research: primary and


secondary. For primary research, you review the customer
support tickets that were received previously to find the specific
details on the type of functionality on the advice sites they
would like to use, or specific data they have requested. You also
include several targeted questions on the annual year-end
customer feedback survey regarding additional external
functionality they desire including external sites and mobile
applications.

For secondary research, you decide to investigate the key


fantasy advice websites to see their features that are supported
for a large number of competitor league hosts (and not
supported for yours). You also review the API documentation of
these websites to see the major features that are supported. To
focus on the data science and analytics users, you spend time
researching discussion groups related to fantasy football,
football analytics, Python, R, and data science to find questions
that have been asked about using Sports World Central data
and solutions or sample code that people have published. You
also review open-source code repositories and code libraries
that reference related topics.

Based on the initial research, you have begun to understand the


needs of a variety of potential API consumers. To refine your
understanding and validate (or invalidate) some hypotheses
you have developed, you decide to perform additional primary
research using usability testing. Usability testing is a method for
learning about how users interact with a product through a
combination of interview questions and direct observation of a
customer using a product.

You conduct a dozen or so interviews with potential users that


represent a cross-section of user types, goals, and demographic
categories. You are learning about tasks they are trying to
perform and are not able to complete. For example, if they have
tried to use an advice tool that does not support SWC, you
observe them using the tool with data from another league host
that they support. Or if they are unable to create analytics
products with the SWC data, you observe other workarounds
they have used such as screen-scraping the SWC website for
partial data.

Key Results of the Research

After performing this research, you have identified several


distinct user types, the primary tasks they are trying to
accomplish, and the pain points they are encountering. You
summarize what you have learned in narrative form as follows:

Advice website users: These are current or potential SWC


fantasy football managers who use outside websites to help
manage their teams. Important activities include importing
the league teams, rosters, and scoring rules to use “rate my
draft” functions, get advice on weekly lineups, or view
playoff projections. Their current pain points are that SWC
is not supported by the most popular websites.
Advice websites: The advice websites want to have as broad
coverage of all the league websites as possible. They
provide a variety of advanced analytics products based on
league host data. They want read-only access to teams,
rosters, and league scoring rules. Some of the more
advanced websites also want read-write access to set
starters and make roster moves on behalf of the managers.
They currently can’t any of these services to SWC managers
and want to expand their services to include them. Their
preferred method for accessing league host data is via APIs.
Data science users: These users have specialized skills and
use data science-related tools and languages such as
Python, R, Jupyter Notebooks, and Microsoft Excel. Some
have professional titles such as data scientist, data engineer,
or data analyst. Others are hobbyists with no formal title.
They would like read-only access to current season data to
create a variety of analytics products. They also want data
for prior seasons to create analytic products that track
trends over time and make predictions from historical data.
Their primary pain point is that that data is not available
from the SWC website to use in their products. Some have
made attempts to screen-scrape the SWC website, but the
information is limited to current seasons, and website
design updates periodically break their ETL scripts. They
would prefer the data be available as a Python or R library,
but they’re willing to work with whatever source they can
get reliably. They would also like a bulk download option
for the data.
Mobile app developers: These are independent software
developers who create mobile apps in the mobile app
stores. They would like to create mobile apps that use live
data from the SWC website to show real-time scores and
allow players to drop players and set their starting lineup.
They can charge for these apps directly or include
advertising. Their pain points are that SWC does not make
their data available.
Mobile app users: These are current or potential SWC
fantasy football managers who would like to use a mobile
app to track the progress of their live game scores in real-
time, chat with other managers, and manage their team
rosters. Their pain points are that the current SWC website
is not mobile-friendly and no third-party apps are available
in mobile app stores.
TIP

To further empathize with users, you can create a user persona which defines key
users with additional details such as age, education, and even a fictional name. Read
Personas: learn how to discover your audience, understand them, and pivot to
address their needs for more information about this technique.

Food for Thought

During the research process, you also identified several


recurring requests that are not relevant to the project at hand,
but might be worth exploring in the future for additional
business opportunities:

Soccer fantasy managers: You focused your research on


fantasy football because the feedback received previously
focused on this, and football is the largest of the SWC
fantasy sports. However, during your research into message
boards, open source code repositories, and general web
searches related to “football analytics” and “fantasy
football”, you came across an active community of fantasy
soccer managers in both the U.S. and abroad tracking who
were interested in fantasy data products and access to
advice websites.
Sports analytics users: These are users who are interested
in accessing general (non-fantasy) NFL data. They are
interested in raw data feeds of football statistics, as well as
customized models and metrics. Similarly, there is also a
large number of users interested in soccer analytics beyond
fantasy soccer.

Table 2-1 summarizes the results of your research.


Table 2-1. Potential consumers for your APIs

User type Primary tasks Pain points

Advice Viewing teams and SWC league host is not


website leagues in advice supported by advice
users sites sites

Advice Support as many SWC data is not


websites league hosts as accessible
possible, with APIs
if possible

Data science Create dashboards, They can’t access SWC


users charts and models data reliably

Mobile app Create third-party SWC is not available for


developers apps using SWC use by mobile apps
data

Mobile app Live in-game SWC website is not


users updates on mobile mobile-friendly and no
devices third-party apps are
available

Summarizing User Research


After sorting through the user research, you have a much better
understanding of what potential consumers are trying to
accomplish, and their current pain points. Some of the findings
confirmed your starting assumptions, especially that there is a
strong demand for connected services.

On the other hand, several recurring themes surprised you. One


surprise was that very few people directly asked for APIs, even
though the functionality they wanted or the data they sought
would logically be provided by APIs. They more often asked for
connectivity, support, or data. You also expected that most of
the data needs would revolve around individual player scoring,
but many of the requests focused on league and roster
information. You also captured the interest in soccer fantasy
services and sports data and analytics beyond fantasy. These
are interesting ideas for future exploration.

After reviewing the data that could fulfill each of the needs, you
find that user needs can be divided into roughly four
quadrants, based on how frequently updates are needed, and
whether read-write access is needed as shown in Figure 2-1.
Figure 2-1. Four quadrants of requirements.

The majority of users fell into Quadrant 1 or Quadrant 3. The


Quadrant 1 users who wanted real-time data also wanted read-
write access for live game updates and access to league
management functions through mobile apps or the advanced
advice website APIs. The Quadrant 3 users needed daily
updates to create descriptive, predictive, or prescriptive
analytics products. These were data science users and advice
websites.
CATEGORIES OF ANALYTICS

Many of your potential consumers would like access to your


data to create analytics. Analytics projects can be grouped into
three categories:

Descriptive analytics products describe or list information about


a topic. Examples include dashboards, reports, and charts. For
fantasy football data, examples would include player ranking
charts, team analysis, and historical listings.

Predictive analytics creates predictions or inferences about


future state using some type of model. The models range from
simple formulas to complex statistical or machine-learning
models. For fantasy football, typical examples of predictive
analytics include player scoring projections and team playoff
predictions.

Prescriptive analytics goes beyond predicting future outcomes


to prescribing or recommending actions that users should
perform to achieve a desired outcome. It recommends “knobs
that you can turn” to make adjustments to future outcomes.
Typical examples include recommendation engines and various
wizard interfaces to guide users to make decisions. In fantasy
football, examples of prescriptive analytics include trade
wizards, free-agent finders, and drafting tools.
Selecting the First API Products
The design thinking methodology is a human-centered
approach to innovation and product development. This
approach begins with user desirability, which identifies
solutions that fulfill a need or desire of users. Then it works to
find solutions that are both technically feasible and
economically viable.

Using this general approach will help select the first API
products that SWC will develop, and decide between the
Quadrant 1 and Quadrant 3 users. Both quadrants appear to
have roughly the same number of users that desire them, so
user desirability is fairly even. Both have a potential economic
benefit because they provide access to value-added services
that will benefit existing SWC customers and generate new
customers. From the technical perspective, there is a clear
difference: providing read-only, daily data (Quadrant 3) will be
significantly simpler than providing real-time, read-write
access (Quadrant 1). Simpler generally means less expensive to
develop and host, which is the other half of the economic
viability.
Using these three principles leads to a clear candidate for the
first API products: Quadrant 3. To further refine these products,
you will create user stories, which are structured descriptions
of the user type, goals, and motivations that can be fulfilled by a
product.

Creating User Stories

A common template for user stories is the following:

As a (user type)
I want to (goal or intent)
So that (motivation or benefits)

You create several user stories for the Quadrant 3 needs so that
you can decide which API products can fulfill them.

1. As an SWC team manager, I want to view my fantasy league


and team on advice websites So that I can win my league and
beat my friends.
2. As an advice website provider, I want to create analytics
products such as roster advice, league analysis, and playoff
predictions using current-season SWC data So that I can
increase the number of customers that use my website and
increase ad or subscription revenue.
3. As an advice website provider, I want to create analytics
products such as fantasy draft tendencies using historical
SWC data So that I can gain draft-season subscribers and
web views.
4. As a data science user, I want to create analytics products
such as dashboards, charts, models, and metrics using SWC
data So that I can demonstrate and grow my data science
skills, explore hypotheses and hunches about fantasy data,
and build my reputation in both fields.
5. As a data science user, I want to create analytics products
such as historical league dashboards and multi-year machine
learning models using historical SWC data So that I can
demonstrate and grow my data science skills, create more
advanced models, and create league history websites for the
fantasy leagues I participate in.

Selecting Feasible User Stories

Using the principles of technical viability and economic


feasibility above, you see that user stories one, two, and four
will require current-season data about the following entities:

Leagues
Teams
Players
This relies on information that SWC stores today, and can be
made available for read-only APIs. In contrast, user stories
three and five require historical (prior-season) data about
teams and leagues. FSA does not retain that information, so
supporting these user stories is not viable. If you want to
consider supporting it in the future, you could make plans for it.
You would want to consider the economic feasibility of it in that
case. You have a target then: you will work to implement APIs
that fulfill the user stories one, two, and four.

Summary of Progress

The user research and human-centered design you conducted


have proven very valuable. You identified several users who
may consume your APIs. You understand their motivations and
pain points.

You have applied the three principles of user desirability,


technical feasibility, and economic viability. This has helped you
exclude read-write APIs, real-time data, and historical data. You
have three user stories that should be an excellent starting
point for your API development.

Selecting Your API Architecture


For your portfolio project, you have already made a few
technical decisions. For instance, you are using Python as your
programming language, SQLite as your database, and
SQLAlchemy as your object-relational mapper. What’s next? For
production-grade APIs that are hosted publicly, the number of
technologies available is practically limitless. For example, take
a look at the API Platform Landscape from the Postman 2023
State of the API Report or the list of technologies in the APIDays
API Landscape.

There are so many technological options available to design,


develop, deploy, and manage APIs that it can be a bit
overwhelming. However, it is not necessary to implement tools
in every category of the API landscape to build APIs. The
chapters in Part 1 will explain important concepts in API
building, and implement them with a set of tools that make
sense together. This won’t address everything necessary to run
APIs in production, but it will give you a solid foundation for
implementing some key concepts.

The following list introduces terminology that is related to the


technical aspects of API building:

API
A set of endpoints related to a single data source or
business domain.

API endpoint

Individual addressable resource in an API. A combination


of an HTTP verb and a URL. Also referred to as operation.

API version

A group of API endpoints that are maintained for some


time without breaking changes so that consumers can
count on them.

Breaking changes

Changes to API endpoints that cause functionality to stop


working, and may require consumers to make changes in
their program code.

De-serialization

Converting the Python objects into JSON, which is a text


format that is used to transmit it to the consumer.
Pydantic performs this task in your project.

HTTP verb
Specific type of action for HTTP traffic. Examples include
GET , which reads data, and DELETE which deletes data.

HTTP status codes

A standard list of codes returned by web applications,


which include APIs. View the full list here: HTTP status
codes.

Path parameter

API parameter that is included in the URL path. In the


example /customers/{customer_id}, the customer ID is a
path parameter.

Query parameter

API parameter that is included in the URL path with a


question mark in front, and an ampersand between. In
the example /customers?customer_id={customer_id}, the
customer ID is a query parameter.

Serialization

Converting input parameters and message bodies from


JSON into Python objects. Pydantic performs this task in
your project.

Web framework
A set of libraries that simplify common tasks for web
applications.

API Architectural Styles

One of the most significant decisions to make is selecting the


API architectural style you will be using. Since you are using a
consumer-centric design process, it follows that one of your
first goals would be to use a style that is widely supported and
understood by potential consumers. The Postman 2023 State of
the API Report found the following top 5 architectural styles
were being used by survey participants:

REST: 86%
WebHooks: 36%
GraphQL:29%
SOAP:26%
WebSockets:25%
gRPC:11%

The overwhelming popularity of REST found in the survey is


consistent with what you will experience if you explore most
public APIs. REST is currently the typical style used for APIs. For
an example relevant to your project, all of the APIs that I have
found for real-world fantasy football platforms use REST.
There are a couple of other API architectural styles that are
worth reviewing because they also make sense in data science
and AI-related situations. Let’s take a look at some of the
definitions and attributes of these terms.

Representation State Transfer (REST)

REST was formally defined by Roy Fielding’s doctoral


dissertation Architectural Styles and the Design of Network-
based Software Architectures. However, out in the wild, you
will find that not all of the REST-style APIs conform to the
formal definitions proposed.

A useful view of this architectural style is sometimes referred to


as Pragmatic REST or RESTful. The following is a mix of formal
definitions and some good practices:

Commonly occurs over the HTTP/1.1 protocol like the rest of


web traffic (HTTPS in practice).
API providers make resources available at individual
addresses (eg, /customers, /products, etc.). Consumers make
requests to these resources using standard HTTP verbs.
Producers provide a response. This is the client-server
model.
The response is defined by the producer. The standard
structure of the response is the same for each consumer.
The REST response is typically in JSON or sometimes XML
formats, which are standard text-based data transfer
formats.
The interaction is stateless. So in a conversation of multiple
requests and responses, each request has to provide
information or context from previous responses. For
example, retrieving a list of players, and then requesting
additional detail about an individual player.
Increasingly REST APIs are defined by an OpenAPI
Specification file, although a variety of other options have
been used over the years.
It is a best practice to use API versions to protect existing
consumers from changes.

Graph Query Language (GraphQL)

GraphQL is both a query language for APIs and a query runtime


engine. GraphQL was developed by Facebook and open-sourced
in 2015, and as the Postman report shows it is used by a large
number of developers. Here are some attributes of GraphQL
APIs, with some comparisons to REST:

Commonly occurs over HTTP/1.1 (like RESTful APIs).


Communication uses the client-server model (like RESTful
APIs).
Communication is stateless (like RESTful APIs).
The response is usually in JSON (like RESTful APIs)
Instead of only using HTTP verbs, the consumer uses the
GraphQL query language.
The consumer can specify the contents of the response,
along with the query options.
The producer makes the API available at a single address
(e.g. /graphql) and the consumer passes queries to it via the
HTTP POST verb.
Versioning is not recommended, because the consumer
defines the contents they are requesting.

A big advantage of GraphQL over RESTful APIs is that fewer API


calls are needed for the consumer to get the information they
need, and the amount of network traffic required can be much
less.

gRPC

Like GraphQL, gRPC was developed by a commercial company


(Google) and open-sourced in 2015. gRPC was developed for
very fast, efficient communication between microservices. gRPC
is usually used for a different set of problems than REST, and
has many differences:

Instead of sharing resources, gRPC provides Remote


Procedure Calls, which are more like traditional code
functions.
Instead of being limited to stateless request-response
patterns, gRPC can be used for continuous streaming.
Instead of using HTTP/1.1, gRPC uses HTTP/2.
Instead of returning data in a text-based format like JSON, it
uses _protocol buffers, which is a format for serializing data
that is smaller and faster than JSON or XML.
Instead of using an OpenAPI specification file, it uses
Protocol Buffers as the specification in a .proto file.

gRPC is not a likely candidate for the APIs that you will be
creating in your portfolio project. However, it’s worth
mentioning in this discussion of API architectural styles related
to data science for one big reason: large language models
(LLMs). These machine learning models are the engines behind
generative AI services such as Bard and ChatGPT. These are
very big models that need all the performance they can get, and
are using gRPC in some cases to achieve this.

Your Choice: REST


For your company’s needs, REST is the appropriate choice. It is
by far the industry standard, and it is appropriate for providing
resource-based APIs for the user stories your users care about.
It is also widely supported by a broad range of technologies, so
your customers should have no problem using a RESTful API.

GraphQL is also a good choice for a user querying your football


data, and you should keep an eye on it in the future. One signal
that it is worth considering will be if you get feedback from API
users that GraphQL support is desired.

Before diving into the Python coding for your API, let’s discuss
how this book will use a couple of key terms. For this book, we
will consider a RESTful API to be a set of endpoints that are all
related to the same data source. From this perspective, your
SWC website will start with a single API: the SWC Fantasy
Football API.

An API version is a group of endpoints that are consistent for


some time so that consumers can count on them. A later
chapter will discuss versioning in detail.

An API endpoint (also referred to as an operation) is a


combination of two fundamental building blocks: an HTTP verb
and a URL path.
The overall structure of these terms is:

.
└── api
└── version
└── endpoint

Look at a few examples using a general Acme widget company


(Table 2-2). Assume that the company’s APIs reside under the
https://api.acme.com subdomain.
Table 2-2. Example endpoints in the Acme API version 1

Endpoint
HTTP verb URL
description

Read GET api.acme.com/v1/products/


product list

Read GET api.acme.com/v1/products/{produ


individual
product

Create new POST api.acme.com/v1/products/


product

Update PUT api.acme.com/v1/products/{produ


existing
product

Delete DELETE api.acme.com/v1/products/{produ


existing
product

You can see that the URL is re-used for several of the endpoints.
But by combining the HTTP verb with the URL, a specific action
is taken when this resource is called. This HTTP verb plus URL
combination must be unique. For your portfolio project, you
will develop a set of endpoints to fulfill the user cases you
selected.

Continuing Your Portfolio Project


In Chapter 1, you established that the Sports World Central has
an existing fantasy football website. You simulated the player
and performance data from the existing application database,
which is maintained by the website.

In this chapter, you will begin building the APIs that provide
your consumers access to all the valuable fantasy football data
that they’ve asked for. This will provide them direct access to
data for their own data-centric work, as well as allow third-
party websites and apps to provide them with services.

Table 2-3 displays the new tools you will use in this chapter:
Table 2-3. News tools used in this chapter

Software
Version Purpose
name

FastAPI 0.103.2 Web framework to build the


API

Pydantic 2.3.0 Validation library

Uvicorn 0.23.2 Web server to run the API

FastAPI

FastAPI is a Python web framework that is designed for


building APIs. A web framework is a set of libraries that simplify
common tasks for web applications. Other common web
frameworks include Express, Flask, Django, and Ruby on Rails.

FastAPI is built to be fast in both application performance and


developer productivity. Because FastAPI focuses on API
development, it simplifies several tasks related to API building
and publishing:

It handles HTTP traffic, requests/responses, and other


“plumbing” jobs with a few lines of code.
It automatically generates an OpenAPI specification file for
your API, which is useful for integrating with other
products.
It includes interactive documentation for your API.
It supports API versioning, security, and many other
capabilities.

As you will see as you work through the portfolio project, all of
these capabilities provide benefits to the users of your APIs.

Compared to the other frameworks I mentioned, FastAPI is a


relative newcomer, having been created by Sebastián Ramírez
Montaño in 2018. It is an open-source project, and the current
version as of writing is 0.103.xx. That version number is
important because according to semantic versioning, the 0.x
indicates that breaking changes may occur with the software.

For your project, you will use version 0.103.2 of FastAPI.

Pydantic

Pydantic is a data validation library, which will play a key part


in the APIs that you build. Because APIs are used to
communicate between systems, a critical piece of their
functionality is the validation of inputs and outputs. API
developers typically spend a significant amount of time writing
the code to check the data types and validate values that go into
and out of the API endpoints.

Pydantic is purpose-built to address this important task. Similar


to FastAPI, Pydantic is fast in two ways: it saves the developer
time that would be spent to write custom Python validation
code, and Pydantic validation code runs much faster because it
is implemented in the Rust programming language.

In addition to these benefits, objects defined in Pydantic


automatically support tooltips and hints in IDEs such as Visual
Studio Code. Pydantic generates JSON Schema representations
from Python code. JSON Schema is a standard that ensures
consistency in JSON data structures. This Pydantic feature
enables FastAPI to automatically generate the OpenAPI
specification, which is an industry-standard file describing APIs.

For your project, you will use Pydantic version 2.3.0.

Uvicorn

All web applications, including APIs, rely on a web server to


handle the various administrative tasks related to handling
requests and responses. You will be using the open-source
Uvicorn web server. Uvicorn is based on the ASGI specification,
which provides support for synchronous processes (which block
the process while waiting for a task to be performed) and
asychronous processes (which can allow another process to
continue while they are waiting).

For your project, you will be using Uvicorn 0.23.2.

Updates to the Database


Based on the user stories you plan to fulfill with your first API
endpoints, you will be using some additional tables from the
SWC web application. As you did in Chapter 1, you will create
these tables in SQLite and populate them with sample data. You
can see in Figure 2-2 that you will be adding tables that store
league and team data to the existing player and performance
tables you have already created.

You will also add the team_player table, which handles the
many-to-many relationship between players and teams. This is
necessary because each fantasy team is made up of 12-16 NFL
players. Across the thousands of leagues that SWC hosts, each
NFL player will occur on many different teams.
In your development environment, create a directory for your
Chapter 2 code with the following commands:

mkdir chapter2_project
cd chapter2_project

You will now create several files for the FastAPI API code and
Pydantic validation schemas, continuing to follow the FastAPI
Tutorial databases template.

The directory listing when you complete will look like the
following:

.
└── chapter2_project
├── crud.py
├── database.py
├── fantasy_data.db
├── main.py
├── models.py
├── schemas.py
└── requirements.txt

Creating Additional Tables


To support the new functionality, you will add additional tables
to the database. In the case of the SWC website, these reflect
tables that already exist. You are creating them for your project.
Figure 2-2 displays the structure of the database when you
complete the additions.

Figure 2-2. Database table structure with additional tables.

First, copy the fantasy_data.db from chapter1_project to


chapter2_project. You want to leave the Chapter 1 files intact in
the original folder so you can review them if needed:

cp ../chapter1_project/fantasy_data.db .

Open your database with the SQLite command:

sqlite3 fantasy_data.db

Which produces:
SQLite version 3.41.2 2023-03-22 11:56:21
Enter ".help" for usage hints.
sqlite>

To create the league, team, and team_player tables, execute the


following commands at the SQLite command line:

/* Table creation script - Chapter 2 */


CREATE TABLE league (
league_id INTEGER NOT NULL,
league_name VARCHAR NOT NULL,
scoring_type VARCHAR NOT NULL,
PRIMARY KEY (league_id)
);

CREATE TABLE team (


team_id INTEGER NOT NULL,
team_name VARCHAR NOT NULL,
league_id INTEGER,
PRIMARY KEY (team_id),
FOREIGN KEY(league_id) REFERENCES league
);

CREATE TABLE team_player (


team_id INTEGER NOT NULL,
player_id INTEGER NOT NULL,
PRIMARY KEY (team id player id)
PRIMARY KEY (team_id, player_id),
FOREIGN KEY(team_id) REFERENCES team (tea
FOREIGN KEY(player_id) REFERENCES player
);

To verify that the tables were created, enter .tables resulting


in the following:

sqlite> .tables
league performance player team

Loading More Example Data

To populate the new tables, you will execute DML scripts. For
the sake of space in this chapter, the number of league, team,
and team_player records is limited.

Add several fantasy football leagues by executing the following


command at the SQLite prompt:

/* Insert league records */


INSERT INTO league (league_name, scoring_type) VA
INSERT INTO league (league_name, scoring_type) VA
INSERT INTO league (league_name, scoring_type) VA
INSERT INTO league (league_name, scoring_type) VA
INSERT INTO league (league_name, scoring_type) VA
g ( g _ , g_ yp )

Add a few teams to two of the leagues by executing the


following command at the SQLite prompt:

/* Team records */
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1
INSERT INTO team (league_id, team_name) VALUES (1

The last DML is worth reviewing in a bit more detail. The


team_player is what is known as an association table, which
only exists to support the many-to-many relationship between
players and teams. You will associate several existing player
records that you added in Chapter 1 to teams that you added in
the previous step. The ID values are the primary keys of the
team and player tables:
/* team_player inserts */
/* Inserting records with team_id 1 */
INSERT INTO team_player (team_id, player_id) VALU
INSERT INTO team_player (team_id, player_id) VALU
INSERT INTO team_player (team_id, player_id) VALU
INSERT INTO team_player (team_id, player_id) VALU
INSERT INTO team_player (team_id, player_id) VALU

/* Inserting records with team_id 2 */


INSERT INTO team_player (team_id, player_id) VALU
INSERT INTO team_player (team_id, player_id) VALU
INSERT INTO team_player (team_id, player_id) VALU

INSERT INTO team_player (team_id, player_id) VALU


INSERT INTO team_player (team_id, player_id) VALU

You have now created three new tables in your database and
populated them with a small amount of example data. You are
now ready to create the API code using several new tools.

Continuing Your Python Development


In Chapter 1, you set up the SQLAlchemy configuration in the
database.py file and created models to represent your tables in
models.py. You also created main_cli.py that ran from the
command line to query your database and print results to the
console output.

In this chapter, you will build on that foundation to create the


SWC Fantasy Football API with endpoints to fulfill your selected
user stories. Table 2-4 lists the initial endpoints that you will
create. Assume that your APIs reside under the
https://api.sportsworldcentral.com subdomain:

Table 2-4. Endpoints for the SWC Fantasy Football API

Endpoint
HTTP verb URL
description

API health check GET /

Read player list GET /v0/players/

Read individual GET /v0/players/{player_id}/


player

Read performance GET /v0/performances/


list

Read league list GET /v0/leagues/

Read team list GET /v0/teams/


You are using version zero for these initial endpoints. This will
notify consumers of the API that the product is changing
rapidly and they should be aware of potential breaking changes
which are changes that cause functionality to stop working, and
may require consumers to make changes in their program code.

Installing New Tools In Your Environment

In Chapter 1, you created the requirements.txt file and specified


the version of SQLAlchemy to install using the pip3 package
manager in Python. You will now use this process to install
Pydantic, FastAPI, and Uvicorn.

Create a requirements.txt file with the following content:

#Chapter 2 pip requirements


SQLAlchemy>=1.4.0,<1.5.0
pydantic>=2.3.0,<2.4.0
fastapi>=0.103.0,<0.104.0
uvicorn>=0.23.0,<0.24.0

You may notice that instead of specifying a specific version as in


Chapter 1, this file defines a small range of values. This allows
the program to use minor updates to the libraries without
breaking any functionality. To install the new libraries, execute
the following command:

pip3 install -r requirements.txt

pip will download and install these libraries, along with others
that are required by the libraries themselves. You should see a
message that states that these libraries were successfully
installed, such as the following:

Successfully installed annotated-types-0.6.0 anyi

To verify this, type the following command:

pip3 show SQLAlchemy pydantic fastapi uvicorn

That should return:

Name: SQLAlchemy
Version: 1.4.49
Summary: Database Abstraction Library
Home-page: https://www.sqlalchemy.org
Author: Mike Bayer
Author-email: [email protected]
License: MIT
Location: /usr/local/python/3.10.13/lib/python3.1
Requires: greenlet
Requires: greenlet
Required-by:
---
Name: pydantic
Version: 2.3.0
Summary: Data validation using Python type hints
Home-page:
Author:
Author-email: Samuel Colvin <[email protected]>, E
License:
Location: /usr/local/python/3.10.13/lib/python3.1
Requires: annotated-types, pydantic-core, typing-
Required-by: fastapi
---
Name: fastapi
Version: 0.103.2
Summary: FastAPI framework, high performance, eas
Home-page:
Author:
Author-email: Sebastián Ramírez <[email protected]
License:
Location: /usr/local/python/3.10.13/lib/python3.1
Requires: anyio, pydantic, starlette, typing-exte
Required-by:
---
Name: uvicorn
Version: 0.23.2
Summary: The lightning-fast ASGI server.
Home-page:
Author:
Author-email: Tom Christie <[email protected]>
License:
Location: /usr/local/python/3.10.13/lib/python3.1
Requires: click, h11, typing-extensions
Required-by:

Python Files For Your API

You will be creating or updating several Python files for Chapter


2. Table 2-5 summarizes the purpose of each file.
Table 2-5. Purpose of the Chapter 2 files

File name Purpose

crud.py Database query functions that access the


SQLAlchemy models

database.py Configures SQLAlchemy to use the SQLite3


database

fantasy_data.db SQLite database containing source data

main.py FastAPI file that defines routes and


controls API

models.py Defines the Python classes that match the


database tables

schemas.py Defines the Pydantic classes that validate


data sent to the API

requirements.txt Identifies specific versions of libraries


used for the project

You will not be changing the database.py file. Go ahead and


copy it from chapter1_project to chapter2_project at this time,
using the following command from the chapter2_project
directory:
cp ../chapter1_project/database.py .

Updates to Model code

To reference the new tables using SQLAlchemy, you will modify


the models.py file. As a reminder, the models.py is used to define
the SQLAlchemy classes that represent the database tables in
your code.

Here is the updated models.py file:

"""SQLAlchemy models - Chapter 2"""


from sqlalchemy import Boolean, Column, ForeignKe
from sqlalchemy.orm import relationship

from database import Base

class Player(Base):
__tablename__ = "player"

player_id = Column(Integer, primary_key=True,


first_name = Column(String, nullable=False)
last_name = Column(String, nullable=False)

performances = relationship("Performance", ba
class Performance(Base):
__tablename__ = "performance"

performance_id = Column(Integer, primary_key=


week_number = Column(String, nullable=False)
fantasy_points = Column(Float, nullable=False

player_id = Column(Integer, ForeignKey("playe

player = relationship("Player", back_populate

class League(Base):
__tablename__ = "league"

league_id = Column(Integer, primary_key=True,


league_name = Column(String, nullable=False)

scoring_type = Column(String, nullable=False)

teams = relationship("Team", back_populates="

class Team(Base):
__tablename__ = "team"

team_id = Column(Integer, primary_key=True, i


team name = Column(String, nullable=False)
team_name Column(String, nullable False)

league_id = Column(Integer, ForeignKey("leagu

league = relationship("League", back_populate

players = relationship("Player", secondary="t

class TeamPlayer(Base):
__tablename__ = "team_player"

team_id = Column(Integer, ForeignKey("team.te


player_id = Column(Integer, ForeignKey("playe

# Many-to-many relationship between Player and Te


Player.teams = relationship("Team", secondary="te

Let’s take a look at the updated models.py file. The Player and
Performance classes have not changed. The first new code is
the definition of the League class.

class League(Base):
__tablename__ = "league"

league_id = Column(Integer, primary_key=True,


league_name = Column(String, nullable=False)
scoring_type = Column(String, nullable=False)
teams = relationship("Team", back_populates="

League is going to be the top-most parent class in our code, as


was reflected in Figure 2-2. The teams relationship will be used
to enable League.teams in this class.

Look at the next block of code, which defines the Team class.

class Team(Base):
__tablename__ = "team"

team_id = Column(Integer, primary_key=True, i


team_name = Column(String, nullable=False)

league_id = Column(Integer, ForeignKey("leagu

league = relationship("League", back_populate

players = relationship("Player", secondary="t

The beginning of the class is similar to those we have defined


previously, defining the elements in the class. And the child-to-
parent relationship with the League class above it.

The next statement is new.


players = relationship("Player", secondary="t

I referred to the team_player table in your database as an


association table, meaning it is used only to connect multiple
football teams to multiple players. A key point is that your
ultimate goal is not to access team_player data; you want to
access player records from the team record.

This relationship makes that happen. The relationship to


the Player class is defined with secondary="team_player"
reflecting the presence of the association table in between.

The remaining code completes the rest of the setup of this


relationship.

class TeamPlayer(Base):
__tablename__ = "team_player"

team_id = Column(Integer, ForeignKey("team.te


player_id = Column(Integer, ForeignKey("playe

The TeamPlayer class is created without any relationships


because those are defined on the Team and Player classes.
The final statement adds a new relationship to the Player
class, which was defined already. This is a matching
relationship using the secondary statement, as you did on the
Team class. You could have added this to the Player
definition above, but including it here makes it clear that this is
a new addition for Chapter 2.

# Many-to-many relationship between Player and Te


Player.teams = relationship("Team", secondary="te

You have now defined all of the SQLAlchemy models needed for
the new database tables. Next, you will define the SQLAlchemy
query functions for them.

SQLAlchemy query functions

The database functions will be in a file name crud.py. This


strange-sounding name stands for Create, Read, Update, Delete.
Here are the full contents of that file:

"""SQLAlchemy Query Functions - Chapter 2"""


from sqlalchemy.orm import Session
from sqlalchemy.orm import joinedload

import models
def get_player(db: Session, player_id: int):
return db.query(models.Player).filter(models

def get_players(db: Session, skip: int = 0, limit


return db.query(models.Player).offset(skip).l

def get_performances(db: Session, skip: int = 0,


return db.query(models.Performance
).offset(skip).limit(limit).a

def get_leagues(db: Session, skip: int = 0, limit


return db.query(models.League
).options(joinedload(models.L
).offset(skip).limi

def get_teams(db: Session, skip: int = 0, limit:


return db.query(models.Team
).offset(skip).limit(limit).a

Let’s look at the SQLAlchemy query functions.

import models
It’s worth noticing the import models statement to remember
that these functions are performing actions on the SQLAlchemy
models you defined in the models.py file. Instead of issuing SQL
commands, you will be executing methods of your model
classes and SQLAlchemy will create prepared SQL statements to
retrieve the data.

The functionality inside these functions has been moved out of


the main_cli.py file into a standalone query file to keep
maintenance cleaner. This file does not contain a reference to
the database session, so the first function parameter of
get_players is db: Session . This function will be called by
the FastAPI program in main.py, which will pass a reference to
the database session.

The next statement defines the first query.

def get_player(db: Session, player_id: int):


return db.query(models.Player).filter(models

By using filter(models.Player.player_id ==
player_id).first() , this function receives a specific
Player.player_id value, and returns the first matching
instance. Because you have defined player_id as a primary
key in the models.py file and the SQLite database, this query will
return a single result.

The next function adds several new options to the .query()


statement:

def get_players(db: Session, skip: int = 0, limit


return db.query(models.Player).offset(skip).l

The offset(skip).limit(limit) combines two statements.


Taken together, offset and limit enable pagination, which
allows the user to specify a set of records in chunks rather than
a full list. In this case, the skip instructs the query to skip a
number of records from the beginning of the query results, and
limit instructs the query to return only a certain number of
records.

The remaining lookup functions use similar options to those


you already used, plus one new item.

def get_performances(db: Session, skip: int = 0,


return db.query(models.Performance
).offset(skip).limit(limit).a

def get_leagues(db: Session, skip: int = 0, limit


return db.query(models.League
).options(joinedload(models.L
).offset(skip).limi

def get_teams(db: Session, skip: int = 0, limit:


return db.query(models.Team
).offset(skip).limit(limit).a

The new item is the


.options(joinedload(models.League.teams) statement
in the get_leagues function. This is a type of eager loading,
which causes SQLAlchemy to retrieve the joined Team data
when it retrieves the League data.

Since all of the functions in crudy.py are reading (querying)


data, you have only implemented the “r” in CRUD. That is
appropriate because all of your user stories require read-only
functionality. If you were developing an API that allowed
creating, updating, or deleting records, this file could be
extended with additional functions.

You have finished the SQLAlchemy portion of your code. Next,


you will move on to the Pydantic classes.

Pydantic class definition


The Pydantic classes define the structure of the data that the
consumer will receive in their API responses. This uses a
software design pattern called data transfer objects (DTO), in
which you define a format for transferring data between a
producer and consumer, without the consumer needing to
know the backend format. In your portfolio project, the
backend and frontend classes won’t look significantly different,
but using DTOs allows complete flexibility on this point.

Although you define the classes using Python code and your
code interacts with them as fully formed Python objects, the
consumer will receive them in an HTTP request as a JSON
object. Pydantic automatically performs the de-serialization
process, which is converting the Python objects into JSON, the
text format that is used to transmit it to the consumer. This
means you do not need to manage de-serialization in your
Python code, which simplifies your program. It is worth
mentioning again that Pydantic 2 is written in Rust, which
makes this process much faster than similar code you could
write in Python.

In addition to performing this de-serialization task, Python also


defines the response format in the openapi.json file. This is a
standard contract that uses OpenAPI and JSON Schema. This
will provide multiple benefits for the consumer, as we will
explore in Chapter 3.

You will define your Pydantic classes in a file named


schemas.py. This naming choice will help separate the purpose
of the frontend Pydantic classes and backend SQLAlchemy
classes. Both SQLAlchemy and Pydantic documentation refer to
their classes as models, which may be confusing at times.

Here are the full contents of the schemas.py file.

"""Database configuration - Chapter 2"""


from pydantic import BaseModel
from typing import List

class Performance(BaseModel):
performance_id : int
player_id : int
week_number : str
fantasy_points : float

class Config:
from_attributes = True

class PlayerBase(BaseModel):
player_id : int
first_name : str
last_name : str

class Config:
from_attributes = True

class Player(PlayerBase):
performances: List[Performance] = []

class Config:
from_attributes = True

class TeamBase(BaseModel):
league_id : int
team_id : int
team_name : str

class Config:
from_attributes = True

class Team(TeamBase):
players: List[PlayerBase] = []

class Config:
from_attributes = True

class League(BaseModel):
league_id : int
league_name : str
scoring_type : str
teams: List[TeamBase] = []

class Config:
from_attributes = True

Let’s dive into the Pydantic schemas to see how they work. The
first class is the simplest, the Performance class.

class Performance(BaseModel):
performance_id : int
player_id : int
week_number : str
fantasy_points : float

class Config:
from_attributes = True

This class represents the performance data that the consumer


will receive. From their perspective, a performance is what
happens when a player plays in a single week. The
fantasy_points is specific to the SWC platform.
Performance is a subclass of the Pydantic BaseModel class,
which adds all the bells and whistles that Pydantic provides.
Notice that the data types of individual class elements are
assigned with a colon, instead of an equals sign, as was used in
the SQLAlchemy classes. (This will trip you up if you’re not
careful.)

Each Pydantic class has a subclass named Config , which is an


instance of the format ConfigDict. You can use this to set special
behaviors for your classes. The option you set in your program
is the from_attributes=True . The purpose is to force
SQLAlchemy to retrieve relationship data that is associated with
your Pydantic options. (The from_attributes option was
referred to as orm_mode in earlier versions of Pydantic, and
you may see it used in some examples you come across.)

The definition of the player-related information is a bit more


complicated:

class PlayerBase(BaseModel):
player_id : int
first_name : str
last_name : str

class Config:
from_attributes = True

class Player(PlayerBase):
performances: List[Performance] = []

class Config:
from_attributes = True

You are defining the representation of player data in two steps.


In the first step, you define class PlayerBase(BaseModel) .
The PlayerBase is a subclass of the Pydantic BaseModel , like
the Performance class defined previously. It contains the
player-related data, but it does not include the link to the
Performance object. In some cases, the consumer will receive
a list of football players without a list of all their game data. You
will use the PlayerBase in those cases.

The class Player(PlayerBase): statement defines the


Player class as a sub-class of PlayerBase . The
performances: list[Performance] = [] statement
defines the link to the performances that we excluded from the
PlayerBase .

Taken together, these two class definitions make a full Player


class that includes all the player data, including the link to
performances. But they also allow the more limited
PlayerBase to be used in some situations.
If you are comparing these Pydantic classes to the SQLAlchemy
models, you will notice that the player classes include no
reference to the teams they reside on. This is intentional: in the
API design, the users will not be requesting a list of fantasy
teams that an NFL player resides on.

The next two classes are used to define the team data.

class TeamBase(BaseModel):
league_id : int
team_id : int
team_name : str

class Config:
from_attributes = True

class Team(TeamBase):
players: List[PlayerBase] = []

class Config:
from_attributes = True

Like the PlayerBase class, the TeamBase class excludes child


records. In this case, a TeamBase object excludes the list of
players.
The Team shows the first use of the limited PlayerBase class.
The full Team class contains an item named players . Notice
that it uses the limited PlayerBase instead of the full Player
class. When the Team object is returned to a consumer, the
players will not include all of the performance data for that
player. (They will have to make an additional call for that
information if they need to use it.)

The last class definition is for the league.

class League(BaseModel):
league_id : int
league_name : str
scoring_type : str
teams: List[TeamBase] = []

class Config:
from_attributes = True

The League class looks simple, but don’t miss one detail:
League.teams is a List of TeamBase objects, which do not
contain a list of players. This means that the consumer
receiving a League does not recursively receive a list of all
players on each team.
At this point, you have designed the DTOs that will be used to
send data to the API consumer, which are defined in Pydantic.
You are ready to bring FastAPI into the mix.

FastAPI Python File

Now that all of the pieces are in place in the other Python files,
you can tie them together with the FastAPI functionality in
main.py. As you will see in the chapters in Part 1, an impressive
amount of functionality is provided for your API in just a few
lines of code.

Here are the full contents of the main.py file.

"""FastAPI program - Chapter 2"""


from fastapi import Depends, FastAPI, HTTPExcepti
from sqlalchemy.orm import Session

import crud, models, schemas


from database import SessionLocal, engine

app = FastAPI()

# Dependency
def get_db():
db = SessionLocal()
try:
y
yield db
finally:
db.close()

@app.get("/")
async def root():
return {"message": "API health check successf

@app.get("/v0/players/", response_model=list[sche
def read_players(skip: int = 0, limit: int = 100,
players = crud.get_players(db, skip=skip, lim
return players

@app.get("/v0/players/{player_id}", response_mode
def read_player(player_id: int, db: Session = Dep

player = crud.get_player(db, player_id=player


if player is None:
raise HTTPException(status_code=404, deta
return player

@app.get("/v0/performances/", response_model=list
def read_performances(skip: int = 0, limit: int =
performances = crud.get_performances(db, skip
return performances

@app.get("/v0/leagues/", response_model=list[sche
pp g ( g p [
def read_leagues(skip: int = 0, limit: int = 100,
leagues = crud.get_leagues(db, skip=skip, lim
return leagues

@app.get("/v0/teams/", response_model=list[schema
def read_teams(skip: int = 0, limit: int = 100, d
teams = crud.get_teams(db, skip=skip, limit=l
return teams

Let’s walk through the code in your FastAPI file.

app = FastAPI()

# Dependency
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()

In FastAPI, the primary class you will work with is a FastAPI


class. This class by default includes the functionality to handle
much of the work that an API needs to perform, without
requiring you to specify every detail. You create an instance
and name it app. This will be used in the rest of main.py and
when you execute your API from the command line using
Uvicorn. You will reference main:app , which will refer to the
app object in the main.py class.

You define the get_db() function to create a database session


and close the session when you are done with it. This function
is used as a dependency in the API routes within main.py.

@app.get("/")
async def root():
return {"message": "API health check successf

The command is @app.get("/") , which is a decorator. A


decorator is a statement that is added above a function
definition, to give special attributes to it. In this case, the
decorator defines that the async def root() function
definition will be a FastAPI request handler.

This function will be called when a consumer accesses the root


URL of the API, which is equivalent to / . It will serve as a
health check for the entire API by returning a simple message
to the consumer.

The next statement defines the first endpoint that we have


created for our user stories.
@app.get("/v0/players/", response_model=list[sche
def read_players(skip: int = 0, limit: int = 100,
players = crud.get_players(db, skip=skip, lim
return players

Remember that Table 2-4 defined the endpoints that we


planned to create as a combination of HTTP verb and URL. With
FastAPI these endpoints (also called routes) are defined with the
decorators above each function.

The following explains how the HTTP verb and URL are
specified in the decorator:

HTTP verb: All of these endpoints use the GET verb, which
is defined by the @app.get() decorator function.
URL: The first parameter of the get() function is the
relative URL. For this first endpoint, the URL is /v0/players/.

The second parameter of the decorator is


response_model=list[schemas.Player]) . This informs
FastAPI that the data returned from this endpoint will be a list
of Pydantic Player objects, as defined in the schemas.py file.
This information will be included in the OpenAPI specification
that FastAPI automatically creates for this API. Consumers can
count on the returned data being valid according to this
definition.

Let’s look at the function signature that you decorated:


read_players(skip: int = 0, limit: int = 100, db:
Session = Depends(get_db)) . Several things are going on in
this function. Starting at the end, the db object is a session that
is created by the get_db() function defined at the top of this
file. By wrapping the function in Depends() , FastAPI handles
the call for and gives the Session to your function.

The function also includes two additional parameters: skip:


int = 0, limit: int = 100 . These are named parameters
that have a defined datatype and a default value. FastAPI will
automatically include these parameters as query parameters in
the API definition. Query parameters are included in the URL
path with a question mark in front, and an ampersand
between. For instance to call this query method, the API
consumer could use this request:

HTTP verb: GET


URL: api.sportsworldcentral.com/v0/players/?
skip=10&limit=50
Within the body of the read_players() function, FastAPI is
calling the get_players() function that you defined in
crud.py. It is performing a database query. The players object
receives the result of that function call. FastAPI validates that
this object matches the definition list[schemas.Player] . If
it does, Pydantic de-serializes the Python objects into a text
JSON string and sends the response to the consumer.

The next endpoint adds two additional FastAPI features.

@app.get("/v0/players/{player_id}", response_mode
def read_player(player_id: int, db: Session = Dep
player = crud.get_player(db, player_id=player
if player is None:
raise HTTPException(status_code=404, deta
return player

First, the URL path includes {player_id} . This is a path


parameter, which is an API request parameter that is included
in the URL path instead of separated by question marks and
ampersands like the query parameters. Here is an example of
how the API consumer might call this endpoint:

HTTP verb: GET


URL: api.sportsworldcentral.com/v0/players/12345?
skip=10&limit=50

The next feature to notice is that the function checks to see if


any records were returned from the helper function, and if not
it raises an HTTPException, which is a standard method that
web applications use to communicate status. It is good RESTful
API design to use the standard HTTP status codes to
communicate with consumers. This makes the operation more
predictable and reliable. This endpoint returns an HTTP status
code of 404, which is the not found code. It adds the additional
message that the item not found was the player being searched
for.

The final three endpoints do not use any new features. But
together they complete all of the user stories that we have
included for our first API.

@app.get("/v0/performances/", response_model=list
def read_performances(skip: int = 0, limit: int =
performances = crud.get_performances(db, skip
return performances

@app.get("/v0/leagues/", response_model=list[sche
def read_leagues(skip: int = 0, limit: int = 100,
leagues = crud.get_leagues(db, skip=skip, lim
return leagues
return leagues

@app.get("/v0/teams/", response_model=list[schema
def read_teams(skip: int = 0, limit: int = 100, d
teams = crud.get_teams(db, skip=skip, limit=l
return teams

It is worth noting that in addition to the basic options of FastAPI


and Pydantic that you are using, there are many additional
validations and other features available. As you can see, these
libraries accomplish a lot with only a few lines of code from
you.

Launching Your API


This is the moment you have been waiting for: running your
API. Instead of executing main.py directly using Python, as you
did in Chapter 1 for the main_cli.py, you are going to use the
Uvicorn web server to execute your program, passing the name
of the Python file without the file extension, and the FastAPI
app object name.

Enter the following command from the command line:


uvicorn main:app , which should return something similar to
this:
INFO: Started server process [12345]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:800

If you are running your code on GitHub Codespaces, an


additional popup message may offer to launch a special local
browser session for you, using a different URL path. If you are
running locally, copy the http://127.0.0.1:8000 URL into your
web browser.

In either case, if your API is working, you should see the health
check message in your web browser:

{"message":"API health check successful"}

This confirms your API is running, which is a great start.

The real test is when you call the first endpoint that looks up
data. Give that a try by copying the following URL (or
equivalent if on GitHub Codespaces) in your browser bar:
http://127.0.0.1:8000/v0/players/?skip=0&limit=3. If everything is
working correctly, you should see the following data in your
browser:
[{"player_id":1,"first_name":"Justin","last_name"

TIP

This chapter covered a lot, so it’s possible that an error occurred or you are not
getting a successful result. Don’t worry, this happens to all of us. Here are a few
suggestions for how to troubleshoot any problems you are running into:

Verify that you installed all the required software.


Verify that you’re running the command in the Chapter 2 Project folder.
Take a minute to verify the path in the URL bar of your browser. Minor things
matter such as slashes and question marks.
Look at the command line to see any errors that are being thrown by Uvicorn.
Think about the path that the request is taking, and walk through a single
function from the front end to the back end: main.py -→ crud.py -→ models.py
-→ crud.py -→ main.py. And of course, the database.py and schemas.py are being
referenced as well.
To verify your environment with FastAPI and Uvicorn, try creating a very
simple API, such as one from the Official FastAPI Tutorial.

If this first API endpoint is working for you, try out some more
of the URLs from Table 2-4 to verify that you have completed all
of your user stories. Congratulations, you are an API developer!
EXTENDING YOUR PORTFOLIO PROJECT

To use these techniques on a portfolio project that is uniquely


yours, here are some suggestions:

Identify an additional business or market (or continue


using the one you identified at the end of Chapter 1).
Identify secondary research methods you can use to learn
about potential API consumers. Document the user types,
primary tasks, and pain points of the users.
Alternative: research and use the tools from another formal
technique such as Design Thinking, Lean Startup, or APIOps
Cycles to document your users.
Select a primary user, and extend your database to add
necessary tables to support the tasks they want to perform
or resolve their pain points.
Create a new FastAPI project using the database and
SQLAlchemy models for your new API.
ADDITIONAL RESOURCES

To see examples of real-world Fantasy Football APIs, you can


view the Sleeper API and MyFantasyLeague.com 2023 API.

To view a discussion board related to fantasy football or soccer


coding and APIs, take a look at r/fantasyfootballcoding
(reddit.com), r/fantasyfootball (reddit.com), or r/FantasyPL
(reddit.com).

For more details about design thinking and human-centered


design, read the IDEO Field Guide to Human-Centered Design.

A quick discussion of one of the pitfalls of innovation is


Building a Product Nobody Wants, Eric Ries, author of The Lean
Startup.

To learn about the technical architecture of APIs, I recommend


Mastering API Architecture: Design, Operate, and Evolve API-
Based Systems, by James Gough, Daniel Bryant, and Matthew
Auburn (O’Reilly, 2022).

For some tips about RESTful API design, read Ten REST
Commandments by Steve McDougall.

To explore FastAPI beyond this book, the official FastAPI


tutorial and FastAPI reference documentation are both very
useful.

To learn the ins and outs of building a project with FastAPI, I


recommend FastAPI: Modern Python Web Development by Bill
Lubanovic (O’Reilly, 2023).

The official Pydantic 2.3 documentation provides information


for the specific version of Pydantic used in this chapter.

The official Uvicorn documentation has much more


information about the capabilities of this software.

Summary
In this chapter, you built on the foundation of the database
created in Chapter 1. Here is what you have accomplished so
far:

You used a user-centric approach to research potential API


consumers and how you could serve them.
You identified several potential consumers and found two
that you could help now: Data science users and Advice
websites.
You focused on user desirability, technical feasibility, and
economic viability to select three user stories to begin your
API development
You defined the API endpoints needed to complete the user
stories.
You included more tables in your database, which now
includes players, performances, leagues, and teams.
You created more SQLAlchemy models and CRUD helper
functions to handle this new data.
You installed FastAPI, Pydantic, and Uvicorn, along with
several supporting libraries.
You defined Pydantic schemas to represent the data that
your API consumers wanted to receive.
You created a FastAPI program to process consumer
requests and return data responses, tying everything
together.

In Chapter 3, you will look at how to document your API using


the built-in capabilities of FastAPI and build a developer portal.
About the Author
Ryan Day is a data scientist in the financial services industry
who wears a few additional hats including cloud architect and
API strategist. He previously led the digital services division for
a Federal agency, where he helped developers learn cloud
development and adopt API standards. Ryan enjoys learning
new technologies by building real-world products and then
teaching others.

Ryan is an experienced open-source developer and participates


in the FastAPI project. He has been playing fantasy football for
longer than he has been working professionally and enjoys
building data science products for his fantasy teams. He lives
with his wife in the Kansas City metropolitan area.
from

Your gateway to knowledge and culture. Accessible for everyone.

z-library.se singlelogin.re go-to-zlibrary.se single-login.ru

O cial Telegram channel

Z-Access

https://wikipedia.org/wiki/Z-Library
ffi

You might also like