100% found this document useful (1 vote)
882 views264 pages

OpenID Connect in Action v13

OpenID Connect is a standard that allows client applications to securely identify users by leveraging an identity provider. It builds upon OAuth 2.0 by adding an authentication layer. Some key points: - OpenID Connect defines communication between a client application (like a website) and an identity provider (like Google or Facebook) to authenticate users. - It enables single sign-on, allowing users to log into multiple applications using one identity provider account. - Popular use cases include social logins like "Log in with Google" as well as securing access to APIs and microservices. - The book will cover integrating OpenID Connect into different types of applications and best practices for security and implementation.

Uploaded by

mayuran19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
882 views264 pages

OpenID Connect in Action v13

OpenID Connect is a standard that allows client applications to securely identify users by leveraging an identity provider. It builds upon OAuth 2.0 by adding an authentication layer. Some key points: - OpenID Connect defines communication between a client application (like a website) and an identity provider (like Google or Facebook) to authenticate users. - It enables single sign-on, allowing users to log into multiple applications using one identity provider account. - Popular use cases include social logins like "Log in with Google" as well as securing access to APIs and microservices. - The book will cover integrating OpenID Connect into different types of applications and best practices for security and implementation.

Uploaded by

mayuran19
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 264

MEAP Edition

Manning Early Access Program


OpenID Connect in Action
Version 13

Copyright 2023 Manning Publications

For more information on this and other Manning titles go to


manning.com

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


welcome
Thank you for purchasing the MEAP for OpenID Connect in Action. Although most people don’t
realize it (developers excepted), every day we all use OpenID Connect to log in worldwide.
Developers use OpenID Connect to secure access to all types of applications. In this book you
will learn everything you need to know about OpenID Connect so you can implement it.

Over time, I’ve seen more and more applications adding support for OpenID Connect, which
is easily overtaking its most successful predecessor, SAML 2.0. Since 2016, most applications
developed globally use OpenID Connect for login.

You may have heard of OpenID, which was OpenID Connect’s predecessor. When I joined
WSO2 in 2007, my first task was to implement OpenID support for the open-source Identity
Server, which was called Identity Solution in those days. A few years later, in 2009, we
completed a large-scale deployment of Identity Server as an OpenID provider in Saudi Arabia
with a user base of 4 million. That was my first hands-on experience with OpenID. Later, I also
implemented OAuth 1.0 support, and some part of the OAuth 2.0 support in the Identity Server
product. When OpenID Connect became mainstream, we added OpenID Connect support to the
Identity Server.

In this book, I focus on four types of applications: single-page applications, native mobile
applications, and server-side web applications. I picked these types of applications because they
address almost all the common OpenID Connect use cases we see in practice today.

In addition to explaining OpenID Connect internals in detail while taking you through how
different applications integrate with OpenID Connect, the book also covers security pitfalls and
the best practices and guidelines to avoid those while integrating client applications with OpenID
Connect for login.

The sample applications in the book use Java and React. Although having some working
knowledge of those technologies will be helpful, it’s not a must. If you are comfortable with any
programming language, know the basics of JavaScript, and have a solid understanding of how
the web works in general, you are all set.

I hope you’ll find the book useful. Please post any questions, comments, or suggestions
you have about the book in the liveBook discussion forum. Your feedback is essential in
developing the best book possible.

—Prabath Siriwardena

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


brief contents
1 The OpenID Connect landscape
2 The cornerstone of OpenID Connect
3 Securing access to a single-page application
4 The building blocks of an ID token
5 Requesting and returning claims
6 Securing access to a server-side web application
7 Logging out
8 Claim-based access control with Open Policy Agent (OPA)
9 Securing access to a native mobile application with OpenID Connect
10 Mitigating common threats and vulnerabilities
APPENDIX
A React fundamentals

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


1

The OpenID Connect landscape

This chapter covers

• What is OpenID Connect and why?


• How OpenID Connect differs from OpenID, SAML 2.0 Web SSO and OAuth 2.0?
• What is identity federation and single-sign-on (SSO)?
• The use cases of OpenID Connect
Even if you don’t have hands-on developer experience in integrating OpenID Connect with
your web or mobile applications for login, the chances are very high that at some point in
your life you have used OpenID Connect to log into some web or mobile application. If you
have ever used Log in with Apple ID or Log in with Google, you have used OpenID
Connect underneath. 1
In simple terms, OpenID Connect is a standard developed by the OpenID foundation that
defines how a client application communicates with an identity provider to identify a user. 2 A
client application can be a single-page application, a native mobile application, a server-side
web application and so on. We discuss in the rest of the book how to integrate OpenID
Connect with all these types of applications.
An identity provider is a generic term for an entity that manages user attributes and
credentials. Google, Facebook, Microsoft all of them are identity providers. The users can log
into a client application via an identity provider following some kind of a protocol, which
defines the message flow between those two entities. This protocol can be a standard
protocol that is widely accepted, or a proprietary protocol that is specific to a vendor. An

1 Coursera (https://www.coursera.org), for example, the famous online learning platform, supports login with both Apple ID and Google ID. Hotels.com,
eBay, Bloomberg, Reddit, Meetup, Ups, Rakuten, and many more also support login with both Apple ID and Google ID.
2 OpenID Foundation has defined the standard for OpenID Connect and the core specification is available at https://openid.net/specs/openid-connect-
core-1_0.html.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


2

OpenID Provider is an identity provider that supports the OpenID Connect standard, as the
protocol to communicate with client applications. 3
OpenID Connect is not the only standard out there or the only option you have, to
integrate your web and mobile applications with an identity provider for login. But, by far,
and also in the foreseeable future, OpenID Connect is the most widely adopted technology
for login among greenfield applications. Perhaps that’s why you’ve invested in a book on
OpenID Connect. You’ve made the right decision! Understanding how OpenID Connect works,
integrating OpenID Connect with your web and mobile applications, and understanding the
role of OpenID Connect in securing your APIs/microservices are key skills every developer
should possess.
In this book you will learn everything you need to know about OpenID Connect, and we
don’t expect you to know about anything other than OpenID Connect is being used for login.
The sample applications in the book use Java, React, and React Native. So, having some
working knowledge of those technologies will be helpful, but is not a must. If you are
comfortable with any programming language and know the basics of JavaScript, and have a
solid understanding of how the web works in general, you are all set to get started. Appendix
A and B of the book help you get a jumpstart with React and React Native respectively.

1.1 What is OpenID Connect?


OpenID Connect is a standard developed by the OpenID foundation, on top of OAuth 2.0
specification (https://tools.ietf.org/html/rfc6749). OAuth 2.0 is an authorization framework
for access delegation. You can also call OpenID Connect an identity layer built on top of
OAuth 2.0. 4 To understand OpenID Connect in detail, having a good understanding of OAuth
2.0 is a must. The chapter 2 of the book covers OAuth 2.0 essentials that you need to know
to follow this book. Also, if you’d like to delve deep into OAuth 2.0, we recommend the
books, OAuth 2.0 in Action by Justin Richer and Antonio Sanso (Manning Publications, 2017)
and Advanced API Security: OAuth 2.0 and Beyond by Prabath Siriwardena (Apress, 2019).
In a typical OpenID Connect login flow, there are two main parties involved in addition to
the end user: the OpenID provider and the client application. The OpenID provider
manages user attributes and credentials, and lets the users log in to client applications
following the OpenID Connect protocol. These client applications and OpenID provider can
belong to the same organization or different organizations. When a client application comes
from an outside organization from where the OpenID provider belongs to, we call that client
application a third-party client application. When you log in to eBay with your Apple ID,
for example, eBay is the client application and Apple is the OpenID provider. They belong to
two different organizations or even you can say they belong to two different domains. You
can call eBay a third-party client application from the point of view of Apple. In the same
way, you can call Apple a third-party OpenID provide from the point of view of eBay.
The client applications rely on an OpenID provider to authenticate users. In this book we
mostly focus on integrating client applications securely with an OpenID provider. We won’t
teach you how to write an OpenID provider, but will help you understand the role of an

3 In general we call the identity provider that supports OpenID Connect standard, an OpenID provider – not an OpenID Connect provider.
4 OAuth 2.0 is about authorization, while OpenID Connect is about authentication. OpenID Connect uses the OAuth 2.0 protocol to transport attributes
related to a user’s identity from an identity provider to a client application.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


3

OpenID provider and what you need to worry about when picking an OpenID provider for
your project. We’ll use popular open-source OpenID providers in our samples, and since how
you set up those tools could vary time to time with new releases, we’ll keep the setting up
steps of those OpenID providers outside the book, in our GitHub repository
(https://github.com/openidconnect-in-action/samples).
In simple terms, OpenID Connect is a standard that defines how a client application
communicates with an OpenID provider to identify a user. How exactly the communication
between a client application and the OpenID provider takes place is defined in the OpenID
Connect Core specification (https://openid.net/specs/openid-connect-core-1_0.html)
developed by the OpenID foundation. There are few more specifications developed by the
OpenID foundation to address some other use cases around OpenID Connect. As we delve
deep into OpenID Connect, we’ll introduce you to those specifications. Also, in section 1.8 we
discuss OpenID Connect use cases.

Figure 1.1 The OpenID Connect specification defines how a client application can authenticate a user by
talking to an OpenID provider. It further defines how exactly the messages being passed between the client
application and the OpenID provider.

Figure 1.1 shows a typical OpenID Connect flow at a very highlevel. From the user’s point of
view, its quite similar to an OAuth 2.0 authorization code grant type, which you’ll learn in
chapter 2. You’ll learn in detail, what exactly happens under each arrow in chapter 3, when
we explain how to use OpenID Connect to log into a single-page application. However,
following lists out the communications happen between the client application, OpenID
provider and the end user at a highlevel.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


4

• In step 1, the client application redirects the user to the OpenID provider for
authentication. On eBay for example, once you click on Login with Google you are
redirected to Google for authentication.
• The OpenID provider first checks whether the user has a valid session under the
domain of the OpenID provider, and if not, in step 2, challenges the user to
authenticate. Also, will show the end user which user attributes the client application
is requesting. This step is outside scope of the OpenID Connect protocol, and different
OpenID providers have implemented this in different ways.
• In step 3, the end user authenticates to the OpenID provider and gives or rejects
their consent to share the requested attributes with the client application. This step is
also outside the scope of the OpenID Connect protocol.
• In step 4, the OpenID provider redirects the user back to the client application. Based
on the OpenID Connect authentication flow you use (which we discuss in chapter 3),
this step may return the requested attributes directly to the client application, or a
temporary token that can be exchanged to the requested attributes via another direct
call happens between the client application and the OpenID provider.

1.2 An alternative view of OpenID Connect


As we discussed in section 1.1 the most common way of looking at OpenID Connect is as an
identity layer built on top of OAuth 2.0. There is another way of looking at OpenID Connect.
It’s a specification, which defines the followings:

• A schema for a token (which is a JSON Web Token (JWT)) and a set of processing
rules around it. A JWT is a container to transport a set of claims (a set of attributes)
from one point to another point in a cryptographically safe manner. You’ll learn more
about JWT in chapter 4. The OpenID Connect specification identifies this token, as the
ID token and you will learn more about ID token in chapter 5. Following is a decoded
ID token and only shows the header and the body parts. 5 The third part is the
signature of the body, which is not shown.

{ "alg": "RS256", "kid": "1e9gdk7" }


{
"iss": "http://server.example.com",
"sub": "248289761001",
"aud": "s6BhdRkqt3",
"nonce": "n-0S6_WzA2Mj",
"exp": 1311281970,
"iat": 1311280970
}

• A transport binding, which defines how to transport an ID token from one point to
another. OpenID Connect specification uses the term, authentication flows, to
define multiple ways how you can transport an ID token from one point to another.

5 A JWT can be a JWS (JSON Web Signature) or a JWE (JSON Web Encryption). In practice most of the time an ID token is a JWS. However, it can be a JWE as
well. If an ID token is a JWS, then it has three parts, and if the ID token is a JWE, then it has five parts. For further details please check chapter 4.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


5

Most of the applications of OpenID Connect use both the token type and the transport
binding, for example when you use Login with Google on eBay. But, still there are some
applications that rely on only the token (the ID token). Those applications use ID token as
the contract to transfer attributes in cryptographically safe manner. For example, Kubernetes
uses an ID token to authenticate to the Kubernetes API server, and SPIFFE JWT SVID profile
defines the structure of the JWT as an ID token. 6

1.3 OpenID Connect vs. OpenID


OpenID Connect has its roots in OpenID. In fact OpenID is the predecessor of OpenID
Connect. There is a chance that you might not have heard of OpenID, or think OpenID is
synonymous to OpenID Connect – they are not. OpenID is a specification developed by
OpenID foundation (the same entity behind OpenID Connect) in 2005, as a way of facilitating
login to a client application (OpenID uses the term relying party to represent a client
application), via an identity provider the user picks. The end user experience is very much
similar to what you experience today when you login to eBay using your Apple ID or Google
ID. But how the protocol works underneath is completely different. Today no one uses
OpenID except Amazon and Goodreads. Surprisingly, Amazon still uses OpenID (at least at
the time of this writing). Try to sign into amazon.com or log in to goodreads.com via the
Signing with Amazon option; you will see how OpenID works.
OpenID Connect is not OpenID and the two are not compatible with each other. Even
though conceptually they try to address the same problem at a high level, OpenID Connect’s
design is totally different from that of OpenID. So, the best part of it is, to learn OpenID
Connect, you don’t need to know anything about OpenID. 7 Also, unlike OpenID, OpenID
Connect is enterprise ready! Also, in most of the time, today if someone just says OpenID,
they probably mean OpenID Connect.

1.4 OpenID Connect vs. OAuth 2.0


As we discussed in section 1.3, OpenID Connect is not compatible with its predecessor,
OpenID. Unlike OpenID, OpenID Connect is built on top of OAuth 2.0. From here onwards,
in this book, when not mentioned explicitly, whenever we say OpenID, we are referring to
OpenID Connect.
OAuth 2.0, which is defined in the RFC 6749, is an authorization framework for access
delegation. OpenID Connect builds an identity layer on top of OAuth 2.0. We’ll discuss what
this really means later in this section. If you are new to OAuth 2.0, the chapter 2 of the book
will help you understand it better. Having a good understanding of OAuth 2.0 is a must to
understand how OpenID Connect works.
An OpenID provider is always an OAuth 2.0 authorization server, yet the reverse is not
true all the time. 8 For example, Facebook supports OAuth 2.0, so we can call it an

6 If you are new to Kubernetes and/or SPIFFE please check the Appendix H and Appendix J of the book, Microservices Security in Action (Manning
Publications, 2020) by Prabath Siriwardena and Nuwan Dias.
7 OpenID is an obsolete protocol today. However if you are interested in reading more about OpenID, please check this presentation done by Prabath
Siriwardena (the author of this book) in 2008: https://www.slideshare.net/prabathsiriwardena/understanding-openid.
8 In general, we call the identity provider that supports OpenID Connect standard, an OpenID provider – not an OpenID Connect provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


6

authorization server, but we cannot call Facebook an OpenID provider, because it does not
support OpenID Connect.

Figure 1.2 A typical OAuth 2.0 authorization code grant type flow. The client application, following the
authorization grant type obtains an access token from the authorization server to access a resource on behalf
of the user (the resource owner).

One of the key features in OAuth 2.0 is grant types. A grant type defines how a client
application can get an access token from an OAuth 2.0 authorization server (see figure 1.2).
This access token lets the client application accesses a resource (an API, a microservice) on
behalf of the owner of the resource. If a client application, for example, wants to access your
Facebook photos via the Facebook API (GET graph.facebook.com/me/photos), the client
application first has to obtain an access token from the Facebook authorization server, with
your consent, and use that access token along with the API call to access the photos. The
photos are the resources and you are the resource owner. The me, in above API represents
you, or the owner of the resource, or the owner of the access token. The following table
summarizes the mapping between OAuth 2.0 terminolgy and Facebook terminology.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


7

Table 1.1 The summary of the mapping between OAuth 2.0 terminology and Facebook
terminology

OAuth 2.0 Terminology Facebook Terminology

Client The client application, which relies on Login with


Facebook

Resource owner The Facebook user

Resource server Facebook, which hosts Facebook APIs

Resources Facebook APIs corresponding to photos, wall and so


on

Authorization server Facebook, which authenticates Facebook users

With OAuth 2.0, the client application gets an access token from the authorization server.
This token necessarily does not identify the end user (or the resource owner). It’s only good
enough to access a resource on behalf of the resource owner. The client application can
access a resource even without knowing who the resource owner is. The bottom line is,
OAuth 2.0 does not let the client application know who the user is, and it only lets the client
application access a resource on behalf of the user. OAuth 2.0 is not about authentication,
but authorization.
OpenID Connect, which is built on top of OAuth 2.0, defines how you can extend OAuth
2.0 to identify a user. In simple terms OpenID Connect is the identity layer built on top of
OAuth 2.0. When you use OpenID Connect, in addition to the access token, you also get an
identity (ID) token. OpenID provider uses a structure (or a container) called ID token to
transport claims to the client applications. As discussed in section 1.2, the ID token is in fact
a JSON Web Token (JWT).
An ID token can carry multiple assertions. An assertion (or a claim) is a strong
statement about someone or something, which can be cryptographically verified by the
recipient of the assertion. Typically, you can call an OpenID provider (or any identity
provider), an issuer of claims or assertions. An assertion can be an attribute assertion, an
authentication assertion, or an authorization assertion.
• An attribute assertion: A strong statement about someone or something, which
carries a set of attributes. Your driving license, for example, issued to you by the
Department of Motor Vehicles (DMV) carries a set of attribute assertions: name,
address, eye color, hair color, gender, date of birth, license number and so on.
• An authentication assertion: A strong statement issued with respect to how the
OpenID provider authenticates the user during the login flow. The authentication
assertion might be the username of the user and how the issuer authenticates the
user before issuing the assertion.
• An authorization assertion: A strong statement issued with respect to the
corresponding user’s entitlements. Based on the authorization assertion, the recipient
of the assertion (or the client application) can decide how to act. When an issuer, for
example, shares user’s age or date of birth with a client application, its an attribute
assertion, but if the issuer says the user is entitled to buy a beer (if the user is old
enough) without sharing the age, then its an authorization assertion.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


8

In chapter 3, we’ll discuss in detail how OpenID Connect extends OAuth 2.0. For the time
being, let’s conclude that OAuth 2.0 is about authorization, while OpenID Connect is about
authentication.

1.5 How login with Facebook works around OAuth 2.0 for
authentication
As we concluded in section 1.4, OAuth 2.0 is about authorization. Then why are there zillions
of web sites using Login with Facebook to authenticate users? Facebook is using OAuth
2.0, not OpenID Connect. In this section we discuss how client applications work around
OAuth 2.0 to authenticate users. In chapter 3 we will discuss how OpenID Connect builds an
identity layer on top of OAuth 2.0, and why you should be using OpenID Connect for
authentication, rather building your own ways around OAuth 2.0.
At the end of an OAuth 2.0 flow you get an access token. This access token is for the
client application to access a resource (an API, a microservice) on behalf of the resource
owner. Or in other words, it’s not for the consumption of the client application itself. Put this
in a better way, the access token’s audience is not the client application, but the resource
(while the audience of the ID token, which comes with OpenID Connect, is the client
application). So, the client applications should not try to interpret the meaning of an access
token.
The OAuth 2.0 access token a client application gets at the end of the Login with
Facebook flow, for example, is for the consumption of the Facebook API, not for the client
application. It’s an opaque token for the client application. It should not try to decode the
access token and try to interpret the meaning of it. There are two types of access tokens
commonly used: a reference token and a self-contained token. At the time of this writing
Facebook only uses reference tokens.
• A reference token is just a random string value generated by the authorization
server. It carries no meaningful data and only makes sense to the issuer of the token.
When an API (or the recipient of the token) wants to validate a reference token, it has
to talk to the authorization server (or the issuer of the token) all the time.
• A self-contained access token, contains some useful information with respect to
the user, client application, scopes and so on, and it’s a JSON Web Token (JWT). We
discuss JWT in detail in chapter 4.

In either case, whether the access token you get is a reference token or a self-contained
token, the client application should not try to interpret the data embedded into the token. It
should only use an access token to access a resource on behalf of the resource owner.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


9

Figure 1.3 Client applications use Login with Facebook to authenticate users, which uses OAuth 2.0
authorization code grant type underneath. To identify the user, the client applications access an OAuth 2.0-
protected Facebook business API with the access token received in step 4.

Let’s get back to the Facebook use case. Once a user completes the Login with Facebook
flow, the client application gets an access token (figure 1.3). This access token does not
carry any user information. But is only good enough to access the Facebook API,
https://graph.facebook.com/me.
The request to the Facebook API will return you back the information with respect to the
owner of the access token you pass along with the request. The level of data exposed via this
API is governed by the OAuth 2.0 scopes. We discuss OAuth 2.0 scopes in detail in chapter 2.
For the time being, think about scopes as permissions. The scopes attached to an access
token define what you can do with that token.
This is how Login with Facebook works around OAuth 2.0 for authenticating users. At the
end of the OAuth 2.0 flow, the client application has to use a non-standard API provided by
Facebook to authenticate users, and it’s outside the scope of OAuth 2.0 specification. So still,
OAuth 2.0 acts as an authorization framework, and to identify a user, the client applications
talk to a Facebook business API. In chapter 3, we discuss the limitations of this approach,
and why you should use OpenID Connect instead.

1.6 OpenID Connect vs. SAML 2.0 Web SSO


In section 1.3 we discussed that OpenID is the predecessor of OpenID Connect. SAML 2.0
Web SSO (Single Sign On) is another standard, which is quite similar to OpenID and OpenID

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


10

Connect in terms of the end user experience. OpenID (not OpenID Connect) and SAML 2.0
Web SSO came out in the same year to address similar concerns. But yet, most of the
enterprises adopted SAML 2.0 Web SSO, while most of the web sites that required
community interactions adopted OpenID. The pedigree of SAML 2.0 with the support from all
the members of the Liberty Alliance (http://www.projectliberty.org/), Shibboleth initiative
(https://www.shibboleth.net/) and many who already were using SAML version 1.1, helped
SAML 2.0 to be a widely adopted standard. At the same time, SAML 2.0 addressed most of
the enterprise SSO use cases, while OpenID failed behind. However, OpenID established
itself as a successful standard in the web community.
Even today, few years after OpenID Connect replaced OpenID, SAML 2.0 Web SSO is still
the most widely used standard for SSO (even though most of the new greenfield applications
today use OpenID Connect). The popularity of SAML 2.0 Web SSO started to fade with the
popularity of XML, around early 2010s. Identity industry started looking for a JSON based
standard, which is also as strong as SAML 2.0 Web SSO in terms of enterprise level use
cases and security. OpenID Connect filled that gap. Unlike OpenID, OpenID Connect today is
as strong as SAML 2.0 and addresses key enterprise use cases. If you start building any
application today, we recommend using OpenID Connect instead of SAML 2.0 Web SSO.

1.7 Transporting identity related attributes across multiple trust


domains
In section 1.1 you learnt that in an OpenID Connect login flow, when a client application
comes from an outside organization from where the OpenID provider belongs to, we call that
client application a third-party client application. The whole process of sharing identity
related data (for example user attributes) between two or more organizations is called
identity federation in general. When you log in to eBay using your Google ID, for example,
Google shares your identity attributes that you store at the Google OpenID provider, with
eBay. In other words, Google federates your identity attributes to eBay.
In an enterprise setup, the identity federation can take place even within a single
organization, among multiple departments. Typically, you have the Corporate IT department
that centrally manages all the employee identity data. This is the team that also owns the
corporate OpenID provider. The other departments of the company, such as Marketing,
Engineering, Sales and so on, may have their own applications, and there can be small IT
teams managing those applications. All the applications connect to the corporate OpenID
provider to authenticate employees. The corporate OpenID provider transfers identity data
corresponding to an employee that the Corporate IT department owns, to an application
owned by a different department, during the login flow. This is another example of identity
federation.
To precisely define identity federation, we use the term trust domain, instead of
organization and department. In fact, an organization or a department can be a trust
domain. However, a trust domain has a broader meaning. We can define a trust domain of
an application by the teams having control and governance of that application. All the
applications controlled and governed by the same team fall under the same trust domain.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


11

When you transfer identity attributes among two or more trust domains, we call that process
an identity federation.
When sharing identity attributes, the application in the recipient domain only accepts
identity attributes from a trusted OpenID provider. The most common way to establish trust
among domains is via X509 certificates. The OpenID provider signs the attributes it shares
with the client applications, and the client applications validate the signature using the
corresponding public certificate of the OpenID provider, which is known to the client
applications already.
OpenID Connect, SAML 2.0 Web SSO, WS-Federation, Central Authentication Service
(CAS) are the standard ways of implementing identity federation. All of them define a
protocol for how to transport identity related attributes from an identity provider to a client
application. For the same reason these standards are also called standards for identity
federation. When you hear someone saying OpenID Connect is a standard for identity
federation, now you know what it means!

1.8 Building a seamless login experience among multiple


applications connected to a single identity provider
Once you logged into an identity provider that supports SSO by providing credentials, you
can access all the applications connected to that identity provider without further providing
your credentials. When there are multiple applications that trust Google for login, for
example, you have SSO among all those applications; and you provide credentials to Google
only once.
If you have multiple applications and each one of them independently accepts login
credentials, you cannot build an SSO experience for the users among those applications.
Each time a user logs into an application, they have to share their credentials with the
corresponding application. This is very frustrating and a productivity killer. Sometimes your
employees may tolerate such experience, but if you build customer-facing applications, your
customers never will do. Building the right level of user experience is the key to onboard and
retain customers.
You can build an SSO experience for a user only among the applications that trust the
same identity provider. If you have a set of applications that trusts Google for login, and
another set of applications that trusts Apple for login, you cannot build SSO among all those
applications. You can have SSO among the applications that trust Google for login, and SSO
among the applications that trust Apple for login. But you cannot have SSO between both the
sets of applications. To have SSO, it’s a primary requirement that all the applications that
participate in building the SSO experience trust a single identity provider.
There are two main types of SSO: web SSO and enterprise SSO. In this book, whenever
we say SSO, we mean web SSO. The discussion we had in this section so far is also related
to web SSO. Web SSO works only with web applications, which are accessed via a web
browser. So, the client applications as well as the OpenID provider must be web applications.
Enterprise SSO talks about building SSO among all the types of applications a user has
access to, which includes web applications, desktop applications, command line tools and so
on.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


12

1.9 The benefits of having one trusted identity provider for multiple
client applications
The most traditional way of integrating login with an application is form-based
authentication. The application itself asks the user to provide their credentials and connects
to a credential store, which could be an LDAP server (OpenLDAP, OpenDS, OpenDJ, Microsoft
Active Directory and so on) or a database to verify the credentials (figure 1.4). In this
section we discuss the drawbacks of this design and the advantages you gain by moving to a
model where you have one trusted identity provider (an OpenID provider).

Figure 1.4 Each application itself requests the user to provide their credentials and post those to a backend
system to validate. To share the credentials with an application, the user has to trust each of them, which
broadens up the attack surface. Each has to worry about how to securely accepts and transmit user
credentials.

If you have experienced in using SAML 2.0 Web SSO or any other OpenID Connect
equivalent protocols, you probably are familiar with this topic and could safely skip this
section and move to section 1.10.

1.9.1 Having one trusted identity provider means you have a single source of truth
Having each application to manage user credentials would probably work for a small
company with few applications and one team to own them all. In this design you expect each
application to handle user credentials in a responsible manner. The application has direct
access to the credentials a user shares during the login flow, as well as direct access to the
credential store. This puts lot of trust on the implementation and the deployment of the
application. You probably can do that when you are a small company and you have just one
team to handle everything. But as you scale, and when you have multiple teams developing
applications, or you start onboarding applications from different vendors, you cannot trust
each application to handle user credentials in a secure way.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


13

Figure 1.5 Each application trusts the OpenID provider for authenticating users and the OpenID provider acts
as the single source of truth. Only the OpenID provider has direct access to user credentials. To authenticate a
user to a given application, the corresponding application redirects the user to the OpenID provider.

OpenID Connect helps decouple applications from user credentials. Only the OpenID provider
needs direct access to the credential store and also only the OpenID provider accepts
credentials from the users. This makes the OpenID provider the single source of truth. All
the other applications only need to establish a trust relationship with the OpenID provider,
and does not need to trust each and every individual user (see figure 1.5). 9 If the OpenID
provider says it’s a legitimate user in the system, the client applications simply take its word.
In chapter 3 we’ll discuss how to establish a trust relationship between the OpenID provider
and the client applications.

1.9.2 Having one trusted identity provider helps implementing single sign on
(SSO) across multiple client applications
When we have multiple applications that trust a single OpenID provider for login, then a user
only needs to login to that OpenID provider once; that is at the first time they log into an

9 The most common way for an application to establish trust with an OpenID provider is to trust the public certificate corresponding to that OpenID
provider. The OpenID provider signs all identity attributes it shares with the client applications, and the client applications can verify the signature
using the corresponding public certificate.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


14

application via the OpenID provider. When login to other applications via the same OpenID
provider, users do not need to type their credentials again and again. From end user’s point
of view, this provides a single sign-on experience. As we discussed in section 1.8, in this
book we only talk about web SSO.
The OpenID Connect by design maintains the logged in user’s session under a single
domain, which is the OpenID provider’s domain. Here the domain means, the HTTP domain
name of the OpenID provider, which you use to access via a web browser. For example,
when you use you Google ID to login to eBay, the Google OpenID provider maintains your
logged-in session under the accounts.google.com domain name.
Multiple applications that trust an OpenID provider for login will redirect the users to the
OpenID provider’s domain for authentication. Since the OpenID provider maintains the users’
logged-in sessions under the OpenID provider domain name, it can detect whether a given
user has already logged in or not. If the user is already logged in, then the OpenID provider
can skip asking credentials from that user. Typically, the OpenID provider uses browser
cookies to detect whether a given user has a valid login session or not. From the end user’s
point of view, this builds an SSO experience.

1.9.3 A single place to implement and configure multiple login options for user
authentication
If each application has to deal with user authentication independantly, then each application
should know how to implement and maintain those authentication protocols. This could be
doable when you simply rely on username/password-based authentication. But to address
the current threats in cyber security and ever-increasing data breaches globally, most
modern applications rely on multi-factor authentication (MFA) and adaptive authentication.
With MFA, in most of the times, you would use one more login options along with
username/password-based authentication. This second factor could be a onetime passcode
(OTP) sent over SMS or email, Fast Identity Online (FIDO) 2.0, time based OTP (TOTP) using
something like the Google Authenticator app, and so on 10.
With adaptive authentication, you decide how you want to authenticate the users based
on some contextual parameters. These contextual parameters can be the role of a user, the
time of the day, location of the user, the number of failed login attempts, or a risk score,
which is derived dynamically during the login flow, to decide which login options you should
use. If the risk score is high, for example, you can use FIDO 2.0 as the second factor, while if
the risk score is medium, use OTP over SMS. Typically, to find the risk score your application
has to connect to a risk engine. 11
These are complex scenarios that we cannot expect each application to handle by it self.
It’s a hard ask for an application developer to build support for these authentication options
in their applications, as to implement those requires specialized skills and needs to be tested
thoroughly.

10 FIDO 2.0 is a specification developed by FIDO Alliance for strong authentication.


11 A risk engine is a specialized service or software that considers multiple factors and derives a risk score. For example if you use your credit card in USA
and China within the same hour, a risk engine that evaluates a all the credit card transactions will detect that the 2nd transaction is at high risk, and
probably will lock the credit card.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


15

Since the design of OpenID Connect helps you decouple applications from the identity
provider, which is also the single source of truth, we don’t want individual applications to
implement the above complex authentication options. All the client applications rely on an
OpenID provider to authenticate users and the OpenID provider can implement all the
complex authentication options. Then again, you should not worry about building an OpenID
provider yourself; rather use one already available. Section 1.8 lists some of the available,
production-ready, open source OpenID providers.

OpendID Connect decouples the authentication options from the protocol


None of the identity federation protocols, including OpenID Connect, mandate how to authenticate a user at the
identity provider. They define the request/response protocol between the client application and the identity provider.
So, how you implement MFA options or implement support for adaptive authentication is outside the scope of OpenID
Connect. Most of the modern identity providers do support these requirements. When you pick an identity provider,
you need to be conscious about these requirements.

1.9.4 Having one trusted identity provider helps to bootstrap trust with external
identity providers
Most enterprises, as they grow, start building relationships with external partners, suppliers,
and many more who need access to their internal applications. One straightforward way of
onboarding a partner is to create an account for each employee of the partner who needs
access to your internal applications, in your local credential store itself. This approach looks
simple and hassle free, but yet has a major security concern. Here, we do not trust the
partner company, rather individuals of that company. If an individual leaves the
corresponding company, unless they notify you and you consciously remove their accounts
from your system, they will still have access. Not just the security concern, there is a
usability concern too from the user’s point of view. Now the employees of the partner
company have to maintain one more set of credentials.
The ideal approach in fixing these kinds of problems is to build trust between the two
parties – or the two companies. How do you do that? One approach is that each and every
application in your domain trusts an identity provider running in your partner’s domain. 12
This may work to some extent if you have one application and one partner. As you scale and
start adding more applications and partners, managing trust relationships between
applications and partners could lead to a maintenance nightmare. 13 We also call this
spaghetti identity pattern, as you build multiple point-to-point connections between
applications and partners. Here are some of the drawbacks of this approach.

12 Each application needs to know the public certificate associated with each external identity provider. When an application gets a set of identity
attributes from an external identity provider, it validates the signature of identity attributes using the corresponding public certificate.
13 The most common way to build a trust relationship is done via X509 certificates. If an application trusts an partner, that application should know the
X509 certificate (or the public certificate) associated with the corresponding partner.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


16

• Hard to enforce a centralized governance model for the company that decides which
partner should have access to which application.
• Each application has to trust one or more partner domains. Every time to add or
remove a partner, need to update multiple systems.
• Each application has to deal with protocol / claim transformations by itself. For
example, when your application supporting OpenID Connect has to connect to a
partner identity provider that supports SAML 2.0 Web SSO, your application should
know how to transform a SAML token to an OpenID Connect token. We discuss this in
section 1.7.5 in detail.

The centralized identity provider approach helps to overcome all the above drawbacks (see
figure 1.6). You have one identity provider all your applications trust, and you build trust
relationships between that identity provider and partner identity providers. In that way, all
your applications only need to trust your own internal identity provider, and should only
know how to talk to it. This internal identity provider bootstraps trust with partners (or with
partner identity providers). 14

Figure 1.6 The internal identity provider bootstraps trust with partner identity providers, and does
claim/protocol transformations, as expected by the client applications connected to it. All the client
applications only need to trust the internal identity provider, and should only need to support the federation
protocol, the internal identity provider supports.

14 The bootstrap trust refers to the process you have to go through to introduce a new partner into the system as a trusted identity provider. During this
process you will upload the public certificate of the partner identity provider to your internal identity provider, so the internal identity provider can
validate the signature of the identity attributes coming from the corresponding partner identity provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


17

1.9.5 Handling protocol / claim transformation between client applications and


partner identity providers at a single place
If each application directly connects to partner identity providers, then each application
needs to know how to talk to those identity providers. Not all of them will support OpenID
Connect, but other federation protocols as well, such as SAML 2.0 Web SSO, WS-Federation,
Central Authentication Service (CAS), and so on. In that case, based on the partner identity
provider, the application developers have to implement the support for the corresponding
federation protocols. 15
Also, in practice, when we talk to multiple partner identity providers, each identity
provider may have its own way of defining claims they share. A claim is an assertion issued
by a trusted entity, for example by an OpenID provider. 16 Here, the challenge for the client
application is, different partner identity providers can use different claim URIs to identify the
same claim. For example, one identity provider may use email_address as the claim URI,
while another could use email to identify the email address of the corresponding user.
This can even get worst when some partner identity providers don’t have the exact claims
you expect. Your client application, for example, may expect full name of the user as a claim,
but the partner identity provider only provides first name and last name. In that case you
need to have some code written in your application to concatenate these two together to
create the full name claim.
Expecting each client application to deal with multiple federation protocols and claim
transformations make your application brittle. You deeply couple your application to external
identity providers. Also, building support for multiple federation protocols at the client
application end requires specialized skills, which normally goes beyond the average
developer skills.
The best way to handle this kind of a situation is, push the logic to handle protocol/claim
transformation to a centralized identity provider. Your application will always expect claims in
one given format, and will only talk OpenID Connect. The internal identity provider your
application connects will deal with connecting to partner identity providers over
heterogeneous federation protocols and will do the claim transformations where required.

1.10 OpenID Connect use cases


You already know OpenID Connect is being used for login, and probably that’s how you got
to know about OpenID Connect for the first time. In this section, we discuss login and other
extended use cases of OpenID Connect. All these use cases revolve around the primary login
use case.

1.10.1 Login to client applications


Login with OpenID Connect is the most popular use case of OpenID Connect, and also the
base for all the other use cases. You can use OpenID Connect to log into your web
application, single-page application, native mobile application or to an application running on

15 WS-Federation and CAS are federation protocols similar to OpenID Connect and SAML 2.0 Web SSO. However those are not that popular.
16 Typically, an identity provider shares claims as name / value pairs. The name of the claim is also called the claim URI and the value is called the claim
value

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


18

a browserless device such as a smart TV. In chapter 3 we discuss in detail how you can
integrate OpenID Connect with a single-page application developed in React, and in chapter
6 we discuss how you can use OpenID Connect to log into a server-side web application
developed in Java. Then in chapter 9 we discuss how to integrate OpenID Connect for login
with a native mobile application developed in React.

1.10.2 Sharing attributes


OpenID Connect lets you transport user attributes from an OpenID provider to the client
applications. The client applications can request the attributes they need, by using OAuth 2.0
scopes, OpenID Connect request attributes or with some OpenID provider specific out of
band configuration. 17 A client application can use these attributes to identify the user,
communicate with the user (for example use user’s email address to send a newsletter),
build a personalized user experience, as well as do application specific authorization checks.
In chapter 5 we discuss in detail how a client application can request attributes from an
OpenID provider, and how the attributes are returned to the client application.

1.10.3 Signup with OpenID Connect


Signup is another key use case of OpenID Connect. This is not a use case OpenID Connect
addresses in a specific way, rather the client applications use the login flow of the OpenID
Connect protocol to sign up users. If you visit the web site Coursera (coursera.org) and click
on Join for Free, you’ll see options to sign up with Google ID and Apple ID. Both cases
initiate a login flow with OpenID Connect (figure 1.7). At the end of the login flow, Coursera
gets user attributes from the OpenID provider, and will use those to create a local account
for the user in its own user store. This reduces the signup friction to some extent, as the
users do not need to retype their personal information every time they sign up for some
service.

17 In general most of the OpenID providers have their own ways of configuring what attributes need to be shared with which client applications. These
configurations are done out-of-band by application developers directly interacting with the corresponding OpenID provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


19

Figure 1.7 Coursera, the famous online learning platform, supports signup with both Apple ID and Google ID,
which use OpenID Connect underneath. Once the login flow is completed, the client application checks
whether the user has a valid local account with it; if not, requests the user to sign up.

The risk of completely relying on social login


Most of the time when a web site offers you to sign up with an external social identity provider, say, for example with
Facebook, Google, LinkedIn, Apple, and so on, you won’t have local credentials with the corresponding web site (or
the client application). Instead, you will use the social login all the time. This isn’t specific to OpenID Connect, but if
your website fully relies on social login, you put your business at risk to some extent 18. There were cases in some
countries where the Governments banned Facebook and other social media sites for some period 19. During that time,
if you only relied on Facebook, your users couldn’t log into your web site and consume services, which could possibly
lead into a considerable loss of income. To minimize the risk, you can support login with multiple social identity
providers, or ask your users to create a local account with credentials (with your website) once they sign up via a
social identity provider. That minimizes your dependency on social login, yet your customers can use it for login when
available.

18 There are more than 70 million small businesses on Facebook, using it on day-to-day basis for login.
19 Sri Lanka banned Facebook and some other social media once in 2018 to avoid the spread of false news during a communal riot and again in 2019
during the Easter day bombing attack.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


20

1.10.4 Single logout


Once a user logs out from the OpenID provider, or from any client application, and if those
do support single logout, the user should logout from the corresponding OpenID provider as
well as from all the other client applications the user has logged into via the same OpenID
provider, sharing the same browser session. Just as we discussed SSO in section 1.8, here
too we discuss single logout only respect to web applications, that are accessed using a web
browser.
OpenID Connect protocol does not specifically talk about single sign on (SSO). But, as we
discussed in section 1.9.2, an OpenID provider can implement SSO during the login flow.
However, outside the core OpenID Connect specification, the OpenID foundation developed
three other specifications; these three specifications talk about how to implement logout.
We’ll learn about all three specifications in detail in chapter 7. In a similar way, how an
OpenID provider implements SSO with OpenID Connect, they can implement single logout as
well.

1.10.5 Federating access to APIs


OAuth 2.0 is the de facto standard for securing access to APIs. If you are new to API access
management we would recommend you check out the book, Advanced API Security: OAuth
2.0 and Beyond (Apress, 2019) by Prabath Siriwardena. To access an API that is protected
with OAuth 2.0, you need an access token from an issuer (or an OAuth 2.0 authorization
server) that the corresponding API trusts. 20 Typically a client application accesses an API just
by being itself, or on behalf of another user or system.
Think about a scenario where a user comes from the same trust domain (see section
1.17) as of the OAuth 2.0 authorization server. As explained in section 1.4, an OAuth 2.0
authorization server is not necessarily an OpenID provider. Then again, as explained in
section 1.5, irrespective of the authorization server supports OpenID Connect or not, you still
can work around that to log in a user to a client application even via OAuth 2.0.
If the end users and the authorization server that your API trusts are in the same trust
domain, then the authorization server knows how to authenticate the users. So, when the
client application invokes the API to access a resource on behalf of the end user, and when
the API (or an API gateway that intercepts all the traffic coming to the API) talks to the
authorization server it trusts to validate the token, the authorization server can identify the
user by looking at the token. This is a non-federated API access use case, where the user
and the authorization server ( the API or the API gateway trusts) are in the same domain.

20 When an API receives a request from a client application along with an OAuth 2.0 access token, it always validates the token by talking to an
authorization server it trusts.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


21

Figure 1.8 The client application exchanges the ID token it got from the OpenID provider it trusts to an OAuth
2.0 access token from the authorization server the API trusts. The authorization server only accepts the token
the client application presents in step 2, if it trusts the corresponding OpenID provider. The trust between the
OpenID provider and the authorization server, and the trust between the API gateway and the authorization
server can be established out of band.

In a scenario where the authorization server does not know how to authenticate the users, or
the authorization server has no control over how the client applications authenticate their
users, we need to worry about federation.
When a client application wants to access an API on behalf of a logged in user, it has to
bring in an OAuth 2.0 access token from the authorization server the API trusts. Assuming
the client application has its own trusted OpenID Provider to authenticate its users, it won’t
be able to use the OAuth 2.0 access token it gets during the OpenID Connect login flow, to
access the API. The API only trusts its own authorization server, not the OpenID provider
attached to the client application.
To fix this problem, the client application has to talk to the authorization server the API
trusts and get an access token for the logged in user. In doing that, the client application can
pass the ID token of the corresponding user, it got from the OpenID provider it trusts, during
login flow (see figure 1.8).
During this token exchange the authorization will issue an access token for the user
attached to the ID token, if it trusts the OpenID provider who issued the corresponding ID
token. So, all the time when the client application accesses the API, it passes an access
token issued by the authorization server the API trusts.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


22

1.11 OpenID providers and client libraries


In an OpenID Connect flow there are two main entities: an OpenID provider and a client
application. Mostly, we do not recommend you implement an OpenID provider by yourself,
rather look for the best OpenID provider out there that fits your requirements.
Then again, if you are a hardcore developer, with the required level of skills with a deeper
understanding in OpenID Connect and security, of course you can go ahead and implement.
Even in that case, you do not have to start from scratch.
You can start with an open source OpenID provider, which has a compatible license and
build on top of it. For example, Apache 2.0 is the most business-friendly open source license.
You can fork any software released under Apache 2.0 license, make your own changes in
your own repository, use it privately, or release it to the public either as an open source or
as a proprietary product.
There are other open source licenses as well, such as GNU General Public License (GPL),
MIT, and so on. All the products released under these licenses are also free to use, but if you
intend to make any changes on top of their code base, please check the corresponding
licenses in detail. Here’s an alphabetical list of open source OpenID providers released under
Apache 2.0 licenses.
• Dex (Apache 2.0 license): https://github.com/dexidp/dex
• Identity Server by WSO2 (Apache 2.0 license):
https://github.com/wso2/product-is
• IdentityServer (.NET) (Apache 2.0 license): https://github.com/IdentityServer
• Keycloak by Red Hat (Apache 2.0 license):
https://github.com/keycloak/keycloak
• Gluu (Apache 2.0 license): https://github.com/GluuFederation

Here’s an alphabetical list of popular cloud-based identity management solutions that support
OpenID Connect.
• Auth0 : https://auth0.com
• Azure AD: https://azure.microsoft.com/en-us/services/active-directory
• Okta: https://www.okta.com
• OneLogin: https://www.onelogin.com
• Ping One: http://pingone.com

Mostly, the focus of this book is to help you integrate your client applications with an OpenID
provider. The client application can be a web application, single-page application, native
mobile application or an application running in a browserless device such smart TV.
To build these applications, based on your preferred programming language, you need to
pick a library which knows how to deal with OpenID Connect specific nitty-gritty, and build
your application on top of that. Here’s a list of open source libraries that you can use to
develop OpenID Connect client applications.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


23

• A module for Apache2 web server:


https://github.com/zmartzone/mod_auth_openidc
• A PHP library: https://bitbucket.org/PEOFIAMP/phpoidc/src/master/
• A C# library: https://identitymodel.readthedocs.io/en/latest/native/overview.html
• A Node JS library: https://www.npmjs.com/package/openid-client
• Java/Spring Boot library: https://spring.io/projects/spring-security-oauth
• Ruby library: https://github.com/nov/openid_connect

1.12 What you will learn in this book


This book gives you hands-on introduction to OpenID Connect. We use Java, React
and React Native to build the samples and having basic knowledge of Java and
JavaScript, along with a good developer experience (with any programming
language) will help you to get maximum out of the book. You will learn:
• How to integrate OpenID Connect with a web application, a single-page application
(SPA), and a native mobile application.
• The security best practices in developing a client application to work with OpenID
Connect, based on the use cases and context.
• To pick an OpenID provider by understanding the key requirements in terms of
business use cases, security, and spec compliance.

1.13 Summary
• OpenID Connect is an identity layer built on top of OAuth 2.0.
• OpenID Connect has roots in OpenID, but OpenID Connect and OpenID are not
compatible standards and OpenID is no longer used.
• Most of the greenfield applications developed today use OpenID Connect.
• OpenID Connect facilitates single sign on, identity federation, attribute sharing, single
logout, and many more use cases.
• There are multiple benefits of having one trusted identity provider for a given
enterprise, rather having each application to implement support for multiple
heterogenous identity federation protocols.
• There are many open source implementations of OpenID providers and OpenID
Connect client libraries, and you pick an OpenID provider based on your
requirements, and a client library based on the type of the client and the technology
you prefer in using.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


24

The cornerstone of OpenID Connect

This chapter covers

• What is OAuth 2.0 and how does it fix the access delegation problem?
• The actors of an OAuth 2.0 flow
• OAuth 2.0 grant types and client types
• What’s new in OAuth 2.1?
In chapter 1 you learned that OpenID Connect is an open standard developed by the OpenID
foundation on top of OAuth 2.0 specification. Also in chapter 1 you learned that OpenID
Connect is an identity layer built on top of OAuth 2.0. OAuth 2.0 is the cornerstone of
OpenID Connect, and in this chapter we delve deeply into OAuth 2.0. If you are already
familiar with OAuth 2.0, you can safely skip this chapter.
You’ll also learn in this chapter the new changes proposed by the successor to OAuth 2.0,
the OAuth 2.1. The OAuth 2.1 is still at draft stage at the time of this writing. Even though
you are familiar with OAuth 2.0, but still want to learn about what’s new in OAuth 2.1, still
worth going through this chapter.
This chapter does not cover all the bits and pieces of OAuth 2.0. OAuth 2.0 has evolved a
lot since its inception in 2012. If you’re interested in understanding OAuth 2.0 in detail, we
recommend Advanced API Security: OAuth 2.0 and Beyond (Apress, 2019) by Prabath
Siriwardena and OAuth 2 in Action (Manning, 2017) by Justin Richer and Antonio Sanso.

2.1 What is OAuth 2.0?


In this section you’ll learn what OAuth 2.0 is, what the access delegation problem is, and
how OAuth 2.0 addresses the access delegation problem. OAuth 2.0 is an authorization

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


25

framework developed by the Internet Engineering Task Force (IETF) OAuth working group.
It’s defined in the RFC 6749 (https://tools.ietf.org/html/rfc6749). The fundamental focus of
OAuth 2.0 is to fix the access delegation problem, and in section 2.1.1 we discuss what
access delegation problem is. In the section 2.1.2 you’ll learn how OAuth 2.0 fixes the access
delegation problem, and in section 2.1.3 you’ll learn why we call OAuth 2.0 a framework.

2.1.1 What is the access delegation problem?


In this section you’ll learn about access delegation and two common models for delegating
access. If you want a third-party application, for example, to read your Facebook status
messages, you need to give that third-party application the corresponding rights to access
the Facebook API. This is called delegating access to a third-party party application to access
the Facebook API. There are two models of access delegation. One model delegates access
with credential sharing, while the other model delegates access with no credential sharing
If we follow the model that delegates access with credential sharing, you need to share
your Facebook credentials with the third-party application. It can use your credentials to
authenticate to the Facebook API, and read all your Facebook status messages. This is quite
a dangerous model (we are using Facebook just as an example; however, it does not support
this model).
Once you share your credentials with a third-party application, it can do anything, not
just read your Facebook status messages. It can read your friends list, view your photos, and
chat with your friends via Messenger. Delegating access with credential sharing is the model
many applications used before OAuth.
FlickrAuth, Google AuthSub, and Yahoo BBAuth all tried to fix the access delegation
problem in their own proprietary way: to undertake access delegation with no credential
sharing. OAuth 1.0, released in 2007, was the first effort to crack this problem in a standard
way. OAuth 2.0 followed the direction set by OAuth 1.0, and in October, 2012, became the
RFC 6749. 1

2.1.2 Fixing the access delegation problem with OAuth 2.0


In this section we discuss how OAuth 2.0 helps to implement access delegation with no
credential sharing. Here we extend the same example we discussed in section 2.1.1. With
OAuth 2.0, the third-party web application, which requires access to the Facebook API to
read your status messages, first redirects the user to Facebook (where the user belongs).
Facebook authenticates and gets the user’s consent to share a temporary token with the
third-party web application, which is only good enough to read the user’s Facebook status
messages for a limited time. Once the web application gets the token from Facebook, it
passes the token along with the API calls to Facebook (figure 2.1).

1 The main difference is that OAuth 2.0 is more extensible than OAuth 1.0. OAuth 1.0 is a concrete protocol, whereas OAuth 2.0 is an authorization
framework.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


26

Figure 2.1 A third-party application follows the model of access delegation with no credential sharing in order
to get a temporary token from Facebook, which is only good enough to read a user’s status messages.

The temporary token Facebook issues has a limited lifetime and is bound to the Facebook
user, the third-party web application, and the purpose. The purpose of the token here is to
read the user’s Facebook status messages, and the token should be only good enough to do
just that and no more. The OAuth 2.0 terminology is as follows:
• The Facebook user is called the resource owner. The resource owner decides who
should have which level of access to the resources they own.
• Facebook, which issues the token, is called the authorization server. The authorization
server knows how to authenticate (or identify) the resource owner, and grants access
to third-party applications to access resources owned by the resource owner, with
their consent. The authorization server also knows how to identify these third-party
applications.
• The Facebook API is called the resource and the server that hosts all the resources
are called the resource server. The resource server guards the resources owned by
the resource owner, and lets someone access a resource only if the access request
comes along with a valid token issued by an authorization server the resource server
trusts.
• The third-party web application is called the client. The client consumes a resource on
behalf of the resource owner.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


27

• The token Facebook issues to the third-party web application is called the access
token. The authorization server issues access tokens, and the resource server
validates those. To validate an access token, the resource server may talk to the
authorization server. We used term ‘may’ here because, when we use self-contained
access tokens, which we discuss in section 2.6, the resource server does not need to
talk to the authorization server to validate the token.
• The purpose of the token is called the scope. The resource server makes sure a given
token can be used only for the scopes attached to it. If the third-party application
tries to write to the user’s Facebook wall with the access token it got to read the
status messages, that request will fail. We discuss scopes in detail in section 2.5.
• A grant type defines the flow of events that happens during the process of the third-
party web application getting an access token from the authorization server. OAuth
2.0 defines a set of grant types, which we discuss in section 2.4.

2.1.3 Why OAuth 2.0 is called an authorization framework


You already know that OAuth 2.0 is an authorization framework. In this section we discuss
why OAuth 2.0 is a called an authorization framework. As per Wikipedia
(https://en.wikipedia.org/wiki/Software_framework), in computer programming, a software
framework is an abstraction in which software providing generic functionality can be
selectively changed by additional user-written code, thus providing application-specific
software.
This is exactly what OAuth 2.0 does! It defines a generic model or a framework for a
resource owner to delegate access to a resource they own, to a client application. So, the
client application can access the resource on behalf of the resource owner. However, when
you implement OAuth 2.0 for your client applications, you need to pick what is best for you
(OAuth 2.0 presents multiple options). If you are in the financial domain, for example, you
can follow the best practices and guidelines defined by the Financial-grade API (FAPI)
working group (https://openid.net/wg/fapi/).
FAPI guidelines are more relevant to the companies in the financial domain, and its use is
recommended by the UK Open Banking standard. 2 However, as a best practice, you can
follow FAPI guidelines even you develop applications out side the financial domain. Apart
from FAPI, the IETF OAuth working group also has developed a set of RFCs to support OAuth
2.0 implementations, which we discuss in section 2.7.
Going back to the definition of a framework, which we discussed before, a framework
presents a set of extensions for applications to extend the framework capabilities based on
the application specific needs. You can find multiple extension points in OAuth 2.0. For
example, the grant types (section 2.4), scopes (section 2.5) and token types (section 2.6)
are a set of OAuth 2.0 extensions that we discuss later in this chapter.

2 Many European countries follow the UK Open Banking (https://standards.openbanking.org.uk/) standard to implement the support for Payment
Services Directive 2 (PSD2). PSD2 is a data and technology-driven directive that aims to drive increased competition, innovation and transparency
across the European payments market.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


28

2.2 Actors of an OAuth 2.0 flow


In the section 2.1.2 we briefly discussed about different actors in a typical OAuth 2.0 flow. In
this section we delve deep into each of those actors and you learn the role each of them
plays under OAuth 2.0. There are four actors in OAuth 2.0 based on the role each plays in an
access delegation flow (see figure 2.2).
• The resource server
• The client
• The resource owner (also known as the end user)
• The authorization server

In a typical access delegation flow, a client application accesses a resource that’s hosted on a
resource server on behalf of a resource owner with a token provided by an authorization
server. This token grants access rights to the client to access a resource on behalf of the
resource owner.

Figure 2.2 In a typical OAuth 2.0 access delegation flow, a client accesses a resource that’s hosted on a
resource server, on behalf of the end user, with a token provided by the authorization server.

2.2.1 The role of the resource server


The resource server hosts the resources and decides who can access which resources based
on certain conditions. If we take Flickr, the famous image- and video-hosting service, all the
images and videos that you upload to Flickr are resources. Because Flickr hosts them all,
Flickr is the resource server. In the Facebook example we discussed in section 2.1.2, the
server that hosts the Facebook API is the resource server. The Facebook wall, the friends list,
videos, and photos are the resources exposed by the Facebook API.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


29

2.2.2 The role of the client application


The client is the consumer of the resources. It is the entity in an OAuth flow that seeks the
end user’s approval to access a resource on their behalf. If we extend the same Flickr
example that we discussed in section 2.2.1, a web application that wants to access your
Flickr photos is a client. It can be any kind of an application: a mobile, web, or even a
desktop application. In the Facebook example we discussed in section 2.1.2, the third-party
application that wanted to read Facebook status messages is a client application.
The application developers first have to register the client applications at the
authorization server (section 2.2.4). Different authorization servers provide their own
interfaces (both graphical and API) to register applications. Also, the RFC 7591: OAuth 2.0
Dynamic Client Registration Protocol (https://tools.ietf.org/html/rfc7591) provides a
standard interface for client registration. Once you register an application at the
authorization server, you get a client identifier and optionally a secret, based on the type of
your client application. In section 2.4 we discuss client types in detail.

2.2.3 The role of the resource owner


The resource owner is the one who owns the resources. In our Flickr example, you’re the
resource owner (or the end user) who owns your Flickr photos. In the Facebook example we
discussed in section 2.1.2, the Facebook user is the resource owner. In some cases, the
client application itself can be the resource owner, which simply accesses a resource, just as
itself with no other party involved (see section 2.3.1 for a use case).

2.2.4 The role of the authorization server


In an OAuth 2.0 environment, the authorization server issues tokens (known as access
tokens). An OAuth 2.0 token is a secret issued by an authorization server to a client
application to access a resource (for example, a microservice or an API) on behalf of a
resource owner. The resource server talks to the authorization server to validate the tokens
that come along with the access requests (figure 2.2). The authorization server should know
how to authenticate the end user or the resource owner, as well as to validate the identity of
the client application, before issuing an access token.

2.3 A grant type defines a protocol to request an access token


In this section, you’ll learn about OAuth 2.0 grant types and how to pick the correct grant
type for your client application. The way an application gets an access token to access a
resource on behalf of a user depends on the characteristics of the corresponding application.
An OAuth 2.0 grant type defines a request/response flow for a client application to get an
access token from the authorization server.
The OAuth 2.0 RFC identifies four main grant types and the refresh token grant type.
Each grant type outlines the process for obtaining an access token. The result of executing a
particular grant type is an access token that can be used to access a resource or behalf of
the resource owner (end-user). The following are the four main grant types along with the
refresh token grant type highlighted in the OAuth 2.0 specification:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


30

• Client credentials—Suitable for client applications that directly access resources with
no end-users. The client application itself is the resource owner (we discuss this in
section 2.3.1). A Weather App (the client application), for example, can use the client
credentials grant type to obtain an access token, and use it to access the Weather API
(the resource).
• Resource owner password—Suitable for applications the authorization server trusts
(we discuss this in section 2.3.2). This should be avoided at all cost and the
OAuth 2.1 specification has removed the resource owner password grant
type. 3
• Authorization code—Suitable for almost all the applications with an end user (we
discuss this in section 2.3.4)
• Implicit—Don’t use it! (we discuss this in section 2.3.5). The OAuth 2.1
specification has removed the implicit grant type.
• Refresh token—Used for renewing expired access tokens. The refresh token grant
type is little different from other four grant types listed above and we discuss the
differences in section 2.3.3.

The OAuth 2.0 framework isn’t restricted to these five grant types. It’s an extensible
framework that allows you to add grant types as needed. The following are two other popular
grant types that aren’t defined in the OAuth 2.0 RFC but in related profiles:
• SAML Profile for OAuth 2.0 Client Authentication and Authorization Grants —Suitable
for applications having single sign-on using SAML 2.0 (defined in RFC 7522).
• JWT Profile for OAuth 2.0 Client Authentication and Authorization Grants —Suitable for
applications having single sign-on using OpenID Connect (defined in RFC 7523).

Typically, a grant type defines four key components:


• authorization request,
• authorization response,
• access token request, and
• access token response.
The authorization endpoint of the authorization server handles the authorization
request. The token endpoint of the authorization server handles the access token request.
Both the authorization and token are two special endpoints the OAuth 2.0 RFC defines.
Not all the grant types implement all for key components we mentioned above. In the
following sections you’ll learn how each grant type implements these components.
Authorization code grant type, for example implements all four, while the implicit grant type
only implements the authorization request and authorization response. Both the client
credentials grant type and resource owner password grant type only implement access token
request and access token response components.

3 The OAuth 2.1 specification is in draft at the time of this writing and whenever mentions OAuth 2.1 in the rest of this chapter we refer to the draft
specification, which is available at https://www.ietf.org/archive/id/draft-ietf-oauth-v2-1-00.html.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


31

2.3.1 Client credentials grant type


In this section you’ll learn how client credentials grant type works and the use cases. With
this grant type we have only two participants in the process of obtaining an access token:
the client application and the authorization server. There’s no separate resource owner; the
client application itself is the resource owner.
Each client application carries its own credentials, called the client ID and the client
secret, issued to it by the authorization server. The client ID is the identifier of the client
application; the client secret is the client application’s secret key. The client application
should securely store and use the client secret. For example, you should never store a client
secret in cleartext; instead, encrypt it and store it in persistent storage (such as a database).
Also, when the client application uses the client secret to authenticate to the authorization
server, it must do it over HTTPS (or TLS). That’ll prevent any man-in-the-middle attacks.
Instead of using a text as the client secret, you can also use any other mechanism that is
supported by the corresponding authorization server for client authentication. One popular
option we see in the industry is to use mutual TLS. Mutual TLS (mTLS) is a more stronger
authentication option, as you do not send the credentials over the wire while using mTLS.
When you use text as the client secret, you need to send it along with each request to the
authorization endpoint of the authorization server to get an access token.
However, since mutual TLS is based on asymmetric encryption, the client application
never sends the keys with the request, rather proves that it owns the corresponding private
key by signing some part of the message in the TLS handshake. In fact, OAuth 2.1
specification recommends using an asymmetric method for client authentication. 4
As shown in figure 2.3, in the client credentials grant type, the client application has to
send its client ID and client secret to the authorization server over HTTPS to get an access
token. The authorization server validates the combination of the ID and secret and responds
with an access token.

Figure 2.3 The client credentials grant type lets an application obtain an access token with no end user; the
application itself is the end user.

4 The RFC 8705, OAuth 2.0 Mutual-TLS Client Authentication and Certificate-Bound Access Tokens is a specification by the IETF working group, which
standardize the usage of mutual TLS for client authentication: https://www.rfc-editor.org/rfc/rfc8705.html.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


32

Here’s a sample curl command for the client credentials grant request (this is just a sample,
so don’t try it out as is):

Listing 2.1 A sample token request with client credentials grant type
\> curl \
-u application_id:application_secret \ #A
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&scope=create_customer update_payment" \
https://localhost:8085/oauth/token
#A The client application uses HTTP Basic Authentication with its client_id and client_secret.
In listing 2.1, the value application_id is the client ID, and the value application_secret
is the client secret of the client application. The -u parameter instructs curl to perform a
base64-encoded operation on the string application_id:application_secret. The
resulting base64-encoded string that’s sent as the HTTP Authorization header to the
authorization server would be YXBwbGljYXRpb25faWQ6YXBwbGljYXRpb25fc2VjcmV0.
Even though we use a client secret (application_secret) in the curl command in
listing 2.1 to authenticate the client application to the token endpoint of the authorization
server, the client application can use mTLS instead (see listing 2.2), if stronger
authentication is required. In that case, we need to have a public/private key pair for the
client application, and the authorization server must trust the issuer of the public key or the
client certificate.
Listing 2.2 A sample token request with client credentials grant type with mutual TLS
\> curl \
--cert client.crt \ #A
--key client.key \ #B
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&scope=create_customer update_payment" \
https://localhost:8085/oauth/token
#A Public certificate corresponding client application’s private key.
#B The private key of the client applications.
In addition to client authentication, the token request in the code listing 2.1 and listing 2.2
also carries following parameters:
• The grant_type parameter is a required parameter for all the grant types and for the
client credentials grant type, its value must be client_credentials.
• Optionally, you can also specify the expected purpose of the access token with the
scope parameter. Not just for the client credentials grant type, you can pass scope
parameter along with the token request or the authorization request under all the
grant types. If a given grant type has an authorization request (for example
authorization code grant type and implicit grant type) then the scope parameter goes
with the authorization request, if not with the token request. As in code listing 2.2,
you can specify multiple values for the scope parameter where each one is separated
by a space. In section 2.5 you’ll learn more about the scope parameter.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


33

The authorization server validates the request in listing 2.2 and issues an access token as in
the following HTTP response:

Listing 2.3 A sample token response from the authorization server for client credentials grant
type
{
"access_token":"de09bec4-a821-40c8-863a-104dddb30204",
"token_type":"bearer",
"scope":"create_customer",
"expires_in":3599
}

The following lists out the parameters included in the token response from the authorization
server as in the code listing 2.3:
• The access_token parameter carries the access token the authorization server issues.
This is a required parameter.
• The token_type parameter specifies the type of the token. In practice, at the time of
this writing most of the OAuth 2.0 implementations use bearer tokens and in section
2.6 we discuss token types in detail. This is a required parameter.
• The expires_in parameter hints the client application about the lifetime of the access
token in minutes from the time the token is issued. This is not a required parameter,
however recommended. In the absence of expires_in parameter in the response, the
client application should know other means to find the token expiration or else the
client can simply keep using the access token to access the corresponding resource
and if the token is expired the resource server will respond back with an error (HTTP
401 status code), and at that point the client application can request a new token
from the authorization server. We discuss an approach to request a new token in
section 2.3.3.
• The scope parameter in the response carries the scopes associated with the issued
access token. This is only required in the token response, if the authorization server
issued the access token corresponding to a subset of the requested scopes. If the
access token is issued to the same set of scopes requested by the client application,
its optional to have the scope parameter in the token response.

The client credentials grant type is suitable for applications that access APIs and that don’t
need to worry about an end user. Simply put, it’s good when you need not be concerned
about access delegation, or in other words, the client application accesses an API just by
being itself, not on behalf of anyone else. Because of this, the client credentials grant type is
mostly used for system-to-system authentication when an application, a periodic task, or any
kind of a system directly wants to access an endpoint that is protected with OAuth 2.0.
Let’s take a weather API, for example. It provides weather predictions for the next five
days. If you build a web application to access the weather API, you can simply use the client
credentials grant type because the weather API isn’t interested in knowing who uses your
application. It is concerned with only the application that accesses it, not the end user.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


34

2.3.2 Resource owner password grant type


In this section you’ll learn how resource owner password grant type works and the use cases.
This grant type is an extension of the client credentials grant type, and it adds support for
resource owner authentication with the end-user’s username and password. The resource
owner password grant type involves all four parties in the OAuth 2.0 flow—resource owner
(end-user), client application, resource server, and authorization server.
The resource owner provides the client application their username and password. The
client application uses this information to make a token request to the authorization server,
along with the client ID and client secret (the credentials of the application) embedded within
itself (or using some other way for client authentication as we discussed in section 2.3.1).
Figure 3.4 illustrates the resource owner password grant type.

Figure 2.4 The password grant type allows an application to obtain an access token.

The following is a sample curl command for a token request following the resource owner
password grant type made to the authorization server (this is just a sample, so don’t try it
out as is):

Listing 2.4 A sample token request with cURL under resource owner password grant type
\> curl \
-u application_id:application_secret \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=password&username=user&password=pass" \
https://localhost:8085/oauth/token

As with the client credentials grant type, the application_id and application_secret are
sent in base64-encoded form in the HTTP Authorization header, using HTTP Basic
authentication. In this case, the authorization server validates not only the client ID and

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


35

secret (application_id and application_secret) to authenticate the client application, but


also the user’s credentials. The issuance of the token happens only if all four fields are valid.
In addition to client authentication, the token request in the code listing 2.3 also carries
following parameters:
• The request body contains the grant_type parameter, and its value must be set to
password. This is a required parameter.
• The username parameter carries resource owner’s username. This is a required
parameter.
• The password parameter carries the resource owner’s password. Note that because
you’re passing sensitive information in plaintext format in the request header and the
body, the communication must happen over TLS (HTTPS). Otherwise, any intruder
into the network would be able to see the values being passed.
• As discussed in the section 2.3.1 the token request can optionally include the scope
parameter, which carries one or more identifiers that are known to the authorization
server. The scope parameter defines the expected purpose of the access token.

As with the client credentials grant type, upon successful authentication, the authorization
server responds with a valid access token as shown in the following code listing. Except the
refresh_token parameter in the response, we discussed the meaning of all the other
parameters in the section 2.3.1:

Listing 2.5 A sample token response from the authorization server under resource owner
password grant type
{
"access_token":"de09bec4-a821-40c8-863a-104dddb30204",
"refresh_token":" heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3",
"token_type":"bearer",
"expires_in":3599
}

The value of the refresh_token parameter you find in the response (listing 2.4) can be used
to renew the current access token before it expires. We discuss refresh tokens in section
2.3.3 in detail. You might have noticed that we didn’t get a refresh_token in the client
credentials grant type. You only need a refresh_token to refresh a given access token
offline, when the resource owner is not in presence or when the client application has no
interactions with the resource owner. But, with the client credentials grant type, the client
application itself is the resource owner, and the client application has access to the resource
owner’s (the application itself) credentials all the time. So, to refresh an access token the
client application does not need another token, it can simply do that with its own original
credentials. That’s why the authorization server does not return a refresh_token token for
the client credentials grant type.
With the password grant type, the resource owner (user of the application) needs to
provide their username and password to the client application. Therefore, this grant type
should be used only with client applications that are trusted by the authorization server. This
model of access delegation is called access delegation with credential sharing. It is, in fact,

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


36

what OAuth 2.0 wanted to avoid using. Then why is it in the OAuth 2.0 specification? The
only reason the password grant type was introduced in the OAuth 2.0 specification was to
help legacy applications using HTTP Basic authentication to migrate to OAuth 2.0; otherwise,
you should avoid using the password grant type where possible, and the OAuth 2.1
specification has officially removed the password grant type from the specification.
As with the client credentials grant type, the password grant type requires the application
to store the client secret securely. It’s also critically important to deal with the user
credentials responsibly. Ideally, the client application must not store the end user’s password
locally; should only be used to get an access token from the authorization server and then
forget it. The access token the client application gets as the response to the token request
has a limited lifetime. Before this token expires, the client application can get a new token by
using the refresh_token received in the token response from the authorization server. This
way, the client application doesn’t have to prompt for the user’s username and password
every time the token on the application expires.

2.3.3 Refresh token grant type


In this section you’ll learn how refresh token grant type works and the use cases. The refresh
token grant type is used to renew an existing access token. Refresh token grant type is bit
different from other OAuth 2.0 grant types. All other four grant types are used to get a new
access token from the authorization server, while a client application uses the refresh token
grant type to refresh an existing access token.
Typically, the refresh token grant type is used when the current access token expires or
is near expiry, and the client application needs a new access token to work with without
having to prompt the user of the application to log in again. To use the refresh token grant
type, the application should receive an access token and a refresh token in the token
response from the author.
Not every grant type issues a refresh token along with its access token. For example both
the client credentials grant type and the implicit grant type (discussed later in section 2.3.5)
do not include a refresh token. Therefore, the refresh token grant type is a special grant type
that can be used only with applications that use other grant types to obtain an access token.
Figure 2.5 illustrates the refresh token grant flow.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


37

Figure 2.5 The refresh token grant type allows a token to be renewed when it expires.

The following curl command can be used to renew an access token with the refresh token
grant type (this is just a sample, so don’t try it out as is):

Listing 2.6 A sample cURL request with refresh token grant type
\> curl \
-u application_id:application_secret \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=refresh_token&
refresh_token=heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3" \
https://localhost:8085/oauth/token

As we discussed in section 2.3.1 and 2.3.2, the application’s client ID and client secret
(application_id and application_secret) must be sent in base64-encoded format as the
HTTP Authorization header, using HTTP Basic authentication.
In addition to client authentication, the token refresh request in the code listing 2.6 also
carries following parameters:
• The request body contains the grant_type parameter, and its value must be set to
refresh_roken. This is a required parameter.
• The refresh_token parameter caries the value of a valid refresh token the client
application already has. This is a required parameter. The refresh token grant type
should be used only with the applications that can store the client secret and refresh
token values securely, without any risk of compromise. You will learn in section 2.3.5
that implicit grant type cannot store tokens securely, so the client applications that
use the implicit grant type do not get a refresh token.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


38

• The refresh token request can optionally include the scope parameter, which carries
one or more identifiers that are known to the authorization server. The value of the
scope parameter must be equal to the scope of the original access token or to a sub
set of the scopes that are attached to the already issued access token, which you
want to refresh. Then again why do we need to refresh an access token for a subset
of the scopes attached to the original access token? Practically, we would require
doing this to avoid over-scoped access tokens. It’s too early to discuss what this really
means in this chapter, and you’ll find more details about over scoped access tokens in
chapter 10.

The refresh token usually has a limited lifetime, but it’s generally much longer than the
access token’s lifetime, so an application can renew its access token even after a significant
duration of idleness. When you refresh an access token, in the response, the authorization
server sends the renewed access token, along with another refresh token. This refresh token
may or may not be the same refresh token you get in the first request from the authorization
server. It’s up to the authorization server; it’s not governed by the OAuth 2.0 specification.
The following listing shows the response from the authorization server for an access
token refresh request. This is the exact response, which discussed under listing 2.5 in section
2.3.2.

Listing 2.7 A sample response for the access token refresh request
{
"access_token":"de09bec4-a821-40c8-863a-104dddb30204",
"refresh_token":" heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3",
"token_type":"bearer",
"expires_in":3599
}

2.3.4 Authorization code grant type


In this section you’ll learn how the authorization code grant type works and the use cases.
The authorization code grant type is used with web applications (both single-page
applications and server-side application accessed via a web browser), native mobile
applications, and with desktop applications that are capable of handling HTTP redirects. In
the authorization code grant type, the client application first initiates an authorization
request to the authorization server. This request provides the client ID of the application and
a redirect URL to redirect the user when authentication is successful. Figure 2.6 illustrates
the flow of the authorization code grant.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


39

Figure 2.6 The authorization code grant type allows a client application to obtain an access token on behalf of
an end user (or a resource owner).

As shown in figure 2.6, the first step of the client application is to initiate the authorization
code request. The HTTP request to get the authorization code looks like the following (this is
just a sample, so don’t try it out as is):

Listing 2.8 A sample authorization request to the authorization server under authorization code
grant type
GET https://localhost:8085/oauth/authorize?
response_type=code&
client_id=application_id&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fweb.application.domain%2Flogin

The authorization request in the code listing 2.8 carries following parameters:
• The response_type parameter indicates to the authorization server that an
authorization code is expected as the response to this request. This parameter is only
included in the authorization requests from the client application to the authorization
endpoint of the authorization server. In the client credentials grant type and resource
owner password grant type there is no authorization request – but only a token
request from the client application to the token endpoint of the authorization server.
That’s why you didn’t see the response_type parameter in those two grant types.
OpenID Connect heavily uses the response_type parameter and in chapter 3 we
discuss this in detail. In the authorization code grant type, the value of the
response_type parameter must be code and it’s a required parameter.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


40

• The client_id parameter carries the identifier given to the client application by the
authorization server. This request only includes the client_id and no client secret. In
this step the authorization server does not authenticate the client application, rather
uses the client_id to identify the application to load the configuration related to it.
This is a required parameter.
• The redirect_uri in the request should be equal to the redirect_uri provided when
registering the corresponding client application at the authorization server. This is an
optional parameter. If the client application does not provide a redirect_uri in the
authorization request, the authorization server picks the already registered
redirect_uri. However, a client application can register multiple redirect_uris at
the authorization server, and pick one of them by sending the preferred
redirect_uri in the authorization request. Also, some authorization servers support
registering a regular expression as the redirect_uri, so the redirect_uri in the
request must match with the registered regular expression.
• One optional parameter that’s not included in listing 2.8 is the scope parameter.
When making the authorization request, the application can request the scopes it
requires on the token to be issued. We discuss scopes in detail in section 2.5.
• Another optional parameter that’s not included in listing 2.8 is the state parameter.
The client application can send any value in the state parameter, and can expect the
same value in the response from the authorization server. Although this is an optional
parameter, it is recommended to use, and in chapter 10 we discuss how a client
application can use the state parameter to mitigate possible cross-site request forgery
attacks.

Upon receiving the authorization request in listing 2.8, the authorization server first validates
the client ID and the redirect_uri; if these parameters are valid, it presents the user with
the login page of the authorization server (assuming that no valid user session is already
running on the authorization server). The user needs to enter their username and password
on this login page. When the username and password are validated, the authorization server
issues the authorization code and provides it to the user agent via an HTTP redirect (figure
2.6). The authorization code is part of the redirect_uri as shown here:
The following code listing shows the response from the authorization server for the
authorization request in listing 2.8.

Listing 2.9 A sample authorization response under authorization code grant type
https://web.application.domain/login?code=hus83nn-8ujq6-7snuelq

The authorization response in the code listing 2.8 carries the following parameters:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


41

• The code parameter in the response carries the authorization code the authorization
server generates. The code is sent to the user agent via the redirect_uri, and it
must be passed over HTTPS. Also, because this is a browser redirect, the value of the
authorization code is visible to the end user, and also may be logged in server logs.
To reduce the risk that this data will be compromised, the authorization code usually
has a short lifetime (no more than 30 seconds) and is a one-time-use code. The client
application later uses the authorization code (before it expires) to talk to the token
endpoint of the authorization server to get an access token. If the code is used more
than once, the authorization server revokes all the tokens previously issued against it.
This is a required parameter.
• If the authorization request (listing 2.8) had the optional state parameter in it, then
the authorization server must include the same value in the response, as a query
parameter.

As in listing 2.9 the authorization code is provided as a query parameter in an HTTP redirect
(https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections) on the provided
redirect_uri. The redirect_uri is the location to which the authorization server should
redirect the browser (user agent) upon successful authentication.
In HTTP, a redirect happens when the server sends a response code between 300 and
310. In this case, the response code is 302. The response contains an HTTP header named
Location, and the value of the Location header is the URL to which the browser should
redirect. The URL (host) in the Location response header should be equal to the
redirect_uri query parameter (listing 2.8) in the HTTP request used to initiate the
authorization grant flow:

Location: https://web.application.domain/login?code=hus83nn-8ujq6-7snuelq

Upon receiving the authorization code, the client application issues a token request to the
token endpoint of the authorization server, requesting an access token in exchange for the
authorization code. The following is a curl command of such a request (step 6 in figure 3.6):

Listing 2.10 A token request under authorization code grant type


\> curl \
-u application1:application1secret \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=authorization_code&
code=hus83nn-8ujq6-7snuelq&
client_id=application_id&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fweb.application.domain%2Flogin"
https://localhost:8085/oauth/token

Like the other grant types discussed so far, the authorization code grant type requires the
client ID and client secret (optional) to be sent as an HTTP Authorization header in base64-
encoded format using HTTP Basic authentication. Also, as discussed in section 2.3.1, the
client application can use a much stronger form of authentication for the client application. In
addition to client authentication, the token request in the code listing 2.9 also carries
following parameters:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


42

• The request body contains the grant_type parameter, and its value must be set to
authorization_code. This is a required parameter.
• The code parameter is the authorization code the client application got from the
authorization server as in listing 2.8. This is a required parameter.
• The value of the client_id parameter is the same client identifier that we had in
listing 2.8, in the authorization request. This is a required parameter. Then again, why
do we need to include client_id parameter again in the HTTP POST body; we are
already sending it in the HTTP Authorization header, for client authentication? The
client authentication needs not to happen with client_id and client secret all the
time. As mentioned before, the client application can pick a stronger authentication
option that is supported by the authorization server. So, the client_id parameter
carried in the token request, in the HTTP POST body helps the authorization server to
identify the client application irrespective of the authentication method the client uses.
• The value of the redirect_uri must be identical to the value of the redirect_uri
parameter we had in the authorization request in listing 2.8. If the client application
added the redirect_uri to the authorization request, it must add the same to the
token request as well.

Upon validation of the token request in listing 2.10, the authorization server issues an access
token to the client application in an HTTP response. The following listing shows the exact
response, which we discussed under listing 2.5 in section 2.3.2.

Listing 2.11 A token response under authorization code grant type


{
"access_token":"de09bec4-a821-40c8-863a-104dddb30204",
"refresh_token":" heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3",
"token_type":"bearer",
"expires_in":3599
}

As you’ve seen, the authorization code grant type involves the user, client application, and
authorization server. Unlike the password grant type, the authorization code grant type
doesn’t require the user to provide their credentials to the client application. The user
provides their credentials only on the login page of the authorization server. This way, you
prevent the client application from learning the user’s login credentials. Therefore, this grant
type is suitable to provide user credentials for web, mobile, and desktop applications that
you don’t fully trust.
A client application that uses the authorization code grant type needs to have some
prerequisites to use it securely. Because the application needs to know and deal with
sensitive information, such as the client secret, refresh token, and authorization code, it
needs to be able to store and use these values with caution. It needs to have mechanisms
for encrypting the client secret and refresh token when storing and to use HTTPS, for
example, for secure communication with the authorization server. The communication
between the client application and the authorization server needs to happen over TLS so that
network intruders don’t see the information being exchanged.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


43

2.3.5 Implicit grant type


In this section you’ll learn how the implicit grant type works and the use cases. The implicit
grant type is similar to the authorization code grant type, but it doesn’t involve the
intermediary step of getting an authorization code before getting the access token. Instead,
the authorization server issues the access token directly in response to the authorization
request. Figure 2.7 illustrates the implicit grant flow.

Figure 2.7 The implicit grant type allows a client application to obtain an access token.

With the implicit grant type, when the user attempts to log in to an application, the client
application initiates the login flow by creating an implicit grant request. This request should
contain the client ID and the redirect_uri. The redirect_uri, as with the authorization
code grant type, is used by the authorization server to redirect the user agent back to the
client application when authentication is successful. The following is a sample authorization
request the client application sends to the authorization endpoint of the authorization server
under the implicit grant type (this is just a sample, so don’t try it out as is):

Listing 2.12 An authorization request under implicit grant type


GET https://localhost:8085/oauth/authorize?
response_type=token&
client_id=application_id&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fweb.application.domain%2Flogin

The authorization request in the code listing 2.12 carries following parameters:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


44

• The response_type parameter indicates to the authorization server that an access


token is expected as the response to this request. In the implicit grant type, the value
of the response_type parameter must be token and it’s a required parameter.
• The client_id parameter carries the identifier given to the client application by the
authorization server. This request only includes the client_id and no client secret. In
this step the authorization server does not authenticate the client application, rather
uses the client_id to identify the application to load the configuration related to it.
This is a required parameter.
• The redirect_uri should be equal to the redirect_uri provided when registering
the particular client application at the authorization server. As in the authorization
code grant type, this is an optional parameter.
• One optional parameter that’s not included in listing 2.8 is the scope parameter.
When making the authorization request, the application can request the scopes it
requires on the token to be issued. We discuss scopes in detail in section 2.5.
• Another optional parameter that’s not included in listing 2.8 is the state parameter,
which discussed in section 2.3.4.

As you can see in the HTTP requests, the difference between the authorization code grant’s
initial request and the implicit grant’s initial request is the fact that the response_type
parameter in this case is token. This indicates to the authorization server that the client
application is interested in getting an access token as the response to the implicit
authorization request. As with the authorization code grant type, here too scope is an
optional parameter that the user agent (or the client application) can provide to ask the
authorization server to issue a token with the required scopes.
Once the authorization server receives this request (listing 2.12), it validates the client ID
and the redirect_uri, and if those are valid, it presents the user the login page of the
authorization server (assuming that no active user session is running on the browser against
the authorization server). When the user enters their credentials, the authorization server
validates them and only if scope is provided in the request, presents to the user a consent
page to acknowledge that the application is given the required permissions to perform the
actions denoted by the scope parameter.
Note that the user provides credentials on the login page of the authorization server, so
only the authorization server gets to know the user’s credentials. When the user has
consented to the required scopes, the authorization server issues an access token and
provides it to the user agent on the redirect_uri itself as a URI fragment. The following is
an example of such a redirect:

Listing 2.13 An authorization response under implicit grant type


https://web.application.domain/login#access_token=jauej28slah2&
expires_in=3599&
token_type=bearer

The response from authorization server as in the code listing 2.12 carries the same set of
parameters we discussed under sections 2.3.1 (listing 2.2). The access_token, token_type
are required parameters and optionally the response from the authorization server can

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


45

include expires_in, scope and state parameters. If the state parameter was included in
the authorization request, the authorization response must have the state parameter with
the same value as in the request.
When the user agent (web browser) receives this redirect, it makes an HTTPS request to
the web.application.domain/login URL. Because the access_token field is provided as a
URI fragment (denoted by the # character in the URL), that particular value doesn’t get
submitted to the server on web.application.domain. Only the authorization server that
issued the token and the user agent (web browser) get to know the value of the access
token. The implicit grant type doesn’t provide a refresh token to the user agent. As we
discussed earlier in this section, because the value of the access token is passed in the URL,
it will be in the browser history and also possibly logged into server logs.
The implicit grant type doesn’t require your client application to maintain any sensitive
information, such as a client secret or a refresh token. This fact made it a good candidate for
use in SPAs, where rendering the content happens on web browsers through JavaScript.
These types of applications execute mostly on the client side (browser); therefore, these
applications are incapable of handling sensitive information such as client secrets. But still,
the security concerns in using the implicit grant type is much higher than its benefits, and
it’s no longer recommended, even for SPAs. The OAuth 2.1 has removed the implicit
grant type from the specification. As discussed in the previous section, the
recommendation is to use the authorization code grant type with no client secret, even for
SPAs. In chapter 10 we discuss in detail the security concerns associated with the implicit
grant type.

2.4 Public clients vs. confidential clients


In this section we discuss the types of client applications defined in the OAuth 2.0 RFC and
their characteristics. Also, you’ll learn in this section the new changes OAuth 2.1 has
proposed with respect to the client types.
OAuth 2.0 defines two types of clients, based on their ability to manage secrets that a
client application uses to authenticate to an authorization server. The confidential clients
are the applications that can manage their secrets securely, while all the other applications
that are incapable of managing their own secrets securely, fall under the public client type
category.
A server-side web application, for example, is capable of managing its own secrets
securely. It can use either the client id/ client secret or mutual TLS to authenticate to the
authorization server. So, a server-side web application is an example of a confidential client.
However, a SPA running on the browser, is not capable of securely handling its secrets. The
SPA initiates all the requests to the authorization server directly from the browser, and any
credentials you keep in the browser is visible to the end user. So, a SPA is an example of a
public client. A native mobile application is another example of a public client. You can’t hide
credential in a native mobile application. A user having root level access to the device will be
able to discover any secrets you hide. However there are techniques to make things harder
for a user or an attacker to discover any hidden secrets in a native mobile application.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


46

In contrast to the OAuth 2.0, the OAuth 2.1 specification proposes three types of clients:
confidential, credentialed and public. Both the confidential and public clients carry the
same meaning as we discussed before with respect to OAuth 2.0. At the time OAuth 2.0
defined the confidential client type, it was assumed that the authorization server issues these
credentials to a client application during the application registration process, with proper
verification. And most of the time, given a client id, the authorization server knows the
application developer who registered the client application.
But with the introduction of RFC 7591: OAuth 2.0 Dynamic Client Registration Protocol
(https://tools.ietf.org/html/rfc7591), one can register a client application dynamically and
obtain and a client id and a client secret. An authorization server that supports RFC 7591
provides an endpoint to dynamically register client applications, and this endpoint can be
protected or open. If its open, anyone having access to the environment can register an
application.
The OAuth 2.1 specification proposes to differentiate client applications that carry verified
secrets (the confidential clients) and the client applications that register themselves with an
open dynamic client registration endpoint. To identify the later type clients, OAuth 2.1
introduced a new client type called credentialed. The credentialed clients have credentials
(client id and client secret), however the authorization server has not verified the identity of
that client application, before issuing them.

2.5 Scopes bind capabilities to an OAuth 2.0 access token


Each access token that an authorization server issues is associated with one or more scopes.
In this section we’ll delve deep into OAuth 2.0 scopes and you’ll learn what is a scope and
how an authorization server associates one or more scopes to an access token it issues.
A scope defines the purpose of a token. A token can have more than one purpose; hence,
it can be associated with multiple scopes. A scope defines what the client application can do
at the resource server with the corresponding token.
When a client application requests a token from the authorization server, along with the
token request, it also specifies the scopes it expects from the token (see figure 2.8). That
doesn’t necessarily mean the authorization server has to respect that request and issue the
token with all requested scopes. An authorization server can decide on its own, also with the
resource owner’s consent, which scopes to associate with the access token. In the token
response, it sends back to the client application the scopes associated with the token, along
with the token. In case the client application didn’t send the scope parameter either to the
authorization endpoint or to the token endpoint (based on the grant type), the authorization
server can associate the token it issues with a default scope. How to associate a default
scope with an access token is out of the scope of the OAuth 2.0 specification, and different
authorization servers follow their own ways in do it.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


47

Figure 2.8 The client application requests an access token along with the expected set of scopes. When the
access token is a self-contained JWT, the resource server validates the token by itself, without talking to the
authorization server.

2.6 Token types


In this section you’ll learn what token types available in OAuth 2.0 and how they are being
used. As you already learned in this chapter, the most common token type used in OAuth 2.0
deployments is the bearer. When the authorization server returns back an access token to a
client application it uses the mandatory token_type parameter to specify the type of the
token, under all the grant types we discussed in section 2.3.
A bearer token is like cash. If I steal a $10 note from you, I can use that to buy a coffee
from any of the Starbucks and the cashier won’t ask me to prove the procession of that $10
note - or prove how I got that $10 note. The usage of bearer tokens is analogous to the
usage of cash. If someone steals a bearer token, they can use it to access a resource just as
the legitimate user. So, it’s a must, wherever you use a bearer token use it over HTTPS (or
over TLS).
Using a bearer tokens over HTTPS is great, but still, that only protects the tokens in
transit, and there are many other ways an attacker can steal a token, which we discuss in
chapter 10 in detail. So, how do we make an access token a proof-of-procession or pop
token? An attacker won’t be able to use a stolen pop token to access a resource, just as they
did with a bearer token. While using a pop token, the client application also has to prove the
procession of the token, or the ownership of the token.
There have been multiple efforts in the IETF OAuth working group to bring pop tokens to
the OAuth world. The OAuth 2.0 Token Binding (https://tools.ietf.org/html/draft-ietf-oauth-
token-binding-08) draft specification is one of the earlier efforts. This specification proposes
a mechanism to bind authorization code, the access token and the refresh token to the
underneath TLS connection that is initially used by the corresponding client application to

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


48

retrieve the token. So, even if someone steals the token, they won’t be able to use it outside
the TLS channel, which was used to obtain the token.
The OAuth 2.0 Token Binding specification was built on top of few other specifications:
Token Binding over HTTP (https://tools.ietf.org/html/rfc8473), Transport Layer
Security (TLS) Extension for Token Binding Protocol Negotiation
(https://tools.ietf.org/html/rfc8472), and The Token Binding Protocol Version 1.0
(https://tools.ietf.org/html/rfc8471). Also, for the solution proposed in this specification to
work, it was required the authorization servers and the resource servers to support the new
TLS extension the RFC 8472 proposed and the browsers too had to have the built in support.
So, in practice due to the lack implementation support OAuth 2.0 token binding never
became popular. You can learn more about OAuth 2.0 token binding from this blog
(https://medium.facilelogin.com/oauth-2-0-token-binding-e84cbb2e60), which I wrote in
2017.
After the OAuth 2.0 token binding not becoming mainstream, mostly due to the lack of
support from other systems, the IETF OAuth working group came up with another approach
to support proof-of-procession tokens with the OAuth 2.0 Demonstrating Proof-of-Possession
at the Application Layer specification (https://tools.ietf.org/html/draft-ietf-oauth-dpop-02),
which is at the draft stage at the time of this writing. This approach that is commonly known
as DPoP, does not require any changes to browser or TLS implementation. Also proposed
changes are doable at the application side. In chapter 10 we discuss DPoP in detail. When an
authorization server returns an access token as per the DPoP specification it has to set the
value of the token_type parameter to DPoP.

Self-contained access tokens


An access token can be either a reference token or a self-contained token. A reference token is just a string, and only
the issuer of the token (the authorization server) knows how to validate it. When the resource server gets a reference
token, it has to talk to the authorization server all the time to validate the token.

In contrast, if the token is a self-contained token, the resource server can validate the token itself; there’s no need
to talk to the authorization server. A self-contained token is a signed JWT or a JWS (see chapter 4). The JWT Profile for
OAuth 2.0 Access Tokens (which is in its tenth draft at the time of writing), developed under the IETF OAuth working
group, defines the structure for a self-contained access token.

2.7 OAuth 2.0 ecosystem


As you already learned in this chapter OAuth 2.0 is an authorization framework for access
delegation, which is defined in the RFC 6749. The RFC 6749 by design has kept room in the
specification to extend it based on practical, enterprise use cases. Over the past years, since
the RFC 6749 was published in 2012, there have been multiple specifications developed by
the IETF OAuth working group to address the practical, enterprise use cases. In this we list
out some of those key specifications. However, since the main focus of this book is on
OpenID Connect, we do not intend to delve deep into these. I’ve discussed most of these
specifications in the book, Advanced API Security: OAuth 2.0 and Beyond (Apress, 2019).

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


49

• OAuth 2.0 Token Introspection (https://tools.ietf.org/html/rfc7662): This specification


talks about how a resource server can talk to an authorization server to discover more
information about an access token or a refresh token.
• OAuth 2.0 Token Revocation (https://tools.ietf.org/html/rfc7009): This specification
talks about how a client application can talk to the authorization server to revoke an
access token or a refresh token.
• Proof Key for Code Exchange (https://tools.ietf.org/html/rfc7636): This specification
proposes a mechanism for the authorization code grant type to mitigate the code
interception attack. OAuth 2.1 has made this mandatory to use in the authorization
code grant type. We discuss this specification in detail in chapter 5.
• OAuth 2.0 Token Exchange (https://tools.ietf.org/html/rfc8693): This specification
proposes a mechanism to exchange one access token to a new one. The new access
token may attached to all or a sub set of scopes the original access token had. The
token exchange approach proposed in this specification is useful, when a client
application talks to a resource and that resource talks to another resource, on behalf
of the original resource owner. The first resource can talk to the authorization server
and exchange the access token it got from the client application to a new one. We
discuss this specification in detail in chapter 10.
• OAuth 2.0 Device Authorization Grant (https://tools.ietf.org/html/rfc8628): This
specification proposes an approach to use OAuth 2.0 in a browser-less environment;
for example in an application running on your smart TV.
• OAuth 2.0 Dynamic Client Registration Protocol (https://tools.ietf.org/html/rfc7591):
This specification proposes an approach to register a client application via a standard
endpoint provided by the authorization server.
• OAuth 2.0 for Native Apps (https://tools.ietf.org/html/rfc8252): This specification
defines a set of best practices in using OAuth 2.0 for native apps. In chapter 8 we
discuss this specification in detail.
• OAuth 2.0 Authorization Server Metadata (https://tools.ietf.org/html/rfc8414): This
specification proposes an standard interface for the authorization server to expose
/publish its metadata, so the client applications can discover the token types, grant
types, scopes and so on, the authorization server supports.
• OAuth 2.0 Mutual-TLS Client Authentication and Certificate-Bound Access Tokens
(https://tools.ietf.org/html/rfc8705): This specification proposes an approach for
client applications to authenticate to the authorization server’s token endpoint with
mutual TLS to obtain an access token. The authorization server binds the access
token it issues to the corresponding client application’s certificate. When the client
application uses this access token to access a resource server, and authenticates with
mutual TLS, the resource server can talk to the authorization server and verify that
the corresponding token was issued to the client who is accessing the resource.

In addition to the above RFCs, there are few interesting discussions happening in the IETF
OAuth working group, which is totally worth to be aware of as an application developer. The
following list highlights some of the draft specification in discussion, at the time of this
writing.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


50

• The OAuth 2.0 Authorization Framework: JWT Secured Authorization Request


(https://tools.ietf.org/html/draft-ietf-oauth-jwsreq-30): This specification proposes an
approach to send parameters in an authorization request as a JWT, which will add the
integrity protection to the token.
• OAuth 2.0 Pushed Authorization Requests (https://tools.ietf.org/html/draft-ietf-oauth-
par-04): This specification proposes an endpoint at the authorization server to directly
accept authorization requests, rather going through a browser redirect. So, the client
application can first directly push the authorization request to this endpoint at the
authorization server and then redirect the user via the browser to the authorization
endpoint. This redirect request will carry a reference to the authorization request data
already pushed by the client application to the authorization server.
• OAuth 2.0 Rich Authorization Requests (https://tools.ietf.org/html/draft-ietf-oauth-
rar-03): This specification proposes a new parameter called authorization_details
to the authorization request that is used to carry fine grained authorization data in the
OAuth authorization request.
• OAuth 2.0 for Browser-Based Apps (https://tools.ietf.org/html/draft-ietf-oauth-
browser-based-apps-07): This specification proposes security best practices and
guidelines for browser-based applications that use OAuth 2.0.
• OAuth Security Best Current Practice (https://tools.ietf.org/html/draft-ietf-oauth-
security-topics-16): This specification proposes a set of security best practices for
using OAuth 2.0. We discuss this specification in detail in chapter 10.

2.8 What’s new in OAuth 2.1?


OAuth 2.1 is the successor of OAuth 2.0. At the time of this writing it is a draft specification,
developed under the IETF OAuth working group. We already discussed in this chapter some
of the changes OAuth 2.1 is proposing and in this section we’ll summarize the key changes
proposed in the OAuth 2.1 draft specification.
OAuth 2.1 does not introduce new drastic changes on top of OAuth 2.0. It tries to simplify
OAuth 2.0 for developers. As we discussed in listing 2.7, the OAuth 2.0 ecosystem is getting
bigger and bigger. It’s not easy for a developer to keep track of all of those specifications.
OAuth 2.1 consolidates the functionality in OAuth 2.0 (RFC6749) along with some other
specifications we discussed in this chapter (mostly in section 2.7), as listed in the following:
• OAuth 2.0 for Native Apps (RFC8252)
• Proof Key for Code Exchange (RFC7636)
• OAuth 2.0 for Browser-Based Apps (a draft specification at the time of this writing)
• OAuth Security Best Current Practice (a draft specification at the time of this writing)
• Bearer Token Usage (RFC6750)

Since the introduction of OAuth 2.0 in 2012, it’s usage has gone well-beyond the initial
expectations. OAuth 2.0 is the de facto standard for securing APIs, microservices and many
more. Also, OAuth 2.0 is not only being used over HTTP. In the microservices world, for
example, OAuth 2.0 is being used to secure gRPC communications, Kafka topics and so on.
The Microservice Security in Action (Manning, 2020) book, which I co-authored with Nuwan
Dias, explains how to use OAuth 2.0 to secure microservices. Open Banking is another area

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


51

OAuth 2.0 is getting rapid adaption. So, for all of these use cases, security and
interoperability are a top priority, which OAuth 2.1 tries to address.
Following lists out some of the key changes OAuth 2.1 has introduced:
• Implicit and resource owner password grant types are no more. We already discussed
in this chapter, even with OAuth 2.0, these two grant types are not recommended.
• In addition to the confidential and public client types in OAuth 2.0, OAuth 2.1
introduces another new client type: credentialed. In section 2.4, we discussed about
credentialed client type.
• The authorization code grant is extended with the functionality from Proof Key for
Code Exchange (RFC 7636). We discuss RFC 7636 in detail in chapter 5.
• Under OAuth 2.0, when a client application accesses a resource, it can send the
access token either as an HTTP header or a query parameter. OAuth 2.1 removes the
support to send an access token as a query parameter.
• While validating an authorization request, the authorization server validates the
redirect_uri from the request with one already registered at the authorization server,
during application registration. With OAuth 2.1, the authorization server must make
sure that the redirect_uri in the authorization request exactly matches with the
redirect_uri registered at the authorization server.

2.9 Summary
• OAuth 2.0 is an authorization framework developed by the Internet Engineering Task
Force (IETF) OAuth working group. It’s defined in the RFC 6749
(https://tools.ietf.org/html/rfc6749).
• The fundamental focus of OAuth 2.0 is to fix the access delegation problem, which is
to let someone else access a resource you own, on your behalf.
• OpenID Connect is a standard developed by the OpenID foundation, on top of OAuth
2.0 specification.
• There are key roles defined in the OAuth 2.0 specification: client application,
authorization server, resource server and resource owner.
• An OAuth 2.0 grant type defines a request/response flow to get an access token from
the authorization server.
• A grant type defines four key components: authorization request, authorization
response, access token request, and access token response. However not all
the grant types implement all for key components we mentioned above.
• The OAuth 2.0 RFC identifies five main grant types: authorization code, implicit,
resource owner password, client credentials. Each grant type outlines the steps for
obtaining an access token.
• An access token can be either a reference token or a self-contained token.
• OAuth 2.1 does not introduce new drastic changes on top of OAuth 2.0. It tries to
simplify OAuth 2.0 for developers.
• OAuth 2.1 dropped resource owner password and implicit grant types from the
specification.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


52

Securing access to a single-page


application

This chapter covers

• OpenID Connect authentication flows and how they differ from OAuth 2.0 grant types
• How implicit authentication flow works with a single-page application
• How authorization code flow works with a single-page application
• Why you need to pick authorization code flow over implicit flow
• Securing a React-based single-page application with OpenID Connect
With the heavy adoption of APIs, over time, single-page applications (SPA) have become one
of the most popular options for building client applications on the web. If you are new to
single-page application architecture, we recommend you first go through the book SPA
Design and Architecture: Understanding Single Page Web Applications by Emmit Scott
(Manning Publication, 2015). Also in this chapter we assume you have a good knowledge of
OAuth 2.0, which is the fundamental building block of OpenID Connect. In case you don’t,
please go through chapter 2 before following the rest of the chapter.
Under OAuth 2.0 terminology, a SPA is identified as a public client application. 1 In
principle, a public client application is unable to hide any secrets from the users of it. Most of
the time a SPA is an application written in JavaScript that runs on the browser; so, anything
on the browser is visible to the users of that application. This is a key-deciding factor on how
you want to use OpenID Connect to secure a SPA.
In this chapter we’ll teach you different OpenID Connect authentication flows and how
those flows work with a SPA. You will also learn how to build a SPA using React and then log

1 In chapter 2, under the section 2.4 we discuss OAuth 2.0 client types and their characteristics.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


53

in to it via OpenID Connect. React is the most popular JavaScript library for developing user
interfaces. If you are new to React, please go through appendix A first.

3.1 Authentication flows define the communications between a


client application and an OpenID provider
In this section you’ll learn what is an authentication flow in OpenID Connect and different
types of authentication flows.
In chapter 1, you learnt that OpenID Connect defines a schema for a token (which is a
JSON Web Token (JWT)) to exchange information between an OpenID provider and a client
application; and a set of processing rules around it. The OpenID Connect specification
identifies this token, as the ID token, which we will briefly discuss in this chapter and
in detail in chapter 4.

Figure 3.1 OpenID authentication flows define how the client application communicates with the OpenID
provider to authenticate an end user. Some communications happen via the web browser and some happens
directly between the client application and the OpenID provider.

In addition to the ID token, OpenID Connect specification also introduces a


transport binding, which defines how to transport an ID token from an OpenID
provider to a client application (figure 3.1). In OpenID Connect, we use the term
authentication flows to define multiple ways by which you can transport an ID
token from an OpenID provider to a client application.
OpenID Connect defines three authentication flows:
• authorization code flow,
• implicit flow, and
• hybrid flow.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


54

In section 3.3 you learn how implicit flow works and in section 3.9 how authorization code
flow works. We’ll discuss hybrid flow in detail in chapter 6.

3.2 Authentication flows vs. grant types


The OAuth 2.0 core specification (RFC 6749) introduced four grant types, which we discussed
in chapter 2 in detail 2. In this section you’ll learn how an OpenID Connect authentication flow
relates to a grant type as well as the differences.
A grant type in OAuth 2.0 defines a protocol how a client application can obtain an access
token from an authorization server. Typically, a grant type defines four key components
(please see section 2.3 for the details): authorization request, authorization response,
access token request, and access token response.
An authentication flow in OpenID Connect uses grant types, but an authentication flow is
more than a grant type (table 3.1). Typically, an authentication flow in OpenID Connect
defines four key components, quite similar to an OAuth 2.0 grant type, but not exactly the
same: authentication request, authentication response, token request and token
response. You might have already noticed the differences; in a grant type we have an
authorization request/response, while in an authentication flow we have an authentication
request/response, also in a grant type we have an access token request/response, while in
an authentication flow we have a token/request response.

Table 3.1 The differences in the terminology, OAuth 2.0 vs. OpenID Connect

OAuth 2.0 OpenID Connect

Authorization request: Initiated from the from client Authentication request: Initiated from the client
application to the authorization server. The scope and application to the OpenID provider. The scope and
redirect_uri are optional parameters in the redirect_uri are requred parameters in the
authorization request. authorization request.

Authorization response: Initiated from the Authorization response: Initiated from the OpenID
authorization server to the client application provider to the client application

Access token request: Initiated from the client ID token request: Initiated from the client application
application to the authorization server to the OpenID provider

Access token response: Initiated from the client ID token response: Initiated from the OpenID provider
application to the authorization server to the client application

The OpenID Connect specification defines the authentication flows in a self-contained manner
in itself. So we should not confuse the OAuth 2.0 grant types with OpenID Connect
authentication flows. The authorization code flow in OpenID Connect is not as same as the
authorization code grant type in OAuth 2.0, and the implicit flow in OpenID Connect is not as
same as the implicit grant type in OAuth 2.0.

2 The RFC 6749 in fact introduced five grant types. However, the behavior of refresh token grant type is quite different from other four: authorization code,
implicit, password and client credentials. When we say OAuth 2.0 defines four grant types, we refer to those four core grant types. As discussed in
chapter 2, OAuth 2.1 has removed implicit and password grant types from the specification.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


55

3.3 How does implicit flow work?


In this section you’ll learn how an OpenID provider transports an ID token to a client
application using the implicit flow. The sequence of events or steps happens during this flow,
as well as the messages being passed in each step is clearly defined in the OpenID Connect
specification.

3.3.1 The flow of events in the implicit authentication flow


Figure 3.2 shows the sequence of events between the OpenID provider, the client
application, and the user. The client application in figure 3.2 can be any type of an
application, but here our discussion mostly focuses on a SPA. Also, the implicit flow is more
popular among SPAs than any other application type. In the following sections we discuss in
detail what happens in each step in figure 3.2.

Figure 3.2 The client application uses the implicit authentication flow to communicate with the OpenID
provider to authenticate the user. With the implicit authentication flow, all the communications between the
client application and the OpenID provider happen via the user agent.

THE CLIENT APPLICATION INITIATES A LOGIN REQUEST VIA THE BROWSER


In the step 1 of figure 3.2, the user clicks on the login link and the client application initiates
a login request via the browser. In the case of a SPA, we can expect that the user clicks on a
login link on the web page of the client application, and the browser does an HTTP GET to the
authorize endpoint of the OpenID provider.
The authorize endpoint of the OpenID provider is a well-known endpoint and the client
applications can find it by going through the OpenID provider documentation or else using
OpenID Connect discovery protocol. If you use Google as your OpenID provider, then this is
the authorize endpoint of Google, which you can find from their documentation:
https://accounts.google.com/o/oauth2/v2/auth.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


56

The request the client application generates in step 1 of figure 3.2 is called an
authentication request. You may recall from the chapter 2, in OAuth 2.0 the request
initiated from the client application to the OAuth 2.0 authorization server is called an
authorization request, following the implicit grant type.
The following listing shows an example of an authentication request. This is in fact a URL
constructed by the client application, which takes the user to the authorization endpoint of
the OpenID provider, when the user clicks on the login link.

Listing 3.1 Authentication request generated by the client application


https://accounts.google.com/o/oauth2/v2/auth?
client_id=424911365001.apps.googleusercontent.com&
redirect_uri=https%3A//app.example.com/redirect_uri&
scope=openid email&. #A
[email protected]&
response_type=id_token token&. #B
state=Xd2u73hgj59435&
nonce=0394852-3190485-2490358

#A The scope values are separated by a space. However, when you type this on the browser, the browser will URL
encode the space, so the space will be replaced by %20.
#B The response_type values are separated by a space. However, when you type this on the browser, the browser will
URL encode the space, so the space will be replaced by %20.

Let’s go through the query parameters added to the authentication request by the client
application, as shown in listing 3.1. The definition of these parameters are consistent across
all three authentication flows the OpenID Connect defines, however, the values may change.
• client_id: This is an identifier the OpenID provider uses to uniquely identify a client
application. The client application gets a client_id after registering itself at the
OpenID provider. For registration at the OpenID provider, either you can follow an
out-of-band mechanism provided by the OpenID provider or use OpenID Connect
dynamic client registration API. 3 The client_id is a required parameter in the
authentication request, and is originally defined in the OAuth 2.0 specification, which
we discussed in detail in chapter 2.
• redirect_uri: This is an endpoint belongs to the client application. After successfully
authenticating the user and getting the consent from the user to share the requested
data with the client application, the OpenID provider redirects the user to the
redirect_uri endpoint along with the requested tokens (step 5 of figure 3.2). During
the client registration process at the OpenID provider, you need to share the exact
URI you use for redirect_uri parameter in the authentication request, with the
OpenID provider.

The OpenID provider will do one-to-one matching of the value of the redirect_uri in
the authentication request against the one already registered by the client application.
Most OpenID providers do an exact match between these two URIs. However, some
OpenID providers let the client applications register multiple URI and some let the
client applications define a regular expression for the validation of the redirect_uri.

3 A out-of-band mechanism could be a developer registering a client application using the UI provided by the OpenID provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


57

Doing a validation against a regular expression gives more flexibility to dynamically


change the redirection path by the client application, but should use consciously and
regular expression used for validation must be thoroughly tested. In the section 3.3
we explain the use cases where you want to have multiple redirect_uris for the
same client application.
The redirect_uri is a required parameter in the authentication request, and is
originally defined in the OAuth 2.0 specification. However, in the OAuth 2.0
specification the redirect_uri is not a required parameter for both the implicit and
authorization code grant types.
• scope: The value of scope parameter is just a string, where both the client
application and the OpenID provider should be able to interpret the meaning of it. Any
OpenID Connect authentication request must carry the value openid for the scope
parameter. You can have multiple values for the scope parameter, each separated by
space, but one of them must be openid.

The OpenID Connect specification defines four scope values (profile, email, address
and phone) in addition to the openid scope. A client application can use any of these
scope values to request claims from the OpenID provider. In chapter 5, we discuss in
detail how to use scopes to request claims.

The scope is a required parameter in the authentication request, and is originally


defined in the OAuth 2.0 specification. However, in the OAuth 2.0 specification the
scope is not a required parameter for both the implicit and authorization code grant
types.
• login_hint: The value of login_hint parameter is a string that carries some hint
with respect to the user (or the application), which can be used by the OpenID
provider to build a better user experience. For example, if the application already
knows the user’s email address (probably from a cookie stored under the domain of
the client application), then that can go as the value of the login_hint parameter
and the OpenID provider can directly request the user to share the credentials, rather
asking for an user identifier.

The login_hint is an optional parameter introduced by the OpenID Connect


specification, which you do not find in the OAuth 2.0 specification.
• response_type: The value of the response_type parameter in the authentication
request defines which tokens the authorization endpoint of the OpenID provider
should return back to the client application.

In the implicit flow there are two possible values: id_token or id_token token. If
the value of the response_type is id_token, then the authorization endpoint will only
return back an ID token, and if the response_type is id_token token, then the
authorization endpoint will return back an ID token and an access token.

The response_type is a required parameter in the authentication request, and is


originally defined in the OAuth 2.0 specification.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


58

• response_mode: The value of the response_mode parameter in the authentication


request defines how the client application expects the response from the OpenID
provider. This is an optional parameter and is not in the listing 2.1. If you set the
value of the response_mode parameter to query, for example, then all the
parameters in the response (from the OpenID provider) are encoded as a query string
added to the redirect_uri as shown below.

https://app.example.com/redirect_uri?token=XXXXX&id_token=YYYYYY

If you set the value of the response_mode parameter to fragment, then all the
response parameters are added to the redirect_uri as a URI fragment as shown
below.
https://app.example.com/redirect_uri#token=XXXXX&id_token=YYYYYY

In addition to the query and fragment, the OAuth 2.0 Form Post Response
Mode (https://openid.net/specs/oauth-v2-form-post-response-mode-1_0.html)
specification defines another response_mode called form_post, and we’ll
discuss that in chapter 6.

The response_type and response_mode are bit related to each other. If you do
not specify a response_mode parameter in the authentication request, then the
default response_mode associated with the corresponding response_type gets
applied automatically. If the response_type is id_token or id_token token
(implicit flow) for example, then the corresponding default response_mode is
fragment (table 3.2). That means, when you use implicit grant flow, the
OpenID provider sends back the response parameters as an URI fragment.
Under section 3.6 we discuss the differences between a URI fragment and a
query string.

Table 3.2 The default response_mode values for the corresponding response_type

The of response_type parameter The default value of response_mode

id_token token (implicit flow) fragment

Id_token (implicit flow) fragment

code (authorization code flow, section 3.9) query

token (OAuth 2.0 implicit grant type) fragment

The response_mode is an optional parameter in the authentication request, and is


originally defined in the OAuth 2.0 Multiple Response Type Encoding Practices
specification (https://openid.net/specs/oauth-v2-multiple-response-types-1_0.html),
which is developed by the OpenID Foundation (not by the OAuth IETF working group).

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


59

Figure 3.3 The client application uses implicit authentication flow to communicate with the OpenID provider
to authenticate the user. This is duplicating the figure 3.1 for readability purpose.

• state: The value of state parameter is just a string, which is added to the
authentication request by the client application and the OpenID provider must return
back the same value (unchanged) in the response (step-5) in figure 3.3. In section
3.5 we discuss the use cases of the state parameter and also in chapter 10 we
discuss how to use the state parameter to mitigate some possible security threats.

The state is an optional, however, a recommended parameter in the authentication


request, and is originally defined in the OAuth 2.0 specification.
• nonce: The value of nonce parameter carries a unique value added to the
OpenID Connect authentication request by the client application. The OpenID
provider must include the value of nonce from the authentication request to
the ID token it builds. The section 3.7 explains how to generate a nonce to be
unique and nonguessable.

The nonce is an optional parameter introduced by the OpenID Connect specification to


mitigate replay attacks and in chapter 10 we discuss nonce in detail.

In addition to the authentication request parameters we discussed in the above list, there are
few more optional ones: display, prompt, max_age, ui_locales, id_token_hint, and
acr_values. We’ll discuss them in detail in chapter 6.

THE OPENID PROVIDER VALIDATES THE AUTHENTICATION REQUEST AND REDIRECTS THE USER BACK TO THE BROWSER FOR
AUTHENTICATION

Once the OpenID provider validates the authentication request from the client application, it
checks whether the user has a valid login session under the OpenID provider’s domain. Here
the domain is the HTTP domain name that you use to access the OpenID provider using a

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


60

web browser. If the user has logged into the OpenID provider already from the same web
browser, then there exists a valid login session, unless its expired.
If the user does not have a valid login session, then the OpenID provider will challenge
the user to authenticate (step 2 in figure 3.3); and also will get user’s consent to share the
requested claims with the client application. In step 3 of figure 3.3 user types the login
credentials and in step 4 in figure 3.3, the browser posts the credentials to the OpenID
provider. The steps 2, 3 and 4 are out side the scope of the OpenID Connect specification
and up to the OpenID providers to implement in the way they prefer. Figure 3.4 shows a
sample login page, Google OpenID provider pops up during the login flow.

Figure 3.4 A sample login screen for user authentication from the Google OpenID provider.

THE OPENID PROVIDER RETURNS BACK THE REQUESTED TOKENS TO THE CLIENT APPLICATION
In step 5 of figure 3.5, the OpenID provider returns back the requested tokens to the client
application. If the client application, for example, requested only an ID token in step 1, by
having id_token as the value of the response_type parameter in the authentication
request, then the OpenID provider only returns back an ID token as shown below.
https://app.example.com/redirect_uri#id_token=YYYYYYYY&state=Xd2u73hgj59435

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


61

If the value of the response_type parameter was id_token token, then the OpenID
provider will return back both the ID token and the access token as shown below.

https://app.example.com/redirect_uri#access_token=XXXXXX&token_type=Bearer&expires_in=3600&
id_token=YYYYYYYY&state=Xd2u73hgj59435

In both the above cases the OpenID provider returns the tokens as an URI fragment.That’s
because, if you do not explicitly mention the response_mode parameter in the authentication
request, for the implicit flow the default value of the response_mode is fragment.

Figure 3.5 In step 5 the OpenID provider returns back the requested tokens to the client application.

Let’s go through the parameters in the URI fragments added to the authentication response
by the OpenID provider (step-5 in figure 3.5). The definition of these parameters are
consistent across all three authentication flows the OpenID Connect defines, however, the
values may change.
• access_token: The value of the access_token parameter carries the OAuth 2.0
access token. The OpenID provider adds an access_token to the response only if the
response_type is id_token token, under the implicit flow. A client application can
use this access token to securely access OpenID provider’s userinfo endpoint to
retrieve claims with respect to the logged in user or access a business API. In chapter
5, we discuss userinfo endpoint in detail. In case the client application does not need
to access any OAuth 2.0 secured APIs, it should use just id_token for the
response_type. So, there won’t be any access_token in the authentication response.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


62

As you already learned in chapter 2, what you do with an access_token is defined by


the scope. If you had email for the scope parameter in the authentication request,
for example, then you can use the corresponding access_token to access the OpenID
provider’s userinfo endpoint to retrieve logged in user’s email and email_verified
claims. If you had address for the scope, then you can retrieve the address claim of
the logged in user from the userinfo endpoint.

The OpenID provider (or the authorization server under the OAuth 2.0 terminology)
does not necessarily need to respect the scope value in the authentication request all
the time. Based on the consent provided by the user and other policies, the OpenID
provider can decide which scopes out of all the scopes in the authentication request it
wants to respect. So, a client application should not expect all the time to get an
access token, which is bound to the quested scope values.

If the scope of the access token in the response is different from the requested scope,
the OpenID provider must include the corresponding scope value in the response,
otherwise the client application can safely assume the token is issued for the
requested scope values.
• id_token: The value of the id_token parameter carries the OpenID Connect ID
token, which is a JWT. This is a required parameter and we discuss ID token in detail
in chapter 5.
• token_type: The value of the token_type parameter carries the type of the OAuth
2.0 access token. This is a required parameter and we discussed this in detail in
chapter 2.
• expires_in: The value of the expires_in parameter carries the validity of the OAuth
2.0 access token in seconds calculated from the time it is issued. This is an optional,
but a recommended parameter and we discussed this in detail in chapter 2.
• state: The value of the state parameter copies the value of the state parameter
from the authentication request. The value of the state parameter in response must
be exactly the same found in the request. In section 3.6 we discuss how to use the
state parameter. This is a required parameter only if the authentication request
carries a state parameter.
• scope: If the scope of the access token in the response is different from the
requested scope, the OpenID provider must include the corresponding scope value in
the response, otherwise the client application can safely assume the token is issued
for the requested scope values.

One important thing you might have already noticed in the authentication response from the
OpenID provider is, there is no refresh_token. A refresh_token is defined in the OAuth 2.0
specification and is used by the client applications to refresh (extend the token expiration)
the access_token and the id_token. However, the implicit flow in OpenID Connect does not
return back an refresh_token.
If you don’t have a refresh_token, the client application won’t be able to renew the
access_token (or the ID token) it got from the OpenID provider, and in that case the client
application has to initiate a new authentication request to get a new access_token and an ID

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


63

token. This is one of the reason (not the only reason, we discuss more in section 3.11)
people prefer to use the authorization code flow over implicit flow. We discuss authorization
code flow in detail in section 3.9 and refresh tokens in detail in chapter 6.

Figure 3.6 The client application uses implicit authentication flow to communicate with the OpenID provider to
authenticate the user.

Once the client application gets the tokens in the authentication response, it can use a
JavaScript to extract out the access_token and the id_token from the URL fragment. In
practice, what happens is, the OpenID provider does an HTTP redirect (with 302 status code)
to the redirect_uri corresponding to the client application and the client application
delivers an HTML page with a JavaScript to the browser, which extracts out the ID token and
the access token from the URI fragment and does the validation (see figure 3.6).
The following code listing shows a JavaScript code segment extracted out from the
OpenID Connect Implicit Client Implementer's Guide 1.0 (https://openid.net/specs/openid-
connect-implicit-1_0.html) that extracts out the URI fragment from the browser location bar
and posts to a backend endpoint for validation.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


64

Listing 3.2 A JavaScript code that validates the tokens in the URI fragment
<script type = "text/javascript" >

// First, parse the query string


var params = {},
postBody = location.hash.substring(1),
regex = /([^&=]+)=([^&]*)/g,
m;
while (m = regex.exec(postBody)) {
params[decodeURIComponent(m[1])] = decodeURIComponent(m[2]);
}

// And send the token over to the server


var req = new XMLHttpRequest();
// using POST so query isn't logged
req.open('POST', 'https://' + window.location.host + '/catch_response', true);
req.setRequestHeader('Content-Type','application/x-www-form-urlencoded');

req.onreadystatechange = function(e) {
if (req.readyState == 4) {
if (req.status == 200) {
// If the response from the POST is 200 OK, perform a redirect
window.location = 'https://' +
window.location.host + '/redirect_after_login'
}
// if the OAuth response is invalid, generate an error message
else if (req.status == 400) {
alert('There was an error processing the token')
} else {
alert('Something other than 200 was returned')
}
}
};
req.send(postBody)
</script>

3.4 Why does one client application need to have multiple


redirect_uris?
In this section you’ll learn why in some cases one client application needs to have multiple
redirect_uris. The redirect_uri is a required parameter in the authentication request both
under the implicit flow we discussed in section 3.3 as well as under the authorization code
flow we are going to discuss in the section 3.9. As you already learnt in section 3.3, the
OpenID provider validates the redirect_uri in the authentication request against the one
which is already registered at the OpenID provider (during client application registration
flow).
In theory, the OpenID provider is expected to do one-to-one match between these two
redirect_uris; however, in practice, sometimes we see one client application register
multiple redirect_uris. This is a common pattern in the enterprise use cases where the
organizations have multiple applications managed by the same team. In such cases, they
don’t want to duplicate the application configuration at the OpenID provider. Or in other
words they do not want to register each and every client application at the OpenID provider;

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


65

rather, they want to reuse one client_id, with multiple redirect_uris by different
applications (see figure 3.7).

Figure 3.7 Multiple client applications reuse the same client_id but with different redirect_uris.

If you are to follow this pattern, you need to do this cautiously by knowing the drawbacks of
this approach, as listed below.
• Since you are using a single client_id to identify multiple applications, OpenID
provider won’t be able to recognize authentication requests generated by different
applications independently. Even though still it’s possible to differentiate
authentication requests from each other by looking at the redirect_uri, in practice,
most of the OpenID providers are not built in that way.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


66

• Most of the OpenID providers make the access_token unique by the client_id,
user, associated scopes and the status of the token. The status of a token can be
ACTIVE, EXPIRED, REVOKED, LOCKED and so on. None of these are defined in the
OpenID Connect or OAuth 2.0 specifications; rather implemented by different OpenID
providers in the way they want. For a given client_id, user, set of scopes and
status, OpenID provider can find one access_token.

Then again, if the same user tries to login to the same client application several times
even before the first issued token is expired, the number of tokens the OpenID
provider has to maintain could explode, if the OpenID provider generates a new token
for each login of the user. To fix this token explosion problem, some OpenID providers
return the same access_token (not the ID token) back to the client application for
subsequent login requests from the same client application for the same user for the
same set of scopes, if the original token is not expired.

Under the context of this topic, the above solution to the token explosion problem will
result in sharing the same access_token with multiple applications, because they do
share the same client_id. One workaround for this is, use one additional scope value
to represent the client application – and it has to be unique for a given application.
This is a special scope value, which acts as a signal to the OpenID provider, rather a
scope that requires user’s consent. Since the requested scopes are different by the
application, even though they share the same client_id, still OpenID provider will
generate different tokens by application.
• Even if you register multiple redirect_uris at the OpenID provider, still the OpenID
provider does one-to-one validation of the redirect_uri in the authentication
request against each of the registered redirect_uris to see whether there is a
match.

If the OpenID provider provides an option to validate the redirect_uri in the


authentication request with a regular expression pattern instead of doing one-to-one
match between the redirect_uri registered at the OpenID provider and the
redirect_uri in the authentication request, that could help an attacker to plant an
attack by changing the redirect_uri in the authentication request by adding some
parameters, that will still pass the regular expression pattern. We’ll discuss such
possible attacks in chapter 10.

So, if you are using a regular expression pattern for redirect_uri validation, you
need to make sure that it’s tested properly only to include what you want to have.
However, as discussed in chapter 2, with OAuth 2.1, the authorization server (or the
OpenID provider) must make sure that the redirect_uri in the authorization
request exactly matches with the redirect_uri registered at the authorization server.

3.5 Using the state parameter


As discussed in the section 3.3 the OpenID provider is obliged to return the value of the
state parameter from the authentication request, in the authentication response back to the

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


67

client application. From the client application’s point of view, this provides a way to correlate
an authentication response from the OpenID provider with the authentication request it
generates (figure 3.8).

Figure 3.8 The client application adds a random, nonguessable string as the value of the state parameter to
the authentication request, and the OpenID provider returns the same value in the authentication response,
back to the client application.

Following lists out some use cases of the state parameter.

• In an ecommerce application, for example, a user can add certain items to the
shopping cart and at the point they decide to checkout, the login with OpenID
Connect will kick in and will redirect the user to the OpenID provider. However, when
the user returns back from the OpenID provider to the client application, the user
expects the same shopping cart with all the items they picked, before being redirected
to the OpenID provider.

In a typical web application, this is handled by maintaining a session with the backend
web server, and correlated with the browser via a session cookie. In a SPA, you can
use the HTML5 session storage of the browser to store the shopping cart items against
a key, which is a randomly generated identifier. The value of the key is non-guessable
and non-static, and generated once for each browser session.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


68

This key can go as the value of the state parameter in the authentication request;
and when the user returns back from the OpenID provider, the client application can
find the state value from the authentication response and use that as a key to load
the saved shopping cart from the browser session storage.
• In section 1.9.4 of chapter 1 we discussed bootstrapping trust with external identity
providers as a benefit of having a single trusted identity provider for a client
application. So, the client application only connects to its own trusted OpenID
provider, and then that OpenID provider helps connecting the client application to
other external identity providers (figure 3.9). These external identity providers can
support OpenID Connect, SAML and so on.

Figure 3.9 The internal identity provider bootstraps trust with partner identity providers, and does
claim/protocol transformations, as expected by the client applications connected to it. All the client
applications only need to trust the internal identity provider, and should only need to support the federation
protocol, the internal identity provider supports.

The OpenID provider the client application directly connects is responsible for getting
a response from the external identity providers and transforming the response in a
way the response it (the OpenID provider) generates is understood by the
corresponding client application. To do that the OpenID provider has to cache (or
store) the initial request it got from the client application and should be able to
correlate it to the response it gets from the external identity provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


69

Most of the OpenID providers implement this using the state parameter in the
authentication request. Before the OpenID provider redirects the user to an external
identity provider, it generates a random string as the correlation handle, and stores
the authentication request from the client application against it. The OpenID provider
adds the correlation handle as the value of the state parameter in the authentication
request it generates, and when it receives a response from an external identity
provider, the OpenID provider expects the same correlation handle to be in the state
parameter of the response; and that helps to retrieve the initial authentication request
from the cache.
• Finally, and most importantly, one of the primary use cases of the state parameter is
security. The state parameter helps protecting a client application from a cross-site
request forgery (CSRF) attack, which we will discuss in detail in chapter 10.

3.6 URI fragment vs. query string


In the section 3.3.1 you learnt that the response_mode parameter in the authentication
request, dictates the structure or the format of the response a client application gets from an
OpenID provider. If you set the value of the response_mode parameter to query, for
example, then all the parameters in the response are encoded as a query string added to the
redirect_uri as shown below.

https://app.example.com/redirect_uri?token=XXXXX&id_token=YYYYYY

If you set the value of the response_mode parameter to fragment, then all the response
parameters are added to the redirect_uri as a URI fragment as shown below.

https://app.example.com/redirect_uri
#access_token=XXXXXX&token_type=Bearer&expires_in=3600&id_token=YYYYYYYY&state=Xd2u7
3hgj59435

When the OpenID provider redirects the user back to the client application, it sets the HTTP
response status code to 302 and sets the Location header to the redirect_uri either with
a query string or with an URI fragment. Then the browser after receiving 302 from the
OpenID provider, extracts out the URL from the Location header and does an HTTP GET to
the extracted URL. Following lists out the key differences between an URI fragment and a
query string.
• Any parameter in a URI fragment never leaves the browser. When the browser does
an HTTP GET to the URL comes from the Location header of the 302 response from
the OpenID provider, the HTTP GET request goes to the backend web server, but still
the URI fragment remains on the browser address bar. However, a query string
attached to a URL, goes to the backend web server.
• As defined in the RFC 2616 (https://tools.ietf.org/html/rfc2616), the HTTP Referer
header does not carry anything from the URI fragment, but everything in the query
string is included.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


70

Usually, when you click on a link on a web page, the URL of the current web page
goes as the Referer header of the HTTP request to the new web page, unless you
have setup a policy to not to include the Referer header. Having certain information
in the Referer header could lead to possible security issues and in chapter 10 we
discuss them in detail and how to mitigate those.

3.7 Generating a random, unguessable nonce


As discussed in the section 3.3.1 value of the nonce parameter carries a unique value
added to the OpenID Connect authentication request by the client application. In
this section you’ll learn how to generate a random, nonguessable nonce value in
Java; and you’ll be able to find similar constructs to generate random numbers in
other programming languages as well.
When picking an algorithm to generate a random number, there are few properties you
need to worry about. You do not need to implement them by your own. Most of the random
number generators available in different programming languages support these properties.
• A random number should be statistically independent. That means, the random
number generator must not rely on previously generated ones to generate new
random numbers.
• A random number should be unpredictable. No one should be able to guess a
random number by looking at a random number already generated by the
corresponding random number generator.
• The generated random numbers should be uniformly distributed. The probability of
generating a given random number should be equal among all the possible random
numbers, the random number generator generates.

The following code listing shows how to use the java.secure.SecureRandom Java class to
generate a random number. Neil Madden, the author of the book API Security in Action
(Manning, 2020) suggests this approach in his blog
(https://neilmadden.blog/2018/08/30/moving-away-from-uuids/).

Listing 3.3 Generating a random, nonguessable nonce in Java


import java.security.SecureRandom;
import java.util.Base64;

public class SecureRandomString {

private static final SecureRandom random = new SecureRandom();


private static final Base64.Encoder encoder=Base64.getUrlEncoder().withoutPadding();

public static String generateNonce() {


byte[] buffer = new byte[20];
random.nextBytes(buffer);
return encoder.encodeToString(buffer);
}
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


71

3.8 Implementing implicit flow using Google as the OpenID provider


In this section we discuss how to implement implicit authentication flow by using Google as
an OpenID provider. Here we are not going to build a web application from ground-up;
rather, we use a row URL constructed from all the necessary parameters and place it on the
browser address bar to demostrate the authentication request and authentication response
we discussed in section 3.3 under the implicit flow.

3.8.1 Setting up Google as an OpenID provider


To set up Google as an OpenID provider, please check https://github.com/openidconnect-in-
action/samples/blob/master/IDPs.md. Once you are done with that, you get a client_id
and a client_secret for your client application. However, since we are using implicit
authentication flow here, we only need client_id. Also, while registering your client
application with Google, you also need to provide one or more redirect_uris and lets
assume you have https://localhost:3000/redirect_uri as the redirect_uri. The following
code snippet lists all the parameters we need to know to build a client application against the
Google OpenID provider following the implicit flow.

Listing 3.4 Parameters required to communicate with Google OpenID provider


client_id: 450443992251-7gft9u.apps.googleusercontent.com
redirect_uri: https://localhost:3000/redirect_uri
Google authorization endpoint: https://accounts.google.com/o/oauth2/v2/auth

3.8.2 Constructing the authentication request


In this section we’ll construct an OpenID Connect authentication request with the parameters
from the listing 3.4 and some additional parameters such as scope, state, response_type,
and nonce. The following listing shows the complete authentication request; you may replace
the value of client_id with what you got from Google after registering the client
application. Also, we use two random string values for state and nonce parameters, and we
expect the OpenID response to carry the same state value. In section 3.7 we discussed
how to generate random, nonguessable value for the nonce parameter.

Listing 3.5 Authentication that redirects the user to the Google OpenID provider
https://accounts.google.com/o/oauth2/v2/auth?
client_id=450443992251-7gft9u.apps.googleusercontent.com&
redirect_uri=https://localhost:3000/redirect_uri&
scope=openid profile&
response_type=id_token token&
state=caf7871khs872&
nonce=89hj37b3gd3

Once you copy and paste the above request into your browser location bar, it will take you to
Google (the OpenID provider) for authentication. If you are not logged in already, you may
see a screen similar to figure 3.5. Also, the same screen will display the name of the client
application you used during the application registration process, and the set of attributes

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


72

Google about to share with the client application. We used profile as the scope value (in
addition to openid) in the authentication request, and the list of attributes (name, email
address, and so on) shown in figure 3.10 are related to that. You’ll learn more about
requesting user attributes using scopes in chapter 5.

Figure 3.10 Google login screen for user authentication. The name of the client application you used during
application registration process is displayed on the screen (Book Club), along with the user attributes Google is
going to share with the client application.

Once you complete the authentication flow at the Google OpenID provider by entering your
own credentials, it will redirect you back to the client application (to the redirect_uri we
provided ). However, since we do not have any application running on that address, the
response from Google OpenID provider will remain on the browser location bar, as shown in
the following code (listing 3.6). In practice, you will get a lengthy string for the values of the
access_token and id_token parameters, which we have replaced in the listing 3.6 with the
text <ACCESS_TOKEN> and <ID_TOKEN> respectively. In the response from Google, we got
both the access_token and the id_token; as you might have rightly guessed already, that is
because we used id_token token as the response_type in the authentication request

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


73

(listing 3.5). In case we have used just id_token as the response_type, you won’t see the
access_token in the response.

Listing 3.6 Authentication response from the Google OpenID provider


https://localhost:3000/redirect_uri#state=caf7871khs872
&access_token=<ACCESS_TOKEN>
&token_type=Bearer
&expires_in=3599
&scope=profile%20openid%20https://www.googleapis.com/auth/userinfo.profile
&id_token=<ID_TOKEN> &authuser=0&prompt=consent

3.8.3 An overview of the ID token returned back from the Google OpenID provider
In section 3.8.2 we got an authentication response from the Google OpenID provider, which
includes an ID token. In this section we’ll delve deep into the attributes that you find in the
ID token (listing 3.6). As you learnt in chapter 1, the ID token is a JSON Web Token (JWT).
In chapter 4, we discuss JWT in detail.
The ID token you got from Google has three parts in it, where they are separated by each
other with a period (.). If you have three parts in a JWT, that is a JSON Web Signature (JWS)
and if you have five parts in a JWT, that is a JSON Web Encryption (JWE). In chapter 4, you
learnt about both JWS and JWE in detail.
To decode the JWT you got in the response from Google, in listing 3.6, let’s use the
https://jwt.io. Just copy the value of id_token parameter in the Encoded text area of the
web site (jwt.io), and you will get the values decoded. Then again, you need to treat your ID
tokens as secrets and never use public sites like jwt.io to decode any production ID tokens.
4Once you decode the ID token, you will see the decoded payload of the JWT as shown in the

following listing.

Listing 3.7 The decoded payload of the ID token returned from Google
{
"iss": "https://accounts.google.com",
"azp": "450443992251-9l17d82cli8npa9cdrvcp1g9m17gft9u.apps.googleusercontent.com",
"aud": "450443992251-9l17d82cli8npa9cdrvcp1g9m17gft9u.apps.googleusercontent.com",
"sub": "104063262378861625904",
"at_hash": "Krnm3SB1v_UR00j50VzLoQ",
"nonce": "89hj37b3gd3",
"name": "Prabath Siriwardena",
"picture": "https://lh3.googleusercontent.com/a-
/AOh14GiriDTmbf8tcSKzMkFYvYwYuBMUmGFdtEBqpvRGOA=s96-c",
"given_name": "Prabath",
"family_name": "Siriwardena",
"locale": "en",
"iat": 1601313517,
"exp": 1601317117,
"jti": "4fea08bd4386e45d6f8b869520ace3f7a4f80bde"
}

4 You can follow the approach this blog (https://prefetch.net/blog/2020/07/14/decoding-json-web-tokens-jwts-from-the-linux-command-line/)


suggests to decode a JWT in the command line. That’s a much safer way to decode a JWT, than relying on online tools.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


74

The JWT payload in listing 3.7 has two types of claims: the claims related to the end user
and the claims related to token validation. The sub, name, picture, given_name,
family_name, and locale are related to the end user, and all those claims related to the
user are standard claims defined in the OpenID specification. In chapter 5 we’ll go through
these claims in detail.
The rest of the claims in listing 3.7 are defined in two specifications. Some are defined in
the JWT specification (jti, iat, exp, aud, iss) and some are defined in the OpenID Connect
specification (nonce, at_hash, azp). Chapter 4 covers in detail all the claims defined by the
JWT specification, and in the section 3.7.4, we’ll explain nonce, at_hash and azp. Even
though iat, exp, aud and iss claims are defined in the JWT specification, the OpenID
Connect specification makes all of them mandatory in an ID token.

3.8.4 ID token validation rules


In this section you’ll learn how to validate an ID token issued by an OpenID provider. You
already know that an ID token is a JWT. So, the first step in validating an ID token is to
make sure, it is a valid JWT. We discuss in detail how to validate a JWT in chapter 4. This
section explains how a client application uses some of the attributes OpenID Connect
specification introduced into the ID token during the validation process. Once the client
application identified the ID token as a valid JWT it needs perform further validation as listed
in the following.
• If the authentication request included a nonce parameter, the same must be returned
to the client application as a claim in the ID token; and also the client application
must make sure it has not seen the same nonce value in an ID token before. This
check prevents a replay attack of an ID token, which we discuss in detail in chapter
10.
• The value of the at_hash parameter in the ID token is a base64url-encoded hash of a
part of the access token (not the hash of the complete token). The hashing algorithm
the OpenID provider uses to hash the token is included in the JWT header under the
alg parameter. If an access_token is returned along with ID token in the
authentication response from the OpenID provider, this parameter must be present in
the ID token and must be valid. This helps preventing some of the security issues
found in the OAuth 2.0 implicit grant type and chapter 10 discusses that in detail.
• The azp parameter (or the authorized party) in the ID token is an optional
parameter; but if the client_id is not present in the aud parameter of the ID token
the azp parameter must be included in the ID token and its value must be the
client_id.

There are few more claims (auth_time, act and amr) OpenID Connect specification
introduced with respect to token validation and we’ll discuss them in detail in chapter 6. Then
again most of the time as an application developer, you will not write code to do these
validations yourself, rather will use some OpenID Connect library.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


75

Security issues with the implicit flow and how to mitigate those
We discussed OAuth 2.0 implicit grant type in chapter 2. The IETF OAuth 2.0 working group recommends not using
the implicit grant type due to couple of security issues explained in the OAuth 2.0 Security Best Current Practice
document (https://tools.ietf.org/html/draft-ietf-oauth-security-topics-15), mostly the access token replay and access
token leakage attacks. We discuss these two attacks in chapter 10. However, the implicit flow in OpenID Connect is
not exactly the implicit grant type in OAuth 2.0; and OpenID Connect adds protection to prevent the implicit flow from
both the access token replay and access token leakage attacks.

The at_hash parameter we discussed in section 3.7.4 binds the access token issued along with the ID token, to
the ID token. Since the ID token itself has replay protection with the nonce parameter, binding the access token to
the ID token, also protects the access token being replayed. We have dedicated chapter 10 to discuss security issues
related to OpenID Connect and OAuth 2.0; so, we’ll differ a detailed discussion on this to chapter 10.

3.9 How does authorization code flow work?


In this section you’ll learn how the authorization code flow works with a SPA in detail. There
are many similarities in the parameters passed in the authentication request and
authentication response, both under the implicit flow we discussed in section 3.3 and the
authorization code flow. So, we assume you have gone through the section 3.3 already. The
sequence of events or steps that happens during this flow, as well as the messages being
passed in each step, are clearly defined in the OpenID Connect specification.

3.9.1 The flow of events in the authorization code authentication flow


Figure 3.6 lists the sequence of events happens between the OpenID provider, the client
application, and the user. The client application in figure 3.6 can be any type of an
application, but here our discussion mostly focuses on a SPA. Over the time, the
authorization code flow has become more popular for implementing login with OpenID
Connect for SPAs. In the following sections we discuss what happens in each step in the
figure 3.11 in detail.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


76

Figure 3.11 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.

THE CLIENT APPLICATION INITIATES A LOGIN REQUEST VIA THE BROWSER (STEP 1)
In the step 1 of figure 3.11, the client application initiates a login request via the browser. In
case of a SPA, we can expect the user clicks on a login link on the web page of the client
application, and browser does an HTTP GET to the authorization endpoint of the OpenID
provider. Listing 3.8 shows an example of an authentication request under the authorization
code authentication flow.

Listing 3.8 Authentication request generated by the client application (authorization code flow)
https://accounts.google.com/o/oauth2/v2/auth?
response_type=code&
client_id=424911365001.apps.googleusercontent.com&
scope=openid email&
redirect_uri=https%3A//app.example.com/redirect_uri&
state=Xd2u73hgj59435&
[email protected]&
nonce=0394852-3190485-2490358

In section 3.3 we discussed the usage of all the parameters in the listing 3.8, so we do not
intend to duplicate the same here. However, a few parameters need attention:
• response_type: The value of the response_type parameter in the authentication
request defines which tokens the authorization endpoint of the OpenID provider
should return back to the client application.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


77

In the authorization code flow there is only one possible value: code. The client
application expects that authorization endpoint of the OpenID provider to return a
code in the authentication response. This is the key parameter in the authentication
request that differentiates an authorization code flow from the implicit flow.
• response_mode: The value of the response_mode parameter in the authentication
request defines how the client application expects the response from the OpenID
provider. For the authorization code flow, the default value of response_mode
parameter is query. So, the client application expects the OpenID provider to return
the code and the corresponding parameters in the query string to the redirect_uri.

THE OPENID PROVIDER VALIDATES THE AUTHENTICATION REQUEST AND REDIRECTS THE USER BACK TO THE BROWSER FOR
AUTHENTICATION (STEP 2)

Once the OpenID provider validates the authentication request from the client application, it
checks whether the user has a valid login session under the OpenID provider’s domain. If the
user has logged into the OpenID provider already from the same web browser, then there
exists a valid login session, unless its expired.
If the user does not have a valid login session, then the OpenID provider will challenge
the user to authenticate (step 2 in figure 3.12); and also will get user’s consent to share the
requested claims with the client application. In step 3 of figure 3.12 user types the login
credentials and in step 4 in figure 3.12, the browser posts the credentials to the OpenID
provider. The steps 2, 3 and 4 are outside the scope of the OpenID Connect specification and
up to the OpenID providers to implement in the way they prefer.

Figure 3.12 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


78

THE OPENID PROVIDER RETURNS BACK THE AUTHORIZATION CODE TO THE CLIENT APPLICATION (STEP 5)
In step 5 of figure 3.12, the OpenID provider returns the authorization code along with the
state parameter to the client application in a query string to the redirect_uri (as shown
in the following line of code). Once the the client application receives the request, it needs to
make sure that the value of the state parameter in the response exactly matches with the
value of the state parameter in the authentication request.

https://app.example.com/redirect_uri?code=YDed2u73hXcr783d&state=Xd2u73hgj59435

THE CLIENT APPLICATION EXCHANGES THE AUTHORIZATION CODE TO AN ID TOKEN AND AN ACCESS TOKEN (STEP 6)
Unlike in the implicit flow, in the authorization code flow, the client application does not get
the ID token or the access token in the authentication response. To get an ID token and an
access token, the client application has to talk to the token endpoint of the OpenID
providers, as shown in the step 6 of figure 3.12. Following listing shows an example request
to the token endpoint of the OpenID provider, which carries the authorization code from the
authentication response. The token request defined in the authorization code authentication
flow in OpenID Connect is identical to the token request defined under the authorization code
grant type in OAuth 2.0 specification, which we discussed in the chapter 2.

Listing 3.9 Request to the token endpoint of the OpenID provider (authorization code flow)
POST /token
HTTP/1.1
Host: oauth2.googleapis.com
Content-Type: application/x-www-form-urlencoded

code=YDed2u73hXcr783d&
client_id=your_client_id&
redirect_uri=https%3A//oauth2.example.com/code&
grant_type=authorization_code

One key thing to notice here is, in the token request in listing 3.9, the client application does
not authenticate to the OpenID provider. So, we only use the client_id in the request and
the client application does not need to have a client_secret or any other authentication
mechanism. As you already learnt in chapter 2, a SPA is called a public client under the
OAuth 2.0 terminology and a public client does not have the capability protect any secrets.
Since a SPA runs on the browser, it can’t hide any secrets from the end user. Anything
that you hide on the browser is visible to the end user. So, no point of having any credentials
for an SPA. In chapter 6, we’ll discuss how to use the authorization code flow with a
traditional (server-side) web application and there the web application will use client_id
and client_secret to authenticate to the token endpoint of the OpenID provider.

THE OPENID PROVIDER RETURNS BACK AN ID TOKEN AND ACCESS TOKEN TO THE CLIENT APPLICATION (STEP 7)
In step 7 of figure 3.8 the OpenID provider returns back an ID token and an access token to
the client application, as shown in the following listing.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


79

Listing 3.10 The response from the token endpoint of the OpenID provider
{
"access_token": "1/fFAGRNJru1FTz70BzhT3Zg",
"expires_in": 3920,
"token_type": "Bearer",
"id_token": "<ID_TOKEN>"
"refresh_token": "<ACCESS_TOKEN>"
}

The only difference you see in the response in listing 3.10 and the response you get from the
token endpoint under the OAuth 2.0 authorization grant type (which we discussed in chapter
2), here we have an additional parameter called id_token, which carries the ID token. Also,
unlike in the implicit flow, in the authorization code flow, the client application also gets a
refresh_token, which can be used to renew the access_token and also the id_token. We
talk about refreshing tokens in detail in chapter 6.

3.10 Authorization code flow or the implicit flow?


In section 3.3 we discussed implicit flow in detail and in section 3.9 the authorization code
flow in detail. In practice, there are SPAs that use implicit flow as well as the authorization
code flow. However, most of the new SPAs use authorization code flow, and based on the
following points listed out we recommend using authorization code flow over the implicit
flow.
• Under implicit flow, the access_token and the id_token are returned as an URI
fragment to the redirect_uri. The URI fragment will remain in the browser history,
and anyone having access to the browser could observe those. However, from the
following JavaScript code running on client application, you still can remove the URI
fragment (after reading the tokens) from the browser location bar as well as from the
browser history.

// remove fragment as much as it can go without adding


// an entry in browser history:
window.location.replace("#");

// slice off the remaining '#' in HTML5:


if (typeof window.history.replaceState == 'function') {
history.replaceState({}, '', window.location.href.slice(0, -1));
}

• Implicit flow does not return back a refresh_token, while the authorization code flow
does. Under implicit flow if the access_token expires, then the client application has
to re-initiate the login flow to obtain a new access_token.
• This point is not directly related to SPA, however if you implement OpenID Connect
login with the implicit flow in a native mobile application or a desktop application, it
could possibly be vulnerable for token interception, which we discuss in detail in
chapter 10. Under the authorization code flow, this attack can be mitigated by
implementing Proof Key for Code Exchange (PKCE) OAuth 2.0 profile
(https://tools.ietf.org/html/rfc7636) that we discuss in chapter 6 in detail.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


80

• Finally, its better to understand why the implicit flow was added to the OpenID
Connect specification. When you use the authorization code flow with a SPA, to
exchange the code to an ID token or/and an access token, you need to do a direct
HTTP request from a JavaScript running on the browser to the token endpoint of the
OpenID provider. Most of the time, the domain of the token endpoint is different from
the domain of the client application. So, by default, browsers won’t permit cross-
domain calls. That means a client application running on app.example.com domain,
won’t be able to do a direct HTTP request from the JavaScript to the op.example.com
domain.

That’s was the reason, the OpenID Connect introduced the implicit flow. Unlike in the
authorization code, with implicit flow there is no direct HTTP requests between a client
application running on the browser and the OpenID provider. However, over the time
cross-origin resource sharing (CORS) policies have become popular. CORS enables
you to do cross domain calls selectively. So, you can use authorization code flow by
enabling CORS for the client application’s domain to access the token endpoint of the
authorization server. That means, there is no reason to use the implicit flow now. In
chapter 5, you’ll learn more details about CORS.

3.11 Securing a single-page application using OpenID Connect


In this section you’ll learn how to create a fully functioning SPA using React and then
integrate the SPA with OpenID Connect for login. We assume you have a good knowledge on
React; if not, please go through appendix A first, which covers all the React fundamentals
you need to know to follow this section.

3.11.1 Building a single-page application with React


In this section we’ll build a React application with no security, and make sure its up and
running. All the samples we use in this book are available in the GitHub repository:
https://github.com/openidconnect-in-action/samples; either you can do a git clone or
download all the samples as a zip file. The sample we discuss in the section is available
under the chapter03/sample01 directory. To build the same React application, run the
following command from the sample01 directory. This npm command will look at the
package.json file inside the sample01 directory and download all the dependent node
modules and store them under the directory sample01/node_modules. You won’t see the
node_modules directory in the samples you downloaded from the GitHub repository; it’s
created only after you run the following command.

\> npm install

To build the React application, run the following command from sample01 directory on the
command console. This command wll create directory called build and copy all the files that
you want to deploy into your production web server into the build directory.

\> npm run build

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


81

In this example we use a node server as our web server. You can start the node server using
the following command run from the sample01 directory.

\> npm start

The above command starts the node server on localhost port 3000 by default; if you visit
http://localhost:3000 link on the web browser, you will see a welcome message. This is the
simplest React application you can have; in the next two sections, you’ll learn how to secure
this application with OpenID Connect.

3.11.2 Setting up an OpenID Provider


To secure the React application we developed in section 3.11.1, we need to have an OpenID
provider. In section 3.8, in our discussion of the implicit flow, we used Google as the OpenID
provider. However, we cannot use Google as an OpenID provider here, because Google
always expects the client application that implements OpenID Connect login with Google,
following the authorization code flow, to authenticate to the token endpoint using client_id
and client_secret.
As we discussed in section 3.9, a SPA is a public client and public clients cannot securely
store secrets. So, there is no point of authenticating a SPA. Here we need to use an OpenID
provider that supports the authorization code flow with no client authentication. Here (check
https://github.com/openidconnect-in-action/samples/blob/master/IDPs.md), we explain how
to setup two open source OpenID providers and to run the examples in this section you can
pick one of them. Once you successfully set up your OpenID provider and register your
application with the OpenID provider, you need to have following parameters to secure
access to the React application with OpenID Connect.

Listing 3.11 Parameters with sample values required to communicate with the OpenID provider
client_id: D4ZoMSpsxqgvUuiC6j5ROnEYea0a
redirect_uri: https://localhost:3000
Authorization endpoint: https://localhost:9443/oauth2/authz
Token endpoint: https://localhost:9443/oauth2/token
Issuer: https://localhost:9443

3.11.3 Updating the client application to use OpenID Connect login


In this section we are going to update the React application we developed in section 3.11.1
to support login with OpenID Connect. If you are already running the node server, which
hosts the React application from section 3.11.1, please take it down by pressing Ctrl + C on
the terminal that runs the node server.
To enable OpenID Connect login to the React application, in chapter03/sample01
directory we use the npm package @facilelogin/oidc-react. It’s an open source npm
package released under the MIT licenses. We developed this package by forking two open-
source node modules: https://github.com/auth0/auth0-react and
https://github.com/auth0/auth0-spa-js. If you are interested in finding more details you can
find the two forked repositories with our changes here: https://github.com/openidconnect-in-
action/oidc-react and https://github.com/openidconnect-in-action/oidc-spa-js.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


82

To install the @facilelogin/oidc-react package, run the following command from sample01
directory. Once the command runs successfully, you’ll find a new entry added into the
sample01/package.json file under the dependencies section with respect to the
@facilelogin/oidc-react package.

\> npm install @facilelogin/oidc-react

The @facilelogin/oidc-react package introduces a new React component called


<OIDCProvider /> that carries the configuration corresponding to your OpenID provider. To
add this component to your React application, on sample01/src/index.js, replace the existing
call to the ReactDOM.render() function with the following. You also need to have an import
statement to import OIDCProvider component from the @facilelogin/oidc-react
package.

Listing 3.12 Rendering the <ODICProvider /> React component


import { OIDCProvider } from '@facilelogin/oidc-react';

ReactDOM.render(
<OIDCProvider
domain="localhost:9443"
tokenEp="https://localhost:9443/oauth2/token"
authzEp="https://localhost:9443/oauth2/authorize"
clientId="D4ZoMSpsxqgvUuiC6j5ROnEYea0a"
issuer="https://localhost:9443/oauth2/token"
redirectUri={window.location.origin}>
<App />
</OIDCProvider>,
document.getElementById('book-club-app')
);

Now you can replace the code in sample01/src/components/App.js with the following, that
adds a login button to the welcome page.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


83

Listing 3.13Updated App.js code that renders the login button to initiate the login flow
import React from 'react';
import { useAuth0 } from '@facilelogin/oidc-react';

function App() {
const {isLoading,isAuthenticated,error,user,loginWithRedirect,logout,} = useAuth0();

if (isLoading) {
return <div>Loading...</div>;
}
if (error) {
return <div>Oops... {error.message}</div>;
}

if (isAuthenticated) {
console.log(user.id);
return (
<div>
Hello {user.sub}{' '}
<button onClick={() => logout({ returnTo: window.location.origin })}>
Log out
</button>
</div>
);
} else {
return <button onClick={loginWithRedirect}>Log in</button>;
}
}

export default App;

Now you can build the updated React application and start the node server using the
following two npm commands.

\> npm run build


\> npm start

Once the node server successfully started up, you can visit http://localhost:3000 and click on
the login button to initiate the login flow; and you will get redirected to the OpenID
provider’s login page.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


84

3.12 Summary
• OpenID Connect defines three authentication flows: authorization code flow,
implicit flow, and hybrid flow.
• An authentication flow in OpenID Connect uses OAuth 2.0 grant types, but an
authentication flow is more than a grant type. It defines additional request/response
parameters on top of what is already defined by OAuth 2.0 for grant types.
• The OpenID Connect specification defines the authentication flows in a self-contained
manner in itself. So we should not confuse the OAuth 2.0 grant types with OpenID
Connect authentication flows.
• The implicit authentication flow uses id_token or id_token token as the value of the
response_type parameter in the authentication request and gets both the access
token and the ID token from the authorization endpoint of the OpenID provider.
• The authorization code authentication flow uses code as the value of the
response_type parameter in the authentication request and gets the access token
and the ID token from the token endpoint of the OpenID provider.
• In practice, some SPAs use implicit flow as well as authorization code flow. However,
most new SPAs use only authorization code flow.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


85

The building blocks of an ID token

This chapter covers

• The fundamentals of a JSON Web Token (JWT)


• Using compact serialization with an JSON Web Signature (JWS) token
• Using compact serialization with an JSON Web Encryption (JWE) token
• The role of a nested a JWT
In chapter 1 you learned that OpenID Connect defines a schema for a token, which is called
an ID token, and a set of processing rules around it; and then a transport binding, which
defines how to transport an ID token from one point to another. An ID token in OpenID
Connect is a JSON Web Token (JWT). A JWT is one of the key building blocks of in building
the OpenID Connect standard. In this chapter, we discuss JWT in detail.
A good understanding of JWT is helpful to follow the rest of the chapters in this book.
Even if you are already familiar with JWT, we still recommend you to go through this chapter,
as we also discuss here the changes OpenID Connect brings in to build an ID token on top of
a JWT. If you are interested in understanding the internals of JWT in detail, we recommend
Advanced API Security: OAuth 2.0 and Beyond (Apress, 2019) by Prabath Siriwardena.

4.1 What is a JSON Web Token?


In this section you’ll learn what a JWT is and the different types of assertions it carries. A
JWT (pronounced jot) is a container that carries different types of assertions or claims from
one place to another in a cryptographically safe manner. An assertion is a strong statement
about someone or something, issued by some entity. The entity that issues an assertion is
called the issuer of the assertion or the asserting party. By issuing an assertion the issuer
asserts the corresponding party (someone or something) that assertion belongs to. The RFC
7519 (https://tools.ietf.org/html/rfc7519) developed under the IETF OAuth working group
defines the structure and the processing rules of a JWT.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


86

Imagine that your state’s Department of Motor Vehicles (DMV) creates a JWT, to
represent your driver’s license, with your personal information, which includes your name,
address, eye color, hair color, gender, date of birth, license expiration date, and license
number. All these items are attributes, or claims, about you and we can also call them as
attribute assertions. The DMV is the issuer of these attribute assertions, and also the issuer
of the JWT that embeds those assertions into the JWT.
Anyone who gets a JWT can decide whether to accept what’s in it as true, based on the
level of trust they have in the issuer of the token (in this case, the DMV). But before
accepting a JWT, how do you know who issued it? The issuer of a JWT signs it by using their
private key and the JWT itself carries an identifier corresponding to the issuer under a special
attribute id called iss. In the scenario illustrated in figure 4.1, a bartender, who is the
recipient of the JWT, can verify the signature of the JWT and see who signed the token.

Figure 4.1 A JWT is used as a container to transport assertions from one place to another in a
cryptographically safe manner. The bartender, who is the recipient of the JWT, accepts the JWT only if they
trust the DMV, the issuer of the JWT.

In addition to the attribute assertions, a JWT can carry authentication and authorization
assertions. In fact, a JWT is a container; you can fill it with anything you need. An
authentication assertion might carry an identifier corresponding to the user the issuer of the
JWT asserts and how the issuer authenticated the user before issuing the assertion. In the
DMV use case, an authentication assertion might be your first name and last name (or even
your driver’s license number), or how you are known to the DMV. Later in this chapter you’ll
learn how the issuer embeds authentication assertions into a JWT.
An authorization assertion is about the user’s entitlements, or what the user is entitled to
do. Based on the assertions the JWT brings in from the issuer, the recipient of the JWT can
decide how to act. In the DMV example, if the DMV decides to embed the user’s age as an

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


87

attribute in the JWT, that data is an attribute assertion, and a bartender can do the math to
calculate whether the user is old enough to buy a beer. Also, without sharing the user’s age
with the bartender, the DMV may decide to include an authorization assertion stating that
the user is old enough to buy a beer. In that case, a bartender will accept the JWT and let
the user buy a beer. The bartender wouldn’t know the user’s age, however they will let the
user buy a beer because the DMV authorized to do so.
In addition to carrying a set of assertions about the user, a JWT plays another role behind
the scenes. Apart from the end user’s identity, a JWT also carries the issuer’s identity, which
is the DMV in this case. The issuer’s identity is implicitly embedded in the signature of the
JWT. By looking at the corresponding public key while validating the signature of the JWT,
the recipient can figure out who the token issuer is. Also, the JWT carries an identifier
corresponding to the issuer under a special attribute id called iss.

4.2 The structure of a JWT


In this section we delve deep into the structure of a JWT and you’ll learn different parts of a
JWT.
Let’s take a closer look at a JWT. There are two forms of a JWT and the figure 4.2 shows
the most common form, which is the JSON Web Signature (JWS). In section 4.3 you’ll learn
JWS in detail. The other form of the JWT is the JSON Web Encryption (JWE), which we’ll
discuss in section 4.6 in detail.
A JWT is always a JWS or a JWE token. But the reverse is not always true. That means a
JWS or JWE token is not a JWT all the time. A JWS or a JWE token becomes a JWT, when
they are compact serialized. 1 The other form of serialization available for JWS and JWE is
JSON serialization, and we do not call a JSON serialized JWS or a JWE, a JWT. If you find this
too much of information to grasp at this point, just know that as a fact, and in section 4.4
and 4.5 you’ll understand in detail. The figure 4.2 shows the structure of a JWT, which is also
JWS token.

Figure 4.2 A JWT formatted as a JWS. This has three parts in it: the JWT header, which is also known as the
JOSE header, JWT body, which is also known as the claims set, and the third part is the signature.

As you can see in figure 4.2, a JWT, which is also a JWS token, has three parts, with a dot (.)
separating each part (in section 4.4 you’ll learn that when a JWT is a JWE, it has 5 parts):
• The first part is known as the JSON Object Signing and Encryption (JOSE) header.
We’ll discuss JOSE header in detail in section 4.3 and 4.4.

1 In computing, serialization is the process of translating a data structure or object state into a format that can be stored or transmitted and reconstructed
later (https://en.wikipedia.org/wiki/Serialization)

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


88

• The second part is the claims set, or body (or payload). In general, under JWS we call
this the JWS body, however when a JWS becomes a JWT, we call this a claims set.
We’ll discuss these differences in section 4.3.
• The third part is the signature of the JOSE header and the claims set. We’ll discuss in
section 4.3 how an issuer generates a signature and a client application verifies it.

4.2.1 The JWT JOSE header


In this section you’ll learn the key attributes in the JOSE header, as defined by the JWT
specification. The JOSE header is a base64url-encoded JSON object, which expresses the
metadata related to the JWT, such as the algorithm used to sign the message. Here’s the
base64url-decoded JOSE header, corresponding to the figure 4.2:

Listing 4.1 The JOSE header of a JWT


{
"alg": "RS256" #A
}

#A The cryptographic algorithm used to sign the JOSE header and the body of a JWT. The RFC 7518: The JSON Web
Algorithms specification defines identifiers of these algorithms.

You already learned that a JWT either has to be a JWS or JWE token. So, what exact
attributes go into the JOSE header depends on whether the JWT is a JWS or JWE token. The
JWT specification does not define any new attributes that go in the JWT JOSE header. All the
header attributes JWT inherits either come from the JWS and JWE specifications. However, in
the JWT specification, it talks about two header attributes that it herits from the JWS
sepcification and explains the usage of those two with respect to a JWT. Following lists out
those two header attributes.
• The typ attribute in the JOSE header is used to define the media type of the complete
JWT. A media type is an identifier, which defines the format of the content,
transmitted on the Internet. There are two types of components that process a JWT:
JWT implementations and JWT applications. Nimbus is a JWT implementation in Java.
The Nimbus library knows how to build and parse a JWT. A JWT application can be
anything, which uses JWTs internally. A JWT application uses a JWT implementation
(such as Nimbus) to build or parse a JWT. In this case, the typ element is just
another element for the JWT implementation. It will not try to interpret the value of it,
but the JWT application would. The typ element helps JWT applications to
differentiate the content of the JWT when the values that are not JWTs could also be
present in an application data structure along with a JWT object. This is an optional
element and if present for a JWT, it is recommended to use JWT as the media type.
This is an optional attribute.

{
"alg": "RS256",
"typ": "JWT"
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


89

• The cty attribute in JOSE header is used to define the structural information about the
JWT. In case of a nested JWT this attribute must be present in the JOSE header and
the value must be JWT. However in non-nested cases, it is not recommended to use
this. A nested JWT is a JWT that encloses another JWT. In section 4.11, we discuss
nested JWTs in detail.

So, what’s difference between the typ and cty attributes? The typ attribute says
whether the overall structure is a JWT or not. If the cty attribute is present and if its
value is set to JWT, then it’s an indication that it’s a nested JWT.
{
"alg": "RS256",
"cty": "JWT"
}

4.2.2 The JWT claims set


In this section you’ll learn what goes inside the JWT claims set and the definition of each
attribute the claims set includes. The JWT claims set (or body) is a base64url-encoded JSON
object between the first and second separators as in figure 4.2, which carries the assertions
Following is the base64url-decoded claims set, corresponding to the figure 4.2:

Listing 4.2 The claims set of a JWT


{
"sub": "peter",
"aud": "app.example.com",
"nbf": 1533270794,
"iss": "iss.example.com",
"exp": 1533271394,
"iat": 1533270794,
"jti": "5c4d1fa1-7412-4fb1-b888-9bc77e67ff2a"
}

The JWT specification (RFC 7519) defines seven attributes: sub, aud, nbf, iss, exp, iat, and
jti. None of these are mandatory—and it’s up to the other specifications that rely on JWT to
define what is mandatory and what is optional. For example, the OpenID Connect
specification makes the iss, iat, aud and exp attributes mandatory.
These seven attributes that the JWT specification defines are registered in the Internet
Assigned Numbers Authority (IANA) Web Token Claims registry
(https://www.iana.org/assignments/jwt/jwt.xhtml). However, you can introduce your own
custom attributes to the JWT claims set (in chapter 5 we discuss how to introduce custom
claims into an ID token). In the following sections, we discuss these seven attributes in
detail.

THE ISSUER OF A JWT


The iss attribute in the JWT claims set carries an identifier corresponding to the issuer, or
asserting party, of the JWT. The JWT is signed by the issuer’s private key. With respect to
OpenID Connect, the OpenID provider is the issuer. Since the ID token itself is a JWT, the
issuer of the JWT is also the issuer of the ID token. Even though the iss attaribute in the

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


90

JWT is optional, the OpenID Connect specification makes it a required attribute in the ID
token. So, a JWT, when used within the context of OpenID Connect protocol (as an ID
token), it must have the iss attribute.

THE SUBJECT OF A JWT


The sub attribute in the JWT claims set defines the subject (the party being asserted by the
issuer) of a JWT. The subject is the owner of the JWT—or in other words, the JWT carries the
claims about the subject. In the bartender example we discussed in section 4.1, Peter is the
subject. The applications of the JWT can further refine the definition of the sub attribute. The
OpenID Connect specification, for example, makes the sub attribute mandatory, and the
issuer of the token must make sure that the sub attribute carries an immutable unique
identifier. That helps the client application to identify the subject uniquely and since this
value is immutable (or never going to change) the client application can rely on this attribute
to identify the user over the time, even some other attributes of the user, such as email,
phone number change over the time. However, if the client application works with multiple
OpenID providers, then value of sub attribute is unique only for a given value of the iss
attribute.

THE AUDIENCE OF A JWT


The aud attribute in the JWT claims set specifies the audience, or intended recipient of the
token. In listing 4.2, it’s set to the string value app.example.com. The value of the aud
attribute can be any string or a URI that’s known to the recipient of the JWT.
The recipient of the JWT must check the value of the aud parameter to see whether it’s
known before accepting any JWT as valid. If you have an application called foo with the
audience value foo.example.com, that application should reject any JWT that does not carry
the aud value foo.example.com, for example. The logic in accepting or rejecting a JWT
based on audience is up to the corresponding client application (or the recipient of the JWT)
and to the overall security design.
By design, you can define a policy to agree that any client application will accept a JWT
with the audience value <app identifier>.example.com or *.example.com, for example.
However, narrower the audience you can make is better. Broder the audience means, the
same JWT can be used to access more applications, which broadens up the attack surface.
OpenID Connect specification defines a set of strict rules on top of what is already defined
in the JWT specification with respect to the aud attribute. In terms of OpenID Connect, the
value of the aud attribute has to represent the client application and it must carry the value
of the client identifier corresponding to the client application provided by the OpenID
provider. In fact the value of aud attribute is a JSON array, so it can carry multiple values.
However, in an ID token, at least one value of the aud attribute array must be the client
identifier of the corresponding application. Also, though aud is an optional attribute as per
the JWT specification, OpenID Connect makes it mandatory in an ID token.

JWT EXPIRATION, NOT BEFORE AND ISSUED TIME


The value of the exp attribute in the JWT claims set expresses the time of expiration in
seconds, which is calculated from 1970-01-01T0:0:0Z as measured in Coordinated Universal

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


91

Time (UTC). Any recipient of a JWT must make sure that the time represented by the exp
attribute is not in the past when accepting a JWT—or in other words, the token is not
expired. The iat attribute in the JWT claims set expresses the time when the JWT was
issued. That too is expressed in seconds and calculated from 1970-01-01T0:0:0Z as
measured in UTC.
The time difference between iat and exp in seconds isn’t the lifetime of the JWT when
there’s an nbf (not before) attribute present in the claims set. You shouldn’t start processing
a JWT (or accept it as a valid token) before the time specified in the nbf attribute. The value
of nbf is also expressed in seconds and calculated from 1970-01-01T0:0:0Z as measured in
UTC. When the nbf attribute is present in the claims set, the lifetime of a JWT is calculated
as the difference between the exp and nbf attributes. However, in most cases, the value of
nbf is equal to the value of iat.
According to the JWT specification, exp, iat and nbf are optional attributes. However,
the OpenID Connect specification makes exp and iat attributes mandatory in an ID token,
and does not talk about the nbf attribute. So, the OpenID Connect client applications should
always look for the exp and iat attributes while verifying the ID token and must not expect
the nbf attribute to be present all the time. However, if the nbf attribute is present in the
token, the corresponding application must validate it.

THE JWT IDENTIFIER


The jti attribute in the JWT claims set defines a unique identifier for the token. Ideally, a
given token issuer should not issue two JWTs with the same jti. However, if the recipient of
the JWT accepts tokens from multiple issuers, a given jti will be unique only along with the
corresponding issuer’s identifier (iss attribute). As per the JWT specification jti is an
optional attribute and the OpenID Connect specification does not talk about the jti
attribute. So, the OpenID Connect client applications must not rely on the jti attribute.

4.3 What does JSON Web Signature (JWS) token look like?
In section 4.2 you learned that a JWT is either a JWS or a JWE token, but the reverse is not
true for any JWT or a JWT token. The JWT we went through in the section 4.2 is also a JWS
token. In this section we delve deep in to JWS and you’ll learn the structure of a JWS token
and what differences are there bwteen a JWT and JWS token.
The RFC 7515: JSON Web Signature (JWS) defines the structure and the processing rules
of a JWS token (https://tools.ietf.org/html/rfc7515). Unlike the JWT specification, which is
developed under the IETF OAuth working group, the JWS specification is developed under the
IETF JOSE working group. JWS provides two standard ways to represent a signed message.
The message to be signed, also known as the JWS payload (figure 4.3) can be anything,
such as a JSON payload, an XML payload, or a binary. One way to represent a signed
message with JWS is to use compact serialization, and the other way is to use JSON
serialization.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


92

Figure 4.3 A JWT formatted as a JWS. This has three parts in it: the JWT header, which is also known as the
JOSE header, JWT body, which is also known as the claims set, and the third part is the signature.

As you learned already in section 4.2 we don’t call every JWS a JWT. A JWS becomes a JWT
only when it follows compact serialization. Then again, that’s not 100% precise. As per the
JWS specification, whether a JWS token is compact serialized or JSON serialized, the
message (the JWS payload) it signs can be anything, not necessarily a JSON payload.
However, if we are to call a JWS token, a JWT, then it has to be a compact serialized JWS
token that signs a JSON payload. In other words, for a generic JWS, the content it protects
(or signs) can be represented in any format, but for a JWS to become a JWT, the JWS
payload must be a JSON object (see figure 4.4). In fact JWS payload is the generic name for
the content that is to be protected, and we call the JWS payload a claims set only when it
becomes a JWT.

Figure 4.4 A JWS token can be either JSON serialized or compact serialized and the JWS payload can be any
content: XML, JSON, binary and so on. A JWT is a compact serialized JWS (or JWE) token where the payload is
a JSON payload.

With JSON serialization, the JWS is represented as a JSON payload (listing 4.3). It’s not
called a JWT. The payload parameter in the JSON-serialized JWS can carry any value. The
message being signed and represented in listing 4.3 is a JSON message with all its related
metadata.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


93

Listing 4.3 An example of a JWS token with JSON serialization and all related metadata
{
"payload":"eyJpc3MiOiJqb2UiLA0KICJleHAiOjEzMDA4MTkzODA...",
"signatures":[
{
"protected":"eyJhbGciOiJSUzI1NiJ9",
"header":{
"kid":"2010-12-29"
},
"signature":"cC4hiUPoj9Eetdgtv3hF80EGrhuB__dzERat0"
},
{
"protected":"eyJhbGciOiJFUzI1NiJ9",
"header":{
"kid":"e9bc097a-ce51-4036-9562-d2ade882db0d"
},
"signature":"DtEhU3ljbEg8L38VWAfUAqOyKAM6..."
}
]
}

Unlike in a JWT, a JSON serialized JWS can carry multiple signatures corresponding to the
same payload. In listing 4.3, the signatures JSON array carries two elements, and each
element carries a different signature of the same payload. The protected and header
attributes inside each element of the signatures JSON array define the metadata related to
the corresponding signature. Since the focus of this chapter is on JWT, we don’t intend to
discuss JSON serialized JWS tokens in detail. However, if you are interested in learning more
of JWS JSON serialization please check the chapter 7 of the book, Advanced API Security.

4.4 Building a compact serialized JWS token


In this section you’ll build a compact serialized JWS with the open source Nimbus
(https://connect2id.com/products/nimbus-jose-jwt) Java library. The source code related to
all the examples used in this chapter is available in the https://github.com/openidconnect-in-
action/samples GitHub repository, inside the chapter04 directory.
Let’s build the sample, which builds the JWS token, and run it. Run the following Maven
command from the chapter04/sample01 directory. It may take a couple of minutes to finish
the build process when you run this command for the first time. If everything goes well, you
should see the BUILD SUCCESS message at the end:

\> mvn clean install


....................
[INFO] BUILD SUCCESS

Now run your Java program to create a JWS with the following command (from the
chpater04/sample01/lib directory). If it executes successfully, it prints the base64url-
encoded JWS:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


94

\> java -cp "../target/com.manning.oidc.chapter04.sample01-1.0.0.jar:*" \


com.manning.oidc.chapter04.sample01.RSASHA256JWTBuilder

eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJwZXRlciIsImF1ZCI6IiouZWNvbW0uY29tIiwibmJmIj
oxNTMzMjcwNzk0LCJpc3MiOiJzdHMuZWNvbW0uY29tIiwiZXhwIjoxNTMzMjcxMzk0LCJpYXQiO
jE1MzMyNzA3OTQsImp0aSI6IjVjNGQxZmExLTc0MTItNGZiMS1iODg4LTliYzc3ZTY3ZmYyYSJ9
.aOkwoXAsJHz1oD-N0Zz4-dvZBtz7oaBXyoysfTKy2vV6C_Sfw05w10Yg0oyQX6VBK8tw68Tair
pA9322ZziTcteGxaNb-Hqn39krHT35sD68sNOkh7zIqLIIJ59hisO81kK11g05Nr-nZnEv9mfHF
vU_dpQEP-Dgswy_lJ8rZTc

You can decode this JWS token by using the JWT decoder available at https://jwt.io. The
following is the decoded JWS claims set, or payload:

{
"sub": "peter",
"aud": "app.example.com",
"nbf": 1533270794,
"iss": "iss.example.com",
"exp": 1533271394,
"iat": 1533270794,
"jti": "5c4d1fa1-7412-4fb1-b888-9bc77e67ff2a"
}

Take a look at the code that generated the JWT. It’s straightforward and self-explanatory
with comments. You can find the complete source code in the
sample01/src/main/java/com/manning/oidc/chapter04/sample01/RSASHA256JWTBuilder.jav
a file. The following code listing does the core work of JWT generation. It accepts the token
issuer’s private key as an input parameter and uses it to sign the JWT using the RSA-SHA256
signing algorithm.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


95

Listing 4.4 The content of the RSASHA256JWTBuilder.java file


public static String buildRsaSha256SignedJWT(PrivateKey privateKey)
throws JOSEException {

// build audience restriction list.


List<String> aud = new ArrayList<String>();
aud.add("app.example.com");

Date currentTime = new Date();

// create a claim set.


JWTClaimsSet jwtClaims = new JWTClaimsSet.Builder().
// set the value of the issuer.
issuer("iss.example.com").
// set the subject value - JWT belongs to this subject.
subject("peter").
// set values for audience restriction.
audience(aud).
// expiration time set to 10 minutes.
expirationTime(new Date(new Date().getTime() + 1000 * 60 * 10)).
// set the valid from time to current time.
notBeforeTime(currentTime).
// set issued time to current time.
issueTime(currentTime).
// set a generated UUID as the JWT identifier.
jwtID(UUID.randomUUID().toString()).build();

// create JWS header with RSA-SHA256 algorithm.


JWSHeader jswHeader = new JWSHeader(JWSAlgorithm.RS256);

// create signer with the RSA private key..


JWSSigner signer = new RSASSASigner((RSAPrivateKey) privateKey);

// create the signed JWT with the JWS header and the JWT body.
SignedJWT signedJWT = new SignedJWT(jswHeader, jwtClaims);

// sign the JWT with HMAC-SHA256.


signedJWT.sign(signer);

// serialize into base64-encoded text.


String jwtInText = signedJWT.serialize();

// print the value of the JWT.


System.out.println(jwtInText);

return jwtInText;
}

4.5 The JOSE header of a JWS token


In the section 4.2.1 you learned that a JWT inherits JOSE header attributes from the JWS
and JWE specifications. In this section you’ll learn all the JOSE attributes the JWS
specification defines and how they are related to the ID token defined in the OpenID Connect
specification.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


96

4.5.1 The alg carries name of the algorithm


The alg attribute carries name of the algorithm, which is used to sign the JWS payload and
JOSE header. This is a required attribute in the JOSE header. Failure to include this in the
header will result in a token parsing error. The value of the alg parameter is a string, which
is picked from the JSON Web Signature and Encryption Algorithms registry defined by the
RFC 7518, which is the JSON Web Algorithms (https://tools.ietf.org/html/rfc7518)
specification. If the value of the alg parameter is not picked from the preceding registry,
then it should be defined in a collision-resistant manner, but that won’t give any guarantee
that the particular algorithm is identified by all JWS implementations. It’s always better to
stick to the algorithms defined in the JWA specification.

{
"alg": "RS256"
}

4.5.2 The jku carries a URL pointing to a JSON Web Key set
The jku attribute in the JOSE header carries a URL, which points to a JSON Web Key (JWK)
set. This JWK set represents a collection of JSON-encoded public keys, where one of the keys
is used to sign the JWS token. 2 Whatever the protocol used to retrieve the key set should
provide the integrity protection. If keys are retrieved over HTTP, then instead of plain HTTP,
HTTPS (or HTTP over TLS) should be used. The jku is an optional attribute as per the JWS
specification. However, under the context of OpenID Connect, the jku attribute must
not be used to load the keys to verify an ID token, instead the OpenID provider (the
issuer of the ID token) and the client application (the recipient of the ID token) must
communicate the keys used for signing in some other means, which we discuss in the
following sections.

{
"jku": "https://example.com/jwks.json"
}

The following listing shows a sample JWKs.

Listing 4.5 A sample JSON Web Key (JWK) set


{
"keys":[ #A
{
"e":"AQAB", #B
"kid":"d7d87567-1840-4f45-9614-49071fca4d21", #C
"kty":"RSA", #D
"n":"-WcBjPsrFvGOwqVJd8vpV " #E
}
]
}

#A The parent element, which represents an array of JWKs

2 A JSON Web Key (JWK) is a JSON representation of a cryptographic key, and a JSON Web Key Set (JWKS) is a representation of multiple JWKs. The RFC
7517 (https://tools.ietf.org/html/rfc7517) provides the structure and the definition of a JWK.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


97

#B A cryptographic parameter corresponding to the RSA algorithm


#C The key identifier. This should match the kid value in the JWT header.
#D Defines the key type. The RFC 7518 defines the possible values.
#E A cryptographic parameter corresponding to the RSA algorithm.

4.5.3 The jwk carries the public key corresponding to the signature
The jwk attribute in JOSE header represents the public key corresponding to the key that is
used to sign the JSON payload. The key is encoded as per the JSON Web Key (JWK)
specification (https://tools.ietf.org/html/rfc7517). The jku parameter, which we discussed in
section 4.5.2 points to a link that holds a set of JWKs, while the jwk parameter embeds the
key into the JOSE header itself. The jwk is an optional parameter. However, as discussed in
section 4.5.2 under the context of OpenID Connect, the jwk attribute must not be
used to load the keys to verify an ID token; instead the OpenID provider (the issuer of
the ID token) and the client application (the recipient of the ID token) must communicate
the keys used for signing in some other means, which we discuss in the following sections.

{
"jwk": <Embeds the JWK>
}

4.5.4 The kid represents an identifier for the key used to sign the message
The kid attribute of the JOSE header represents an identifier for the key that is used to sign
the JOSE payload. Using this identifier, the recipient of the JWS should be able locate the
key. In OpenID Connect implmentations this is the mostly used approach. The actual key
used to sign the message is exchanged between the OpenID provider and the client
application, and the client application when it has to validate the ID token, it looks up for the
corresponding key, using the kid attribute in the JOSE header.
If the token issuer uses the kid parameter in the JOSE header to let the recipient know
about the signing key, then the corresponding key should be exchanged “somehow” between
the token issuer and the recipient beforehand. How this key exchange happens is out of the
scope of the JWS specification. If the value of the kid parameter refers to a JWK, then the
value of this parameter should match the value of the kid parameter in the JWK. The kid is
an optional parameter in the JOSE header.

{
"kid": "wkek18392-199kkjh39-2983j7h"
}

4.5.5 The x5u attribute carries a URL pointing to a X.509 certificate


The x5u attribute in the JOSE header is very much similar to the jku attribute, which we
discussed in section 4.5.2. Instead of pointing to a JWK set, the URL here points to an X.509
certificate or a chain of X.509 certificates. The resource pointed by the URL must hold the
certificate or the chain of certificates in the PEM-encoded form. Each certificate in the chain
must appear between the delimiters: -----BEGIN CERTIFICATE----- and -----END
CERTIFICATE-----. The public key corresponding to the key used to sign the JOSE payload
should be the very first entry in the certificate chain, and the rest is the certificates of

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


98

intermediate CAs (certificate authority) and the root CA. The x5u is an optional parameter in
the JOSE header. However, as discussed in section 4.5.2 under the context of OpenID
Connect, the x5u attribute must not be used when building the ID token.

{
"x5u": "https://example.com/x509.pem"
}

4.5.6 The x5c attribute represents the X.509 certificate


The x5c attribute in the JOSE header represents the X.509 certificate (or the certificate
chain), which corresponds to the private key, which is used to sign the JOSE payload. This is
similar to the jwk parameter we discussed before, but in this case, instead of a JWK, it’s an
X.509 certificate (or a chain of certificates). The certificate or the certificate chain is
represented in a JSON array of certificate value strings. Each element in the array should be
a base64-encoded DER PKIX certificate value. The public key corresponding to the key used
to sign the JOSE payload should be the very first entry in the JSON array, and the rest is the
certificates of intermediate CAs (certificate authority) and the root CA. The x5c is an optional
parameter in the JOSE header. However, as discussed in section 4.5.2 under the context
of OpenID Connect, the x5c attribute must not be used when building the ID token.

{
"x5c": <PEM encoded X.509 certificate or chain of certificates>
}

4.5.7 The x5t / x5t#s256 attributes represent the thumbprint of a certificate


The x5t attribute in the JOSE header represents the base64url-encoded SHA-1 thumbprint of
the X.509 certificate corresponding to the key used to sign the JSON payload. This is similar
to the kid parameter we discussed before. Both these parameters are used to locate the key.
If the token issuer uses the x5t parameter in the JOSE header to let the recipient know
about the signing key, then the corresponding key should be exchanged “somehow” between
the token issuer and the recipient beforehand. How this key exchange happens is out of the
scope of the JWS specification. The x5t is an optional parameter in the JOSE header.

{
"x5t": "Khk48kek39kk3..."
}

In the same as in the x5t attribute, the x5t#s256 attribute in the JOSE header represents
the base64url-encoded SHA256 thumbprint of the X.509 certificate corresponding to the key
used to sign the JSON payload. The only difference between x5t#s256 and the x5t is the
hashing algorithm. The x5t#s256 is an optional parameter in the JOSE header.

{
"alg": "Xjeklr39kdj3d..."
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


99

4.5.8 The crit attribute indicates the presence of custom parameters


The crit parameter in the JOSE header is used to indicate the recipient of the JWS that the
presence of custom parameters, which neither defined by the JWS or JWA specifications, in
the JOSE header. If these custom parameters are not understood by the recipient, then the
JWS token will be treated as invalid. The value of the crit parameter is a JSON array of
names, where each entry represents a custom parameter. The crit is an optional parameter
in the JOSE header.

{
"crit": ["exp"]
}

4.6 The process of compact serializing JWS token


To compact a serialize a JWS token, we need to build three components: the JOSE header,
JWS payload and the signature. In this section you’ll learn the process of building each one
of those three parts.

Figure 4.5 A JWT formatted as a JWS. This has three parts in it: the JWT header, which is also known as the
JOSE header, JWT body, which is also known as the claims set, and the third part is the signature.

1. First to build the JOSE header, we construct a JSON object that includes all the header
attributes, which expresses the cryptographic properties of the JWS token. As
discussed before, the token issuer should advertise in the JOSE header (figure 4.5),
the public key corresponding to the key used to sign the message. This can be
expressed via any of these header
elements: jku, jwk, kid, x5u, x5c, x5t and x5t#s256. However, as you learned
in section 4.5, when building an ID token under the context of OpenID Connect, you
cannot use jku, jwk, x5u and x5c attributes.
2. Compute the base64url-encoded value against the UTF-8 encoded JOSE header from
the step 1, to produce the 1st element of the JWS token.
3. Construct the payload or the content to be signed — or the JWS payload. The payload
is not necessarily JSON — it can be any content. However as you learned already in
this chapter, if the JWS to be n JWT, the JWS payload must be a JSON payload, and
we call it the claims set.
4. Compute the base64url-encoded value of the JWS payload from the step 3 to produce
the 2nd element (figure 4.5) of the JWS token.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


100

5. Build the message to compute the digital signature or the Mac. The message is
constructed as ASCII(BASE64URL-ENCODE(UTF8(JOSE Header)) ‘.’ BASE64URL-
ENCODE(JWS Payload)).
6. Compute the signature over the message constructed in the previous step, following
the signature algorithm defined by the JOSE header element alg. The message is
signed using the private key corresponding to the public key advertised in the JOSE
header.
7. Compute the base64url-encoded value of the JWS signature produced in the step 6,
which is the 3rd element (figure 4.5) of the serialized JWS token.
8. Now we have all the elements to build the JWS token in the following manner.

4.7 What does JSON Web Encryption (JWE) token look like?
In the section 4.5, we stated that a JWT is a compact-serialized JWS. It can also be a
compact-serialized JSON Web Encryption (JWE) token. Like JWS, a JWE represents an
encrypted message using compact serialization or JSON serialization. In this section we delve
deep in to JWE and you’ll learn the structure of a JWE token and what differences there are
between a JWT and JWE token.
JWS addresses the integrity and nonrepudiation aspects of the data contained in it, while
JWE protects the data for confidentiality. If you transfer some data in a JWS, for an example,
the recipient of the JWS would know if the token got modified in the middle, by verifying the
signature of the token. If the signature verification fails, that means the message the
recipient got is not same as the original message. In other words, the signature of the JWS
protects the integrity of the data it transfers. Also, the signature verification helps to achieve
nonrepudiation. Nonrepudiation means the sender of the JWS cannot later dispute that they
did not send the message. However, JWS does not prevent anyone from seeing the content
of the message while in transit, unless you send the message over TLS. But when you use
JWE to transfer some data, irrespective of the transport channel uses TLS or not, it makes
sure only the intended recipient of the token can see the content of the token.
A JWE is called a JWT only when compact serialization is used. Then again as we
discussed in section 4.3, to be precise, for us to call JWE a JWT, it has to be compact
serialized, as well as the JWE payload must be a JSON object. In a generic JWE token, the
JWE payload can be anything, XML, JSON and so on. However, for a JWE to become a JWT,
the JWE payload must be a JSON object, and once the JWE payload becomes a JSON object,
we call it a claims set.

Figure 4.6 A JWT that’s a compact-serialized JWE

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


101

A compact-serialized JWE (see figure 4.6) has five parts; each part is base64url-encoded and
separated by a dot (.). The JOSE header is the part of the JWE that carries metadata related
to the encryption. The JWE encrypted key, initialization vector, and authentication tag are
related to the cryptographic operations performed during the encryption. We won’t talk about
those in detail here. If you’re interested, we recommend chapter 8 of the book, Advanced
API Security: OAuth 2.0 and Beyond. Finally, the ciphertext part of the JWE includes the
encrypted payload.

You might have noticed in figure 4.6, that there is no JWE payload. Yes, ciphertext in the
figure 4.6 is produced by encrypting the JWE payload, and that’s why we do not have JWE
payload in figure 4.6.
With JSON serialization, the JWE is represented as a JSON payload. It isn’t called a JWT.
The ciphertext attribute in the JSON-serialized JWE carries the encrypted value of any
payload, which can be JSON, XML or even binary. The actual payload is encrypted and
represented in listing 4.5 as a JSON message with all related metadata. Since the focus of
this chapter is on JWT, we don’t intend to discuss JSON serialized JWE tokens in detail.
However, if you are interested in learning more of JWS JSON serialization please check the
chapter 8 of the book, Advanced API Security: OAuth 2.0 and Beyond.

Listing 4.6 An example of a JWE token with JSON serialization and all related metadata
{
"protected":"eyJlbmMiOiJBMTI4Q0JDLUhTMjU2In0",
"unprotected":{
"jku":"https://server.example.com/keys.jwks"
},
"recipients":[
{
"header":{
"alg":"RSA1_5",
"kid":"2011-04-29"
},
"encrypted_key":"UGhIOguC7IuEvf_NPVaXsGMoLOmwvc1G"
},
{
"header":{
"alg":"A128KW",
"kid":"7"
},
"encrypted_key":"6KB707dM9YTIgHtLvtgWQ8mKwboJW3of9locizkDTHzBC2IlrT1oOQ"
}
],
"iv":"AxY8DCtDaGlsbGljb3RoZQ",
"ciphertext":"KDlTtXchhZTGufMYmOYGS4HffxPSUrfmqCHXaI9wOGY",
"tag":"Mz-VPPyU4RlcuYv1IwIvzw"
}

4.8 Building a compact serialized JWE token


In this section you’ll learn how to build an compact serialized JWE with the open source
Nimbus Java library to create a JWE. The source code related to all the samples used in this
appendix is available in the https://github.com/openidconnect-in-action/samples Git

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


102

repository inside the chapter04 directory. Before you delve into the Java code that you’ll use
to build the JWE, try to build the sample and run it. Run the following Maven command from
the chapter04/sample02 directory. If everything goes well, you should see the BUILD
SUCCESS message at the end:

\> mvn clean install


....................
[INFO] BUILD SUCCESS

Now run your Java program to create a JWE with the following command (from the
chapter04/sample02/lib directory). If it executes successfully, it prints the base64url-
encoded JWE:

\> java -cp "../target/com.manning.oidc.chapter04.sample02-1.0.0.jar:*" \


com.manning.oidc.chapter04.sample02.RSAOAEPJWTBuilder

eyJlbmMiOiJBMTI4R0NNIiwiYWxnIjoiUlNBLU9BRVAifQ.Cd0KjNwSbq5OPxcJQ1ESValmRGPf
7BFUNpqZFfKTCd-9XAmVE-zOTsnv78SikTOK8fuwszHDnz2eONUahbg8eR9oxDi9kmXaHeKXyZ9
Kq4vhg7WJPJXSUonwGxcibgECJySEJxZaTmA1E_8pUaiU6k5UHvxPUDtE0pnN5XD82cs.0b4jWQ
HFbBaM_azM.XmwvMBzrLcNW-oBhAfMozJlmESfG6o96WT958BOyfjpGmmbdJdIjirjCBTUATdOP
kLg6-YmPsitaFm7pFAUdsHkm4_KlZrE5HuP43VM0gBXSe-41dDDNs7D2nZ5QFpeoYH7zQNocCjy
bseJPFPYEw311nBRfjzNoDEzvKMsxhgCZNLTv-tpKh6mKIXXYxdxVoBcIXN90UUYi.mVLD4t-85
qcTiY8q3J-kmg

Following is the decrypted JWE payload:

JWE Header:{"enc":"A128GCM","alg":"RSA-OAEP"}
JWE Content Encryption Key: Cd0KjNwSbq5OPxcJQ1ESValmRGPf7BFUNpqZFfKTCd-9
XAmVE-zOTsnv78SikTOK8fuwszHDnz2eONUahbg8eR9oxDi9kmXaHeKXyZ9Kq4vhg7WJPJXS
UonwGxcibgECJySEJxZaTmA1E_8pUaiU6k5UHvxPUDtE0pnN5XD82cs
Initialization Vector: 0b4jWQHFbBaM_azM
Ciphertext: XmwvMBzrLcNW-oBhAfMozJlmESfG6o96WT958BOyfjpGmmbdJdIjirjCBTUA
TdOPkLg6-YmPsitaFm7pFAUdsHkm4_KlZrE5HuP43VM0gBXSe-41dDDNs7D2nZ5QFpeoYH7z
QNocCjybseJPFPYEw311nBRfjzNoDEzvKMsxhgCZNLTv-tpKh6mKIXXYxdxVoBcIXN90UUYi
Authentication Tag: mVLD4t-85qcTiY8q3J-kmg
Decrypted Payload:
{
"sub":"peter",
"aud":"app.example.com",
"nbf":1533273878,
"iss":"iss.example.com",
"exp":1533274478,
"iat":1533273878,
"jti":"17dc2461-d87a-42c9-9546-e42a23d1e4d5"
}

NOTE If you get any errors while executing the previous command, check whether you executed the
command from the correct location. It has to be from inside the chapter04/sample02/lib directory, not from
the chapter04/sample02 directory. Also make sure that the value of the –cp argument is within double
quotes.

Now take a look at the code that generated the JWE. It’s straightforward and self-
explanatory with code comments. You can find the complete source code in the
sample02/src/main/java/com/manning/oidc/chapter04/sample02/RSAOAEPJWTBuilder.java

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


103

file. The method in the following listing does the core work of JWE encryption. It accepts the
token recipient public key as an input parameter and uses it to encrypt the JWE with RSA-
OAEP.

Listing 4.6 The content of the RSAOAEPJWTBuilder.java file


public static String buildEncryptedJWT(PublicKey publicKey)
throws JOSEException {

// build audience restriction list.


List<String> aud = new ArrayList<String>();

aud.add("*.ecomm.com");

Date currentTime = new Date();

// create a claim set.


JWTClaimsSet jwtClaims = new JWTClaimsSet.Builder().

// set the value of the issuer.


issuer("sts.ecomm.com").
// set the subject value - JWT belongs to this subject.
subject("peter").
// set values for audience restriction.
audience(aud).
// expiration time set to 10 minutes.
expirationTime(new Date(new Date().getTime() + 1000 * 60 * 10)).
// set the valid from time to current time.
notBeforeTime(currentTime).
// set issued time to current time.
issueTime(currentTime).
// set a generated UUID as the JWT identifier.
jwtID(UUID.randomUUID().toString()).build();
// create JWE header with RSA-OAEP and AES/GCM.
JWEHeader jweHeader = new JWEHeader(JWEAlgorithm.RSA_OAEP,
EncryptionMethod.A128GCM);

// create encrypter with the RSA public key.


JWEEncrypter encrypter = new RSAEncrypter((RSAPublicKey) publicKey);

// create the encrypted JWT with the JWE header and the JWT payload.
EncryptedJWT encryptedJWT = new EncryptedJWT(jweHeader, jwtClaims);

// encrypt the JWT.


encryptedJWT.encrypt(encrypter);

// serialize into base64-encoded text.


String jwtInText = encryptedJWT.serialize();

// print the value of the JWT.


System.out.println(jwtInText);

return jwtInText;
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


104

4.9 The JOSE header of a JWE token


In the section 4.2.1 you learned that, a JWT inherits JOSE header attributes from the JWS
and JWE specifications. In this section you’ll learn all the JOSE attributes the JWE
specification defines and how they are related to the ID token defined in the OpenID Connect
specification. The figure 4.7 repeats the figure 4.6, which shows different components of a
JWE, and is referred in the next sections to follow.

Figure 4.7 A JWT that’s a compact-serialized JWE

4.9.1 The alg defines the algorithm to encrypt the CEK


The alg attribute in the JOSE header defines the name of the algorithm, which is used to
encrypt the Content Encryption Key (CEK). The CEK is a symmetric key, which encrypts the
plaintext payload. Once the plaintext is encrypted with the CEK, the CEK itself will be
encrypted with another key following the algorithm identified by the value of the alg
parameter. The encrypted CEK will then be included in the JWE Encrypted Key section
(figure 4.7) of the JWE token. This is a required attribute in the JOSE header.

{
"alg": "RS256"
}

Typically assymetric key encryption is resource intensive and also does not perfom well when
it has to encrypt a large amount of data. Because of that, in most of the cases, a symmetric
key does the data encryption, and then an assymetric key encrypts the symmetric key.

4.9.2 The enc represents the algorithm used for content encryption
The enc attribute in the JOSE header represents the name of the algorithm, which is used for
content encryption. This algorithm should be a symmetric Authenticated Encryption with
Associated Data (AEAD) algorithm. 3 This is a required attribute in the JOSE header. Failure to
include this in the header will result in a token parsing error. The value of the enc parameter
is a string, which is picked from the JSON Web Signature and Encryption Algorithms registry
defined by the JSON Web Algorithms (JWA) specification
(https://tools.ietf.org/html/rfc7518). If the value of the enc parameter is not picked from
the preceding registry, then it should be defined in a collision-resistant manner, but that
won’t give any guarantee that the particular algorithm is identified by all JWE

3 Authenticated encryption (AE) and authenticated encryption with associated data (AEAD) are forms of encryption, which simultaneously assure the
confidentiality, and authenticity of data (https://en.wikipedia.org/wiki/Authenticated_encryption).

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


105

implementations. It’s always better to stick to the algorithms defined in the JWA
specification.

{
"enc": "A256GCM" #A
}

#A Advanced Encryption Standard (AES) in Galois/Counter Mode (GCM) algorithm as defined in the JWA specification.

4.9.3 The zip defines the name of the compression algorithm


The zip attribute in the JOSE header defines the name of the compression algorithm. The
plaintext payload gets compressed before the encryption, if the token issuer decides to use
compression. The compression is not a must. The JWE specification defines DEF
(https://tools.ietf.org/html/rfc1951) as the compression algorithm, but it’s not a must to use
it. The token issuers can define their own compression algorithms. The default value of the
compression algorithm is defined in the JSON Web Encryption Compression Algorithms
registry under the JWA specification. This is an optional parameter.

{
"zip": "DEF"
}

4.9.4 The jku carries a URL, which points to a JSON Web Key set
The jku parameter in the JOSE header carries a URL, which points to a JSON Web Key (JWK)
set. This JWK set represents a collection of JSON-encoded public keys, where one of the keys
is used to encrypt the Content Encryption Key (CEK). Whatever the protocol used to retrieve
the key set should provide the integrity protection. If keys are retrieved over HTTP, then
instead of plain HTTP, HTTPS (or HTTP over TLS) should be used. The jku is an optional
parameter. However, like in JWS, under the context of OpenID Connect, the jku
attribute must not be used when building the ID token, instead the OpenID provider
(the issuer of the ID token) and the client application (the recipient of the ID token) must
communicate the keys used for encryption in some other means.

{
"jku": "https://example.com/jwks.json"
}

4.9.5 The jwk attribute carries the public key corresponding to the CEK
The jwk attribute in JOSE header represents the public key corresponding to the key that is
used to encrypt the Content Encryption Key (CEK). The key is encoded as per the JSON Web
Key (JWK) specification. The jku parameter, which we discussed before, points to a link that
holds a set of JWKs, while the jwk parameter embeds the key into the JOSE header itself.
The jwk is an optional parameter. However, like in JWS, under the context of OpenID
Connect, the jwk attribute must not be used when building the ID token.

{
"alg": "RS256"
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


106

4.9.6 The kid carries an identifier for the key used to encrypt CEK
The kid parameter of the JOSE header represents an identifier for the key that is used to
encrypt the Content Encryption Key (CEK). Using this identifier, the recipient of the JWE
should be able to locate the key. If the token issuer uses the kid parameter in the JOSE
header to let the recipient know about the encryption key, then the corresponding key should
be exchanged “somehow” between the token issuer and the recipient beforehand. How this
key exchange happens is out of the scope of the JWE specification. If the value of the kid
parameter refers to a JWK, then the value of this parameter should match the value of the
kid parameter in the JWK. The kid is an optional parameter in the JOSE header.

{
"kid": "a7ejkje-eo38mehr-38klen"
}

4.9.7 The x5u carries a URL, which points to a X.509 certificate


The x5u attribute in the JOSE header is very much similar to the jku parameter, which we
discussed before. Instead of pointing to a JWK set, the URL here points to an X.509
certificate or a chain of X.509 certificates. The resource pointed by the URL must hold the
certificate or the chain of certificates in the PEM-encoded form. Each certificate in the chain
must appear between the delimiters: -----BEGIN CERTIFICATE----- and -----END
CERTIFICATE-----. The public key corresponding to the key used to encrypt the Content
Encryption Key (CEK) should be the very first entry in the certificate chain, and the rest is
the certificates of intermediate CAs (certificate authority) and the root CA. The x5u is an
optional parameter in the JOSE header. However, like in JWS, under the context of
OpenID Connect, the x5u attribute must not be used when building the ID token.

{
"x5u": "https://example.com/x509.pem"
}

4.9.8 The x5c carries the X.509 certificate embedded into the token
The x5c attribute in the JOSE header represents the X.509 certificate (or the certificate
chain), which corresponds to the public key, which is used to encrypt the Content Encryption
Key (CEK). This is similar to the jwk parameter we discussed before, but in this case instead
of a JWK, it’s an X.509 certificate (or a chain of certificates). The certificate or the certificate
chain is represented in a JSON array of certificate value strings. Each element in the array
should be a base64-encoded DER PKIX certificate value. The public key corresponding to the
key used to encrypt the Content Encryption Key (CEK) should be the very first entry in the
JSON array, and the rest is the certificates of intermediate CAs (certificate authority) and the
root CA. The x5c is an optional parameter in the JOSE header. However, like in JWS, under
the context of OpenID Connect, the x5u attribute must not be used when building
the ID token.

{
"x5c": <PEM encoded X.509 certificate or chain of certificates>
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


107

4.9.9 The x5t / x5t#s256 represent the thumbprint of a certificate


The x5t attribute in the JOSE header represents the base64url-encoded SHA-1 thumbprint of
the X.509 certificate corresponding to the key used to encrypt the Content Encryption Key
(CEK). This is similar to the kid parameter we discussed before. Both these parameters are
used to locate the key. If the token issuer uses the x5t parameter in the JOSE header to let
the recipient know about the signing key, then the corresponding key should be exchanged
“somehow” between the token issuer and the recipient beforehand. How this key exchange
happens is out of the scope of the JWE specification. The x5t is an optional parameter in the
JOSE header.

{
"x5t": "Xdelr79e..."
}

The x5t#s256 attribute in the JOSE header represents the base64url-encoded SHA256
thumbprint of the X.509 certificate corresponding to the key used to encrypt the Content
Encryption Key (CEK). The only difference between x5t#s256 and the x5t is the hashing
algorithm. The x5t#s256 is an optional parameter in the JOSE header.

{
"x5t#256": "Xdweelw4kr79e..."
}

4.9.10 The crit attribute indicates the presence of custom parameters


The crit parameter in the JOSE header is used to indicate to the recipient of the JWE that the
presence of custom parameters, which neither defined by the JWE or JWA specifications, in
the JOSE header. If these custom parameters are not understood by the recipient, then the
JWE token will be treated as invalid. The value of the crit parameter is a JSON array of
names, where each entry represents a custom parameter. The crit is an optional parameter
in the JOSE header.

{
"crit": ["exp"]
}

4.10 The process of compact serializing and verifying JWE token


To compact a serialize a JWE token, we need to build five components: the JOSE header,
JWE encrypted key, initialization vector, ciphertext and the authentication tag. In this section
you’ll learn the process of building each one of those three parts. The figure 4.8 repeats the
figure 4.6, which shows different components of a JWE, and is referred in the following steps.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


108

Figure 4.8 A JWT that’s a compact-serialized JWE

1. Figure out the key management mode by the algorithm used to determine the Content
Encryption Key (CEK) value. This algorithm is defined by the alg attribute in the JOSE
header (figure 4.8). There is only one alg element per JWE token.
2. Compute the CEK and calculate the JWE Encrypted Key (figure 4.8) based on the key
management mode, picked in the previous. The CEK is later used to encrypt the JSON
payload. There is only one JWE Encrypted Key element in the JWE token.
3. Compute the base64url-encoded value of the JWE Encrypted Key, which is produced in
the previous step. This is the 2nd element of the JWE token (figure 4.8).
4. Generate a random value for the JWE Initialization Vector. Irrespective of the
serialization technique, the JWE token will carry the value of the base64url-encoded
value of the JWE Initialization Vector. This is the 3rd element of the JWT token (figure
4.8).
5. If token compression is needed, the JSON payload in plaintext must be compressed
following the compression algorithm defined under the zip header element.
6. Construct the JSON representation of the JOSE header and find the base64url-encoded
value of the JOSE header with UTF8 encoding. This is the 1st element of the JWE
token (figure 4.8).
7. To encrypt the JSON payload, we need the CEK (which we already have), the JWE
Initialization Vector (which we already have), and the Additional Authenticated Data
(AAD). Compute ASCII value of the encoded JOSE header from the previous step and
use it as the AAD.
8. Encrypt the compressed JSON payload (from the previous step) using the CEK,
the JWE Initialization Vector and the Additional Authenticated Data (AAD), following
the content encryption algorithm defined by the header enc header element.
9. The algorithm defined by the enc header element is a AEAD algorithm and after the
encryption process, it produce the ciphertext and the Authentication Tag.
10.Compute the base64url-encoded value of the ciphertext, which is produced by the
step one before the previous. This is the 4th element of the JWE token (figure 4.8).
11.Compute the base64url-encoded value of the Authentication Tag, which is produced by
the step one before the previous. This is the 5th element of the JWE token (figure
4.8).
12.Now we have all the elements to build the JWE token in the following manner. The line
breaks are introduced only for clarity.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


109

4.11 The role of a nested JWT


A nested JWT is a JWT, where its payload is another JWT. In this section you’ll learn the role
of a nested JWT and the how its related to OpenID Connect. The figure 4.9 shows a nested
JWT.

Figure 4.9 A nested JWT, where the enclosing JWT is a JWE token and the enclosed JWT is a JWS token. Once
you decrypt the ciphertext components of the JWE token, produces the enclosed JWT or the JWS token.

There are two parts in a nested JWT: the enclosing JWT and the enclosed JWT. In figure 4.9,
the enclosing JWT is a JWE token, and the enclosed JWT is a JWS token. Once you decrypt
the ciphertext components of the JWE token (the enclosing JWT), produces the JWS token.
According to the OpenID Connect specification, if you are to encrypt an ID token, it
has to be signed first, and then encrypted. Signing an ID token produces a JWS token,
and then encrypting that JWS token, produces a JWE token, which is also a nested JWT.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


110

Figure 4.10 A nested JWT, where both the enclosing JWT and the enclosed JWT are a JWS token.

The figure 4.10 shows another form of a nested JWT, where both the enclosing JWT and
enclosed JWT are JWS tokens. Both in figure 4.9 and figure 4.10, payload of the enclosing
JWT is not a JSON payload. In both the cases, the payload is a JWT. So, in a JWT, if the
payload is another JWT, we must use the cty JOSE header attribute in the enclosing JWT and
set its value to JWT. We discussed the cty attribute in the section 4.2.1.

{
"cty": "JWT"
}

In the nested JWT we had in figure 4.9, the use case is obvious. Having an enclosing JWE
token helps to keep the content of the enclosed JWS token protected for confidentaility. But
in the second case (figure 4.10), why do we need to sign a JWT twice? This useful, when one
application has to share a JWT it received from a token issuer, with another application. In a
microservices deployment, for example, say one microservice recieves a JWT, and this
microservice needs to authenticate to another microservice and also has to pass the original
context of the client who invoked the first microservice. In that case, the first microservice
can create a nested JWT by signing the original JWT it received from the client, and pass the
nested JWT to the second microservice. The figure 4.11 illustrates this use case.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


111

Figure 4.11 A nested JWT, where both the enclosing JWT and the enclosed JWT are a JWS token. A token
issuer issues the enclosed JWT and the client application passes it to the Inventory microservice. Then, the
Inventory microservice builds nest JWT, signs the enclosing JWT with its own private key and send it to the
Order Processing microservice.

Apart from the two forms of nested JWTs we discussed so far in this section, there can be
another form, where both the enclosing and enclosed JWTs are JWEs. However according to
the OpenID Connect specification if a JWT is encrypted, it must be signed first, so this form
of a nested JWT is not encouraged to use in an OpenID Connect flow.
One key limitation we found in all the forms of nested JWTs we discussed so far, there is
no way for the enclosing JWT to have its own claims set. Rather, the payload of the enclosing
JWT is another JWT. If you take the use case we illustrated in figure 4.11, there can be a
requirement where the Inventory microservice has to share some claims with the Order
Processing microservice along with the nested JWT. To address this requirement in a
standard way, there is proposed draft specification at the IETF OAuth working group called
the Nested JSON Web Token specification (https://tools.ietf.org/html/draft-yusef-oauth-
nested-jwt-03).
This draft specification introduces a new value of the cty header attribute called, NJWT
and new attribute for the JWT claims set called njwt. For any nested JWT to support this
model it has to have the value of the cty header attribute as NJWT and the enclosed JWT is
set as the value of the njwt attribute in the claims set as shown in the following listing. The
following listing shows the JOSE header and the claims set of the nested JWT.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


112

Listing 4.7 JOSE header and the claims set of a nested JWT
{
"cty": "NJWT" #A
}
{
"sub": "peter",
"aud": "app.example.com",
"nbf": 1533270794,
"iss": "issuer.example.com",
"exp": 1533271394,
"iat": 1533270794,
"jti": "5c4d1fa1-7412-4fb1-b888-9bc77e67ff2a".
"njwt": <Enclosed JWT> #B
}

#A The value of cty attribute must be set to NJWT in the JOSE header of the enclosing JWT. The JOSE header will have
other header attributes as well.
#B The njwt attribute in the enclosing JWT’s claims set carries the enclosed JWT

4.12 Summary
• The JWT is one of the key building blocks of in building the OpenID Connect standard.
• A JWT (pronounced jot) is a container that carries different types of assertions or
claims from one place to another in a cryptographically safe manner.
• A JWT is always a JWS or a JWE token. But the reverse is not always true.
• There are two forms of serialization for JWS and JWE tokens: the compact
serialization and JSON serialization.
• A JWS token becomes a JWT, when it is compact serialized and the JWS payload is a
JSON payload.
• A JWE token becomes a JWT, when it is compact serialized and the JWE payload is a
JSON payload.
• The RFC 7519 developed under the IETF OAuth working group defines the structure
and the processing rules of a JWT.
• The RFC 7515 developed under the IETF JOSE working group defines the structure
and the processing rules of a JWS token.
• The RFC 7516 developed under the IETF JOSE working group defines the structure
and the processing rules of a JWE.
• A nested JWT is a JWT that carries another JWT as the payload.
• According to the OpenID Connect specification, if you are to encrypt an ID token, it
has to be signed first, and then encrypted, which will produce a nested JWT.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


113

Requesting and returning claims

This chapter covers

• Requesting claims from an OpenID provider


• Returning claims via an ID token and the userinfo endpoint of an OpenID provider
• Doing cross-domain API calls with Cross-origin Resource Sharing (CORS)
• Use cases for aggregated and distributed claims
In chapter 3 we discussed how to log in to a single-page application using OpenID Connect,
following the implicit and authorization code flows. In this chapter you’ll learn how to
transport user claims from an OpenID provider to a client application during the login flow. A
client application can use these claims to identify the users who access those applications, as
well as enforce access control checks based on the claim values.
We also discuss in this chapter multiple ways a client application can request claims and
how an OpenID provider can return those. To demonstrate how requesting and returning
claims work in practice, we use examples with the cURL command line tool. Further in this
chapter, you will learn how to extend the default OpenID Connect claim dialect 1 to support
custom claims.

5.1 The ways of requesting claims from an OpenID provider


In this section you’ll learn different ways a client application can request claims from an
OpenID provider. A client application can use these claims to identify the user, communicate
with the user (for example use user’s email address to send a newsletter), build a
personalized user experience, as well as do application specific authorization checks. The

1 A claim dialect is a way of grouping a set of related claims. We can call all the claims that the OpenID Connect specification supports, the OpenID
Connect claim dialect.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


114

following list is an overview of the available options in OpenID Connect to transport claims
between a client application and an OpenID provider (figure 5.1).
• A client application can request claims using the scope parameter in the
authentication request to the OpenID provider. This is the most popular approach to
request claims and almost all the OpenID provider implementations support it. We
discuss this approach in detail in section 5.2.
• A client application can request claims by talking to the userinfo endpoint of the
OpenID provider. The userinfo endpoint is a standard endpoint OpenID Connect
introduced to share claims with the client applications. This approach is useful, when
you use OpenID Connect under the hybdird mode (with code id_token token as the
response_type), and the length of the ID token is too long. We discuss this approach
in detail in 5.3.
• A client application can request claims using the claims parameter in the OpenID
Connect authentication request. This gives you more control to selectively pick, which
claims you need without binding those to a scope value. We discuss this approach in
detail in section 5.4.

Figure 5.1 A client application can request claims from the OpenID provider in multiple ways. The most
popular way to request claims using the scope parameter and the OpenID provider can return claims, either in
the ID token, or via the userinfo endpoint. The other way to request claims is to use the claims parameter in
the OpenID Connect authentication request.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


115

5.2 Returning scope bound claims in an ID token


In this section you’ll learn how to use the scope parameter to request claims from an OpenID
provider. First, we’ll take you through two concrete examples on how eBay uses the scope
parameter to request claims from Google and Apple. Although we’re using Google and Apple
as examples, this information is the same for all OpenID Connect requests.
eBay uses login with Google and Apple, both of which internally use OpenID Connect.
Then we’ll take you through an end-to-end example with cURL to show how to request claims
using the scope parameter with the Google OpenID provider and how the Google OpenID
provider returns claims in an ID token. We picked Google OpenID Provider here because it is
easy to setup and available for anyone to access.

5.2.1 Requesting claims using scope parameter from Google OpenID provider
In this section we’ll take you through a simple example to show how eBay requests claims
from the Google OpenID provider. If you visit ebay.com and click on log in, then you can pick
either Google or Apple for logging in. These options provided by eBay may vary at the time
you read the book, but that won’t affect our discussion in this chapter. In addition to Google
and Apple, you will also see an option to log in with Facebook. However, login with Facebook
does not support OpenID Connect, so we don’t worry about it.
Let’s assume you picked “log in with Google.” Then you’ll be redirected to the Google
OpenID provider, and if you look at the browser location bar, you’ll find the following URL in
listing 5.1. This URL carries a set of query parameters, and all of them are standard
parameters defined in the OpenID Connect specification, except for the flowName
parameter. The flowName is a custom parameter that is specific to the Google OpenID
provider. However, our focus of this section is only on the scope parameter in the request.

Listing 5.1 An authorization code flow request to the Google OpenID provider
https://accounts.google.com/o/oauth2/v2/auth/identifier?locale=en_US&
client_id=510718330363&
scope=openid email profile& #A
response_type=code&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fwww.ebay.com%2Fsignin%2Fggl%2Fcb&
state=dl4xLjEjaV4xI3BeMSNyXjEjSV4zI2ZeMCN0XlVsNDF&
flowName=GeneralOAuthFlow. #B

#A The value of the scope parameter carries one or more values, which are known to the OpenID provider to request
claims.
#B A custom parameter that is specific to the Google OpenID provider

As we discussed in the chapter 1, any OpenID Connect request must have openid as the
scope, and you can have other additional scopes as well. In listing 5.1, you can see two
other values defined under the scope parameter: email and profile. This is the most
common way a client application requests claims from an OpenID provider, using scope
values. The OpenID provider should know how to interpret the value of scope parameter, in
this example email and profile, and return back the corresponding claims to the client
application. The values email and profile are standard scopes defined by the OpenID

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


116

Connect specification and we discuss all the standard scopes OpenID Connect defines in
section 5.2.3.

5.2.2 Requesting claims using scopes from Apple OpenID provider


In this section we’ll take you through a simple example to show how eBay requests claims
from the Apple OpenID provider. On the eBay login page, choose to log in with Apple and
you’ll be redirected to the Apple OpenID provider. If you look at the browser location bar,
you’ll find the following URL in listing 5.2. This URL carries a set of query parameters, and all
of them are standard parameters defined in the OpenID Connect specification. However, in
this discussion we only focus on the scope parameter.

Listing 5.2 An authorization code flow request to the Apple OpenID provider
https://appleid.apple.com/auth/authorize?locale=en_US&
response_type=code%20id_token&'
scope=name&. #A
response_mode=form_post&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fwww.ebay.com%2Fsignin%2Fapple%2Fcb&
state=dl4xLjEjaV4xI2ZeMCNyXjEjcF4xI0leMyN0XlVsN&
client_id=com.ebay.www

#A eBay uses name as the value of the scope parameter to requests user’s name from Apple.

In code listing 5.2 eBay uses name as the value of the scope parameter to requests user’s
name from Apple. However, as you might have already guessed, in this request we don’t
pass openid as a scope in the request. But as per the OpenID Connect specification, you
must pass openid as a scope in the authentication request; if you don’t, then the behavior is
undefined, or in other words the behavior may differ from one OpenID provider to another.
However, the Apple OpenID provider responds correctly regardless if you pass openid as a
scope parameter. It’s always better to adhere to the OpenID Connect specification and
always pass openid as a scope parameter, in addition to the other values the scope
parameters carries.

5.2.3 OpenID Connect defines four standard scope values


In this section we go through the four standard scope values defined by the OpenID Connect
specification. In section 5.2.1, for example, eBay used two standard scope values, email and
profile, for the scope parameter to request claims from the Google OpenID provider. The
following table lists the four standard scope values defined by the OpenID Connect
specification for requesting claims.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


117

Table 5.1 The standard scopes defined by the OpenID Connect specification

Scope value Description

profile The profile as a scope value, represents multiple claims that include name,
family_name, given_name, middle_name, nickname,
preferred_username, profile, picture, website, gender,
birthdate, zoneinfo, locale, and updated_at. When you pass
profile as a scope parameter you request from the OpenID provider all these
claims. These 14 claims are known as standard claims in OpenID Connect. They
are called standard claims, because their meaning is well defined in the OpenID
Connect specification. OpenID Connect specification defines 20 standard claims
including these 14 and we discuss those in detail in section 3.2.4.

email The email as a scope value represents multiple claims that include email and
email_verified. These two claims too are part of the 20 standard claims the
OpenID Connect specification defines. When you pass email as a scope parameter
you request from the OpenID provider the two claims: email and
email_verified.

address The address as a scope value represents the standard claim address. When
you pass address as a scope parameter you request from the OpenID provider
the address claim.

phone The phone as a scope value represents multiple claims that include
phone_number and phone_number_verified. These two claims too are part
of the 20 standard claims the OpenID Connect specification defines. When you
pass phone as a scope parameter you request from the OpenID provider the two
claims: phone_number and phone_number _verified.

Defining a set of standard scopes is helpful in maintaining interoperability among multiple


OpenID providers and client applications. However, an OpenID provider can define its own
scope values for requesting claims, and in such cases you can find the definitions of those
scope values in the corresponding OpenID provider’s documentation.
Google OpenID provider, for example, has a custom scope called
https://www.googleapis.com/auth/drive.file. You can find more details about it here:
https://developers.google.com/identity/protocols/oauth2/openid-connect#scope-param.

DO NOT REDEFINE A STANDARD SCOPE! As a best practice you should not try to redefine the
meaning of the standard scopes that are already defined in the OpenID Connect specification. That could
cause confusion and will kill the interoperability between OpenID providers and client applications.

5.2.4 OpenID Connect defines twenty standard claims


A standard claim is an identifier whose meaning is well-defined in the OpenID Connect
specification. OpenID Connect specification defines 20 such standard claims. In this section
we explore the definition of each of them.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


118

In section 5.2.3 we discussed how 19 out of the 20 standard claims are mapped to four
standard scopes OpenID Connect defines. The only standard claim that is not mapped to a
scope is sub. The sub is a special identifier in the ID token, which is used to identify the
owner of the token. The following table defines all 20 standard claims OpenID Connect
defines.

Table 5.2 The standard claims defined by the OpenID Connect specification

Claim identifier Description

sub An identifier owned by the OpenID provider, which represents the end-user.

name The full name of the end-user.

given_name The first name or the given name of the end-user.

family_name The last name of the family name of the end-user.

middle_name The middle name of the end-user.

nickname The nickname of the end-user.

preferred_username The username preferred by the end-user. The value of this claim may not be unique
across all the users managed by the OpenID provider.

profile The URL of the end-user’s profile page.

picture The URL of the end-user’s profile picture.

website The URL of the end-user’s website.

email The end-user’s email address. The value of this claim may not be unique across all
the users managed by the OpenID provider.

email_verified The value of this claim is true, if the email of the end-user being verified by the
OpenID provider, else false.

gender The gender of the end-user. The OpenID Connect defines male and female as
possible values; and if none of them are applicable the OpenID provider can pick
its own.

birthdate The end-user’s birthdate.

zoneinfo
locale
phone_number The end-user’s phone number. The value of this claim may not be unique across all
the users managed by the OpenID provider.

phone_number_verified The value of this claim is true, if the phone_number of the end-user being verified
by the OpenID provider, else false.

address The end-user’s preferred postal address.

updated_at

Defining a set of standard claims is helpful in maintaining interoperability among multiple


OpenID providers and client applications. However, an OpenID provider can define its own
claims, and in such cases you can find the definitions of those claims in the corresponding

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


119

OpenID provider’s documentation. If you look at the Google OpenID provider documentation
available at https://developers.google.com/identity/protocols/oauth2/openid-connect, for
example, you will find the custom claim hd, which is the hosted G Suite domain of the user.

THE OPENID PROVIDER SHOULD KNOW HOW TO INTERPRET THE VALUE OF THE SCOPE
PARAMETER In the section 5.2.3, we discussed how eBay uses the name as the scope parameter to
request claims from the Apple OpenID provider. The name is not a standard scope. However, as we
discussed in section 5.2.4 name is a standard claim. The rule is, you can’t pass a standard claim as a scope
parameter in the OpenID Connect authentication request and expect the corresponding claim value in the
response. If you use anything other than the standard scope values in the authentication request, the
behavior of such scope values must be defined by the corresponding OpenID provider’s documentation. So,
in this case we can conclude that, in section 5.2.3, the name scope parameter is a custom scope, not a
standard claim.

5.2.5 OAuth 2.0 scope vs. OpenID Connect scope


OpenID Connect does not redefine the scope parameter that is already defined in the OAuth
2.0 specification, which we discussed in chapter 2. In this section you’ll learn different
applications of the scope parameter under OAuth 2.0 and OpenID Connect.
The scope parameter under the context of OAuth 2.0 defines the scope of the access
token issued by the authorization server. Typically, the intended audience of the access
token is a resource. A resource can be an API, a microservice and so on. So, the scope of the
access token defines, what actions the client application can perform on a resource using the
corresponding access token.
When you access the Facebook Graph API, for example, to read a given user’s Facebook
feed, you need to have an access token with the scope user_posts. The same access token
is not good enough to read the list of all Facebook Pages that the user has liked, and for that
you need to have an access token with the user_likes scope. If you’d like to read a given
user’s Facebook posts as well as the all Facebook Pages that the user has liked, then you
need to have access token with the scopes user_posts and user_likes. You can read more
about scopes related to the Facebook Graph API from here:
https://developers.facebook.com/docs/permissions/reference.
In OpenID Connect, however, the application of scopes is bit different from how it is used
with OAuth 2.0 to access APIs or microservices. It is not a complete deviation, rather a
different way of usage. In OpenID Connect, the scopes are mostly used by a client
application to communicate to the OpenID provider, which claims of the user it expects in the
response, or the ID token.
When a client application requests both the ID token and access token from an OpenID
provider in the authentication request, the client application cannot explicitly say which
scopes are related to the access token and which are related to the ID token. 2 It’s up to
OpenID provider to decide which claims it should embed into the ID token. Then again, when
you use the standard scopes defined in the OpenID Connect specification, which we

2 As we discussed in chapter 3, not all the OpenID Connect authentication flows return both the ID token and the access token. In the implicit
authentication flow, for example, when the response_type parameter is set to id_token, the OpenID provider only returns the ID token.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


120

discussed in section 5.2.3, you can guarantee that those are returned to the client
application via the ID token. In section 5.3 you’ll learn how to use an access token issued by
an OpenID provider to request claims by talking to the userinfo endpoint. In fact the userinfo
endpoint of the OpenID provider is secured with OAuth 2.0 and the claims returned by it are
bounded to the scope of the corresponding access token the OpenID provider returns along
with the ID token.

5.2.6 Do it yourself! Requesting claims using scopes with cURL


In this section you’ll learn how to request claims from an OpenID provider using scopes with
cURL. This is very much similar to the exercise you did in section 2.7. However, instead of
the implicit authentication flow, here we use the authorization code authentication flow.

SETTING UP GOOGLE AS AN OPENID PROVIDER


In this section you’ll learn how to obtain a client_id and client_secret from the Google
OpenID provider. To set up Google as an OpenID provider, please check
https://github.com/openidconnect-in-action/samples/blob/master/IDPs.md. Once you are
done with that, you get a client_id and a client_secret for your client application.
Also, while registering your client application with Google, you also need to provide one
or more redirect_uris. If you use authorization code authentication flow, for example, the
OpenID provider returns the authorization code to the redirect_uri in the authentication
request, and it must be one of the registered redirect_uris. Let’s assume you have
https://localhost:3000/redirect_uri as the redirect_uri. The following code snippet lists all
the parameters we need to know to build a client application against the Google OpenID
provider following the authorization code flow.

Listing 5.3 Parameters related to Google OpenID provider


client_id: 4504439922519l17d82cli8.apps.googleusercontent.com
client_secret: uELdLc60SXLTO23P2hu2VkNi
redirect_uri: https://localhost:3000/redirect_uri
Google authorization endpoint: https://accounts.google.com/o/oauth2/v2/auth
Google token endpoint: https://oauth2.googleapis.com/token

CONSTRUCTING THE AUTHENTICATION REQUEST


In this section we’ll construct an OpenID Connect authentication request for the authorization
code flow with the parameters from listing 5.3 and some additional parameters such as
scope, state, response_type, and nonce. We discussed in detail the definition of these
parameters in chapter 2. To use authorization code flow, we need to set the value of the
response_type parameter to code.
The following listing shows the complete authentication request; you may replace the
value of client_id with what you got from Google after registering the client application.
Also, we use two random string values for state and nonce parameters, and we expect the
OpenID response to carry the same state value. When you try out the URL in the
following code listing, make sure that there are no line breaks.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


121

Listing 5.4 An authorization code flow request to the Google OpenID provider
https://accounts.google.com/o/oauth2/v2/auth?
client_id=4504439922519l17d82cli8.apps.googleusercontent.com&
redirect_uri=https://localhost:3000/redirect_uri&
scope=openid profile&
response_type=code&
state=caf7871khs872&
nonce=89hj37b3gd3

The most important parameter in the code listing 5.4 is the scope. That’s the one we worry
about in this section and here we set its value as openid profile. So, we use the standard
scope profile to request attributes from the Google OpenID provider, and expect the
standard claims grouped under the profile scope (see table 5.1) in the response (more
precisely in the ID token).
Once you copy and paste the request in code listing 5.4 into your browser location bar, it
will take you to Google OpenID provider for authentication. If you are not logged in already,
you may see a screen similar to figure 5.2. Also, the same screen will display the name of
the client application you used during the application registration process, and the set of
attributes Google about to share with the client application. We used profile as the scope
value (in addition to openid) in the authentication request, and the list of attributes shown in
figure 5.2 are related to that.

Figure 5.2 Google login screen for user authentication. The name of the client application you used during
application registration process is displayed on the screen along with the user attributes Google is going to
share with the client application.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


122

Even though the profile scope groups 14 standard claims under it, the Google OpenID
provider is only sharing 4 claims with the client application: name, email_address, locale,
and picture. This is quite important to notice. There is no guarantee that the OpenID
provider will return back all the claims a client application requests via scopes. It’s
up to the OpenID provider and also to the end-user to decide which claims they want to
share with client application.
Once you complete the authentication flow at the Google OpenID provider, it will redirect
you back to the client application (to the redirect_uri we provided). However, since we do
not have any application running on that address, the response from Google OpenID provider
will remain on the browser location bar, as shown in the following code snippet.

Listing 5.5 The response form the Google OpenID provider with authorization code
https://localhost:3000/redirect_uri?
state=caf7871khs872&
code=4/5wFzvDar86R-AJECIT&
scope=profile openid https://www.googleapis.com/auth/userinfo.profile&
authuser=0&
prompt=consent

In listing 5.5, in the response from the Google OpenID provider, we got the authorization
code along with four other parameters. If you are familiar with the typical response you get
from an authorization endpoint, you may find this response from Google OpenID provider is
bit weird. As per the OpenID Connect specification (and the OAuth 2.0 RFC), the response
from the authorization endpoint returns only two parameters, the code and the state. 3

USE OF SCOPE WITH AUTHORIZATION CODE AND IMPLICIT AUTHENTICATION FLOWS The
scope is a parameter defined in the OAuth 2.0 RFC, and the OpenID provider is not required to return it back
from the authorization endpoint. As per the RFC, the OpenID provider is only required to send back the scope
parameter in the response from token endpoint, for the OpenID Connect authorization code authentication
flow. However, when using implicit authentication flow, as discussed in section 3.7.2, the OpenID provider is
supposed to return back the scope parameter in the response from the authorization endpoint.

However the response from the Google OpenID provider includes three additional
parameters: scope, authuser, and prompt. These parameters are not defined in either the
OpenID Connect specification or OAuth 2.0 RFC. Also, none of these specifications restricts
an OpenID provider from returning any custom parameters in the response, so you should be
able to safely ignore them, or else look for the documentation provided by the corresponding
OpenID provider and handle them accordingly. These additional parameters the Google
OpenID provider returns are not documented anywhere, so we believe these parameters
could possibly be used by Google applications themselves. 4

3 In section 3.9, we discussed in detail how the OpenID Connect authorization code flow works.
4 The Google OpenID provider documentation is available at https://developers.google.com/identity/protocols/oauth2/openid-connect.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


123

CONSTRUCTING THE TOKEN REQUEST


In this section you’ll learn how to exchange the authorization code you got from the previous
section by talking to the token endpoint of the Google OpenID provider for an access token
and an ID token, with cURL.
You may recall that in the previous section we used code as the value of the
response_type parameter (listing 5.4) in the request we sent to the authorization endpoint
of the Google OpenID provider. We always add the response_type parameter to the
requests to the authorization endpoint and, based on that, the OpenID provider generates
the reponse.
Similarly, any request to the token endpoint carries the parameter grant_type. For the
OpenID Connect authorization code flow, we must use authorization_code as the value of
the grant_type parameter. The grant_type parameter helps the OpenID provider to
generate the response from the token endpoint.
The following code listing shows the cURL request to exchange the authorization code
(listing 5.5) to an access token and an ID token.

Listing 5.6 Exchanging the authorization code to an access token


\> export CLIENTID=4504439922519l17d82cli8.apps.googleusercontent.com
\> export CLIENTSECRET=uELdLc60SXLTO23P2hu2VkNi
\> export CODE=4/5wFzvDar86R-AJECIT
\> export REDIECTURI=https://localhost:3000/redirect_uri
\> curl –v -k --user "$CLIENTID:$CLIENTSECRET" \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "code=$CODE&grant_type=authorization_code" \
-d "client_id=$CLIENTID&redirect_uri=$REDIRECTURI" \
https://oauth2.googleapis.com/token

We got the value of code parameter from the authorization endpoint of the Google OpenID
provider from listing 5.5. In listing 5.6, we first export the values of client_id,
client_secret, redirect_uri, and code to four environment variables. You may have your
own values for these four environment variables.
The values of client_id, client_secret, and redirect_uri in listing 3.6 are for your
client application that you registered at the Google OpenID provider, and will be the same for
all the requests your application sends to the token endpoint the Google OpenID provider.
However, the value of the code will change for each request. The following code listing shows
the response from the token endpoint of the Google OpenID provider.

Listing 5.7 The response from the token endpoint of Google OpenID provider
{
"access_token": "ACCESS_TOKEN",
"expires_in": 3599,
"scope": "https://www.googleapis.com/auth/userinfo.profile openid",
"token_type": "Bearer",
"id_token": "ID_TOKEN"
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


124

For clarity, in listing 5.7 we’ve removed the value of the access_token parameter, and
replaced it with ACCESS_TOKEN, as well as replaced the value of id_token with ID_TOKEN.
The value of the access_token is a lengthy string generated by the Google OpenID provider,
and the value of the id_token is a JWT.
As you learned in chapter 4, JWT can be a JSON Web Signature (JWS) or a JSON Web
Encryption (JWE). The value of the id_token parameter can be either of them. However, the
Google OpenID provider returns back a JWS. A JWS has three parts in it: the header, body
(claims set) and the signature.

INSPECTING THE ID TOKEN RETURNED BY THE OPENID PROVIDER


The following listing shows the decoded header of the id_token. To decode the JWT, you can
user jwt.io website. However, use that website only for testing. As a best practice, better not
to use external tools hosted on the internet to decode the ID tokens you get in a production
environment.

Listing 5.8 The decoded header of the ID token


{
"alg": "RS256",
"kid": "f092b612e9b6447deb10685bb8ffa8ae62f6aa91",
"typ": "JWT"
}

The following listing shows the decoded body of the id_token, which carries the claims we
requested using the scope, profile. In section 3.7.3, we had a detailed discussion on the
parameters included in the ID token. So, if you need any clarification on the content of listing
5.9 please refer section 3.7.3.

Listing 5.9 The decoded body (claims set) of the ID token


{
"iss": "https://accounts.google.com",
"azp": "450443992251-9l17d82cli8npa9cdrvcp1g9m17gft9u.apps.googleusercontent.com",
"aud": "450443992251-9l17d82cli8npa9cdrvcp1g9m17gft9u.apps.googleusercontent.com",
"sub": "108063262378861625804",
"at_hash": "b23ENtJghQgmS83SIgZVfA",
"nonce": "89hj37b3gd3",
"name": "Prabath Siriwardena",
"picture": "https://lh3.googleusercontent.com/a-
/AOh14GiriDTmbf8tcSKzMkFYvYwYuBMUmGFdtEBqpvRGOA=s96-c",
"given_name": "Prabath",
"family_name": "Siriwardena",
"locale": "en",
"iat": 1604877203,
"exp": 1604880803
}

5.3 Returning scope bound claims from the userinfo endpoint


In this section we discuss a way of returning claims to a client application from the userinfo
endpoint of an OpenID provider. In section 5.2 you learnt how to use scopes to request

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


125

claims from the OpenID provider. In the example we had in section 5.2, based on the
scopes, the OpenID provider included all or some of the requested claims in the ID token it
returned from the token endpoint.
In the approach we discuss in this section, the client application talks to a special
endpoint at the OpenID provider called the userinfo endpoint to request claims. You’ll first
learn what the userinfo endpoint is and why OpenID Connect introduced this special
endpoint, and then we’ll take you through an example with the Google OpenID provider.

5.3.1 How the userinfo endpoint works?


In this section you’ll learn how the userinfo endpoint works in an OpenID Connect
authentication flow. In chapter 3 we discussed the three types of authentication flows in
OpenID Connect: the authorization code flow, the implicit flow, and the hybrid flow. In
section 3.3, you learnt how implicit flow works, and in section 3.9, how the authorization
code flow works. We’ll discuss hybrid flow in detail in chapter 6.

Figure 5.3 The client application first gets an access token from the OpenID provider and then uses that access
to request claims from the userinfo endpoint. The claims issued from the userinfo endpoint are bounded to the
scope of the corresponding access token.

All these three flows get you an access token, except the implicit flow with response_type
parameter having the value id_token. Having an access token is a primary requirement for a
client application to retrieve claims from the userinfo endpoint (see figure 5.3).
The userinfo endpoint is protected with OAuth 2.0; and that’s why a client application
must have an access token to access it. The client application must first complete an OpenID
Connect authentication flow that returns an access token, and then use that access token to

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


126

talk to the userfinfo endpoint of the OpenID provider; and the userinfo endpoint returns back
the requested claims with respect to the corresponding user as a JSON payload.
To refresh your memory from chapter 3, the following table lists the tokens you get under
each authentication flow, based on the value of the response_type parameter in the
authentication request.

Table 5.3 The authentication flows defined by the OpenID Connect specification

Type of flow The value of response_type Tokens returned

Authorization code code The authorization endpoint returns the


authorization code, and then the code
can be exchanged to an access token
and an ID token by talking to the token
endpoint.

Implicit id_token The authorization endpoint returns the


ID token. No requests to the token
endpoint and no access tokens.

Implicit id_token token The authorization endpoint returns an


ID token and an access token. No
requests to the token endpoint.

Hybrid code id_token The authorization endpoint returns an


authorization code and an ID token.
The authorization code can be
exchanged to an access token and ID
token by talking to the token endpoint.
Please refer chapter 4 for more
details.

Hybrid code id_token token The authorization endpoint returns an


authorization code, an ID token and an
access token. The authorization code
can be exchanged to an access token
and ID token by talking to the token
endpoint. Please refer chapter 4 for
more details.

Hybrid code token The authorization endpoint returns an


authorization code and an access
token. The authorization code can be
exchanged to an access token and ID
token by talking to the token endpoint.
Please refer chapter 4 for more
details.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


127

Unlike the ID token, which is a JWT, the response from the userinfo endpoint is a JSON
payload unless the client application specifically requested for a JWT. However, in most of
the cases, like the ID token, the claims included in the response from the userinfo endpoint
are also based on the value of the scope parameter in the OpenID Connect authentication
request. In section 5.5 we discuss an approach where you can request claims from an
OpenID provider without based on the scope value in the authentication request.

5.3.2 Why OpenID Connect introduced the userinfo endpoint


In this section you’ll learn why OpenID Connect introduced the userinfo endpoint. The
OpenID provider can transport user claims to the client application via the ID token. Why did
the OpenID Connect specification introduce another endpoint at the OpenID provider to get
the same set of claims? At first glance this sounds redundant. There are a couple of historic
reasons that possibly influenced the OpenID Connect specification to introduce the userinfo
endpoint.

1. To onboard OAuth 2.0 client applications that worked around OAuth 2.0 to
authenticate users.
2. To onboard legacy client applications that didn’t have the cryptographic capabilities to
validate an ID token.
REASON 1 FOR THE OPENID CONNECT SPECIFICATION TO INTRODUCE THE USERINFO ENDPOINT
Prior to OpenID Connect, people worked around OAuth 2.0 to share user claims between an
identity provider (OAuth 2.0 authorization server) and client applications. In section 1.5, we
discussed how the client applications use Login with Facebook to authenticate users.
Facebook only supports OAuth 2.0 (at the time of this writing), however it introduced an
endpoint (https://graph.facebook.com/me), where the client applications can authenticate
with an access token corresponding to a Facebook user and retrieve their claims.
This endpoint Facebook introduced is quite similar to the userinfo endpoint we have in
OpenID Connect. However, it’s not only Facebook who introduced similar endpoints to work
around OAuth 2.0 to share user claims; there were many other OAuth 2.0 authorization
servers did the same. OpenID Connect standardized this approach with the introduction of
the userinfo endpoint. And also the userinfo endpoint helped migrating the OAuth 2.0 client
applications, which worked around OAuth 2.0 to authenticate users to use OpenID Connect.
These client applications used to call a custom endpoint hosted at the OpenID provider and
secured with OAuth 2.0 to get user claims. To migrate to OpenID Connect they had to use
the userinfo endpoint instead of the custom endpoint.

REASON 2 FOR THE OPENID CONNECT SPECIFICATION TO INTRODUCE THE USERINFO ENDPOINT
A client application that receives user claims in a JWT must validate the integrity of the token
by verifying the signature of it. In chapter 4, we discussed the JWT verification process in
detail. In the OpenID Connect implicit authentication flow and in some of the hybrid flows
(where the response_type is code id_token and code id_token token) the OpenID
provider returns back the ID token in the response from the authorization endpoint.
The response from the authorization endpoint always flows via the browser. That means
the end user sees what’s in it and has the ability to change the content of the ID token. So,

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


128

the client application should not accept any ID token that flows via the browser without
validating the integrity of it.
However, not all the programming languages had the libraries to validate a JWT some
time back, and the userinfo endpoint helped to onboard the applications developed in those
languages to use OpenID Connect. So, the client applications can use the access token it
gets from the authorization endpoint of the OpenID provider, and use it retrieve user claims
from the userinfo endpoint. The communication between the client application and the
userinfo endpoint must happen over TLS, and can rely on TLS to protect the integrity of the
communication channel.

5.3.3 When to use the userinfo endpoint to retrieve user claims


In this section we discuss when to use the userinfo endpoint to retrieve claims, instead of
using the ID token. The two historical reasons we discussed in section 5.3.2 that influenced
OpenID Connect to introduce the userinfo endpoint are no longer valid for any new
applications you develop today. There are many libraries available in multiple programming
languages to do JWT verification. So, is there any motivation to use the userinfo endpoint? In
practice we do not see any need to use the userinfo endpoint and in the rest of this section
we’ll explore the details.
Most of the new applications that follow the OpenID Connect authorization code flow rely
on the ID token to retrieve claims from the OpenID provider instead of using the userinfo
endpoint. One popular question we hear quite often with respect to retrieving claims using an
ID token is, is there a size limit in the ID token? The answer is no. You can populate an
ID token with any number of claims you wish. But, wait! The answer is not complete yet!
Although there is no size limit for an ID token, the way you transport an ID token from
the OpenID provider to the client application could enforce certain restrictions. When you use
implicit flow, for example, the client application gets the ID token as a URI fragment. The
URI fragment is part of the URL and browsers enforce limits on the length of the URL.
Microsoft Internet Explorer, for example, has a 2083 character limit on the length of the URL.
In such cases, if you are constrained to use the implicit flow (or you have certain
constraints to not use the authorization code flow), and you expect the ID token with the
user claims to exceed the character limit of the URL, then you would need to go with the
userinfo endpoint to retrieve user claims. However, our recommendation is to go with
authorization code flow, unless there is a very specific reason to avoid that. Section
2.10 highlights the pros and cons of picking authorization code flow over implicit flow. If you
use authorization code flow, then there isn’t a need to use the userinfo endpoint.

5.3.4 Using the userinfo endpoint with the Google OpenID provider with cURL
In this section you’ll learn how to retrieve claims from the userinfo endpoint at the Google
OpenID provider, using cURL. We assume you have completed the examples in section 5.2.6
and have a valid access token issued by the Google OpenID provider. In case the access
token is expired you can redo the same steps in section 5.2.6 to get a new access token.
The following cURL command uses that access token to authenticate to the userinfo
endpoint to retrieve user claims. Here we first export the value of the access token to the
TOKEN environment variable and then run the cURL command against the userinfo endpoint

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


129

of Google OpenID provider. As you learned already, the userinfo endpoint is an OAuth 2.0
protected endpoint, so we need to pass the access token in the Authorization HTTP header
with the Bearer prefix.

\> export TOKEN=ya29.A0AfH6SMBl


\> curl -H "Authorization: Bearer $TOKEN" \
https://openidconnect.googleapis.com/v1/userinfo

The following listing shows the response from userinfo endpoint, which is a JSON payload
(not a JWT) that carries the user claims we requested via the scope parameter.

Listing 5.10 The response from userinfo endpoint that return user claims
{
"sub": "108063262378861625804",
"name": "Prabath Siriwardena",
"given_name": "Prabath",
"family_name": "Siriwardena",
"picture": "https://lh3.googleusercontent.com/a-
/AOh14GiriDTmbf8tcSKzMkFYvYwYuBMUmGFdtEBqpvRGOA\u003ds96-c",
"locale": "en"
}

If you compare the JSON response in code listing 5.10 with the JWT claims set in listing 5.9
that is part of the ID token, you’ll find that in both the cases Google OpenID provider returns
the same set if user claims. However, the JWT claims set has additional claims (for example,
iss, aud, exp, and so on), but those are not directly related to the user, rather being used by
the client applications to validate the JWT.

5.4 Cross-origin resource sharing (CORS)


When you protect a SPA with OpenID Connect, you need to talk to the userinfo endpoint of
the OpenID provider from the browser itself. In practice, your SPA and the OpenID provider
run on two different domains. In chapter 3, when the SPA running on one domain talked to
the token endpoint of the OpenID provider running on another domain, we had to enable
CORS for the token endpoint. Otherwise, the browser won’t allow cross-domain API calls. In
the same way, when your SPA talks to the userinfo endpoint of the OpenID provider, we also
need to enable CORS for the userinfo endpoint.
In section 3.11, we briefly discussed cross-origin resource sharing (CORS). In this section
we delve deep into CORS. First, you’ll learn how browsers enable same origin policy, and
then you’ll learn how CORS helps you to do cross-domain API calls from a SPA. When you
protect a SPA with OpenID Connect, we need to enable CORS for both the token
endpoint and userinfo endpoint at the OpenID provider.

5.4.1 The same-origin policy


In this section you’ll learn what same-origin policy is and how browsers support it. The
same-origin policy (https://en.wikipedia.org/wiki/Same-origin_policy) is a web security
concept introduced by Netscape Navigator 2.02 in 1995 to ensure that scripts running on a
particular web page on the browser can make requests only to the services running on the

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


130

same origin. An origin of a given URL consists of the URI scheme, hostname, and port. Given
the URL http://localhost:8080/login, the following elements compose the origin:
• http—The URI scheme
• localhost—The hostname/IP-address
• 8080—The port

The sections after the port aren’t considered to be part of the origin; therefore, /login isn’t
considered to be part of the origin. The same-origin policy exists to prevent a malicious script
on one website from accessing data on other websites unintentionally. The same-origin policy
applies only to data access, not while loading CSS, images, and scripts, so you could write
web pages that consist of links to CSS, images, and scripts of other origins. To be precise,
you can only load scripts from other origins, but you cannot use those scripts to invoke an
endpoint from a different origin. Figure 5.4 illustrates this scenario.

Figure 5.4 In a web browser, the same-origin policy ensures that scripts running on a particular web page can
make requests only to services running on the same origin.

Here are the steps shown in figure 5.1:

1. The browser loads an HTML file (index.html) from the domain example.com. This
request is successful.

2. The index.html file loaded into the browser makes a request to the same domain
(example.com) to load CSS and JavaScript files; it also loads data (makes an HTTP
request) from the domain example.com. All requests are successful because
everything is from the same domain as the web page itself.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


131

3. The index.html file loaded into the browser makes a request to a domain named
example.org to load CSS and JavaScript files. This request, although made to a
different domain (example.org) from a web page loaded from another domain
(example.com) is successful because it’s loading only CSS and JavaScript.

4. The index.html file loaded into the browser loads data (makes an HTTP request) from
an endpoint on domain example.org. This request fails because, by default, the
browser doesn’t allow web pages in one domain (example.com) to make HTTP
requests to endpoints in other domains (example.org) unless the request is for CSS,
JavaScript, or images.

5.4.2 What is the danger of not having a same-origin policy?


In this section you’ll learn the importance of the same-origin policy and risk of not having
such implemented at the browser.
Suppose you’re logged into Gmail in your web browser. As you may know, Gmail uses
cookies in your browser to maintain data related to your browser session. Someone sends
you a link (via email or chat) that appears to be a link to an interesting website. You click
this link, which results in loading that particular website in your browser.
If this website has a page with one or more scripts that access the Gmail APIs to retrieve
your data, the lack of something similar to a same-origin policy would allow the script to be
executed. Because you’re already authenticated to Gmail, and your session data is stored
locally on cookies and/or browser local storage, a request to Gmail APIs would submit these
cookies as well. So effectively, a malicious website has authenticated to Gmail pretending to
be you and is now capable of retrieving any data that the Gmail APIs/services provide.
Now you have a good understanding of the same-origin-policy in web browsers, its
importance, and the risks of not having such a policy. But in practice, you still need this to
work in certain scenarios, especially with a SPA, to invoke a set of APIs/services, which are
outside the domain of the SPA. Let’s take a look at how web browsers facilitate resource
sharing among domains to support such legitimate use cases.

5.4.3 Using cross-origin resource sharing


Web browsers have an exception to the same-origin policy: cross-origin resource sharing
(CORS), a specification that allows web browsers to access selected resources on different
origins; see https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS. In this section you’ll
learn how cross-origin resource sharing (CORS) allows cross domain API calls from a web
browser.
CORS allows the SPA running on origin example.com to access resources running on
origin example.org. Web browsers use the OPTIONS HTTP method along with special HTTP
headers to determine whether to allow or deny a cross-origin request. Let’s see how the
protocol works.
Whenever the browser detects that it’s about to execute a script that makes a request to
a different origin, it sends an HTTP OPTIONS request to the resource on the particular origin.
You can observe this request, known as a preflight request, by inspecting it on the Network
tab of your browser’s developer tools. The request includes the following HTTP headers:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


132

• Access-Control-Request-Headers—Indicates the HTTP headers that the request is


about to send to the server (such as origin, x-requested-with)
• Access-Control-Request-Method—Indicates the HTTP method about to be executed
by the request (such as GET)
• Origin—Indicates the origin of the web application (such as http://example.com)

The server responds to this preflight request with the following headers:
• Access-Control-Allow-Credentials—Indicates whether the server allows the
request originator to send credentials in the form of authorization headers, cookies, or
TLS client certificates. This header is a Boolean value that indicates true or false.
• Access-Control-Allow-Headers—Indicates the list of headers allowed by the
particular resource on the server. If the server allows more than is requested via the
Access-Control-Request-Headers header, it returns only what is requested.
• Access-Control-Allow-Methods—Indicates the list of HTTP methods allowed by the
particular resource on the server. If the server allows more than is requested via the
Access-Control-Request-Method, it returns only the one requested (such as GET).
• Access-Control-Allow-Origin—Indicates the cross-origin allowed by the server.
The server may support more than one origin, but what is returned in this particular
header is the value of the Origin header requested if the server supports cross-origin
requests from the domain of the request originator (such as http://localhost:8080).
• Access-Control-Max-Age—Indicates for how long, in seconds, browsers can cache
the response to the particular preflight request (such as 3600).

Upon receiving the response to the preflight request, the web browser validates the response
headers to determine whether the target server allows the cross-origin request. If the
response headers to the preflight request don’t correspond to the request to be sent
(perhaps the HTTP method isn’t allowed, or one of the required headers is missing in the
Access-Control-Allow-Headers list), the browser stops the cross-origin request from being
executed, and will show an error message stating that a cross-origin request was blocked.

5.5 Requesting individual claims via the authentication request


In this section you will learn how a client application can request claims individually from an
OpenID provider. In sections 5.2 and 5.3 we discussed how an OpenID provider returns user
claims to a client application via the ID token and in the response from the userinfo
endpoint. In both the cases, the OpenID provider decides which claims it wants to share with
the client application based on the value of the scope in the authentication request.
In section 5.2.3 we discussed that OpenID Connect defines four scopes (profile, email,
address, phone) and each of them is mapped to a set of standard claims (table 5.2). When
you use scopes to request claims, you cannot request an independent claim. Instead, you
always request a scope and you get back the set of claims that are mapped to the
corresponding scope, from the OpenID provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


133

To request independent claims from an OpenID provider, you need to use the claims
parameter in the OpenID Connect authentication request. The claims parameter is part of
the request object we discussed in section 3.12.
With the claims parameter in the authentication request, a client application can tell the
OpenID provider which claims it expects in the ID token and which claims it expects in the
response from the userinfo endpoint. When you use scopes to request user claims, it is not
possible to instruct the OpenID provider to return back two different set of claims, one set in
the ID token and the other set in the response from the userinfo endpoint.

5.5.1 An example of requesting individual claims


In this section we’ll take you through an example of requesting individual claims. Whether
you request claims with scopes or with the claims parameter in the authentication request,
in either way the response from the OpenID provider looks the same; both the ID token and
the response from the userinfo endpoint. The following code listing shows an example of the
value of the claims parameter, that goes with the request object (which is a JWT) to the
OpenID provider as discussed in section 3.12.

Listing 5.11 Requesting claims using the claims parameter


"claims":{ "userinfo": #A
{
"given_name": {"essential": true}, #B
"nickname": null, #C
"email": {"essential": true},
"picture": null
},
"id_token": #D
{
"gender": null,
"birthdate": {"essential": true}
}
}

#A This JSON Object carries the claim identifiers that are expected to be in the response from the userinfo endpoint.
#B The essential keyword indicates to the OpenID provider, this is a required claim for the client application to
function properly.
#C The null keyword indicates to the OpenID provider that there is no special requirement in requesting this claim
#D This JSON object carries the claim identifiers that are expected to be in the ID token

In listing 5.11, the client application requests given_name, nickname, email, and picture
claims from the userfinfo endpoint and expects gender and birthdate claims to be in the ID
token. All these are standard claim identifiers are defined in the OpenID Connect
specification, and you can also find the complete list in the table 5.2.
The following listing shows the complete OpenID Connect authentication request that
carries the claims parameters. This example is with respect to the authorization code
authentication flow. However it works in a similar manner for the implicit and hybrid flows as
well. The claims parameter is inside the JWT that goes under the request parameter.
That’s why you cannot see it the following code listing.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


134

Listing 5.12 Authentication request that carries the claims parameter


https://server.example.com/authorize?
response_type=code&
client_id=4504439922519l17d82cli8.apps.googleusercontent.com&
redirect_uri=https://localhost:3000/redirect_uri&
scope=openid&
state=af9ifjsldky&
nonce=n-0S6_WzA2Mj&
request=[JWT]

The following listing shows the decoded claims set of the JWT. There you can find the value
of the claims parameter. In section 3.12 we already discussed the structure of the request
parameter.

Listing 5.13 The decoded request parameter that carries the claims parameter
{
"iss": "s6BhdRkqt3",
"aud": "https://server.example.com",
"response_type": "code",
"client_id": "4504439922519l17d82cli8.apps.googleusercontent.com",
"redirect_uri": "https://localhost:3000/redirect_uri",
"scope": "openid",
"state": "af9ifjsldky",
"nonce": "n-0S6_WzA2Mj",
"max_age": 86400,
"claims":{
"userinfo":
{
"given_name": {"essential": true},
"nickname": null,
"email": {"essential": true},
"picture": null
},
"id_token":
{
"gender": null,
"birthdate": {"essential": true}
}
}
}

5.5.2 Why it is useful to request individual claims


In this section you’ll learn why it is useful to request individual claims and when would you
pick that approach over using scopes to request claims.
When a client application requests claims using scopes, the corresponding access token
issued by the OpenID provider is bound to the requested scopes. An access token is mostly
used by a client application to access an API or a microservice on behalf of the user who
logged in to the application. So, the access token is be bound to the scopes related to claims,
as well as the permissions the client application needs to access an API or a microservice.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


135

Figure 5.5 The client application can access the userinfo endpoint of the OpenID provider with the provided
access_token to retrieve user claims.

The API or the microservice does not need to know user claims. However, if you pass an
access token with a scope that is bound to a set of user claims, then the API or the
microservice can use that access token to talk to the userinfo endpoint of the OpenID
provider and retrieve user claims, which is not a desirable behavior. So, instead of using
scopes to request claims, if you use the claims parameter in the OpenID Connect
authentication request to instruct the OpenID provider to include user claims in the ID token,
then whoever gets the corresponding access token won’t be able retrieve user claims.
However, if the authentication request from the client application to the OpenID provider
requests to include user claims in the response from the userinfo endpoint, then still the API
or the microservice can use the corresponding access token to talk to the userinfo endpoint
of the OpenID provider and retrieve user claims. So, as a developer, when you decide which
claims you want from the userinfo endpoint and embedded into the ID token, you need to be
mindful about whom you need to give access to those claims.

5.6 Using custom claims


All the claims used in the examples of this chapter so far are the standard claims that are
defined in the OpenID Connect specification. However, in practice you would need more
claims than the standard ones, based on your business requirements. In this section you’ll
learn how to use custom claims with OpenID Connect. A custom claim is any claim that is not

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


136

defined in the OpenID Connect specification. If you take Google OpenID provider, for
example, you will find the custom claim hd, which is the hosted G Suite domain of the user.
The way a client application requests claims from an OpenID provider does not change
based on whether it’s a custom claim or a standard claim. However, the OpenID provider you
use must support custom claims or in other words, the OpenID provider you use should let
you define custom claims, and most of the open source and commercial OpenID provider
products do support it.
If you have a client application that allows login with multiple OpenID providers, then you
cannot expect consistency among different OpenID providers on how they handle custom
claims. For example, to request billing address of a user, you may use the custom claim
billing_address with one OpenID provider, and the custom claim billlingAddress with
another OpenID provider.
In section 1.9.5, we discussed a pattern how you can handle such cases by delegating
the claim transformation logic to a centralized identity provider. For example, your
application expects the ID token to carry the billing_address identifier all the time,
irrespective of which OpenID provider user picks, and the centralized identity provider takes
care of transforming different claim identifiers from different OpenID provider into the one
that your application expects. However, how you do the claim transformation is outside the
scope of OpenID Connect.

USING CLAIMS TO CONTROL ACCESS TO A CLIENT APPLICATION How you control access at the
client application is outside the scope of OpenID Connect, and OpenID Connect only helps you by bringing in
user claims from the OpenID provider to the client application. The client application can define its own
access control policies in the best way that suits it and evaulate those policies against the claims the ID
token brings in or the claims that you get from the userinfo endpoint. In chapter 8 you’ll learn how to define
access control policies using the Open Policy Agent (OPA).

5.7 Claim types


OpenID Connect specification defines three types of claims based on how they are issued by
an OpenID provider: normal claims, aggregated claims and distributed claims. All the
examples in this chapter so far used normal claims. A normal claim is directly issued by the
OpenID provider, the client application trusts. Or in other words those claims are asserted by
the OpenID provider the client application trusts. In this section, we’ll delve deep into
aggregated claims and distributed claims, and discuss how they work and the corresponding
use cases.

5.7.1 Using aggregated claims for identity proofing


In this section you’ll learn how aggregated claims work by taking identity proofing as an
example. Aggregated claims are issued by a claims provider the client application trusts, and
returned to the client application via the ID token issued by the OpenID provider the client
application trusts. This is the first time we talk about a claims provider, so it requires a
decent introduction.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


137

As per our discussion so far in the book the OpenID provider holds and manages user
claims. These claims fall under the normal type. In that case you can think about an OpenID
provider as a claims provider too. However in certain enterprise use cases there are
specialized claims providers, who are not necessarily the OpenID providers.
Identity proofing services are a good example of claims providers. Identity proofing is the
process of a user reliably identifying themselves to an identity proofing service. For example,
Evident ID (https://www.evidentid.com/) is a company that provides identity-proofing
services. You can upload a scanned copy of your US driving license to Evident ID and Evident
ID will verify the authenticity of the driving license and extract your first name, last name,
date of birth and driving license number from it.
Some identity proofing services make the proofing process more reliable, by asking the
user to hold the driving license closer to their face in front of the camera, and makes sure
the photo of the user on the driving license is the same person who holds the driving license
in front of the camera. Only if that verification is successful, the identity proofing service will
extract out the corresponding claims from the driving license and will record those against
the corresponding user.
So, you can call an identity-proofing service a claims provider who holds verified claims,
and repeating again, it does not necessarily need to be an OpenID provider. However, most
of the identity-proofing services expose an API, which can be consumed by an OpenID
provider to retrieve verified claims.

Figure 5.6 The OpenID provider builds ID token by aggregating claims from different claims providers. An
identity proofing service is a good example of a claims provider.

A given OpenID provider can connect to multiple claims providers (figure 5.6), and during a
user’s login process, the OpenID provider builds the ID token by aggregating all the claims it
gets from the connected claims providers. In the identity-proofing example, how the OpenID
provider should build the ID token with aggregated claims is defined in the OpenID Connect

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


138

for Identity Assurance specification (https://openid.net/specs/openid-connect-4-identity-


assurance-1_0.html). Identity proofing is only one use case of aggregated claims; the
following listing shows an example of an ID token with aggregated claims. The listing only
shows the claims set (or the JWT body) of the ID token.

Listing 5.14 An example of aggregated claims in an ID token.


{
"iss": "https://server.example.com", #A
"sub": "248289761001", #B
"email": "[email protected]",
"email_verified": true,
"_claim_names": { #C
"verified_claims": "src1" #D
},
"_claim_sources": { #E
"src1": { "JWT": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ "} #F
}
}

#A The issuer of the ID token


#B Represents the owner of ID token
#C Each element in the _claim_names JSON object carries a claim identifier, and the value of that claim identifier
refers to an element in the _claim_sources JSON object.
#D Points to a child element inside the _claim_sources element by name
#E Each element in the _claim_sources JSON object holds another JSON object, and each of those JSON objects
comes from a claims provider
#F The JWT carries aggregated claims

Listing 5.14 uses two elements in the ID token, _claim_names and _claim_sources. Each one
of them carries a JSON object. Each element in the _claim_names JSON object carries a
claim identifier, and the value of that claim identifier refers to an element in the
_claim_sources JSON object. Each element in the _claim_sources JSON object holds
another JSON object, and each of those JSON objects comes from a claims provider.
In listing 5.9, the name of the first element (and only element) in the _claim_sources
JSON object is src1, and this is the element being referred from the verified_claims
element in the _claim_names JSON object. The value of the src1 element is another JSON
object, and according to OpenID Connect specification, this JSON object must have an
element called JWT. This JWT carries all the claims provided by the corresponding claims
provider.
The identifiers _claim_names, _claim_source and JWT are defined in the OpenID
Connect specification, while the verified_claims identifier is defined in the OpenID Connect
for Identity Assurance specification. However if you are to use aggregated claims for the use
cases outside the identity proofing, you need not to worry about the OpenID Connect for
Identity Assurance specification and you can use whatever the claim identifiers you need
under the _claim_names JSON object. The following listing shows a more generic example of
aggregated claims.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


139

Listing 5.15 A generic example of aggregated claims in an ID token.


{
"iss": "https://server.example.com",
"sub": "248289761001",
"email": "[email protected]",
"email_verified": true,
"_claim_names": { #A
"address": "src1" #B
},
"_claim_sources": { #C
"src1": { "JWT": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ "} #D
}
}

#A Each element in the _claim_names JSON object carries a claim identifier, and the value of that claim identifier
refers to an element in the _claim_sources JSON object.
#B A claim identifier and the corresponding value can be found under the src1 child element under _claim_sources
element
#C Each element in the _claim_sources JSON object holds another JSON object, and each of those JSON objects
comes from a claims provider
#D The JWT carries aggregated claims

In section 5.7.3 we discuss how to verify an ID token that carries aggregated claims.

5.7.2 Using distributed claims for identity proofing


In this section you’ll learn how distributed claims work by extending the identity proofing
example we discussed in section 5.7.1. The same use case we discussed with respect to
aggregated claims is applicable to distributed claims as well. However, unlike in aggregated
claims, the ID token issued by the OpenID provider does not carry claim values from other
claims providers, rather embeds a link to an endpoint (figure 5.7). The client application can
follow the link and get the claims directly from the corresponding claims provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


140

Figure 5.7 The OpenID provider builds ID token including the endpoint details of claims providers, so the client
application can directly talk to them and retrieve user claims.

The following code listing shows an example of an ID token that carries distributed claims, as
per the OpenID Connect for Identity Assurance specification.

Listing 5.16 An example of distributed claims in an ID token.


{
"iss": "https://server.example.com", #A
"sub": "248289761001", #B
"email": "[email protected]",
"email_verified": true,
"_claim_names": { #C
"verified_claims": "src1" #D
},
"_claim_sources": { #E
"src1": {
"endpoint": "https://claims.example.com/ep", #F
"access_token": "eyJhbGciOiJSUzI1NiIsI" #G
}
}
}

#A The issuer of the ID token


#B Represents the owner of ID token
#C Each element in the _claim_names JSON object carries a claim identifier, and the value of that claim identifier
refers to an element in the _claim_sources JSON object.
#D Points to a child element inside the _claim_sources element by name

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


141

#E Each element in the _claim_sources JSON object holds another JSON object, and each of those JSON objects
comes from a claims provider
#F Points to and endpoint under the distributed claims provider
#G The access token to autheniticate to the endpoint under the distributed claims provider

In listing 5.16 the _claims_sources JSON object under each element defines an endpoint
belongs to the claims provider, along with an access token (optional) for authentication. This
endpoint should return the corresponding claims in JWT. The following listing is an example
of a ID token that returns both the aggregated claims and distributed claims.

Listing 5.17 An ID token returning both aggregated and distributed claims


{
"iss": "https://server.example.com",
"sub": "248289761001",
"email": "[email protected]",
"email_verified": true,
"_claim_names": {
"verified_claims": ["src1", "src2"] #A
},
"_claim_sources": {
"src1": { "JWT": "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ "} #B
"src2": { #C
"endpoint": "https://claims.example.com/ep",
"access_token": "eyJhbGciOiJSUzI1NiIsI"
}

}
}

#A Points to a child element inside the _claim_sources element by name


#B The JWT carries aggregated claims
#C The endpoint under this element points to the distributed claims provider

5.7.3 Verifying aggregated and distributed claims


In this section you’ll learn how a client application can verify the aggregated and distributed
claims it receives from an OpenID provider or a claims provider. The ID token is the primary
vehicle to transport both aggregated and distributed claims from an OpenID provider to the
client application. In case of aggregated claims, the ID token embeds the claims into the ID
token itself, and in case of distributed claims ID token embeds an endpoint to the claim
provider. Either way, at the end, the application gets both the aggregated and distributed
claims in a JWT. So you need to do the JWT verification process we discussed in chapter 4.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


142

5.8 Summary
• A client application can use claims to identify the user, communicate with the user
(for example use user’s email address to send a newsletter), build a personalized user
experience, as well as do application specific authorization checks.
• A client application can request claims from an OpenID provider using the scopes or
with the claims parameter in the OpenID Connect authentication request.
• An OpenID provider can return claims to the client application either with the ID token
or via userinfo endpoint.
• The OpenID Connect specification defines 20 standard claims and 4 standard scopes;
19 out of these 20 standard claims are mapped to the 4 standard scopes.
• OpenID Connect specification defines three types of claims based on how they are
issued by an OpenID provider: normal claims, aggregated claims, and distributed
claims.
• The OpenID provider that the client application trusts directly issues a normal claim.
Or in other words, the OpenID provider the client application trusts asserts those
claims.
• Aggregated claims are issued by a claims provider the client application trusts, and
returned to the client application via the ID token issued by the OpenID provider the
client application trusts.
• The OpenID provider does not embed the distributed claims into the ID token. Rather,
the ID token carries corresponding claims provider’s endpoint information, so the
client application can directly talk and retrieve the claims.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


143

Securing access to a server-side web


application

This chapter covers

• How to secure access to a server-side web application using an OpenID Connect agent
• How to secure access to a server-side web application using an OpenID Connect proxy
• How to store tokens securely
In chapter 3 we discussed how to login to a single-page application (SPA) using OpenID
Connect, following the implicit and authorization code flows. Even though SPAs are the most
popular application type at the time of this writing, still there are many server-side web
applications out there we need to worry about.
Technically speaking, a SPA is also a web application, so when we say a server-side web
application we mean a web application that has some logic running in the backend. It can be
a web application, for example that you develop using Java Server Pages (JSP) / Servlet, C#
or any other framework that needs some runtime at the backend to execute the logic.
As we discussed in chapter 3, ideally the opposite of a SPA is a multi-page application
(MPA). But not all MPAs are server-side web applications. There can be MPAs, which do not
need to run any backend logic in a web server. So, that’s why we want to use the term,
server-side, to identify the type of web applications we are going to discuss in this chapter.
However, in this chapter we may use server-side web application, web application,
application interchangeably to refer to the same.
To demonstrate how to integrate OpenID Connect with a server-side web application, in
this chapter we use Apache Tomcat web server to host the web application. However, based
on the language/framework you would like to use to build your web application, you can pick
the application server to host your application. If you use PHP to build your application, for
example, you can probably host it in the Apache Web Server.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


144

6.1 Agent-based single sign on vs. proxy-based single sign on


In this section you’ll learn what is agent-based single sign on and proxy-based single sign on
and how they work with OpenID Connect. These are the two most common ways of
integrating OpenID Connect into a server-side web application.

6.1.1 Agent-based single sign on


In this section you will learn how agent-based single sign on works with a server-side web
application. A picture is worth thousands of words! The figure 6.1 explains how the agent-
based single sign on works. With agent-based single sign on, each application has its own
agent running with it that intercepts all the requests coming into it. For example, a servlet
filter implementation that runs within a Java EE web application is an agent. An agent runs
within the same process as of the web application. Or in other words, the communication
between the agent and application happens within the same process (no network calls). In
chapter 3 when we implemented OpenID Connect with an SPA, we used the React SDK. The
React SDK is a part of the SPA itself; and to some extent in the case of an SPA, the SDK
plays the role of an agent

Figure 6.1 An agent that runs in the same process as of the server-side web application intercepts all the
requests coming to web application and for unauthenticated requests initiates an OpenID Connect login flow.

Typically, an OpenID Connect agent is responsible for the following tasks:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


145

1. The agent intercepts all the requests coming to the application to check whether each
request is already authenticated or not (step 1 in figure 6.1). Ideally this is done by
looking for an already known cookie in the request, which maps into some session
data, the application stores in the backend (server-side). In a Java EE application
running on a Tomcat server, for example, this can be tracked by using the
JSESSIONID cookie. Typically, the application server (e.g.: Tomcat), takes care of
managing the user session, so the agent only needs to check whether an
authenticated session exists or not.
2. For any unauthenticated request the agent will initiate an OpenID Connect login flow
and redirects the user to an OpenID provider (step 2 in figure 6.1).
3. The agent validates the OpenID Connect response it receives from an OpenID
provider, and injects the information related to the user it finds in the ID token, to the
memory, so the web application can simply read that information from memory, with
zero knowledge on OpenID Connect (step 9 in figure 6.1). Once the agent successfully
validates the OpenID Connect response from the OpenID provider, it creates a login
session for the corresponding user.

It looks like the tasks the OpenID Connect agent carries out need some heavy working! If
you are to implement an OpenID Connect agent, in addition to a deeper understanding about
the OpenID Connect specification, you also need to know how the corresponding application
server handles user sessions. Good news is, as an application developer, you do not need to
worry about writing these agents yourself. You can find an agent that fits your needs;
specially that works with your application server. In this chapter, we are using Asgardeo
Tomcat OIDC Agent (https://github.com/asgardeo/asgardeo-tomcat-oidc-agent), which is an
open-source OpenID Connect agent implementation for Java EE applications running on
Apache Tomcat.

6.1.2 Proxy-based single sign on


In this section you will learn how proxy-based single sign on works with a server-side web
application. Unlike agent-based single sign on, with proxy-based single sign on, the proxy
runs outside of your server-side web application, in its own process. However, all the
requests coming to your application go through the proxy. The figure 6.2 explains how the
proxy-based single sign on works.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


146

Figure 6.2 The proxy that runs outside the process of the server-side web application intercepts all the
requests coming to web application and for unauthenticated requests initiates an OpenID Connect login flow.

Typically, a proxy is responsible for the following tasks; and you’ll find this list is very much
similar to what we discussed in section 6.1.1 with respect to agent-based single sign on:

1. The proxy intercepts all the requests coming to the application to check whether each
request is already authenticated or not. Ideally this is done by looking for an already
known cookie in the request (step-1 in figure 6.2).
2. For any unauthenticated request, the proxy will initiate an OpenID Connect login flow
and redirects the user to an OpenID provider (step-2 in figure 6.2). For example, an
Apache server, an Nginx server can act as a proxy. The mod_auth_openidc module
(https://github.com/zmartzone/mod_auth_openidc), for example, running on an
Apache server knows how to initiate an OpenID Connect login flow with an OpenID
provider it trusts. Similarly, the lua-resty-openidc module
(https://github.com/zmartzone/lua-resty-openidc) running on an Nginx server knows
how to initiate an OpenID Connect login flow. In section 6.4 you’ll learn how to secure
access to your server-side web application with lua-resty-openidc module running on
an Nginx server.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


147

3. The proxy validates the OpenID Connect response it receives from an OpenID
provider, and injects the information related to the user it finds in the ID token, to the
HTTP request from the proxy to the web application, so the web application can simply
read that information from the HTTP request, with zero knowledge on OpenID
Connect. Once the proxy successfully validated the OpenID Connect response from the
OpenID provider, it will also create a login session for the corresponding user.

Proxy-based single sign on is the preferred option over agent-based single sign-on in the
following cases:
• When you have to work with legacy web applications that are not designed to support
OpenID Connect.
• When you are only allowed to do minimal changes for your web applications.
• When you have a large number of web applications developed using different
technology stacks.

Sometime back, I worked with a large financial institute in USA to implement proxy-based
single sign on using mod_auth_openidc module running on an Apache server over 200+
server-side web applications. These web applications were written in different programming
languages and had their own built-in authentication mechanisms. Since these applications
were evolved over a couple of decades and still working fine, no one wanted to do any
serious changes on them; and that was the key motivation for them to go for the proxy-
based single sign on. Also, since they had 200+ applications written in multiple programming
languages, if we had used agent-based single sign on, they would need to find an OpenID
Connect agent written in each programming language and maintain those.

6.2 Implementing login using an agent


In this section you’ll learn how to secure a server-side web application using Asgardeo
Tomcat OpenID Connect Agent (https://github.com/asgardeo/asgardeo-tomcat-oidc-agent),
which is an open-source OpenID Connect agent implementation for Java EE applications
running on Apache Tomcat. To get the sample application up and running, you need to
perform the following tasks after validating the pre-requisites defined here:
https://github.com/openidconnect-in-
action/samples/tree/master/chapter06/sample01#prerequisites.

1. Download the sample web application as a WAR (web archive) file, which is written in
Java.
2. Update the web application configuration to include the Tomcat agent for OpenID
Connect integration.
3. Register the web application at an OpenID Provider and get a client ID and client
secret for the web application.
4. Configure the client ID and client secret you got from the OpenID Provider in the web
application.
5. Build the updated web application and deploy it in Tomcat; and start the Tomcat
server.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


148

6. Test the login flow by visiting the web application.

All the instructions corresponding to the above steps are documented in


https://github.com/openidconnect-in-action/samples/blob/master/chapter06/sample01/
README.md file. We thought of not duplicating the steps to set up the samples in this
chapter, and keep them in the GitHub repository; so, as the software versions used in
the sample changes, we have the freedom of updating the instructions in the
README.md file as well as the code of the samples.

The Asgardeo OpenID Connect agent is a servlet filter. If you are familiar with Java,
you most probably know what a servlet filter is. For all the others, think about a servlet filter
as a component that you can configure to intercept all the requests coming to a specific
path of your web application. Any Java EE application server supports servlet filters. So,
you can deploy the same servlet filter that you deployed in Tomcat server, in JBOSS
application server as well.
Let’s have a look at the code that integrates the OpenID Connect agent with the web
application. In the web.xml file, inside oidc-sample-app/WEB-INF directory, you’ll find the
following code, which defines the full qualified name of the Java class, that implements the
OpenID Connect agent logic as a servlet filter.

Listing 6.1 Defining the Asgardeo OpenID Connect servlet filter in WEB-INF/web.xml file
<filter>
<filter-name>OIDCAgentFilter</filter-name> #A
<filter-class>io.asgardeo.tomcat.oidc.agent.OIDCAgentFilter</filter-class> #B
</filter>

#A The name of the servlet filter. This can be any name and will be referred from other places in the web.xml file.
#B The fully-qualified class name of the servlet filter implementation. This is in fact the OpenID Connect agent
implementation.

The following code listing shows, how to setup the servlet filter defined in listing 6.1 against
certain paths of the web application. The OpenID Connect agent (or the servlet filter) will
protect access to these paths.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


149

Listing 6.2 Configuring the servlet filter against different paths in WEB-INF/web.xml file
<filter-mapping> #A
<filter-name>OIDCAgentFilter</filter-name> #B
<url-pattern>/logout</url-pattern> #C
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>/oauth2client</url-pattern> #D
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.jsp</url-pattern> #E
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.html</url-pattern> #F
</filter-mapping>

#A A filter mapping defines a mapping between a filter name and a path for the application server will make sure any
request coming to that path will go through the corresponding servlet filter first
#B The name of the servlet filter, as defined in listing 6.1
#C This servlet filter will intercept all the requests coming to the /logout path of this web application
#D This servlet filter will intercept all the requests coming to the /oauth2client path of this web application
#E This servlet filter will intercept all the request coming to any file that has the .jsp extension
#F This servlet filter will intercept all the request coming to any file that has the .html extension

In addition to the servlet filter implementation, the Asgardeo OpenID Connect agent also
comes with an event listener implementation (SSOAgentContextEventListener). This
listener is responsible for loading certain values from WEB-INF/classes/oidc-sample-
app.properties. The oidc-sample-app.properties file carries a set of properties related
to the OpenID provider the web application trusts for authentication, as well as some
properties related to the web application itself, as shown in the following listing.

Listing 6.3 The WEB-INF/classes/oidc-sample-app.properties file


consumerKey=XXXXXX #A
consumerSecret= XXXXXX #B
skipURIs=/oidc-sample-app/index.html #C
errorPage=/error.jsp #D
callBackURL=http://localhost:8080/oidc-sample-app/oauth2client #E
scope=openid #F
authorizeEndpoint=https://localhost:9443/oauth2/authorize #G
logoutEndpoint=https://localhost:9443/oidc/logout #H
tokenEndpoint=https://localhost:9443/oauth2/token #I
issuer=https://localhost:9443/oauth2/token #J
jwksEndpoint=https://localhost:9443/oauth2/jwks #K
postLogoutRedirectURI=http://localhost:8080/oidc-sample-app/index.html #L

#A The client ID corresponding to this web application, which you get from the OpenID provider
#B The client secret corresponding to this web application, which you get from the OpenID provider
#C Defines pages or the URIs the OpenID Connect agent should not worry about
#D Where to take the user, in case of an error
#E This is the same callback URL (or the redirect URI) you configure at the OpenID provider at the time you registered
your web application.
#F The value of the scope parameter

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


150

#G The authorize endpoint of the OpenID provider. This OpenID Connect agent will redirect any unauthenticated
requests to this endpoint.
#H The logout endpoint of the OpenID provider. When the user initiates logout, the web application will redirect the
user to this endpoint. You’ll learn about logout in chapter 7.
#I The token endpoint of the OpenID provider. The OpenID connect agent uses this endpoint to exchange the
authorization code it got from the authorize endpoint to an access token and an ID token.
#J The issuer of the ID token. The OpenID provider defines the value of the issuer and is included in the ID token. The
OpenID Connect agent only accepts ID tokens from an issuer it knows (trusts).
#K This is an endpoint defined by the OpenID provider, which provides the public key associated with the signature of
the ID token.
#L The endpoint belongs to the web application, where the OpenID provider will redirect the user after logout. You’ll
learn about logout in chapter 7.

The following code listing shows how the web application defines two listeners in the WEB-
INF/web.xml file. In this section we discussed the SSOAgentContextEventListener. In
addition to that, the OpenID Connect agent defines another listener called JKSLoader. The
JKSLoader listener is responsible for loading properties from WEB-INF/classes/jks.properties
file. The jks.properties file defines a set of properties corresponding to the certificates that
are required to make secure connection with the OpenID provider.

Listing 6.4 Two listener implementations defined in WEB-INF/web.xml file


<listener>
<listener-class>
io.asgardio.tomcat.oidc.agent.SSOAgentContextEventListener
</listener-class>
</listener>

<listener>
<listener-class>io.asgardeo.tomcat.oidc.agent.JKSLoader</listener-class>
</listener>

6.3 How authorization code flow works with a server-side web


application?
The OpenID Connect agent we discussed in section 6.2 uses the authorization code flow
underneath to communicate with the OpenID provider. In this section, you’ll learn how the
authorization code flow works with respect to a server-side web application. In chapter 3,
section 3.9, we discussed how authorization code flow works with respect to a single-page
application. If you have not read section 3.9 yet, we recommend reading it now, prior
following the rest of this section. We’ve covered most of the fundamentals of the
authorization code flow in section 3.9.
The figure 6.3 shows the flow of events that take place during the login flow, following
the authorization code flow.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


151

Figure 6.3 The server-side web application uses authorization code authentication flow to communicate with
the OpenID provider to authenticate the user.

The figure 6.3 is exactly the same figure you find in figure 3.11 in chapter 3; however, in
step-6, the request to the token endpoint from the web application to the token endpoint of
the authorization server is slightly different. If you recall from chapter 3, even in the case of
a single-page application, the request to the token endpoint would be very much similar to
what you see in listing 6.5. However, the token request from the single-page application
(listing 3.4) didn’t carry an HTTP Authorization header, while as in listing 6.5, the token
request from the server-side web application carries an HTTP Authorization header. The HTTP
Authorization header is used to authenticate the client application to the token endpoint of
the authorization server.

Listing 6.5 Request to the token endpoint of the OpenID provider (authorization code flow)
POST /token
HTTP/1.1
Host: oauth2.googleapis.com
Content-Type: application/x-www-form-urlencoded
Authorization: Basic XXXXXXX

code=YDed2u73hXcr783d&
client_id=your_client_id&
redirect_uri=https%3A//oauth2.example.com/code&
grant_type=authorization_code

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


152

As you already learnt in chapter 2, a single-page application is called a public client under
the OAuth 2.0 specification and a public client does not have the capability protect any
secrets. Since a SPA runs in the browser, it can’t hide any secrets from the end user.
Anything that you hide on the browser is visible to the end user. So, no point of having any
credentials for an SPA. So, that’s the reason it does not carry the HTTP Authorization header.
The server-side web application is called a confidential client under the OAuth 2.0
specification, because it can securely store secrets at the server side. The server-side web
application (either the agent or the proxy) generates the HTTP Authorization header in listing
6.5 by base64-encoding the client id and client secret it got from the OpenID provider. The
following code snippet shows how the base64-encoding takes place.

Authorization: Basic base64-encode(client_id:client_secret)

6.4 Storing tokens in a server-side web application


In this section you’ll learn how to store tokens that you get from an OpenID provider during
a login flow following the authorization code flow. As per figure 6.3 you get an authorization
code in step-5; and then the access token, refresh token and the ID token in step-7. As you
already learnt in chapter 2, the authorization code is a short-lived, one-time used token.
Typically, the lifetime of an authorization code would be from 30 to 60 seconds. As a
developer you should not worry about persisting authorization codes. You only need an
authorization code to obtain an access token, a refresh token and an ID token; and then you
can simply discard it.
The web application uses the ID token to identify the corresponding user. The ID token
carries the attributes related to the user; which can also be used to do attribute-based
access control to figure out what actions the user can perform on the web application. In
most of the cases, you will not persist the ID token as it is. You would rather validate it at
the server side and will build some in memory data structure that can be used by your web
application to retrieve user attributes without parsing the ID token every time. Anything that
you keep in web application’s memory is bound to the browser session of the corresponding
user via a session identifier. Each time the browser sends a request to the server (or the web
application), that request will carry this session identifier via a cookie; so the web application
can load the corresponding data structure from its memory.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


153

Figure 6.4 The application stores the user data extracted from an ID token in a data structure that can be
looked up by the identifier comes in the cookie from the browser.

The figure 6.4 illustrates how the session identifier from the cookie maps to the data
structure that carries the data from the token, stored in the web application’s memory. As
you might have rightly guessed already, this approach of storing ID token data in memory
has a limitation. When you have multiple servers hosting the same web application behind a
load balancer (figure 6.5), you need to find a way to replicate the data structure you keep in
memory across all the servers; otherwise, when the load balancer routes a request with the
cookie that carries the session identifier to a server that does not have the corresponding
data structure in memory, will result in an error. The other approach to fix this is to
configure the load balancer to send all the requests within a given session to the same web
server. This is called the session-aware load balancing.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


154

Figure 6.5 , when the load balancer routes a request with the cookie that carries the session identifier to a
server that does not have the corresponding data structure in memory, will result in an error.

Just like an ID token, in most of the cases the access tokens are also treated as session
tokens; unless your web application has a specific requirement to use an access token to
perform certain tasks, on behalf of the user, even after the user has logged out from the
browser session. If you login to the meetup.com, for example, using your Google account via
OpenID Connect, the meetup.com can use the access token provided to it by Google to
publish events to your Google calendar. However, if the meetup.com didn’t persist the access
token, it won’t be able to update the event it published to your Google calendar (while you
are logged into the meetup.com). It’s a common case that meetups getting rescheduled
sometime, and if meetup.com didn’t persist your access token, it won’t be able to update
your Google calendar with the latest changes. If you are to persist an access token, you need
to treat it as confidential data and store it in an encrypted format.
Even if you don’t persist the access token, and if you still need to perform certain tasks
on behalf of the user, while the user is not logged into the web application; you can still do it
with the refresh token. You need to persist the refresh token in an encrypted format and
whenever you need an access token, you can talk to the token endpoint of the authorization
server, authenticate with the refresh token and get a new access token. In section 6.5 we
discuss refresh tokens in detail.

6.5 Refreshing an access token and an ID token


In this section you’ll learn how to refresh an access token and an ID token issued to the web
application by the OpenID provider during the login flow. As you learnt in section 2.3.3 of
chapter 2, the OAuth 2.0 specification introduced the concept of a refresh token as an OAuth
2.0 grant type to refresh an already issued access token.
Typically, the refresh token grant type is used when the current access token expires or
is near expiry, and the client application needs a new access token to work with without

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


155

having to prompt the user of the application to log in again. To use the refresh token grant
type, the application should receive an access token and a refresh token in the token
response from the OpenID provider. Figure 6.6 illustrates the refresh token grant flow.

Figure 6.6 The refresh token grant type allows a token to be renewed when it expires.

The following listing shows a cURL request to refresh an access token. This looks very much
similar to the listing 2.5 in chapter 2, except in listing 6.6 we pass openid as a value under
scope parameter.

Listing 6.6 A sample cURL request with refresh token grant type
\> curl \
-u application_id:application_secret \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=refresh_token&
refresh_token=heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3 \
scope=openid" \
https://localhost:8085/oauth/token

The response from the OpenID provider to cURL request in listing 6.6 may or may not
include the ID token. The OpenID Connect specification does not mandate to include the ID
token in the token refresh response, and you would need to check the documentation of your
OpenID provider to clarify it. In case the OpenID provider includes the ID token in the
response; then the ID token must adhere to following rules the OpenID Connect specification
defines. These rules are related to the certain claims (or attributes) the ID token carries, and
in chapter 3, in section 3.8.4 we explained the use of all of those attributes.
• The iss claim value of the ID token must be the same as in the ID Token issued when
the original authentication occurred.
• The sub claim value of the ID token must be the same as in the ID Token issued
when the original authentication occurred.
• The iat claim of the ID token must represent the time that the new ID Token is
issued.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


156

• The aud claim value of the ID token be the same as in the ID Token issued when the
original authentication occurred.
• If the ID Token contains an auth_time claim, its value must represent the time of
the original authentication, not the time that the new ID token is issued.
• The azp claim value of the ID token must be the same as in the ID Token issued
when the original authentication occurred and if no azp claim was present in the
original ID Token, one must not be present in the new ID token.

Then again, what’s the point of refreshing an ID token, along with the access token? In
section 6.4 we explained that the web application will validate the ID token and store it’s
values in memory for future reference. The ID token has its own expiry time too and it does
not necessarily be the same as of the access token. If an ID token is not expired you do not
need to refresh it, and in practice most of the web application implementations do not
refresh the ID token even it is expired; rather the web application relies on the claims it
received in the original ID token throughout the same web session of the user. This approach
to not to refresh an expired ID token has one major drawback. If you use the claims the ID
token carries to make access control decisions, then you are making such decisions based on
stale data.
Whether to refresh an ID token or not is a decision you need to make consciously. For
example, in the case of a single-page application, I will really not bother about refreshing an
ID token. A single-page application is mostly driven by APIs, which are secured with OAuth
2.0; and ID token effectively plays a very minor role. You’ll only use an ID token to identify
who the user is only for the display purposes; yet all the APIs will rely on the access token to
find who the user is and what actions the user can perform on APIs. In case of a server-side
web application, whether to refresh the ID token or not relies on how much you rely on the
claims the ID token carries.

6.6 Implementing login using a proxy


In this section you’ll learn how to protect a web application with OpenID Connect using a
proxy. In section 6.1.2 you learnt how proxy-based single sign on works and this section
extends that discussion to show you how to implement proxy-based single sign on using an
Nginx server with the lua-resty-openidc module as the proxy. To get the sample application
up and running, you need to perform the following tasks after validating the pre-
requisites define here: https://github.com/
openidconnect-in-action/samples/tree/master/chapter06/sample02#prerequisites.
• Download and install OpenResty server (https://openresty.org/en/). OpenResty is a
full-fledged web platform that integrates an enhanced version of the Nginx core, an
enhanced version of LuaJIT, many carefully written Lua libraries, lots of high-quality
3rd-party Nginx modules, and most of their external dependencies. It is designed to
help developers easily build scalable web applications, web services, and dynamic web
gateways. OpenResty also includes the lua-resty-openidc module.
• Register the proxy (the OpenResty server) at an OpenID Provider and get a client id
and client secret.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


157

• Download and deploy the sample web application behind the OpenResty server, so the
OpenResty server can forward the requests to the sample web application.
• Start the OpenResty server and also the Tomcat server that runs the sample web
application
• Test the login flow by visiting the web application. All the requests to the web
application will be proxied through the OpenResty server.

All the instructions corresponding to the above steps are available


at https://github.com/openidconnect-in-action/samples/blob/master/chapter06/sample02/
README.md. We thought of not to have the steps to setup the sample in this chapter,
and keep them in the GitHub repository; so, as the software versions used in the sample
changes, we have the freedom of updating the instructions in the README.md file as well as
the code of the samples.
Let’s have a look at the OpenResty configuration that integrates the OpenID Connect
agent with the web application. In the nginx.conf file, inside
/usr/local/openresty/nginx/conf directory, you’ll find the following configuration; and
listing 6.7 shows only the key configuration elements.

Listing 6.7 OpenResty configuration from openresty/nginx/conf/nginx.conf file


http {
server {
listen 443 ssl;
location / {
access_by_lua_block {
local opts = {
redirect_uri_path = "/welcome",
discovery = "https://op/.well-known/openid-configuration", #A
client_id = "$CLIENT_ID", #B
client_secret = "$CLIENT_SECRET", #C
scope = "openid email profile", #D
}
local res, err = require("resty.openidc").authenticate(opts)
if err then #E
ngx.status = 500
ngx.say(err)
ngx.exit(ngx.HTTP_INTERNAL_SERVER_ERROR)
end
ngx.req.set_header("X-USER", res.id_token.sub) #F
}
}
}

#A The OpenID Connect metadata endpoint. The lua-resty-openidc module talks to this endpoint to discover the
authorization and token endpoints of the OpenID provider.
#B The client id from the OpenID provider. You need to replace this value with your own value.
#C The client secret from the OpenID provider. You need to replace this value with your own value.
#D Carries scope values and must include at least openid.
#E Defines what to in case of an error condition
#F Extract the sub attribute from the ID token and set its value to the X-USER HTTP header. The upstream web
application can find the name of user by reading X-USER header from the HTTP request.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


158

6.7 Summary
• The agent-based single sign on and proxy-based single sign on are the most common
ways of integrating OpenID Connect into a web application.
• With agent-based single sign on an agent running along with the web application
intercepts all the requests coming to the web application to check whether each
request is already authenticated or not; and if not will initiate an OpenID Connect
login flow with the OpenID provider it trusts.
• Unlike agent-based single sign on, with proxy-based single sign on, the proxy runs
outside of your server-side web application in its own process, and intercepts all the
requests coming to the web application.
• If an ID token is not expired, you do not need to refresh it, and in practice most of
the web application implementations do not refresh the ID token even it is expired;
rather the web application relies on the claims it received in the original ID token
throughout same web session of the user.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


159

Logging out

This chapter covers

• Single-logout options for applications using OpenID Connect


• Implementing OpenID Connect session management
• Implementing front-channel logout
• Implementing back-channel logout
In chapter 3 we discussed how to log in to a single-page application (SPA) using OpenID
Connect, and in chapter 6 we discussed how to log in to a server-side web application. The
native mobile applications are the other most popular application type; in chapter 9, we will
discuss how to integrate logging in with OpenID Connect with a native mobile application.
When developing an application, we also need to worry about logging out, in addition to
logging in. Logging helps you create an authenticated session with the corresponding
application, while logout kills that authenticated session. In this chapter we teach you how
OpenID Connect logout works, available options, and how to implement logout with a single-
page application as well as with a traditional web application.

7.1 What is single logout?


Single logout is an extended expectation of logout. Different people have different
expectations from single logout. Some, for example, expect single logout to logout the user
from all the applications running under the current browser session, connected to the same
identity provider; and also expect to logout from the identity provider. This is the most
common expectation we see.
Then again for some, single logout means, you logout from one application and you’ll be
logged out from the same application used across multiple devices. You may logged in to
Gmail from your laptop, iPhone and also from iPad; and you’d expect to logout from all the
Gmail applications running on different devices when you initiate single-logout from one of it.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


160

However, this is not a common expectation. Sometime back Gmail provided an option to
logout from all active sessions (across all the devices) with one click, but at the time of this
writing it is not available. Still, you can visit https://myaccount.google.com/device-activity
and selectively logout from any of the devices you have currently logged in with your Google
account.
In this chapter you will learn how to implement single logout with OpenID Connect to
address the most common expectation of single logout. That is, you logout from one
application and you’ll be logged out from all the applications running under the current
browser session, connected to the same identity provider and also from the identity provider.

7.2 Single logout options in OpenID Connect


In this section you’ll learn about the logout options available in OpenID Connect at a high-
level. In the sections 7.3, 7.4, and 7.5, we’ll delve more deeply into each of them with
hands-on samples.
The OpenID Connect core specification does not talk about logging out. It is defined
under three other specifications developed by the OpenID working group:
• The session management specification (https://openid.net/specs/openid-connect-
session-1_0.html), which defines how to implement logout functionality between an
OpenID application and a provider using two HTML iframes. This is the first logout
specification the OpenID working group introduced.
• The front-channel logout specification (https://openid.net/specs/openid-connect-
frontchannel-1_0.html), which defines how to implement logout functionality between
an OpenID application and a provider completely using the browser (front-channel),
without demanding the OpenID application to load an iframe from the OpenID
provider.
• The back-channel logout specification (https://openid.net/specs/openid-connect-
backchannel-1_0.html), which defines how to implement logout functionality between
an OpenID application and a provider using front-channel (via browser) as well as
direct (back-channel) communication.

In sections 7.3, 7.4 and 7.5, you’ll learn the merits of each of these three approaches.

7.3 Implementing OpenID Connect session management


In this section you’ll learn how logout with OpenID Connect works following the OpenID
Connect session management specification. Also, you’ll learn how to implement logout for a
single-page application developed using React. If you are new to React, we recommend you
first go through appendix A and then chapter 3.

7.3.1 What’s new in OpenID Connect login flow to support logout?


In this section you’ll learn the changes the OpenID Connection session management
specification has introduced to the login flow to support logout.
In section 7.2 you learnt that the OpenID Connect session management approach for
logout is based on iframes. In fact, there are two iframes loaded into the client application.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


161

One is loaded from the OpenID provider domain while the other is from the application’s
domain itself. The figure 7.1 elaborates the message flow happens between an application
and an OpenID provider which supports the session management specification during the
login flow. This will also help you recall what’s happening during the OpenID Connect login
flow, which we discussed in detail in chapter 3.

Figure 7.1 The client application uses authorization code authentication flow to communicate with an OpenID
provider that supports session management to authenticate the user.

The figure 7.1 shows the OpenID Connect authorization code login flow. In chapter 3 we
discussed this flow in detail, so you should be already familiar with the messages passed
between the OpenID provider and the application. However, when the application interacts
with an OpenID provider that supports session management, and if the session management
feature is enabled at the OpenID provider end, in the response from the OpenID provider to
the application (step-5 in figure 7.1), you’ll find one more parameter called session_state
along with the code parameter. The value of the session_state is a random string the
OpenID provider generates to track the login state of the user. The following code snippet
shows a sample response from the OpenID provider, which carries the session_state
parameter.
https://app.example.com/redirect_uri?code=YDed2u73hXcr783d&state=Xd2u73hgj59435&session_sta
te=xfhet67jhj3gj

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


162

In addition to sending the session_state parameter to the application in step-5 in figure


7.1, in the same step the OpenID provider will also generate a random value and write that
to a cookie called opbs, under its own domain name. The name of the cookie, opbs is in fact
the acronym for the OpenID Provider Browser State. Since the opbs cookie is created under
the OpenID provider domain, any other applications (running under different domain names)
won’t be able to read its value. Or in other words, only the OpenID provider will be able to
read the value of the opbs cookie.
So, at the end of the OpenID Connect login flow, we have two new values (that you didn’t
see in the normal OpenID Connect login flow); one in the session_state parameter and the
other is in the opbs cookie. The value in the session_state parameter is accessible to the
client application, while the value in the opbs cookie is only accessible to the OpenID
provider. Undoubtedly you should be curious to know what is the session_state parameter
and the opbs cookie, and how those are being used to facilitate logout. We are getting there
soon, for the time being let’s say those two are used to track the state of the logged-in
user’s session.

7.3.2 The role of iframes loaded from the client application’s domain and the
OpenID provider’s domain
In this section you’ll learn what happens at the client application once the OpenID Connect
login flow is completed. As shown in figure 7.2, once the OpenID Connect login flow is
completed, the client application loads two iframes into the browser. One iframe is loaded
from the OpenID provider’s domain and the other iframe is loaded from the client
application’s domain. The iframe loaded from the client application’s domain will have access
to the session_state parameter, while the iframe loaded from the OpenID provider’s
domain has access to the opbs cookie.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


163

Figure 7.2 Once the login is completed, the client application loads two iframes into the browser. One is from
its own domain and the other iframe is loaded from the OpenID Provider’s domain.

Once the iframes are loaded into the browser, they start talking to each other. The client
application’s iframe is developed by the corresponding application developer, while the
iframe loaded from the OpenID provider’s domain is developed and managed by the OpenID
provider, and is available to all the client applications. The following code listing shows a
sample JavaScript code for the client application’s iframe.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


164

Listing 7.1 The client application’s iframe that communicates with the OpenID provider’s
iframe
var stat = "unchanged";
var mes = client_id + " " + session_state;
var targetOrigin = "https://op.example.com"; // Validates origin
var opFrameId = "op";
var timerID;

function check_session() {
var win = window.parent.frames[opFrameId].contentWindow
win.postMessage(mes, targetOrigin);
}

function setTimer() {
check_session();
timerID = setInterval(check_session, 5 * 1000);
}

window.addEventListener("message", receiveMessage, false);

function receiveMessage(e) {
if (e.origin !== targetOrigin) {
return;
}
stat = e.data;

if (stat === "changed") {


clearInterval(timerID);
// then take the actions below...
}
}
setTimer();

As shown in listing 7.1, the role of the client application’s iframe is to periodically
communicate with the OpenID provider’s iframe to check whether the value of
session_state corresponding to the logged-in user has changed (see figure 7.3). The client
application’s iframe uses the window.postMessage() API (https://developer.mozilla.org/en-
US/docs/Web/API/Window/postMessage) to pass a message, which it constructs by
concatenating the client_id of the application with the session_state, to the OpenID
provider’s iframe. You may recall from section 7.3.1, at the end of the login flow the OpenID
provider passes the session_state parameter to the client application. In case the value of
session_state has changed, that means the user has logged out from the OpenID provider,
so the client application also can logout the user.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


165

Figure 7.3 The Client application’s iframe periodically talks to the OpenID provider’s iframe to check the
status of the user’s login session.

The following code listing shows the sample code of the OpenID provider iframe.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


166

Listing 7.2 The OpenID provider’s iframe that communicates with the client application’s
iframe
window.addEventListener("message", receiveMessage, false);

function receiveMessage(e){ // e.data has client_id and session_state

var client_id = e.data.split(' ')[0];


var session_state = e.data.split(' ')[1];
var salt = session_state.split('.')[1];

// if message is syntactically invalid


// postMessage('error', e.origin) and return

// if message comes an unexpected origin


// postMessage('error', e.origin) and return

// get_op_user_agent_state() is an OP defined function


// that returns the User Agent's login status at the OP.
// How it is done is entirely up to the OP.
var opuas = get_op_user_agent_state();

// Here, the session_state is calculated in this particular way,


// but it is entirely up to the OP how to do it under the
// requirements defined in this specification.
var ss = CryptoJS.SHA256(client_id + ' ' + e.origin + ' ' +
opuas + ' ' + salt) + "." + salt;

var stat = '';


if (session_state === ss) {
stat = 'unchanged';
} else {
stat = 'changed';
}

e.source.postMessage(stat, e.origin);
};

The OpenID provider iframe upon receiving a message from the client application’s iframe,
first extracts out the client_id and session_state from it. In the next section you’ll learn
how the session_state parameter is constructed.

var client_id = e.data.split(' ')[0];


var session_state = e.data.split(' ')[1];

7.3.3 How the OpenID provider constructs the session_state parameter?


In this section we’ll delve deep into the session_state. The OpenID provider constructs the
session_state with three values: the client_id of the client application, the origin (or the
domain name) of the OpenID provider, and a randomly generated value, which it writes to
the opbs cookie.
Independent of the session_state the client application’s iframe passes to it, the
OpenID provider iframe too calculates the session_state, and then it checks the
session_state it came up with is different from what is passed to it by the client
application’s iframe. To calculate the session_state the OpenID provider’s iframe should

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


167

know the client_id, the origin, the salt and the value written to the opbs cookie. The client
application’s iframe passes the client_id and the origin to the OpenID provider’s iframe,
and also, it can find the value of the salt by decoding the session_state passed to it by the
client application’s iframe, as shown below. The salt is a generated value that brings in
randomness to the session_state.

var salt = session_state.split('.')[1];

With that, the only missing value to calculate the session_state is the value written to the
opbs cookie. Since the OpenID provider’s iframe is loaded from the OpenID provider’s
domain itself, it can access the opbs cookie. Now, the OpenID provider’s iframe can calculate
the session_state in the following way.

var ss = CryptoJS.SHA256(client_id + ' ' + e.origin + ' ' +


obps + ' ' + salt) + "." + salt;

The only reason the value of the session_state calculated by the OpenID provider’s iframe
can differ from the value of the session_state passed to it by the client application’s iframe
is due to any changes that could happen to the opbs cookie.
Only the OpenID provider can override the value of the opbs cookie and it will do so,
when it finds that the corresponding user has logged out from the OpenID provider. So, if
the user initiates logout from an application that runs on the same browser, which also runs
your application and also connected to the same OpenID provider as yours, then the OpenID
provider will indirectly notify your application about the logout event by updating the value of
the cookie. When the value of the opbs cookie changes, the value of session_state passed
to the OpenID provider’s iframe by the client application’s iframe won’t match with the value
of session_state the OpenID provider’s iframe derives. That’s an indication to the client
application, that the user has logged out from the OpenID provider, so, the client application
too can initiate it’s own logout routines, which you will learn in the next section.

7.3.4 A client application initiating logout


In all the previous sections under 7.3 we discussed how your client application can respond
to a logout event, which it detects from the communication happens between the OpenID
Provider’s iframe and the client application’s iframe. In this section you’ll learn how a client
application can initiate a logout request by its own.
Whether the client application is responding to a logout event from the OpenID Provider
or the client application by itself initiating logout, first it has to clear its own state under the
client application’s domain. The client application will perform some of the following tasks to
clear its own state based on the type of the application.
• If the client application is a server-side web application then, first it has to send a
logout request to its own backend, which carries the session cookie from the
frontend; and then the backend will invalidate the session and will log out the user.
• If the client application is a single-page application, then it will clear any state it keeps
in local / session or cookie storage.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


168

• If the client application is either a server-side web application or a single-page


application it may decide to revoke any of the access tokens issued to it from the
OpenID provider.

Once the application clears its local state, and logs out the user from the application, it will
construct the following request and redirect the user to the logout endpoint of the OpenID
provider. Since this is initiated by the client application itself, in the OpenID Connect domain,
it’s widely known as OpenID Connect Relying Party (RP) Initiated Logout and is defined in the
specification: https://openid.net/specs/openid-connect-rpinitiated-1_0.html. The following
code snippet shows a sample HTTP GET request from the client application to the OpenID
provider’s logout endpoint.

https://op.example.com/logout?id_token_hint=<ID_TOKEN>&state=XD2uedhgj59458&post_logout_red
irect_uri= https://app.example.com/post_logout

The value of the id_token_hint parameter carries ID token the OpenID provider issued to
the client application during the login flow. This is a recommended parameter, however if the
request also carries the post_logout_redirect_uri parameter, the id_token_hint is a
mandatory parameter.
The post_logout_redirect_uri parameter carries a URL that the OpenID provider
should redirect the user back, after it completes the logout flow. The value of the
post_logout_redirect_uri parameter must be already known to the OpenID provider, so,
the client application should share it with the post_logout_redirect_uri OpenID provider
at the time of the application registration. In most of the cases, in the absence of the
post_logout_redirect_uri at application registration, the OpenID provider uses the
already registered redirect_uri as the post_logout_redirect_uri. The
post_logout_redirect_uri is an optional parameter and if it is not present in the request,
the OpenID provider uses the value of post_logout_redirect_uri parameter which is
already registered with it.
The above explanation of the post_logout_redirect_uri parameter probably might
have raised a question in your mind! If the value of the post_logout_redirect_uri
parameter the application sends in the logout request must match with the value already
registered with the OpenID provider, why in the first place the application has to send it in
the logout request? Why can’t the OpenID provider always use the value of the
post_logout_redirect_uri parameter, which is registered with it, to redirect the user back
after completing the logout flow? The reason is, you can register multiple
post_logout_redirect_uris with the OpenID provider and use one of them in the logout
request to the OpenID provider.
The state parameter in the logout request is an optional parameter. If the state
parameter is present in the logout request, the OpenID provider must return back the same
value in the state parameter to the post_logout_redirect_uri. The state parameter
helps the client application correlate the front-channel state before and after redirecting the
user to the OpenID provider logout endpoint. The value of the state parameter can be
anything the client application generates.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


169

7.3.5 The id_token_hint parameter


In section 7.3.4 we introduced id_token_hint as a parameter you pass to the logout
endpoint in the logout request. The value of this parameter is the ID token issued to the
client application by the OpenID provider. When the OpenID provider receives ID token along
with the logout request, it will carry out following checks before accepting it as valid.
• The aud parameter in the ID token must refer to a client application the OpenID
provider already knows. In most of the cases the aud parameter in the ID token
carries the client_id that represents the OpenID client application.
• The client application corresponding to the aud value in the ID token must have a
valid session on the current browser, the logout request was initiated from. Since the
logout request also carries the cookies attached to the OpenID provider’s domain, the
OpenID provider can figure out the corresponding backend session.
• If the above two conditions are met, the OpenID provider will accept the provided ID
token as valid, and initiate the logout process, even if the provided ID token has
expired.

7.3.6 Implementing OpenID Connect session management with a server-side web


application
In this section you’ll learn how to implement logout with a server-side web application using
Asgardeo Tomcat OpenID Connect Agent (https://github.com/asgardeo/asgardeo-tomcat-
oidc-agent), which is an open-source OpenID Connect agent implementation for Java EE
applications running on Apache Tomcat. To get the sample application up and running, you
need to perform the following tasks after validating the pre-requisites
(https://github.com/openidconnect-in-
action/samples/tree/master/chapter07/sample01#prerequisites).

1. Download the sample web application as a WAR (web archive) file, which is written in
Java.
2. Update the web application configuration to include the Tomcat agent for OpenID
Connect integration.
3. Register the web application at an OpenID Provider and get a client id and client
secret for the web application.
4. Configure the client id and client secret you got from the OpenID Provider in the web
application.
5. Build the updated web application and deploy it in Tomcat; and start the Tomcat
server.
6. Test the login flow by visiting the web application.
7. Once the login flow is completed you can experience the logout flow by clicking the
logout button on the application.

All the instructions corresponding to the above steps are documented in


https://github.com/openidconnect-in-action/samples/blob/master/chapter07/sample01/
README.md file. We thought

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


170

of not to duplicate steps to set up the samples in this chapter, and keep them in the GitHub
repository; so, as the software versions used in the sample changes, we have the freedom of
updating the instructions in the README.md file as well as the code of the samples.

The Asgardeo OpenID Connect agent is developed as a servlet filter. If you are familiar with
Java, you most probably know what a servlet filter is. For all the others, think about servlet
filter as a component, that can be configured to intercept all the requests coming to a
specific path of your web application. Any Java EE application server supports servlet filters.
So, you can deploy the same servlet filter that you deployed in Tomcat server, in JBOSS
application server as well.
Let’s have a look at the code that integrates the OpenID Connect agent with the web
application. In the web.xml file, inside oidc-sample-app/WEB-INF directory, you’ll find the
following code, which defines the full qualified name of the Java class, that implements the
OpenID Connect agent logic as a servlet filter.

Listing 7.3 Defining the Asgardeo OpenID Connect servlet filter in WEB-INF/web.xml file
<filter>
<filter-name>OIDCAgentFilter</filter-name> #A
<filter-class>io.asgardio.tomcat.oidc.agent.OIDCAgentFilter</filter-class> #B
</filter>

#A The name of the servlet filter. This can be any name and will be referred from other places in the web.xml file.
#B The fully-qualified class name of the servlet filter implementation. This is in fact the OpenID Connect agent.

The following code listing shows, how to setup the servlet filter defined in listing 7.3 against
certain paths of the web application. The OpenID Connect agent (or the servlet filter) will
protect the access to these paths.

Listing 7.4 Configuring the servlet filter against different paths in WEB-INF/web.xml file
<filter-mapping> #A
<filter-name>OIDCAgentFilter</filter-name> #B
<url-pattern>/logout</url-pattern> #C
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>/oauth2client</url-pattern> #D
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.jsp</url-pattern> #E
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.html</url-pattern> #F
</filter-mapping>

#A A filter mapping defines a mapping between a filter name and a path for the application server will make sure any
request coming to that path will go through the corresponding servlet filter
#B The name of the servlet filter, as defined in listing 7.3
#C This servlet filter will intercept all the requests coming to the /logout path of this web application
#D This servlet filter will intercept all the requests coming to the /oauth2client path of this web application

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


171

#E This servlet filter will intercept all the request coming to any file that has the .jsp extension
#F This servlet filter will intercept all the request coming to any file that has the .html extension

In addition to the servlet filter implementation, the Asgardeo OpenID Connect agent also
comes with an event listener implementation (SSOAgentContextEventListener), that is
responsible for loading certain values from WEB-INF/classes/oidc-sample-app.properties. The
oidc-sample-app.properties file carries a set of properties related to the OpenID provider the
web application trusts for authentication, as well as some properties related to the web
application itself, as shown in the following listing.

Listing 7.5 The WEB-INF/classes/oidc-sample-app.properties file


consumerKey=XXXXXX #A
consumerSecret= XXXXXX #B
skipURIs=/oidc-sample-app/index.html #C
errorPage=/error.jsp #D
callBackURL=http://localhost:8080/oidc-sample-app/oauth2client #A
scope=openid #E
authorizeEndpoint=https://localhost:9443/oauth2/authorize #F
logoutEndpoint=https://localhost:9443/oidc/logout #G
tokenEndpoint=https://localhost:9443/oauth2/token #H
issuer=https://localhost:9443/oauth2/token #I
jwksEndpoint=https://localhost:9443/oauth2/jwks #J
postLogoutRedirectURI=http://localhost:8080/oidc-sample-app/index.html #K

#A The client id corresponding to this web application, which you get from the OpenID provider
#B The client secret corresponding to this web application, which you get from the OpenID provider
#C Defines pages or the URIs the OpenID Connect agent should not worry about
#D Where to take the user, in case of an error
#E This is the same callback URL (or the redirect URI) you configure at the OpenID provider at the time you registered
your web application.
#F The authorize endpoint of the OpenID provider. This OpenID Connect agent will redirect any unauthenticated
requests to this endpoint.
#G The logout endpoint of the OpenID provider. When the user initiates logout, the web application will redirect the
user to this endpoint. You’ll learn about logout in chapter 7.
#H The token endpoint of the OpenID provider. The OpenID connect agent uses this endpoint to exchange the
authorization code it got from the authorize endpoint to an access token and an ID token.
#I The issuer of the ID token. The OpenID provider defines the value of the issuer and is included in the ID token. The
OpenID Connect agent only accepts ID tokens from an issuer it knows (trusts).
#J This is an endpoint defined by the OpenID provider, which provides the public key associated with the signature of
the ID token.
#K The endpoint belongs to the web application, where the OpenID provider will redirect the user after logout.

The following code listing shows how the web application defines two listeners in the WEB-
INF/web.xml file. In this section already we discussed about the
SSOAgentContextEventListener; and in addition to that the OpenID Connect agent defines
another listener called JKSLoader. The JKSLoader listener is responsible for loading
properties from WEB-INF/classes/jks.properties file. The jks.properties file defines a set of
properties corresponding to the certificates that are required to make secure connection with
the OpenID provider.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


172

Listing 7.6 Two listener implementations defined in WEB-INF/web.xml file


<listener>
<listener-class>
io.asgardio.tomcat.oidc.agent.SSOAgentContextEventListener
</listener-class>
</listener>

<listener>
<listener-class>io.asgardio.tomcat.oidc.agent.JKSLoader</listener-class>
</listener>

7.4 Implementing front-channel logout


In this section you’ll learn how to implement logout with an OpenID Connect client
application following the front-channel logout specification (https://openid.net/specs/openid-
connect-frontchannel-1_0.html). This approach defines the logout functionality between an
OpenID Connect application and a provider completely using the browser (front-channel),
without demanding the OpenID application to load an iframe from the OpenID provider.

7.4.1 A client application initiating logout


In this section you’ll learn how an OpenID Connect client application initiates logout following
the front-channel approach and the corresponding message flow. Since the OpenID Connect
client application always initiates the logout request from the front-channel via the browser,
the message flow discussed in section 7.3.4 is also applicable here. So, to initiate a logout
request the client application sends the following request to the logout endpoint of the
OpenID provider.

https://op.example.com/logout?id_token_hint=<ID_TOKEN>&state=XD2uedhgj59458&post_logout_red
irect_uri= https://app.example.com/post_logout

Please check the section 7.3.4 for a detailed explanation of each of the request parameters in
the above code snippet.

7.4.2 The OpenID provider responding to the client application’s logout request
In this section you’ll learn how the OpenID provider responds to client application’s logout
requests, following the front-channel logout specification. The figure 7.4 explains the logout
flow from end to end. After the client application initiated logout from by talking to the
OpenID provider’s logout endpoint and then the OpenID provider responds by loading a set
of iframes under OpenID provider’s domain, which will send logout requests to other client
applications which do have active login sessions on the same browser.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


173

Figure 7.4 The client application 1 initiates logout from by talking to the OpenID provider’s logout endpoint
and then the OpenID provider responds loading a set of iframes under OpenID provider’s domain, which will
send logout requests to other client applications which do have active login sessions on the same browser.

The OpenID provider can figure out the active sessions the user has on the same browser
and the corresponding client applications via the cookies the logout request brings in. Then,
the OpenID provider will load an iframe for each application to the browser under the OpenID
provider’s domain name. Each iframe then in parallel will do an HTTP GET to the
corresponding application’s logout URL. If an application supports front-channel logout, it has
to register a logout URL with the corresponding OpenID provider. Following code snippet
shows a logout request the OpenID provider initiates via it’s iframe.

https://app.example.com/logout?iss=<ISSUER>&sid=<SESSION_ID>

The value of the iss parameter carries the issuer identifier of the OpenID provider. This is
the same identifier the OpenID provider embeds into the ID tokens it issues during login flow
under the iss parameter. The value of the sid parameter carries a session identifier. An
OpenID provider, which supports logout adds a unique ID it generates for a login session to
the ID token it issues to the client application during the login flow.
When each client application receives the logout request from the OpenID provider along
with the iss and sid parameters, it also gets all the cookies attached to the corresponding
client application’s domain via the browser. With that the client application can uniquely
identify the user session and then initiate the logout routine.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


174

7.4.3 Implementing front-channel logout with a single-page application


In this section you’ll learn how to create a fully functioning SPA using React and then
integrate the SPA with OpenID Connect for logout. We assume you have a good knowledge
on React; if not, please go through appendix A first, which covers all the React fundamentals
you need to know to follow this section. The logout example in this section is built on top of
the same SPA we discussed in chapter 3, section 3.11 with respect to the login flow.

BUILDING A SINGLE-PAGE APPLICATION WITH REACT


In this section we’ll build a React application with no security, and make sure its up and
running. All the samples we use in this book are available in the GitHub repository:
https://github.com/openidconnect-in-action/samples; either you can do a git clone or
download all the samples as a zip file. The sample we discuss in the section is available
under the chapter07/sample02 directory. To build the same React application, run the
following command from the sample02 directory. This npm command will look at the
package.json file inside the sample02 directory and download all the dependent node
modules and store them under the directory sample02/node_modules. You won’t see the
node_modules directory in the samples you downloaded from the GitHub repository; it’s
created only after you run the following command.

\> npm install

To build the React application, run the following command from sample01 directory on the
command console. This command will create directory called build and copy all the files
that you want to deploy into your production web server.

\> npm run build

In this example we use a node server as our web server. You can start using the following
command run from the sample02 directory.

\> npm start

The above command starts the node server on localhost port 3000 by default; if you visit
http://localhost:3000 link on the web browser, you will see a welcome message. This is the
simplest React application you can have; in the next two sections, you’ll learn how to secure
this application with OpenID Connect.

SETTING UP AN OPENID PROVIDER


To secure the React application we developed in the previous section, we need to have an
OpenID provider. As we discussed in chapter 3, section 3.9, a SPA is a public client and
public clients cannot securely store secrets. So, there is no point of authenticating a SPA.
Here we need to use an OpenID provider that supports the authorization code flow with no
client authentication. Here (check https://github.com/openidconnect-in-
action/samples/blob/master/IDPs.md), we explain how to setup two open source OpenID
providers and to run the examples in this section you can pick one of them. Once you
successfully set up your OpenID provider and register your application with the OpenID

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


175

provider, you need to have following parameters to secure access to the React application
with OpenID Connect.

Listing 7.7 Parameters with sample values required to communicate with the OpenID provider
client_id: D4ZoMSpsxqgvUuiC6j5ROnEYea0a
redirect_uri: https://localhost:3000
Authorization endpoint: https://localhost:9443/oauth2/authz
Token endpoint: https://localhost:9443/oauth2/token
Logout endpoint: https://localhost:9443/oauth2/logout
Issuer: https://localhost:9443

UPDATING THE CLIENT APPLICATION TO USE OPENID CONNECT LOGIN


In this section we are going to update the React application we developed in previous section
to support login with OpenID Connect. If you are already running the node server, which
hosts the React application from previous section, please take it down by pressing Ctrl + C
on the terminal that runs the node server.
To enable OpenID Connect login to the React application, in chapter07/sample02
directory we use the npm package @facilelogin/oidc-react. It’s an open source npm
package released under the MIT licenses. We developed this package by forking two open
source node modules: https://github.com/auth0/auth0-react and
https://github.com/auth0/auth0-spa-js. If you are interested in finding more details you can
find the two forked repositories with our changes here: https://github.com/openidconnect-in-
action/oidc-react and https://github.com/openidconnect-in-action/oidc-spa-js.
To install the @facilelogin/oidc-react package, run the following command from sample01
directory. Once the command runs successfully, you’ll find a new entry added into the
sample01/package.json file under the dependencies section with respect to the
@facilelogin/oidc-react package.

\> npm install @facilelogin/oidc-react

The @facilelogin/oidc-react package introduces a new React component called


<OIDCProvider /> that carries the configuration corresponding to your OpenID provider. To
add this component to your React application, on sample02/src/index.js, replace the existing
call to the ReactDOM.render() function with the following. You also need to have an import
statement to import OIDCProvider component from the @facilelogin/oidc-react
package.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


176

Listing 7.8 Rendering the <ODICProvider /> React component


import { OIDCProvider } from '@facilelogin/oidc-react';

ReactDOM.render(
<OIDCProvider
domain="localhost:9443"
tokenEp="https://localhost:9443/oauth2/token"
authzEp="https://localhost:9443/oauth2/authorize"
clientId="D4ZoMSpsxqgvUuiC6j5ROnEYea0a"
issuer="https://localhost:9443/oauth2/token"
redirectUri={window.location.origin}>
<App />
</OIDCProvider>,
document.getElementById('book-club-app')
);

Now you can replace the code in sample02/src/components/App.js with the following, that
adds a login button to the welcome page.

Listing 7.9 Updated App.js code that renders the login button to initiate the login flow
import React from 'react';
import { useAuth0 } from '@facilelogin/oidc-react';

function App() {
const {isLoading,isAuthenticated,error,user,loginWithRedirect,logout,} = useAuth0();

if (isLoading) {
return <div>Loading...</div>;
}
if (error) {
return <div>Oops... {error.message}</div>;
}

if (isAuthenticated) {
console.log(user.id);
return (
<div>
Hello {user.sub}{' '}
<button onClick={() => logout({ returnTo: window.location.origin })}>
Log out
</button>
</div>
);
} else {
return <button onClick={loginWithRedirect}>Log in</button>;
}
}

export default App;

Now you can build the updated React application and start the node server using the
following two npm commands.

\> npm run build


\> npm start

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


177

Once the node server successfully started up, you can visit http://localhost:3000 and click on
the login button to initiate the login flow; and you will get redirected to the OpenID
provider’s login page. Once the login flow is completed, you can click on the logout button on
the client application to initiate the logout flow.

7.5 Summary
• With single logout, once you logout from one application, you’ll be logged out from all
the applications running under the current browser session, connected to the same
identity provider.
• The OpenID Connect core specification does not talk about logout. It is defined under
three other specifications developed by the OpenID working group:
• The session management specification (https://openid.net/specs/openid-connect-
session-1_0.html) defines how to implement logout functionality between an OpenID
application and a provider using two iframes. This is the first logout specification the
OpenID working group introduced.
• The front-channel logout specification (https://openid.net/specs/openid-connect-
frontchannel-1_0.html) defines how to implement logout functionality between an
OpenID application and a provider completely using the browser (front-channel),
without demanding that the OpenID application load an iframe from the OpenID
provider.
• The back-channel logout specification (https://openid.net/specs/openid-connect-
backchannel-1_0.html) defines how to implement logout functionality between an
OpenID application and a provider using front-channel (via browser) as well as direct
(back-channel) communication.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


178

Claim-based access control with


Open Policy Agent (OPA)

This chapter covers

• Key components in an access control system


• Open Policy Agent (OPA) fundamentals and how to use OPA as a policy engine
• How to define and evaluate a policy with OPA based on the claims an ID token carries
In chapter 2 we discussed how to login to a single-page application using OpenID Connect
and in chapter 5 we discussed how to request claims from an OpenID provider. In this
chapter you’ll learn how to implement claim-based access control at the application side with
the claims corresponding to the logged in user, returned to the application by the OpenID
provider.
Claim-based access control, also known as attribute-based access control, is about
controlling access to an application based on different attributes of the logged-in user. What
is a claim? In the real world, for example, only a person who is older than 21 (and in some
parts of the world, 18) can buy a beer from a retail store. Here, the age is the claim, which
decides the minimal requirement to buy a beer. Also, the age is being used to decide who
can drive, who can vote, and so on. If you have visited a theme park with your kids, you
would have noticed that for some rides there is a height restriction and you need to be at
least 4 feet tall. There, the height is the claim, which decides the minimal requirement for
certain rides.
When you build an application, you use claims to decide what a logged in user can do on
the application. A video streaming application, for example, will only list the PG-rated movies
to anyone whose age is below 13.
Most of the applications use role-based access control, where you can think of ‘role’ as
the claim. Controlling access to an application (or to any kind of a resource) based on roles is

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


179

called role-based access control. When you access a Facebook page, for example, as a
member, you can view the posts by other members, but won’t be able to delete any of
those. However, if you are the admin of that Facebook page, you can delete any posts by a
member. Here, Facebook decides what you can do on a Facebook page by the roles assigned
to you.
As discussed in chapter 5, in most of the cases in OpenID Connect, an application gets
the user claims via the ID token returned to it by the OpenID provider. However, to make an
access control decision, the claims are not just enough. They are the inputs, but we also
need policies! In our previous example, enforcing who can delete a post on a Facebook Page
is defined by a policy.
When we build an application, we need to have a way to represent policies, as well as a
policy engine to evaluate policies based on certain inputs. In this chapter we use Open Policy
Agent (OPA) as our policy engine. You’ll learn in this chapter how to set up Open Policy
Engine, and use its REST API to make access control decisions based on the claims an ID
token carries.

8.1 Key components of an access control system


In a typical access control system, we find five key components (figure 8.1): the policy
administration point (PAP), policy enforcement point (PEP), policy decision point (PDP), policy
information point (PIP), and policy store. The PAP is the component that lets policy
administrators and developers define access-control policies.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


180

Figure 8.1 Components of a typical access-control system. The PAP defines access-control policies and then
stores those in the policy store. At runtime, the PEP intercepts all the requests, builds an authorization
request, and talks to the PDP. The PDP loads the policies from the policy store and any other missing
information from the PIP, evaluates the policies, and passes the decision back to the PEP. Based on the
response from the PDP, the PEP decides whether the request should be dispatched to the corresponding
resource or not.

Most of the time, PAP implementations come with their own user interface or expose the
functionality via an API. Some access-control systems don’t have a specific PAP; rather, they
read policies directly from the filesystem, so you need to use third-party tools to author
these policies. Once you define the policies via a PAP, the PAP writes the policies to a policy
store. The policy store can be a database, a filesystem, or even a service that’s exposed via
HTTP.
The PEP is the component that enforces access control policies before a request hits a
protected resource. The protected resource can be a server-side web application, a single-
page application, a mobile application, an API or even a microservice. How you design and
implement the PEP differs based on the type of the resource it wants to protect. If the
resource is a server-side web application written in Java, for example, then you can
implement the PEP as a servlet-filter. The servlet-filter intercepts all the requests directed to
a given web application, so it’s a good place to centrally enforce access control policies.
However, this approach is language specific. In other words, if you want to protect a server-
side application written in C#, then you would need to implement a similar kind of
interceptor using C#.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


181

The best way to implement a PEP for a server-side web application in a language agnostic
way is to front the web application with a proxy, and run the PEP at the proxy. This proxy
can be an Apache server or a Nginx server for example. If it’s an Apache server then you
would need to build the PEP as an Apache module, and if it’s a Nginx server, then you would
need to build the PEP as a Nginx module.
When the PEP intercepts a request, it extracts certain parameters from the request—for
example, certain claims from the ID token—and creates an authorization request. Then it
talks to the PDP to check whether the request is authorized or to find out what actions the
corresponding user is eligible to perform on the protected resource.
When the PEP talks to the PDP to check authorization, the PDP loads all the
corresponding policies from the policy store. And while evaluating an authorization request
against the applicable policies, if there is any required but missing information, the PDP will
talk to a PIP. For example, let’s say we have an access-control policy that says a user can
buy a beer only if the logged in user’s age is greater than 21, but the authorization request
carries only the logged in user’s name in the authorization request. The age is the missing
information here, and the PDP will talk to a PIP to find the corresponding user’s age. We can
connect multiple PIPs to a PDP, and each PIP can connect to different data sources.

8.2 Introducing Open Policy Agent


OPA is an open source, lightweight, general-purpose policy engine. In this section you will
learn OPA fundamentals, and how to use OPA in practice to define and evaluate access
control polices. To define access-control policies, OPA introduces a new declarative language
called Rego (http://www.openpolicyagent.org/docs/latest/policy-language).
OPA started as an open source project in 2016, with a goal to unify policy enforcement
across multiple heterogeneous technology stacks. Netflix, one of the early adopters of OPA,
uses it to enforce access-control policies in its microservices deployment. Apart from Netflix,
Cloudflare, Pinterest, Intuit, Capital One, State Street, and many more use OPA. OPA is a
graduated project under the Cloud Native Computing Foundation (CNCF).

8.2.1 OPA high-level architecture


In this section, we discuss how OPA’s high-level architecture fits into our discussion. As you
can see in figure 8.2, the OPA engine can run on its own as a standalone deployment or as
an embedded library along with an application.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


182

Figure 8.2 An application or a PEP can integrate with the OPA policy engine via its HTTP REST API or via the Go
API.

When you run the OPA server as a standalone deployment, it exposes a set of REST APIs
that PEPs can connect to and check authorization. In figure 8.2, the OPA engine acts as the
PDP.
The open source distribution of the OPA server doesn’t come with a policy authoring tool
or a user interface to create and publish policies to the OPA server. But you can use a tool
like Visual Studio (VS) Code to create OPA policies, and OPA has a plugin for VS Code
(https://marketplace.visualstudio.com/items?itemName=tsandall.opa). If you decide to
embed the OPA server (instead of using it as a hosted server) as a library in your application,
you can use the Go API (provided by OPA) to interact with it.
Once you have the policies, you can use the OPA API to publish them to the OPA server.
When you publish those policies via the API, the OPA engine keeps them in memory only.
You’ll need to build a mechanism to publish policies every time the server boots up. The
other option is to copy the policy files in the filesystem, and the OPA server will pick them up
when it boots up. If any policy changes occur, you’ll need to restart the OPA server.
However, there is an option to ask the OPA server to load policies dynamically from the
filesystem, but that’s not recommended in a production deployment. Also, you can push
policies to the OPA server by using a bundle server; we discuss that in detail in section 8.2.5.
OPA has a PIP design to bring in external data to the PDP or to the OPA engine. This
model is quite similar to the model we discussed in the previous paragraph with respect to
policies. In section 8.2.5, we detail how OPA brings in external data.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


183

8.2.2 Deploying OPA as a Docker container


In this section, we discuss how to deploy an OPA server as a Docker container. In OPA, there
are multiple ways of loading policies. Importantly, OPA stores those policies in memory
(there is no persistence), so that on a restart or redeployment, OPA needs a way to reload
the policies. In loading policies into OPA, the most common approaches are to either
configure OPA to download policies via the bundle API (for example, using AWS’s S3 as the
bundle server) or use volume/bind mounts to mount policies into the container running OPA.
With bind mounts, we keep all the policies in a directory in the host filesystem and then
mount it to the OPA Docker container filesystem. If you look at the
chapter08/sample01/run_opa.sh file, you’ll find the following Docker command (do not try it
as it is). Here, we mount the policies directory from the current location of the host
filesystem to the policies directory of the container filesystem under the root, and the OPA
server will run on port 8181:

\> docker run --mount type=bind,source="$(pwd)"/policies,target=/policies \


-p 8181:8181 openpolicyagent/opa:0.15.0 run /policies --server

To start the OPA server, run the following command from the chapter08/sample01 directory.
This loads the OPA policies from the chapter08/sample01/policies directory (in section 8.2.5,
we discuss OPA policies in detail):

\> sh run_opa.sh

{
"addrs":[
":8181"
],
"insecure_addr":"",
"level":"info",
"msg":"Initializing server.",
"time":"2019-11-05T07:19:34Z"
}

You can run the following command from the chapter08/sample01 directory to test the OPA
server. The chapter08/sample01/policy_1_input_1.json file carries the input data for the
authorization request in JSON format (in section 8.2.4, we discuss authorization requests in
detail):

\> curl -v -X POST --data-binary @policy_1_input_1.json \


http://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

8.2.3 Protecting an OPA server with mTLS


OPA was designed to run on the same server as the PEP that needs authorization decisions.
As such, the first layer of defense for PEP-to-OPA communication is the fact that the
communication is limited to localhost. OPA is a host-local cache of the relevant policies
authored in the PAP and recorded in the policy store. To make a decision, OPA is often self-
contained and can make the decision all on its own without reaching out to other servers.
This means that decisions are highly available and highly performant. Nevertheless, OPA

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


184

recommends defense in depth and ensuring that communication between it and its clients is
secured via mTLS.
In this section, we discuss how to protect the OPA server with mTLS. This will ensure all
the communications that happen among the OPA server and other client applications are
encrypted. Also, only legitimate clients with proper keys can talk to the OPA server. To
protect the OPA server with mTLS, we need to accomplish the following tasks:
• Generate a public/private key pair for the OPA server
• Generate a public/private key pair for the OPA client
• Generate a public/private key pair for the Certificate Authority (CA) client
• Sign the public key of the OPA server with the CA’s private key to generate the OPA
server’s public certificate
• Sign the public key of the OPA client with the CA’s private key to generate the OPA
client’s public certificate

To perform all these tasks, we can use the chapter08/sample01/keys/gen-key.sh script with
OpenSSL. Let’s run the following Docker command from the chapter08/sample01/keys
directory to spin up an OpenSSL Docker container. You’ll see that we mount the current
location (which is chapter08/sample01/keys) from the host filesystem to the /export
directory on the container filesystem:

\> docker run -it -v $(pwd):/export prabath/openssl


#

Once the container boots up successfully, you’ll find a command prompt where you can type
OpenSSL commands. Let’s run the following command to execute the gen-key.sh file that
runs a set of OpenSSL commands, to generate keys that are required to secure the OPA
server with mTLS:

# sh /export/gen-key.sh

Once this command executes successfully, you’ll find the keys corresponding to the CA in the
chapter08/sample01/keys/ca directory, the keys corresponding to the OPA server in the
chapter08/sample01/keys/opa directory, and the keys corresponding to the OPA client in the
chapter08/sample01/keys/client directory.
In case you’re already running the OPA server, stop it by pressing Ctrl-C on the
corresponding command console. To start the OPA server with TLS support, use the following
command from the chapter08/sample01 directory:

\> sh run_opa_tls.sh

{
"addrs":[
":8181"
],
"insecure_addr":"",
"level":"info",
"msg":"Initializing server.",
"time":"2019-11-05T19:03:11Z"
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


185

You can run the following command from the chapter08/sample01 directory to test the OPA
server. The chapter08/sample01/policy_1_input_1.json file carries the input data for the
authorization request in JSON format. Here we use HTTPS to talk to the OPA server:

\> curl -v -k -X POST --data-binary @policy_1_input_1.json \


https://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

Let’s check what’s in the run_opa_tls.sh script, shown in the following listing. The code
annotations in the listing explain what each argument means.

Listing 8.1 Protecting an OPA server endpoint with TLS


\> docker run \
-v "$(pwd)"/policies:/policies \ #A
-v "$(pwd)"/keys:/keys \ #B
-p 8181:8181 \ #C
openpolicyagent/opa:0.15.0 \ #D
run /policies \ #E
--tls-cert-file /keys/opa/opa.cert \ #F
--tls-private-key-file /keys/opa/opa.key \ #G
--server #H

#A Instructs the OPA server to load policies from policies directory, which is mounted to the OPA container
#B The OPA server finds key/certificate for the TLS communication from the keys directory, which is mounted to the
OPA container
#C Port mapping, maps the container port to the host port
#D Name of the OPA Docker image
#E Runs the OPA server by loading policies and data from the policies directory, which is mounted to the OPA
container
#F Certificate used for the TLS communication
#G Private key used for the TLS communication
#H Starts the OPA engine under the server mode

Now the communication between the OPA server and the OPA client (curl) is protected with
TLS. But still, anyone having access to the OPA server’s IP address can access it over TLS.
There are two ways to protect the OPA endpoint for authentication: token authentication and
mTLS.
With token-based authentication, the client has to pass an OAuth 2.0 token in the HTTP
Authorization header as a bearer token, and you also need to write an authorization policy. 1
In this section, we focus on securing the OPA endpoint with mTLS.
If you’re already running the OPA server, stop it by pressing Ctrl-C on the corresponding
command console. To start the OPA server enabling mTLS, run the following command from
the chapter08/sample01 directory:

\> sh run_opa_mtls.sh

Let’s check what’s in the run_opa_mtls.sh script, shown in the following listing. The code
annotations explain what each argument means.

1 This policy is explained at www.openpolicyagent.org/docs/latest/security/.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


186

Listing 8.2 Protecting an OPA server endpoint with mTLS


\> docker run \
-v "$(pwd)"/policies:/policies \
-v "$(pwd)"/keys:/keys \
-p 8181:8181 \
openpolicyagent/opa:0.15.0 \
run /policies \
--tls-cert-file /keys/opa/opa.cert \
--tls-private-key-file /keys/opa/opa.key \
--tls-ca-cert-file /keys/ca/ca.cert \ #A
--authentication=tls \ #B
--server

#A The public certificate of the CA. All the OPA clients must carry a certificate signed by this CA
#B Enables mTLS authentication

You can use the following command from the chapter08/sample01 directory to test the OPA
server, which is now secured with mTLS:

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_1_input_1.json \
https://localhost:8181/v1/data/authz/orders/policy1

Here, we use HTTPS to talk to the OPA server, along with the certificate and the key
generated for the OPA client at the start of this section. The key and the certificate of the
OPA client are available in the chapter08/sample01/keys/client directory. Since we have
secured the OPA server with mTLS, only the trusted client applications can connect to it now.

8.2.4 Defining OPA policies


To define access-control policies, OPA introduces a new declarative language called Rego. 2 In
this section, we go through a set of OPA policies (listing 8.3) to understand the strength of
the Rego language. All the policies we discuss here are available in the
chapter08/sample01/policies directory and are already loaded into the OPA server we booted
up in section 8.2.2, which is protected with mTLS.

2 You can find more details about Rego at www.openpolicyagent.org/docs/latest/policy-language/.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


187

Listing 8.3 OPA policy written in Rego


package authz.orders.policy1 #A

default allow = false #B

allow { #C
input.method = "POST" #D
input.path = "orders"
input.role = "manager"
}

allow {
input.method = "POST"
input.path = ["orders",dept_id] #E
input.deptid = dept_id
input.role = "dept_manager"
}

#A The package name of the policy. Packages let you organize your policies into modules, just as with programming
languages.
#B By default, all requests are disallowed. If this isn’t set and no allowed rules are matched, OPA returns an
undefined decision.
#C Declares the conditions to allow access to the resource
#D The Input document is an arbitrary JSON object handed to OPA and includes use-case-specific information. In this
example, the Input document includes a method, path, role, and deptid. This condition requires that the method
parameter in the input document must be POST.
#E The value of the path parameter in the input document must match this value, where the value of the dept_id is
the deptid parameter from the input document.

The policy defined in listing 8.3, which you’ll find in the policy_1.rego file, has two allow
rules. For an allow rule to return true, every statement within the allow block must return
true. The first allow rule returns true only if a user with the manager role is the one doing
an HTTP POST on the orders resource. The second allow rule returns true if a user with the
dept_manager role is the one doing an HTTP POST on the orders resource under their own
department.
Let’s evaluate this policy with two different input documents. The first is the input
document in listing 8.4, which you’ll find in the policy_1_input_1.json file. Run the following
curl command from the chapter08/sample01 directory and it returns true, because the
inputs in the request match with the first allow rule in the policy (listing 8.3):

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_1_input_1.json \
https://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


188

Listing 8.4 Rego input document with manager role


{
"input":{
"path":"orders",
"method":"POST",
"role":"manager"
}
}

Let’s try with another input document, as shown in listing 8.5, which you’ll find in the
policy_1_input_2.json file. Run the following curl command from the chapter08/sample01
directory and it returns true, because the inputs in the request match with the second allow
rule in the policy (listing 8.3). You can see how the response from OPA server changes by
changing the values of the inputs:

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_1_input_2.json \
https://localhost:8181/v1/data/authz/orders/policy1

{"result":{"allow":true}}

Listing 8.5 Rego input document with dept_manager role


{
"input":{
"path":["orders",1000],
"method":"POST",
"deptid":1000,
"role":"dept_manager"
}
}

Now let’s have a look at a slightly improved version of the policy in listing 8.3. You can find
this new policy in listing 8.6, and it’s already deployed to the OPA server you’re running.
Here, our expectation is that if a user has the manager role, they will be able to do HTTP
PUTs, POSTs, or DELETEs on any orders resource, and if a user has the dept_manager role,
they will be able to do HTTP PUTs, POSTs, or DELETEs only on the orders resource in their
own department. Also any user, regardless of the role, should be able to do HTTP GETs to
any orders resource under their own account. The annotations in the following listing explain
how the policy is constructed.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


189

Listing 8.6 Improved OPA policy written in Rego


package authz.orders.policy2

default allow = false

allow {
allowed_methods_for_manager[input.method] #A
input.path = "orders"
input.role = "manager"
}

allow {
allowed_methods_for_dept_manager[input.method] #B
input.deptid = dept_id
input.path = ["orders",dept_id]
input.role = "dept_manager"
}

allow { #C
input.method = "GET"
input.empid = emp_id
input.path = ["orders",emp_id]
}

allowed_methods_for_manager = {"POST","PUT","DELETE"} #D
allowed_methods_for_dept_manager = {"POST","PUT","DELETE"} #E

#A Checks whether the value of the method parameter from the input document is in the
allowed_methods_for_manager set
#B Checks whether the value of the method parameter from the input document is in the
allowed_methods_for_dept_manager set
#C Allows anyone to access the orders resource under their own employee ID
#D The definition of the allowed_methods_for_manager set
#E The definition of the allowed_methods_for_dept_manager set

Let’s evaluate this policy with the input document in listing 8.7, which you’ll find in the
policy_2_input_1.json file. Run the following curl command from the chapter08/sample01
directory and it returns true, because the inputs in the request match with the first allow
rule in the policy (listing 8.6):

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_2_input_1.json \
https://localhost:8181/v1/data/authz/orders/policy2

{
"result":{
"allow":true,
"allowed_methods_for_dept_manager":["POST","PUT","DELETE"],
"allowed_methods_for_manager":["POST","PUT","DELETE"]
}
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


190

Listing 8.7 Rego input document with manager role


{
"input":{
"path":"orders",
"method":"PUT",
"role":"manager"
}
}

You can also try out the same curl command as shown here with two other input
documents: policy_2_input_2.json and policy_2_input_3.json. You can find these files inside
the chapter08/sample01 directory.

8.2.5 Working with external data


During policy evaluation, sometimes the OPA engine needs access to external data. As we
discussed in section 8.1 while evaluating an authorization request against the applicable
policies, if there is any required but missing information, the OPA server will talk to a PIP (or
external data sources). For example, let’s say we have an access-control policy that says you
can buy a beer only if your age is greater than 21, but the authorization request carries only
your name as the subject, buy as the action, and beer as the resource. The age is the
missing information here, and the OPA server will talk to an external data source to find the
corresponding subject’s age. In this section, we discuss multiple approaches OPA provides to
bring in external data for policy evaluation. 3
• Push data
• Loading data from the file system
• Overload
• Bundle API
• Pull data during evaluation
• JWT

PUSH DATA USING THE DATA API


The push data approach to bring in external data to the OPA server uses the data API
provided by the OPA server. Let’s look at a simple example. The policy in listing 8.8 returns
true if method, path, and the set of scopes in the input message match some data read
from an external data source that’s loaded under the package named
data.order_policy_data.

3 A detailed discussion of these approaches is documented at www.openpolicyagent.org/docs/latest/external-data/.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


191

Listing 8.8 OPA policy using pushed external data


package authz.orders.policy3 #A

import data.order_policy_data as policies #B

default allow = false #C

allow { #D
policy = policies[_] #E
policy.method = input.method #F
policy.path = input.path
policy.scopes[_] = input.scopes[_]
}

#A The package name of the policy


#B Declares the set of statically registered data identified as policies
#C By default, all requests are disallowed. If this isn’t set and no allowed rules matched, OPA returns an undefined
decision.
#D Declares the conditions to allow access to the resource
#E Iterates over values in the policies array
#F For an element in the policies array, checks whether the value of the method parameter in the input matches the
method element of the policy

This policy consumes all the external data from the JSON file
chapter08/sample01/order_policy_data.json (listing 8.9), which we need to push to the OPA
server using the OPA data API. Assuming your OPA server is running on port 8181, you can
run the following curl command from the chapter08/sample01 directory to publish the data
to the OPA server. Keep in mind that here we’re pushing only external data, not the policy.
The policy that consumes the data is already on the OPA server, which you can find in the
chapter08/sample01/policies/policy_3.rego file:

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -H "Content-Type: application/json" \
-X PUT --data-binary @order_policy_data.json \
https://localhost:8181/v1/data/order_policy_data

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


192

Listing 8.9 Order Processing resources defined as OPA data


[
{
"id": "r1", #A
"path": "orders", #B
"method": "POST", #C
"scopes": ["create_order"] #D
},
{
"id": "r2",
"path": "orders",
"method": "GET",
"scopes": ["retrieve_orders"]
},
{
"id": "r3",
"path": "orders/{order_id}",
"method": "PUT",
"scopes": ["update_order"]
}
]

#A An identifier for the resource path


#B The resource path
#C The HTTP method
#D To do an HTTP POST to the order resource, you must have this scope.

Now you can run the following curl command from the chapter08/sample01 directory with
the input message, which you’ll find in the JSON file
chapter08/sample01/policy_3_input_1.json (in listing 8.10) to check if the request is
authorized:

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_3_input_1.json \
https://localhost:8181/v1/data/authz/orders/policy3

{"result":{"allow":true}}

Listing 8.10 OPA input document


{
"input":{
"path":"orders",
"method":"GET",
"scopes":["retrieve_orders"]
}
}

With the push data approach, you control when you want to push the data to the OPA server.
For example, when the external data gets updated, you can push the updated data to the
OPA server. This approach, however, has its own limitations. When you use the data API to

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


193

push external data into the OPA server, the OPA server keeps the data in cache (in memory),
and when you restart the server, you need to push the data again

LOADING DATA FROM THE FILESYSTEM


In this section, we discuss how to load external data from the filesystem. When we start the
OPA server, we need to specify from which directory on the filesystem the OPA server should
load data files and policies. Let’s have a look at the chapter08/sample-01/run_opa_mtls.sh
shell script, shown in the following listing. The code annotations explain how OPA loads
policies from the filesystem at startup.

Listing 8.11 Loading policies at startup


docker run \
-v "$(pwd)"/policies:/policies \ #A
-v "$(pwd)"/keys:/keys \
-p 8181:8181 \
openpolicyagent/opa:0.15.0 \
run /policies \ #B
--tls-cert-file /keys/opa/opa.cert \
--tls-private-key-file /keys/opa/opa.key \
--tls-ca-cert-file /keys/ca/ca.cert \
--authentication=tls \
--server

#A A Docker bind mount, which mounts the policies directory under the current path of the host machine to the
policies directory of the container filesystem
#B Runs the OPA server by loading policies and data from the policies directory

The OPA server you already have running has the policy and the data we’re going to discuss
in this section. Let’s first check the external data file (order_policy_data_from_file.json),
which is available in the chapter08/sample01/policies directory. This is the same file you saw
in listing 8.6 except for a slight change to the file’s structure. You can find the updated data
file in the following listing.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


194

Listing 8.12 Updated OPA policy using external data


{"order_policy_data_from_file" :[
{
"id": "p1",
"path": "orders",
"method": "POST",
"scopes": ["create_order"]
},
{
"id": "p2",
"path": "orders",
"method": "GET",
"scopes": ["retrieve_orders"]
},
{
"id": "p3",
"path": "orders/{order_id}",
"method": "PUT",
"scopes": ["update_order"]
}
]
}

You can see in the JSON payload that we have a root element called
order_policy_data_from_file. The OPA server derives the package name corresponding to
this data set as data.order_policy_data_from_file, which is used in the policy in the
following listing. This policy is exactly the same as in listing 8.8 except the package name
has changed.

Listing 8.13 OPA policy using pushed external data


package authz.orders.polic4

import data.order_policy_data_from_file as policies

default allow = false

allow {
policy = policies[_]
policy.method = input.method
policy.path = input.path
policy.scopes[_] = input.scopes[_]
}

Now you can run the following curl command from the chapter08/sample01 directory with
the input message (chapter08/sample01/policy_4_input_1.json) from listing 8.10 to check
whether the request is authorized:

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_4_input_1.json \
https://localhost:8181/v1/data/authz/orders/policy4

{"result":{"allow":true}}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


195

One issue with loading data from the filesystem is that when there’s any update, you need to
restart the OPA server. There is, however, a configuration option (see
chapter08/sample01/run_opa_mtls_watch.sh) to ask the OPA server to load policies
dynamically (without a restart), but that option isn’t recommended for production
deployments. In practice, if you deploy an OPA server in a Kubernetes environment, you can
keep all your policies and data in a Git repository and use an init container along with the
OPA server in the same pod to pull all the policies and data from Git when you boot up the
corresponding pod. And when there’s an update to the policies or data, we need to restart
the pods.

OVERLOAD
The overload approach to bringing in external data to the OPA server uses the input
document itself. When the PEP builds the authorization request, it can embed external data
into the request. Say, for example, the orders API knows, for anyone wanting to do an HTTP
POST to it, they need to have the create_order scope. Rather than pre-provisioning all the
scope data into the OPA server, the PEP can send it along with the authorization request.
Let’s have a look at a slightly modified version of the policy in listing 8.8. You can find the
updated policy in the following listing.

Listing 8.14 OPA policy using external data that comes with the request
package authz.orders.policy5

import input.external as policy

default allow = false

allow {
policy.method = input.method
policy.path = input.path
policy.scopes[_] = input.scopes[_]
}

You can see that we used the input.external package name to load the external data from
the input document. Let’s look at the input document in the following listing, which carries
the external data with it.

Listing 8.15 OPA request carrying external data


{
"input":{
"path":"orders",
"method":"GET",
"scopes":["retrieve_orders"],
"external" : {
"id": "r2",
"path": "orders",
"method": "GET",
"scopes": ["retrieve_orders"]
}
}
}

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


196

Now you can run the following curl command from the chapter08/sample01 directory with
the input message from listing 8.15 (chapter08/sample01/policy_5_input_1.json) to check
whether the request is authorized:

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_5_input_1.json \
https://localhost:8181/v1/data/authz/orders/policy5

{"result":{"allow":true}}

Reading external data from the input document doesn’t work all the time. For example, there
should be a trust relationship between the OPA client (or the policy enforcement point) and
the OPA server. Next we discuss an alternative for sending data in the input document that
requires less trust and is applicable especially for end-user external data.

BUNDLE API
To bring in external data to an OPA server under the bundle API approach, first you need to
have a bundle server. A bundle server is an endpoint that hosts a bundle. For example, the
bundle server can be an AWS S3 bucket or a GitHub repository. A bundle is a gzipped tarball,
which carries OPA policies and data files under a well-defined directory structure. 4
Once the bundle endpoint is available, you need to update the OPA configuration file with
the bundle endpoint, the credentials to access the bundle endpoint (if it’s secured), the
polling interval, and so on, and then pass the configuration file as a parameter when you spin
up the OPA server. 5 Once the OPA server is up, it continuously polls the bundle API to get the
latest bundle after each predefined time interval.
If your data changes frequently, you’ll find some drawbacks in using the bundle API. The
OPA server polls the bundle API after a predefined time interval, so if you frequently update
the policies or data, you could make authorization decisions based on stale data. To fix that,
you can reduce the polling time interval, but then again, that will increase the load on the
bundle API.

PULL DATA DURING EVALUATION


With this approach, you don’t need to load all the external data into the OPA server’s
memory; rather, you pull data as and when needed during the policy evaluation. To
implement pull data during evaluation, you need to use the OPA built-in function http.send.
To do that, you need to host an API (or a microservice) over HTTP (which is accessible to the
OPA server) to accept data requests from the OPA server and respond with the
corresponding data. 6

JSON WEB TOKEN (JWT)


JSON Web Token (JWT) provides a reliable way of transferring data over the wire between
multiple parties in a cryptographically secure way. (If you’re new to JWT, check out chapter
4.) OPA provides a way to pass a JWT in the input document. The OPA server can verify the

4Details on how to create a bundle are at www.openpolicyagent.org/docs/latest/management/#bundles.


5Details on these configuration options are documented at www.openpolicyagent.org/docs/latest/configuration/.
6Details on how to use http.send and some examples are documented at www.openpolicyagent.org/docs/latest/policy-reference/#http.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


197

JWT and then read data from it. This is the approach we’ll be using to load data into OPA
with OpenID Connect, which we discuss in section 8.3.

8.3 Controlling access based on the claims in an ID token


In this section you’ll learn how to define a policy in OPA to validate an ID token (which is a
JWT) and evaluate policies against the claims it carries. To do that we assume you already
have an ID token. In chapter 3 we discussed how to get an ID token with a single-page
application and in chapter 6 we discussed how to get an ID token with a server-side web
application.

Once you have the ID token, you can build the input document as in listing 8.16. There we
use the value of the ID token as the value of the token parameter. The listing shows only a
part of the ID token, but you can find the complete input document in the
chapter08/sample01/policy_6_input_1.json file.

Listing 8.16 Input document, which carries data in an ID token


{
"input":{
"path": ["orders",101],
"method":"GET",
"empid" : 101,
"token" : "eyJhbGciOiJSUzI1NiIsInR5cCI6IkpXVCJ9... "
}
}

The following listing shows the policy corresponding to the input document in listing 8.16.
The code annotations here explain all key instructions.

Listing 8.17 OPA policy using external data that comes with the request in an ID token
package authz.orders.policy6

default allow = false

certificate = `-----BEGIN CERTIFICATE----- #A


MIICxzCCAa+gAwIBAgIEHP9VkjAN…
-----END CERTIFICATE-----`

allow {
input.method = "GET"
input.empid = emp_id
input.path = ["orders",emp_id]
token.payload.authorities[_] = "ROLE_USER"
}

token = {"payload": payload} {


io.jwt.verify_rs256(input.token, certificate) #B
[header, payload, signature] := io.jwt.decode(input.token) #C
payload.exp >= now_in_seconds #D
}

now_in_seconds = time.now_ns() / 1000000000 #E

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


198

#A The PEM-encoded certificate of the OpenID provider to validate the ID token, which corresponds to the private key
that signs the ID token
#B Verifies the signature of the ID token following the RSA SHA256 algorithm
#C Decodes the ID token
#D Checks whether the ID token is expired
#E Finds the current time in seconds; now_ns() returns time in nanoseconds.

Now you can run the following curl command from the chapter08/sample01 directory with
the input message from listing 8.16 (chapter08/sample01/policy_6_input_1.json) to check
whether the request is authorized:

\> curl -k -v --key keys/client/client.key \


--cert keys/client/client.cert -X POST \
--data-binary @policy_6_input_1.json \
https://localhost:8181/v1/data/authz/orders/policy6

{"result":{"allow":true}}

In listing 8.17, to do the ID token validation, we first needed to validate the signature and
then check the expiration. OPA has a built-in function, called
io.jwt.decode_verify(string, constraints)that validates all in one go. 7 For example,
you can use this function to validate the signature, expiration (exp), not before use (nbf),
audience, issuer, and so on.

8.4 OPA alternatives


Since OPA was introduced in 2016, OPA has become the de facto implementation of fine-
grained access control, mostly in the Kubernetes and microservices domains. A couple of
alternatives to OPA exist, but at the time of this writing, none of them are as popular as OPA.
One alternative, eXtensible Access Control Markup Language (XACML), is an open
standard developed by the Organization for the Advancement of Structured Information
Standards (OASIS). The XACML standard introduces a policy language based on XML and a
schema based on XML for authorization requests and responses. OASIS released the XACML
1.0 specification in 2003, and at the time of this writing, the latest is XACML 3.0. XACML was
popular many years back, but over time, as the popularity of XML-based standards declined,
XACML adoption lessened rapidly as well. Also, XACML as a policy language is quite complex,
though very powerful. If you’re looking for an open source implementation of XACML 3.0,
check the Balana project, which is available at https://github.com/wso2/balana.
Speedle, another open source alternative to OPA, is also a general-purpose authorization
engine. Speedle was developed by Oracle and is relatively new. It’s too early to comment on
how Speedle competes with OPA, and at the time of this writing, only Oracle Cloud uses
Speedle internally. You can find more details on Speedle at https://speedle.io/.

7 You can find all the OPA functions to verify JWT at http://mng.bz/aRv9.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


199

8.5 Summary
• Claim-based access control, also known as attribute-based access control is about
controlling access to an application based on different attributes of the logged in user.
• In a typical access control system, we find five key components (figure 8.1): the
policy administration point (PAP), policy enforcement point (PEP), policy decision point
(PDP), policy information point (PIP), and policy store.
• OPA is an open source, lightweight, general-purpose policy engine.
• OPA was designed to run on the same server as the PEP that needs authorization
decisions. As such, the first layer of defense for PEP-to-OPA communication is the fact
that the communication is limited to localhost.
• To define access-control policies, OPA introduces a new declarative language called
Rego.
• OPA provides a way to pass an ID token (a JWT) in the input document. The OPA
server can verify the ID token and then read data from it.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


200

Securing access to a native mobile


application with OpenID Connect

This chapter covers


• Building a native mobile application using React Native
• Adding an OpenID Connect login to a native mobile application
• Pros and cons of using a web view or the system browser for login
• Managing secrets in a mobile environment
In chapter 3 we discussed how to log in to a single-page application using OpenID Connect.
In chapter 6 we discussed how to log in to a server-side web application. In this chapter you
learn how to integrate an OpenID Connect login with a native mobile application. Along the
way, I cover the associated security best practices.
Over time we have seen people pick cross-platform frameworks to develop native mobile
applications over platform-specific ones. If you are building a native mobile application for
Android, for example, you can use Java or Kotlin, and for Apple iOS you can use Swift or
Objective C. With that approach, if you want your application to run on multiple mobile
platforms, you need to develop and maintain multiple applications. With cross-platform
frameworks like React Native and Flutter, you only need to develop one single application
and it can run on multiple mobile platforms.
In this chapter, to demonstrate how to use OpenID Connect login for native mobile
applications, we use a mobile application developed using React Native. React Native is the
most popular framework for developing cross-platform native mobile applications. If you are
new to React Native, we recommend you read React Native in Action: Developing iOS and
Android apps with JavaScript (Manning, 2019) by Nader Dabit.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


201

9.1 Building a native mobile application using React Native


In this section you will learn how to build a native mobile application using React Native, and
then in section 9.2 you’ll learn how to secure your native mobile applications with OpenID
Connect. If you are already familiar with React Native, you can skip this section and start
with section 9.2.

9.1.1 Setting up Expo tools to build a React Native application


We are using Expo tools (https://github.com/expo/expo) to build our application. In this
section you’ll learn how to set up Expo and the related components. Expo helps you test
React Native applications on Android and iOS platforms without building anything locally. It
comes with a universal runtime and libraries to help us build native mobile applications using
React Native.
To use Expo Go, first you need to have Node.js installed in your computer. Node.js is a
JavaScript runtime that runs on V8 JavaScript engine. You can download and install Node.js,
following the instructions available at https://nodejs.org/. If the installation is successful, the
following command should return the version of the Node.js you installed.

\> node -v
v12.18.1

To build applications with Expo, you need to have the Expo command line interface (CLI) and
Expo Go, which is a mobile client app. To install the Expo CLI, please follow the instructions
listed here: https://docs.expo.dev/get-started/installation/#expo-cli. The Expo Go client app
is available on both the Apple App Store and the Android Play Store; and you can install it on
your mobile device. The Expo Go app lets you access the mobile application you develop and
then expose via the Expo CLI from your mobile device. In section 9.1.3 you’ll learn more
about how to use the Expo CLI and the Expo Go app together.

9.1.2 Setting up a mobile emulator


A mobile emulator helps you to emulate a mobile device, for example an iOS or an Android
device, on your computer. In this section you’ll learn how to set up a mobile emulator. We’ll
use a mobile emulator in the section 9.1.3 to test the mobile application you develop.
If you are on Mac, the iOS emulator is installed by default in your computer along with
the Xcode (https://developer.apple.com/xcode/) installation. You can list the available
emulators in your computer by using the following command on a Mac:

\> xcrun simctl list

The above command lists all the supported device types, runtimes, devices, and device pairs.
To start the emulator with the device you prefer, run the following command.

\> open -a Simulator --args -CurrentDeviceUDID C309778D-500F-4DB0-A09B-6A56745728D0

The C309778D-500F-4DB0-A09B-6A56745728D0 in the command is the UUID corresponding to


the iPhone 12 Pro Max, listed from the previous command under the devices section.
If everything went well, you’ll see the iOS emulator as in figure 9.1.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


202

Figure 9.1 The iOS phone emulator which is installed as part of the Xcode installation on a Mac computer.

If you are not using a Mac, you can find an iOS emulator that matches your platform from
here: https://www.lifewire.com/best-iphone-emulators-4580594. If you are looking for an
Android emulator, you can pick one from here: https://www.androidauthority.com/best-
android-emulators-for-pc-655308/. Before you proceed to section 9.1.3, please make sure
the mobile emulator you picked, is working fine! In the examples in this chapter, we are
using an iOS emulator running on a Mac.

9.1.3 Creating your first React Native application


In this section you’ll learn how to create a React Native application with the create-expo-
app node module (https://www.npmjs.com/package/create-expo-app). Once you have
installed Node.js in your computer (section 9.1.1), run the following command to create a
React Native application. The Node Package eXecute (npx), used in the following command is
installed in your system as part of the Node.js installation.

\> npx create-expo-app chapter09


npx: installed 1 in 1.399s
✔ Downloaded and extracted project files.
✔ Installed JavaScript dependencies.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


203

Once the command successfully completes, you’ll find the application it created under the
chapter09 directory. If you try to run the same command (above) again, you’ll get an error,
and should be using a new directory name.
From the chapter09 directory, you can either run npm run android or npm run ios to
start your application with the respective phone emulator. Make sure you have a phone
emulator installed in your computer before running the command (section 9.1.2). Here we
are going to start the sample application with the iOS emulator with the following
commands:

\> cd chapter09
\> npm run ios

Now you should see the iOS emulator, which runs the sample application as shown in figure
9.2.

Figure 9.2 The iOS phone emulator is running a React Native sample application developed with Expo Go. The
first screen requests permission to open the application. Once given the permission, the emulator loads the
app to the phone emulator.

Once you click on the Reload button (figure 9.2), you should see your application (figure
9.3). The text appears on the application is picked from the chapter09/App.js file. You
can open up the project with your favorite IDE (for example Visual Studio Code) and play
around it.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


204

Figure 9.3 The iOS phone emulator is running a React Native sample application developed with Expo Go.

You can also access the mobile app you just developed via your iPhone or the Android phone,
using the Expo Go app we installed in section 9.1.1. In the terminal where you started the
application with the npm run ios command, you’ll also see a QR code (figure 9.4). You need
to scan it using your phone. It’ll open the Expo Go app in your mobile device, and then load
your application.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


205

Figure 9.4 To load your mobile application to the Expo Go app, scan this QR code using your mobile device.

Finally, we can run the following command to prebuild native modules for your application.
This step is usually done to improve the performance of the application by compiling native
code ahead of time instead of at runtime. This command should be run in your project
directory (chapter09), where the package.json file of your project is located. The
prebuilding process may take a few minutes to complete, and the result of the prebuilding
will be saved in the node_modules directory, so you don't have to run this command again
unless you add or update any native modules in your project.

\> npx expo prebuild

During the above process, you’ll be generating the bundle identifiers for your React Native
application. A bundle identifier is a string that uniquely identifies your application. It typically
takes the form of a reverse-DNS style string; and you need to choose a bundle identifier that
is unique to your application and will not be used by any others.
Once you get the following prompt, you can type a name for your Android package or for
your iOS bundle identifier. For example, we are using com.manning.chapter09 as our bundle
identifier. We use these package names and bundle identifiers to construct the callback URL
for our application when registering your application with Auth0 (section 9.2).

What would you like your Android package name to be? com.manning.chapter09

What would you like your iOS bundle identifier to be? com.manning.chapter09

Now, if you look at the chapter09/app.json file, you should see it updated with the bundle
identifier you created.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


206

{
"expo": {
...
"ios": {
"supportsTablet": true,
"bundleIdentifier": "com.manning.chapter09"
},
}
}

After pre-building the application, you should use the following command to start the
application with iOS emulator. For the Android emulator, you can simply replace ios with
android. The npm run ios command, which we used before, won’t work on a pre-built
application.

\> npx expo start --ios

In section 9.2, you’ll learn how to extend this sample application to support logging in with
OpenID Connect.

9.2 Securing a React Native application with OpenID Connect


In this section you’ll learn how to secure the sample React Native application you built in
section 9.1 with OpenID Connect. Here are a few open-source React Native SDKs to help you
integrate OpenID Connect with your application for login:
• React Native toolkit for Auth0 API: https://github.com/auth0/react-native-auth0
• OIDC React Native SDK for Asgardeo: https://github.com/asgardeo/asgardeo-react-
native-oidc-sdk
• Okta React Native: https://github.com/okta/okta-react-native

In this section we are using the React Native SDK provided by Auth0. It is just our
preference; we do not recommend one over the other. All these SDKs provide good
documentation to get started. Since we used Expo Go in section 9.1, using Auth0 SDK would
be a better option for us, as it added the Expo support from the version 2.16.0, released in
December 2022.
In the same way you built the single-page application in React (section 3.11), here too
we first need to create an application at the OpenID provider you picked. Since we are using
the Auth0 React SDK in this example, you can register with Auth0 and create an application
there. Here (check https://github.com/openidconnect-in-
action/samples/blob/master/IDPs.md), we explain how to set up a React Native application
in Auth0. Since these details could change over time, we wanted to keep them outside the
book, so we can update the instructions as and when they change.
When you register your application with an OpenID provider, you need a URL where the
OpenID provider can redirect the user after authentication. This is also called the callback
URL. You may recall that we used it in chapter 3 when we were building the React single-
page application, and also in chapter 6, when we were building the server-side web
application. In both the cases, the callback URL was an HTTP endpoint. However, when you

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


207

are registering the callback URL with Auth0 for your React Native application, it must be in
the following format.
• For iOS:
{IOS_BUNDLE_IDENTIFIER}://YOUR_DOMAIN/ios/{IOS_BUNDLE_IDENTIFIER}/callba
ck
• For Android:
{ANDROID_PACKAGE}://YOUR_DOMAIN/android/{ANDROID_PACKAGE}/callback

You may recall that in the section 9.1.3, we generated an iOS bundle identifier (or the
Android package), while we were pre-building the application; and there we used
com.manning.chapter09 as our iOS bundle identifier. You can find your domain identifier
after logging in to Auth0 under the Settings section. It is the value of the Tenant Name; for
example, dev-iyql2ytv. Now, we have all the ingredients to construct the callback URL.

• For iOS: com.manning.chapter09://dev-iyql2ytv/ios/


com.manning.chapter09/callback
• For Android: com.manning.chapter09://dev-iyql2ytv/android/
com.manning.chapter09/callback

Now we are ready to start integrating OpenID Connect for login, for the sample application
we developed in section 9.1. First, we need to install the React Native dependencies in our
computer. The following command uses npm to install the Auth0 React Native SDK:

\> npm install react-native-auth0 --save

Once we installed the SDK successfully, we need to configure an Expo plugin in


chapter09/app.json file. The app.json is generated file from section 9.1; and you find
something similar to the following:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


208

Listing 9.1 The content of the generated app.json file


{
"expo": {
"name": "chapter09",
"slug": "chapter09",
"version": "1.0.0",
"orientation": "portrait",
"icon": "./assets/icon.png",
"userInterfaceStyle": "light",
"splash": {
"image": "./assets/splash.png",
"resizeMode": "contain",
"backgroundColor": "#ffffff"
},
"updates": {
"fallbackToCacheTimeout": 0
},
"assetBundlePatterns": [
"**/*"
],
"ios": {
"supportsTablet": true
},
"android": {
"adaptiveIcon": {
"foregroundImage": "./assets/adaptive-icon.png",
"backgroundColor": "#FFFFFF"
}
},
"web": {
"favicon": "./assets/favicon.png"
}
}
}

Listing 9.1 does not have a section called plugins. You can add the following plugins section
directly under the expo element, as shown in the following listing:

Listing 9.2 The content of the generated app.json file


{
"expo": {
...
"plugins": [
[
"react-native-auth0",
{
"domain": "YOUR_DOMAIN" #A
}
]
],
}
}

#A The Auth0 domain name. You can find your domain identifier after login to Auth0 under the Settings section. It
is the value of the Tenant Name; for example, dev-iyql2ytv.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


209

Now we have updated our app configuration with the Auth0 React Native plugin; and next we
need to prebuild the application again to generate the native source code. Run the following
command from the chapter09 directory to generate the native source code.

\> npx expo prebuild

Now, we are ready to enable OpenID Connect to our React Native application. Open up
chapter09/App.js and replace the content of it with the following. You can also find the
same content in the https://github.com/openidconnect-in-
action/samples/blob/master/chapter09/App.js file.

Listing 9.3 The content of the App.js file


import React from 'react';
import {Button, Text, View, StyleSheet} from 'react-native';
import {useAuth0, Auth0Provider} from 'react-native-auth0';

const Home = () => {


const {authorize, user} = useAuth0();

const onLogin = async () => {


try {
await authorize({scope: 'openid profile email'}); #A
} catch (e) {
console.log(e);
}
};

const onLogout = async () => {


alert("Logout not implemented yet!"); #B
};

const loggedIn = user !== undefined && user !== null;

return (
<View style={styles.container}>
{loggedIn && <Text>You are logged in as {user.name}</Text>}
{!loggedIn && <Text>You are not logged in</Text>}

<Button
onPress={loggedIn ? onLogout : onLogin}
title={loggedIn ? 'Log Out' : 'Log In'}
/>
</View>
);
};

const App = () => {


return (
<Auth0Provider domain={YOUR_DOMAIN} clientId={CLIENT_ID}> #C
<Home />
</Auth0Provider>
);
};

const styles = StyleSheet.create({


container: {

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


210

flex: 1,
justifyContent: 'center',
alignItems: 'center',
backgroundColor: '#F5FCFF',
}
});

export default App;

#A This function will redirect the user to the OpenID provider. In this case to Auth0.
#B Logout is not implemented in this example.
#C Replace YOUR_DOMAIN with your Auth0 domain name. You can find your domain identifier after login to Auth0
under the Settings section. It is the value of the Tenant Name; for example, dev-iyql2ytv. Also, replace CLIENT_ID
with the client ID corresponding to your application registered with Auth0.

Now, we are ready to start the application. Run the following command from the
chapter09 directory:
\> npx expo start –ios

The above command will launch your application on the iOS emulator as shown in the figure
9.5. When you click on the Log In button, you’ll be redirected to the login page of the
OpenID provider, in this case the Auth0.

Figure 9.5 The iOS emulator shows the React Native application, now secured with OpenID Connect. When you
tap the Log In button, you’ll be redirected to the OpenID provider’s login page.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


211

9.3 How authorization code flow works with a native mobile


application?
In this section you’ll learn how the OpenID Connect authorization code flow works with
respect to a native mobile application. In the section 3.9 you learnt how the authorization
code flow works with respect to a single-page application; in a native mobile environment,
the flow would be very much the same.
Both the single-page application and the native mobile application are considered as
public clients under the OAuth 2.0 terminology, which you learnt in section 2.4. OAuth 2.0
defines two types of clients, based on their ability to manage secrets that a client application
uses to authenticate to an authorization server. The confidential clients are the applications
that can manage their secrets securely, while all the other applications that are incapable of
managing their own secrets securely, fall under the public client type category.
Figure 9.6 in detail shows the interactions between the native mobile application you
developed in section 9.3 and the OpenID provider. This gives you some insights on what
happens underneath, when a user taps on login for your application. However, as an
application developer you do not need to worry about all these. In most of the cases, the
React Native SDK you picked handles this.

Figure 9.6 The native mobile app spins up the system browser and redirects the user to the OpenID provider,
following the OpenID Connect protocol.

As per figure 9.6, when a user taps the login button on a mobile application; the mobile
applications talks to the corresponding mobile operating system, via the native API, and

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


212

spins up the system browser with the required set of parameters. And then the system
browser redirects the user to the OpenID provider, following the OpenID Connect protocol.
To launch the system browser using React Native, you could probably use something
similar to the following code snippet; the OpenID Connect React Native SDK uses this. You
can call this function with the desired URL to launch the system browser and redirect the
user to the specified URL; for example, in our case, to the authorize endpoint of the OpenID
provider with the corresponding query parameters.

Listing 9.4 A sample React Native code snippet that opens up the system browser
import { Linking } from 'react-native'; #A

const openURL = async (url) => {


const supported = await Linking.canOpenURL(url);
if (supported) {
await Linking.openURL(url);
} else {
console.log("Don't know how to open URI: " + url);
}
};
#A The Linking module abstracts the differences between the underlying platform APIs for launching a browser,
and provides a unified interface for launching URLs from React Native applications.

Launching the system browser to initiate the OpenID Connect login flow is one approach
(figure 9.6); and the other approach is to use an embedded web view within the mobile
application itself; and redirect the user to the OpenID provider on the web view.

What is a web view?


A web view is a component that allows you to display web content within a native mobile app. It essentially acts as an
embedded web browser within the application, allowing you to load and display web pages, HTML, CSS, and
JavaScript.
In a native mobile application, you can use a web view to display web content that is not easily replicable in a
native environment, such as an external website, a complex dynamic form, or a multimedia presentation. You can
also use web views to implement a hybrid application, where the majority of the application's functionality is provided
by a website running within the web view. Web views are available in both Android and iOS platforms and can be
easily integrated into a native mobile app using various mobile development frameworks, such as React Native.

When integrating OpenID Connect with a native mobile application, using a web view or the
system browser both have pros and cons. Here are some of the main pros in using a web
view.
• Control: By using a web view, you have full control over the user experience and the
appearance of the login page.
• Isolation: By using a web view, you can isolate the login flow from the rest of the app,
which can reduce the risk of security vulnerabilities.
• Speed: Web views can be faster and more responsive than system browsers, which
can improve the overall user experience.

Here are some of the cons in using a web view:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


213

• User experience: A web view is a separate, contained environment within the


application, which can make the experience feel less integrated and seamless.
Additionally, the interface and behavior of a web view can differ from the native
environment, leading to a less polished user experience.
• Performance: Depending on the complexity of the web page being displayed, a web
view can be slower and less responsive than native components, which can affect the
overall performance of the app.
• Security: A web view is essentially a browser within the app, which can expose the
app to security risks if the website's code is not secure. Additionally, the login
credentials are often passed between the web view and website in clear text, which
can be vulnerable to network eavesdropping.
• Using a system browser to initiate a login flow has its own pros and cons as well. The
following lists out the pros of using a system browser:
• User experience: By using the system browser, you can provide a more integrated
and seamless user experience, since the login flow is handled by the same browser
that the user uses regularly on their device.
• Security: By using the system browser, you can take advantage of its built-in security
features, such as SSL certificate validation and phishing protection, to enhance the
security of the login flow.
• Here are some of the cons in using the system browser:
• Lack of control: By using the system browser, you have less control over the user
experience and the appearance of the login page. Additionally, the behavior of the
system browser can differ from one device to another, leading to a less consistent
user experience.
• Redirections: When using the system browser, you may have to handle redirections
and callbacks to the app manually, which can add complexity to the integration.

In conclusion, whether to use a web view or the system browser for integrating OpenID
Connect with a native mobile app depends on the specific requirements and constraints of
your application. Both approaches have their advantages and disadvantages, and it's
important to weigh them carefully before making a decision. However, from the security
point of it is recommended to use the system browser over the web view.
Let’s revisit figure 9.6. For your convenient we have duplicated it as figure 9.7.
Something important to note here is the code_challenge parameter, which is passed along
with the login request to the authorize end point of the OpenID provider. The
code_challenge, along with code_verifier, which is part of the request to the token
endpoint of the OpenID provider, were initially introduced by the Proof Key for Code
Exchange (PKCE) by OAuth Public Clients RFC
(https://datatracker.ietf.org/doc/html/rfc7636) by the IETF OAuth working group, and now
it's part of the OAuth 2.1 draft specification. PKCE is a best practice to prevent the code
interception attack, which we discuss in detail in the section 10.5. PKCE was initially
introduced targeting native mobile applications; however now its recommended to use with
all the types of applications that use OpenID Connect for login.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


214

Figure 9.7 The native mobile app spins up the system browser and redirects the user to the OpenID provider,
following the OpenID Connect protocol (a copy of figure 9.6).

As per figure 9.7, after the OpenID provider authenticates the user successfully, it will
redirect the user back to the registered callback URL. You may recall from section 9.2 the
callback URLs are of the following formats (unlike the HTTP callback URL we had for single-
page apps):
• For iOS:
{IOS_BUNDLE_IDENTIFIER}://YOUR_DOMAIN/ios/{IOS_BUNDLE_IDENTIFIER}/callba
ck
• For Android:
{ANDROID_PACKAGE}://YOUR_DOMAIN/android/{ANDROID_PACKAGE}/callback

When the OpenID provider sends an HTTP 302 response to the system browser, it adds the
authorization code it generates as a query parameter to the callback URL, and then adds the
callback URL to the Location HTTP header in the response. Then it’s up to the browser to
interpret the value of the Location header and then act upon. If the Location header
carries an HTTP endpoint, the browser will do an HTTP GET to it. If the Location header
carries a value in the above format, then the browser will pass the control to the mobile
operating system to locate native mobile application with the corresponding
IOS_BUNDLE_IDENTIFIER or the ANDROID_PACKAGE. Once the mobile operating system finds
the corresponding mobile application, it will pass the control over to it; and the mobile
application will have access to the authorization code.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


215

9.4 Storing secrets in a native mobile environment


In this section you’ll learn how to store secrets securely in a native mobile environment. In
section 9.3 you learnt that a mobile application is a public client, so it cannot securely store
any secrets in it. Anything you store in a mobile device can be accessed by a user with root
access to the device.

9.4.1 Use cases for storing secrets in a native mobile environment


In this section you’ll learn the use case of storing secrets in a mobile environment. In the
example we had in section 9.2, we didn’t have a secret; we only used the client ID to initiate
the authorization code flow with OpenID Connect. It’s the same approach we followed in
chapter 3 with a single-page application, because both a native mobile application and a
single-page application are treated as public clients.
We didn’t have a requirement so far to store secrets in a native mobile environment.
Here are two common use cases where we have to store secrets in a native mobile
environment:
• At the end of the OpenID Connect login flow, you will also get an access token and a
refresh token (see chapter 2). The access token is used by the mobile application to
access APIs and the refresh token is used to refresh the access token when it expires.
The access tokens can be stored in memory, while the refresh tokens should be
stored securely for future access.
• Some mobile applications need to access backend APIs as a system (not on behalf of
a user), and in such cases the mobile application must store the secret it uses to
authenticate to the backend APIs. Let’s take Intercom for example. Intercom is a
software company that specializes in business messaging, providing businesses with a
way to chat with their customers. When you integrate the Intercom chat widget with
your mobile application using their SDK, you need to embed an API key and a secret
(https://developers.intercom.com/installing-intercom/docs/setup).

9.4.2 Best practices for securely handling secrets in a mobile environment


In this section you’ll learn best practices for handling secrets and sensitive data securely in a
mobile environment.

STORING SECRETS IN IOS KEYCHAIN OR ANDROID KEYSTORE


The Keychain on iOS and the KeyStore on Android provide a secure environment for storing
secrets, as well as access controls to ensure that only the authorized components of the
mobile application can access the secrets.
To access the Keychain in iOS or the KeyStore in Android using React Native code, you
can use a third-party library, such as react-native-keychain. This library provides a
simple, cross-platform API for accessing the Keychain/KeyStore, making it easy to store and
retrieve secrets in a secure manner. The following code snippet shows an example of how
you can use react-native-keychain to store an OAuth refresh token in the Keychain:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


216

Listing 9.5 Storing an OAuth refresh tokenin the iOS keychain


import Keychain from 'react-native-keychain';

async function storeRefreshToken(token) {


try {
await Keychain.setInternetCredentials('oauth', 'refresh_token', token);
console.log('Refresh token stored successfully.');
} catch (error) {
console.log('Error storing refresh token: ', error);
}
}

The following code snippet shows how to retrieve secret stored in the keychain.

Listing 9.6 Retrieving a secret stored in the iOS keychain


import Keychain from 'react-native-keychain';

async function retrieveRefreshToken() {


try {
const credentials = await Keychain.getInternetCredentials('refresh_token');
console.log('Refresh token: ', credentials.password);
return credentials.password;
} catch (error) {
console.log('Error retrieving refresh token: ', error);
}
}

ENCRYPT THE SENSITIVE CONTENT BEFORE STORING IT TO PREVENT UNAUTHORIZED ACCESS


The sensitive content can be stored in encrypted files on the device, such as SQLite
databases or encrypted JSON files. The encryption key should be stored securely, such as in
the Keychain/KeyStore.

MINIMIZE SECRET EXPOSURE


Minimize the amount of time secrets are stored in memory and never hardcode secrets in the
app's code. Also, make sure to erase secrets from memory as soon as they are no longer
needed.

REGULARLY UPDATE AND PATCH


Regularly update the secure storage mechanism and the application to protect against newly
discovered security vulnerabilities. Keep the app and its dependencies up-to-date to ensure
that security patches and fixes are applied in a timely manner.

USE SECURE COMMUNICATION


Use secure communication protocols, such as HTTPS or TLS, when transmitting secrets over
the network to prevent eavesdropping or tampering.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


217

9.5 Summary
• Expo helps you test React Native applications on Android and iOS platforms, without
building anything locally. It comes with a universal runtime and libraries to help us
build native mobile applications using React Native.
• A mobile emulator helps you to emulate a mobile device, for example an iOS or an
Android device, on your computer.
• A bundle identifier is a string that uniquely identifies your application. It typically
takes the form of a reverse-DNS style string; and you need to choose a bundle
identifier that is unique to your application and will not be used by any others.
• OAuth 2.0 defines two types of clients, based on their ability to manage secrets that a
client application uses to authenticate to an authorization server. The confidential
clients are the applications that can manage their secrets securely, while all the other
applications that are incapable of managing their own secrets securely, fall under the
public client type category.
• Both the single-page application and the native mobile application are considered
public clients under the OAuth 2.0 terminology.
• The Keychain on iOS and the KeyStore on Android provide a secure environment for
storing secrets, as well as access controls to ensure that only the authorized
components of the mobile application can access the secrets.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


218

Mitigating common threats and


vulnerabilities

This chapter covers

• Exploiting vulnerabilities in the TLS communication to intercept tokens


• Reading tokens from the browser via a plugin
• Intercepting the authorization code in a mobile environment
• How proof-key for code exchange (PKCE) works
• Executing a cross-site request forgery (CSRF) attack on a single-page application
• Identity provider mix-up attack on an application using more than one identity provider
This is the last chapter of the book and by now you have a very good understanding on how
OpenID Connect works and how to integrate OpenID Connect for login with a single-page
application and a server-side web application. Over the time we have witnessed multiple
attacks at large scale against some organizations, where the attackers exploited vulnerabilities
in their OpenID Connect / OAuth 2.0 implementations. In most of the cases those attacks were
possible due to bad implementations of OpenID Connect or OAuth 2.0, not due to any
deficiencies in the specification. Here are few examples with respect to Facebook’s OAuth 2.0
implementation:
• In September 2018 an attacker exploited multiple issues in Facebook’s OAuth 2.0
implementation that exposed the personal information of more than 50 million Facebook
users: https://medium.facilelogin.com/what-went-wrong-d09b0dc24de4.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


219

• In December 2014 a security researcher was able to use the Open Redirector
vulnerability in Facebook and Microsoft Live OAuth 2.0 implementations to get access
to an access token issued by the Microsoft Live server:
https://www.yassineaboukir.com/blog/how-I-discovered-a-1000$-open-redirect-in-
facebook/.
• In February 2013 a security researcher was able to get hold of Facebook user access
tokens by exploiting a vulnerability in Facebook’s OAuth 2.0 implementation and
Chrome: http://homakov.blogspot.com/2013/02/hacking-facebook-with-oauth2-and-
chrome.html.
In this chapter you’ll learn possible attacks against different OpenID Connect authentication
flows with respect to different application types. Also, you’ll learn security best practices you
need to follow in your implementations to mitigate such attacks.

10.1 Token exchange flows in OpenID Connect login


In this section we’ll recap how the OpenID Connect login works and how the tokens are
exchanged between a client application and the OpenID provider. The figure 10.1 shows how
the OpenID Connect authorization code authentication flow works.
In step-5 in figure 10.1 the OpenID provider returns the authorization code to the client
application, as a query parameter. In step-6 the client application passes the authorization
code from step-5 to the OpenID provider, in step-6 it gets back the access token, refresh token
and the ID token from the OpenID provider.

Figure 10.1 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


220

For an attacker to get hold of the authorization code, they have to either intercept step-5
or step-6; and to get hold of the access token, refresh token or the ID token, they have to
intercept step-7. In the following sections you’ll learn different techniques an attacker can
follow to intercept these communication paths.

10.2 Exploiting vulnerabilities in the TLS communication to intercept


tokens
An attacker can exploit a vulnerability in the TLS communication and then read the value of a
token. At first glance, this looks closer to impossible, but we cannot completely rule it out. In
the past, there are multiple vulnerabilities found in the TLS implementations such as OpenSSL
(https://www.openssl.org/) which enabled attackers to intercept TLS protected channels.
Padding Oracle On Downgraded Legacy Encryption (POODLE), Browser Exploit Against
SSL/TLS (BEAST), Compression Ratio Info-leak Made Easy (CRIME), Browser Reconnaissance
and Exfiltration via Adaptive Compression of Hypertext (BREACH), Heartbleed are some of the
popular vulnerabilities found in TLS implementations in the past. In most of these cases for an
attacker to exploit these vulnerabilities they first need to find a way to intercept the
communication between the client and the server. This is commonly known as a man-in-middle
(MITM) attack.
There are techniques followed by attackers to execute an MITM attack; and one of the
simplest ways is to run an open WiFi hotspot in a public area, such as an airport. This will
attract anyone looking for free Internet access, and paves a way for the attacker to intercept
all the communications from the devices connected to the WiFi hotspot.
If an attacker can execute an MITM attack first, and then can exploit a vulnerability in the
TLS implementation (if any) to decrypt the TLS protected messages and read the tokens in
clear text. For example, an attacker can intercept the step-7 in figure 9.1 and get access to
the access token, refresh token and the ID token.
How do you prevent such an attack? First you need to make sure that your application uses
the most up to date software libraries with the latest security patches. Then, if you run the
OpenID provider by yourself, there too, you need to make sure the latest security patches are
applied. If you are using an opensource OpenID provider, to get a security patch for the version
you use, in most of the cases you would need to purchase a subscription from the company
behind the corresponding opensource product. Most of the opensource products release
security patches publicly only for their latest version. So, if you do not have a paid subscription,
then you need to upgrade the OpenID provider to the latest released version. If you are using
a cloud-based service (say for example Auth0, Okta) as your OpenID provider, then it’s their
responsibility to make sure their products are patched regularly. In fact, before you subscribe
to any of those cloud-based services, make sure to study their security posture in detail.

10.3 Reading tokens from the browser via a plugin


Extensibility over modification is a well-known design philosophy in software architecture. It
talks about building software to evolve with new requirements, without changing or modifying
the current source code, but having the ability to plug in new software components to the
current system. Google Chrome extensions, Firefox add-ons, all follow this concept.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


221

The Firefox add-on, Modify Headers let you add, modify and filter the HTTP request headers
sent from the browser to a web server. Another Firefox add-on, SSO Tracer, lets you track all
the message flows between an identity provider and a client application via the browser. None
of these are harmful – but, then again if an attacker can fool a user to install a malware as a
browser plugin, it could easily bypass all your browser level security protections, even the TLS,
to get a hold of the tokens passed from the OpenID provider to the client application. In that
way, an attacker can get hold of the authorization code, access token, refresh token and ID
token sent to the client application from the OpenID provider.
It’s not just about an attacker installing a plugin into the user’s browser, but also when
there are many extensions installed in your browser, each one of them expands the attack
surface. Attackers need not to write new plugins, rather can exploit security vulnerabilities in
an already installed plugin to get hold of the tokens that flow through the browser.
How do you prevent such an attack? This attack happens at the client side, so the only way
to prevent such an attack is to educate users to not to install any random browser plugins.
However, in a corporate environment you can enforce policies to make sure only the trusted
browser plugins can be installed and they are updated regularly.

10.4 Intercepting the authorization code in a mobile environment


Similar to installing a browser plugin, which we discussed in the section 10.3, if an attacker
wants to intercept an authorization code in a mobile environment (where a user logs into a
native mobile app via OpenID Connect), they can make the user to install an app into the
mobile device and then intercept the authorization code. Before we delve deep into this attack,
let’s see how the OpenID Connect works in a mobile environment. The figure 10.2 explains the
flow of events, and it is very much similar to what you saw in figure 10.1.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


222

Figure 10.2 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.

As per figure 10.2, when the user clicks on Login on the native mobile app, it initiates the
OpenID Connect authorization code authentication flow, by spinning up the system browser.
In Apple iPhone for example, when you click on Login button on the Facebook app, it spins up
the Safari web browser to initiate the OpenID Connect login flow. At the end of the OpenID
Connect handshake, the OpenID provider redirects the user to the provided redirect_uri
with the authorization code. This is no different from what you have learnt so far with respect
to login with OpenID Connect to an SPA or a server-side web application. However, the value
of the redirect_uri here is a special one provided by the mobile application, which initiated
the login flow. At the time you install the mobile app on the device, it registers this
redirect_uri with the mobile operating system. This is called a custom URL scheme.
Once an app has a registered custom URL scheme with the corresponding mobile operating
system, whenever the OpenID provider redirects the user on a system browser, to a
redirect_uri that matches with the registered custom URL scheme, the mobile operating
system passes the entire request to the corresponding mobile app. This also includes the
authorization code; and then the mobile app can exchange the authorization code to an access
token and an ID token.
The caveat here is, the mobile operating system allows you to register multiple apps against
the same custom URL scheme. So, if an attacker is able to make you install their app in your
device with the same custom URL scheme as of another app that you use, then when you try
to login to the legitimate app, the mobile operating system will hand over the authorization

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


223

code to the attacker’s app; because both apps have registered under the same custom URL
scheme.
How do you prevent such an attack? The IETF OAuth working group introduced a new RFC
called, Proof Key for Code Exchange (PKCE) by OAuth Public Clients
(https://datatracker.ietf.org/doc/html/rfc7636) to mitigate such attacks. Initially this was
introduced for public clients such as a mobile app or an SPA; however today it is recommended
for any kind of an OpenID Connect client application. Also, emphasizing the value of this RFC,
the OAuth working group decided to include the PKCE recommendations into the upcoming
OAuth 2.1 core RFC as well. In the next section we discuss PKCE in detail.

10.5 Using proof key for code exchange (PKCE) to prevent code
interception
Proof key for code exchange (PKCE) is a best practice defined by the IETF OAuth working group
in the RFC 7363 to prevent the code interception attack, which we discussed in section 9.4. In
this section you will learn how PKCE works and how it prevents the code interception attack.

10.5.1 What an attacker can do with a stolen authorization code?


Before we learn how PKCE prevents a code interception attack, let’s see what an attacker can
do with a stolen authorization code. As you already learnt in chapter 3 and chapter 6, a client
application uses the authorization code it gets from the OpenID provider to obtain an access
token and an ID token. If the client application is a public client (say for example an SPA or a
native mobile app) then it does not authenticate to the token endpoint of the OpenID provider
to obtain an access token and an ID token. However, if the client application is a confidential
client (say for example a server-side web application), then it authenticates to the token
endpoint with the client_id and client_secret.
Let’s revisit what an attacker can do with a stolen authorization code. For a confidential
client application, an attacker can only do a little by having access to the authorization code,
because it requires client authentication to exchange the authorization code to an access token
and ID token. However, for a public client application, an attacker who is having access to the
authorization code, can exchange it to an access token and ID token; because it does not
require any client authentication. The attacker only needs access to the client_id and the
authorization code; and for a public client application, the client_id is publicly known.
So, the impact of the code interception attack is mostly applicable for public client
applications and in the next section you’ll learn how PKCE helps to mitigate it.

10.5.2 How proof key for code exchange (PKCE) works?


PKCE introduces two new parameters to the authentication request of the authorization code
authentication flow of OpenID Connect: code_challenge and code_challenge_method.
Following you can see an example of an OpenID Connect authentication request that carries
those two parameters:

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


224

Listing 10.1 OpenID Connect authentication request with PKCE parameters


https://op.example.com/authorize? &
client_id=zZi6Owvc2S9iJbhdG5LQX3Y3Kutlkx20&
redirect_uri=https://app.example.com/redirect_uri&
scope=openid&
response_type=code&
state=XXXXX&
nonce=XXXXX&
code_challenge=XXXXXXX&
code_challenge_method=S256

The client application prior to initiate login, generates a random string and calculates the
hash of it. The code_challenge parameter in the authentication request carries that hashed
value, and the code_challenge_method parameter carries an identifier corresponding to the
hashing algorithm. When the value of the code_challenge_method parameter is set to S256,
that means the hashing algorithm is SHA-256.
In addition to the code_challenge and code_challenge_method, PKCE also introduces
another parameter that goes with the access token request, called code_verifier. This
parameter carries the random string the client application generated before. In fact, the value
of code_challenge parameter is the hashed value of the code_verifier. Following you can
see an example of a token request that carries the code_verifier parameter:

Listing 10.2 OpenID Connect token request with PKCE parameters


POST /token
HTTP/1.1
Host: op.example.com
Content-Type: application/x-www-form-urlencoded

code=YDed2u73hXcr783d&
client_id=your_client_id&
redirect_uri=https://app.example.com/redirect_uri&
grant_type=authorization_code
code_verifier=WW.geSsl9AzL8QWMl6En5dF2DhlH27mZfQ7C6T82ELN

The OpenID provider after receiving the authentication request (listing 10.1) from the client
application, it has to store the value of the code_challenge and the code_challenge_method
against the authorization code it issues. Then, when it receives the token request (listing 10.2)
along with the authorization code and the code_verifier parameters, the OpenID provider
has to load the corresponding code_challenge and the code_challenge_method to the
memory and check whether the hashed code_verifier (following the hashing method as
stated in the code_challenge_method) matches with the code_challenge, which it knows
already. If it’s a successful match then the OpenID provider will accept the provided
authorization code, and if not will reject the request.
How would this prevent the code interception attack? The figure 10.3 shows the flow of
events in a typical OpenID Connect login flow. This is in fact the same diagram you saw in
figure 10.1. Let’s assume the attacker was able to get hold of the authorization code in step-
5. However, when we have enabled PKCE at the OpenID provider, to exchange the stolen
authorization code to an access token and ID token (in step-6), the attacker also must know

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


225

the code_verifier. The client application generates the code_verifier before it initiates
step-1 and keeps it in memory; and the attacker who intercepts the step-5 won’t have access
to it.

Figure 10.3 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.

PKCE provides protection for the code interception attack that happens via intercepting the
step-5. The code interception technique we discussed in the section 10.4 with respect to a
mobile app won’t be successful when PKCE in use. However, PKCE does not help to fully prevent
the other two code interception techniques we discussed under the section 10.2 and 10.3.
For example, if the OpenID provider is using a vulnerable TLS implementation, then the
attacker does not need to worry about intercepting step-5, rather it can intercept step-7
directly and get the access token and ID token. In the same manner, if the attacker can make
the user install a browser plugin, then they need not to worry about intercepting step-5 to get
the authorization code, rather can intercept step-7 and get the access token and the ID token.
In addition to help preventing the code interception attack in a mobile environment, PKCE
also helps you prevent other types of attacks, which we discuss later in this chapter.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


226

10.6 Executing a cross-site request forgery (CSRF) attack on a


single-page application
In this section you’ll learn what is a cross-site request forgery (CSRF) attack and how an
attacker can execute a CSRF attack on a single-page application; and then how we can mitigate
such an attack.

10.6.1 What is a cross-site request forgery (CSRF) attack?


The goal of a cross-site request forgery (CSRF) attack is to trick the user (or the victim) to
submit a malicious request unintentionally. Let’s take an example. Say, you login to your
banking website, which is a server-side web application (see figure 10.4). Being a server-side
web application, let’s assume your banking website uses cookies to authenticate to the backend
APIs. We took this example, because it’s the simplest way to explain the CSRF attack and then,
later we can discuss how CSRF is related to a single-page application.

Figure 10.4 An attacker can successfully execute a CSRF vulnerability in a web application by sending a
carefully crafted link to the victim and making them click on it.

Every time the banking website invokes an API, the browser will automatically attach the
cookies to the API request. An attacker can exploit this behavior to make the user to submit a
malicious request unintentionally. For example, the attacker can send an email with a link that
carries a carefully crafted message (as shown below) and make the user clicks on it.

https://api.mybank.com/transfer?from=101090909090&to=121391939890&amount=5000

Once the user clicks on the link, the browser generates an HTTP request to the
corresponding banking API; and if the user already had a valid login session with the website,
the browser will automatically attach the corresponding cookies to the HTTP request, and the

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


227

API will successfully execute the malicious request. This kind of a CSRF attack has some
conditions to satisfy, which you might have already figured out. Following lists out those
limitations.
• The user (or the victim) must have a valid login session with corresponding backend
server.
• The corresponding backend server must use cookies to protect its APIs.
• The attacker must somehow make the user clicks on the link.
The API corresponding to the operation the attacker wants to perform, in the above
example, accepts inputs as query parameters. Can we prevent a CSRF attack by not accepting
query params as inputs to perform such operations, rather expect the HTTP body to carry the
inputs? If the API does not accept query params as inputs, then the attacker won’t be able to
send a link as below to the victim and make them click on it.

https://api.mybank.com/transfer?from=101090909090&to=121391939890&amount=5000

But, would that make the attacker’s job any harder? No. The attacker can host its own web
page, and make the user clicks on link pointing to that web page, as shown below.

https://app.evil.com/index.html

Then, under the above index.html, you can add HTML code as shown below, which can do
an HTTP POST to the corresponding API, which carries the required input parameters in the
HTTP body and will execute the API request successfully.

<form name="bank" id="bank" action=" https://api.mybank.com/transfer" method="POST">


<p><input type="hidden" name="from" value="101090909090" /></p>
<p><input type="hidden" name="to" value="121391939890" /></p>
<p><input type="hidden" name="amount" value="5000" /></p>
</form>

<script type="text/javascript">
window.onload=function(){
function submitform(){
document.forms["bank"].submit();
}
}
</script>

If you protect your business APIs with OAuth 2.0 tokens instead of cookies, then you can
prevent a CSRF attack against your business APIs. With OAuth 2.0, you pass the access token
in the HTTP Authorization header, and that prevents an attacker from executing a CSRF attack
(figure 10.5).
How about the attacker hosts a website, which will perform prompt=none call to the
authorize endpoint of the OpenID provider; and then gets the authorization code, and then
exchanges the authorization code to an access token and an ID token? If the attacker can do
it, then it can use the access token it got from the OpenID provider to talk to the business
APIs. The figure 10.5 shows the flow of events related to this scenario.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


228

Figure 10.5 The request will not carry the OAuth token in the HTTP Authorization header by default. So, it will
be rejected by the web server..

As per the figure 10.3, in step-5 and step-6, when the browser does a direct call to the
authorize and token endpoints of the OpenID provider, the browser will only permit those calls,
if both the attacker’s website domain and the OpenID provider’s domain are the same or, you
have enabled cross-origin resource sharing (CORS) policy at the OpenID provider to let the
attacker’s website domain to do direct HTTP requests to the OpenID provider. So, having a
restrictive CORS policy at the OpenID provider, helps you prevent this attack.

10.6.2 Fixing a session by passing an authorization code belongs to the attacker


with CSRF
In this section you’ll learn how an attacker can fix the session on the victim’s browser with a
CSRF attack and then how to prevent such an attack.
In the section 10.6.1 you learnt how a CSRF attack works, and we can prevent such an
attack by protecting the business APIs with OAuth 2.0 tokens. However, an attacker can still
execute a CSRF attack to fix the session on the victim’s browser. Session fixation is a common
type of an attack on the web. With session fixation, the attacker can inject their session
identifier to the victim’s session; so in that case the victim unintentionally acts as the
attacker. For example, if an attacker fixes your session on www.amazon.com; then when you
place an order; you in fact place that order under the attacker’s Amazon account. So, the
attacker can login later and change the billing address to a different one. The figure 10.6
shows how a session fixation attack works.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


229

Figure 10.6 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.

With OpenID Connect, an attacker can try to execute a session fixation attack, with the
help of a CSRF vulnerability in the OpenID Connect implementation (see figure 10.7).

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


230

Figure 10.7 The attacker first logs into the OpenID provider, but will block redirect to the web server and
extract the authorization code.

In the step-1 of the figure 10.7, the attack first logs into the OpenID provider with their
own account. This is a perquisite for any kind of a session fixation attack; where the attacker
and the victim should have accounts with the same OpenID provider. In step-2, after the
OpenID provider verifies the attacker’s credentials, it will issue an authorization code. This
authorization code goes to the corresponding web application via the browser. In step-3, the
attacker prevents the authorization code from getting to the website, copies the redirect URI
with all the parameters (which also includes the authorization code); and in step-4 executes a
CSRF attack by sending the copied URI to the victim.
In step-4, the victim clicks on the link, and then the website exchanges the attacker’s
authorization code to an access token and an ID token, and now the victim has logged into the
website as the attacker. In other words, the victim’s session is now fixed.

10.6.3 Preventing the session fixation attack with the state parameter
How do we prevent this session fixation attack? To prevent the session fixation attack, we need
to fix the CSRF vulnerability in our OpenID Connect implementation. There are two ways to do
that. One is to use the state parameter and the other is to use PKCE (which we discussed in
section 10.5.2). The figure 10.8 shows, how to use the state parameter to avoid a possible
CSRF vulnerability. You learnt about the state parameter in chapter 3, and its recommended,
but yet an optional parameter in the OpenID Connect authorization code flow.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


231

Figure 10.8 The client application includes a random, nonguessable string as the value of the state parameter
and also stores the same value on the browser.

In the step-1 in figure 10.8, the client application before it starts the login flow generates
a random value as the state parameter. Also, it’ll add the value of the state parameter to
the browser session. For a server-side web application, this can be added as cookie, and for a
single-application you can store the value of the state parameter in the session storage. In
step-5, when the client application gets back the authorization code, it also gets back the same
state parameter it added to the request in step-1, in the response. Now, the client application
must make sure, the value it stored in the browser session exactly matches with the value of
the state parameter it got form the OpenID provider. If those do match, then client application
can proceed to the code exchange, otherwise it should return an error.
How does the above approach with the state parameter fix the session fixation attack?
The URL the attacker copies from their own login flow, would be something similar to the
following. Here the value of the code is corresponding to the attacker’s account and also the
value of the state parameter is corresponding to the state value stored in the attacker’s
browser.

https://app.example.com/redirect_uri?code=YDed2u73hXcr783d&state=Xd2u73hgj59435

When the attacker passes the above URL to the victim and when the victim clicks on it, the
client application will check the value of the state parameter in the request matches with
value of the state parameter in the victim’s browser. That check will fail because, either the
victim’s browser session does not have a state value, or it won’t match with the state value
of the attacker. So, the session fixation effort by the attacker fails.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


232

10.6.4 Preventing the session fixation attack with PKCE


In the section 10.6.3 we suggested two approaches to prevent a session fixation attack and
discussed how to use the state parameter. In this section you’ll learn how PKCE (see section
9.5.2) helps preventing the session fixation attack. The figure 10.9 shows how PKCE prevents
a session fixation attack.

Figure 10.9 Using PKCE the client application can prevent a session fixation attack. The client generates a
code_challenge and a code_verifier before sending the login request to the OpenID provider, and later users
the code_verifier, when exchanging the authorization code to an access token.

In step-1 in figure 10.9, the client application on the attacker browser generates the
code_verifier parameter; and however, when the victim clicks on the link the attacker
provided, and the client application tries to exchange the authorization code provided the
attacker, that request will fail, because, the client application does know the code_verifier
corresponding to the attacker’s authorization code. So, the session fixation effort by the
attacker fails.

10.7 Identity provider mix-up attack on an application using more


than one identity provider
In 2016, Daniel Fett, Ralf Küsters and Guido Schmitz did a research on OAuth 2.0 security and
published the paper, “A Comprehensive Formal Security Analysis of OAuth 2.0”

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


233

(https://arxiv.org/pdf/1601.01229.pdf). The identity provider mix-up is one of the attacks


highlighted in their paper. The figure 10.10 shows how the identity provider mix-up works.

Figure 10.10 The attacker intercepts the communication between the browser and the web server, and update
the OpenID provider the user picked to a one that is under the attacker’s control.

As per figure 10.10 the client application provides login options with multiple OpenID
providers; let’s say with op.foo.com and op.bar.com. Both these OpenID providers are not
under the control of the client application. In step-1 of figure 9.10, the victim picks op.foo.com
from the browser and the attacker intercepts the request and changes the selection to and
op.bar.com. Here we assume the communication between the browser and the client
application is not protected with TLS. The OpenID Connect specification (or the OAuth 2.0 RFC)
does not talk about how to protect the communication between the browser and the client
application. It is outside the scope of OpenID Connect and it’s purely up to the web application
developers to decide what to do. Since there is no confidential data passed in this flow,
sometimes the web application developers may not be worried about using TLS. At the same
time, there were few vulnerabilities discovered over the past on TLS implementations. So, an
attacker could possibly use such vulnerabilities to intercept the communication between the
browser and the client application even if TLS is used.
The client application only gets the modified request from the attacker, who intercepted
communication. So, in step-2 the client application thinks the victim picks op.bar.com and
redirects the user to op.bar.com. In step-3 the attacker again intercepts the redirection from
the client application and modifies the redirection to go to the op.foo.com. The way redirection
works is, the web server (in this case the client application) sends back a response to the
browser with a 302 status code — and with a Location HTTP header. If the communication
between the browser and the client application is not on TLS, then this response is not
protected, even if the Location HTTP header contains an https URL. Since we assumed already,

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


234

the communication between the browser and the client application can be intercepted by the
attacker, then the attacker can modify the Location HTTP header in the response to go to the
idp.foo.com, which is the original selection by the victim.
In step-4 the client application gets the authorization code and now will talk to the
idp.bar.com to validate it. Since the client application supports multiple OpenID providers, how
does it know given an authorization code, the corresponding OpenID provider is? Just looking
at the authorization code, the client application cannot decide to which OpenID provider the
code belongs to. So, here we assume the client application tracks the OpenID provider by
some session variable (or using the session storage).
In step-5, the idp.bar.com gets hold of the victim’s authorization code from the
idp.foo.com. So, in the case of a single-page application, which is a public client, the
idp.bar.com can exchange the authorization code to an access token and use that to access
business APIs on behalf the victim.
There are no records that the this attack being carried out in practice, but at the same time
we cannot totally rule it out. One way to prevent such an attack is to have separate redirect
URIs by each OpenID provider. With this the client application knows to which OpenID provider
the corresponding authorization code belongs to and helps preventing the OpenID provider
mix-up.

10.8 Summary
• An attacker can exploit a vulnerability in the TLS communication and then read the
value of a token. Padding Oracle On Downgraded Legacy Encryption (POODLE), Browser
Exploit Against SSL/TLS (BEAST), Compression Ratio Info-leak Made Easy (CRIME),
Browser Reconnaissance and Exfiltration via Adaptive Compression of Hypertext
(BREACH), Heartbleed are some of the popular vulnerabilities found in TLS
implementations in the past.
• An attacker can fool a user to install a malware as a browser plugin, and it could easily
bypass all your browser level security protections, even the TLS, to get a hold of the
tokens passed from the OpenID provider to the client application.
• The IETF OAuth working group introduced a new RFC called, Proof Key for Code
Exchange (PKCE) by OAuth Public Clients
(https://datatracker.ietf.org/doc/html/rfc7636) to mitigate authorization code
interception attacks.
• The goal of a cross-site request forgery (CSRF) attack is to trick the user (or the victim)
to submit a malicious request unintentionally.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


235

React fundamentals

React is an open source JavaScript library developed by Facebook for developing user
interfaces for web based applications. Over the time it has become one of the most popular
JavaScript libraries, used by many major companies such as Facebook, Netflix, Instagram,
AirBnB, Medium, Twitter, and many more. In this appendix we discuss React fundamentals,
and everything you need to know to follow the samples in Chapter 2 of the book. React
heavily uses features introduced in modern JavaScript, which is also called Next-Gen JS or
ES6+. So in this appendix you will also learn some of the new features introduced in modern
JavaScript.
If you’re interested in understanding JavaScript in detail, we recommend The Definitive
Guide: Master the World's Most-Used Programming Language (O'Reilly Media, 2020) by
David Flanagan or Eloquent JavaScript: A Modern Introduction to Programming (No Starch
Press, 2018) by Marijn Haverbeke. If you’d like to learn React in detail, we recommend the
book Learning React: Modern Patterns for Developing React Apps (O'Reilly Media, 2020) by
Alex Banks and Eve Porcello. Also, the book React Hooks in Action (Manning, 2020) by John
Larsen is another good book to learn React in detail.

A.1 Running JavaScript


To run a program written in JavaScript you need to have a JavaScript engine. Any web
browser such as Chrome, Firefox, Internet Explorer, Safari embeds a JavaScript engine.
Chrome browser, for example, embeds the V8 JavaScript engine (https://v8.dev/). There are
many other JavaScript engines and you can find a list of JavaScript engines from here:
https://en.wikipedia.org/wiki/List_of_ECMAScript_engines. 1
Even though almost all web browsers embed JavaScript engines, you don’t always need a
web browser to run JavaScript; you only need a JavaScript engine. Node.js

1 JavaScript engines are different from browser engines. For example, Google Chrome browser uses the Blink browser engine with V8 JavaScript engine,
and Safari uses WebKit browser engine with JavaScriptCore JavaScript engine.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


236

(https://nodejs.org/), for example, is a JavaScript runtime that embeds V8 JavaScript


engine. You can run a JavaScript program on Node.js with no web browser. Also, JavaScript
is not just about beautifying frontend web applications running on a web browser, you can
build APIs, microservices with JavaScript. In other words, you can use JavaScript to develop
both client-side and server-side applications. Get Programming with Node.js (Manning, 2019)
by Jonathan Wexler is a very good reference to learn Node.js in detail and learn how to
develop server-side applications with JavaScript.
React, which we discuss in this appendix, only focuses on building client-side applications
with JavaScript.

A.2 What’s new in JavaScript (ES6)?


Netscape together with Sun Microsystems introduced JavaScript in 1995, which was initially
born at Netscape and called Mocha. In 1997, European Computer Manufacturer’s Association
(ECMA) International standards organization took control of standardizing the development
of JavaScript. The standard defined by ECMA is called ECMAScript, while JavaScript is an
implementation of the ECMAScript. However, JavaScript is not the only implementation of
ECMAScript. ECMA released the first edition of the ECMAScript in June, 1997, and we call
JavaScript implementations based on ECMAScript 6th edition onward, the Next-Gen JS.
The 6th edition of ECMAScript is also known as ES6 or ECMAScript-2015 (based on the
year the standard was released). At the time of writing this book, the latest ECMAScript is
ECMAScript-2020. In this section we take you through a set of new features introduced by
ES6, which are important in understanding the React fundamentals. If you are already
familiar with ES6, then please proceed to the section A.3. If you’d like to get a deeper
understanding of all the changes introduced by ES6, the book Understanding ECMAScript 6:
The Definitive Guide for JavaScript Developers (No Starch Press, 2016) by Nicholas C. Zakas
is a great reference.

A.2.1 New keywords to declare variables


In addition to the var keyword that has been present for a while in JavaScript, ES6
introduced two new keywords to declare variables: let and const. Let’s go through few code
samples to understand the differences in var, let and const.
In the following code block we define a JavaScript function called testVar, which declares
a variable called num. This function takes a Boolean value as an input parameter and if the
value is true, the code running within the if-block of the testVar function re-declares the num
variable and assigns it the value 15, which will override the value of num variable defined at
the function level from 10 to 15.
A variable defined at the function level is also called a local variable. When you invoke the
testVar function with the value true, then the following program will print 15 and when you
invoke the same with value false, the program will print 10.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


237

function testVar(override) {
var num = 10;
if (override) {
var num = 15;
}
console.log("value of num ",num);
}

testVar(true); // prints 15
testVar(false); // prints 10

Here you can see we have defined the variable num in two places: at the level of the testVar
function and within the if-block. If you look carefully at the output of the program, you will
realize that the value of the variable num defined within the if-block, overrides the value of
the num variable defined at the function level. In other words, the scope of any variable
defined with the var keyword in JavaScript is not just limited to the code block that contains
it. It’s value is visible within the function that contains the variable declaration. To be more
precise a variable declared with the var keyword within a function directly or inside another
code block, is visible to anyother code block within the same function. So, any code block
within the function can override the value of any variable declared with var at the function
level or within a clode block inside the function.
If a variable is defined outside of a function using the var keyword (also known as a
global variable), then still within a function you can refer that variable. However if you re-
declare a variable within a function with the same name as of a global variable, JavaScript
treats it as a new variable. So, within the function if you want to refer the global variable,
which carries the same name as of the local variable, then you need to you need to access
that variable via the global object, for example as globalThis.num. So, if you re-declare a
new variable with the same name within a function, any changes you do to your local
variable won’t affect the global variable. Following code block demonstrates the above
explanation.

var num = 25

function testVar() {
var num = 10;
console.log("value of num ", globalThis.num); // prints 25
console.log("value of num ", num); // prints 10
}
testVar();
console.log("value of num ",num); // prints 25

In the following code block, let’s rewrite the testVar function by using the let keyword.
When you use let to define a variable, that makes the scope of the corresponding variable
only visible within the block that defines the variable. In the following code block, for
example, the value of the num variable defined at the function level remains unchanged even
after we execute the if-block.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


238

function testVar(override) {
var num = 10;
if (override) {
let num = 15;
}
console.log("value of num ",num);
}
testVar(true); // prints 10
testVar(false); // prints 10

The behavior of const keyword is quite similar to the behavior of let keyword, except, you
cannot override the value of any variable defined as const. Also, unlike let, when you
declare a variable using the const keyword, you also must specify a value to it at the time of
the declaration. The following code block shows the behavior of the const keyword. Here we
define the function level num variable as a constant. However, even if we redeclare another
variable under the same name (num) using the let keyword within the if-block, it will have
no impact on the function level constant.

function testVar(override) {
const num = 10;
if (override) {
let num = 15;
}
console.log("value of num ",num);
}
testVar(true); // prints 10
testVar(false); // prints 10

Finally, there is one more important difference between the variables declared with the var
keyword, and the variables declared with let and const keywords. When you declare a
variable using the var keyword, you can refer to that variable from any location within the
function; it can be before or after the declaration. This is possible due to a feature in
JavaScript called hoisting. All the variable declarations with the var keyword are hoisted to
the top of the corresponding function. But, when you use let and const keywords, those
variable declarations are not hoisted, so you can refer those variables only after the
corresponding variable declaration.

A.2.2 JavaScript functions recap


In section A.2.1 you saw how to define a function in JavaScript. In this section we’ll do a
quick recap of JavaScript functions. JavaScript introduced everything we discuss in this
section prior to ES6 and in section A.2.3 we’ll discuss the key features ES6 introduced to
JavaScript functions.
When you use the keyword function to define a function, those function declarations are
hoisted to the top of JavaScript program behind the scene. So, you can refer such a function
from anywhere in your code. It can be before or after you declare a function.

NESTED FUNCTIONS
A nested function is a function that is defined within another function. JavaScript supports
nested functions. The following code block shows an example of a nested function, which is

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


239

also called an inner function. The variables defined within the outer function are visible to the
inner function.

function foo() {
let num = 100;
function bar() { // this is a nested function
console.log(num); // prints 100
}
bar(); // calls the nested function
}
foo();

FUNCTION EXPRESSIONS
You can also define a function in JavaScript as an expression, with or without a name. The
following code block shows an example of a function expression. Here we assign the function
expression to a constant called greet, so the definition of the function cannot be overriden
later. Also, in this example, we don’t specify a name to the function.

const greet = function(name) { return "Hi " + name; }


greet("Peter"); // prints Hi Peter

Let’s have a look at another example of a function expression. The function expression
shown in the following code block carries a name. Even though this function has a name,
you’ll be able to use it only within the same expression itself. You cannot use the name of
the function to invoke it outside of the function expression. In most of the cases having a
name for a function is useful when you want to invoke same function recurssively within the
function itself.

const g = function greet(name) { return "Hi " + name; }


g("Peter"); // prints Hi Peter
// greet("Peter"); // Uncaught ReferenceError: greet is not defined

There is a major difference in a function you declare using the function keyword and a
function expression. As discussed at the beginning of this section, when you use the keyword
function to define a function, those function declarations are hoisted to the top of
JavaScript program behind the scene, but when you define a function as an expression, you
can invoke that function only after the corresponding expression got executed. JavaScript
does not hoist function expressions.

HIGHER-ORDER FUNCTIONS
In JavaScript a function can accept another function as an input parameter, as well as return
another function. A function that accepts or returns another function is called a higher-order
function. In the following code block, let’s have a look at an example. Here the sendMessage
is a higher-order function that accepts two string arguments and a function. The one who
invokes the sendMessage function decides how the communication should happen with the
recipient and based on that passes the corresponding function as the messenger.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


240

const message = function (recipient, msg, messenger) {


console.log("sending a message to " + recipient);
// some more work if required;
messenger(recipient, msg);
}

const sms = function (recipient, msg) {


// find the mobile number of the recipient and send the message
console.log("sending an SMS");
}

const email = function (recipient, msg) {


// find the email address of the recipient and send the message
console.log("sending an email");
}

message("peter","fun with JS", email);


message("john","fun with JS", sms);

A.2.3 Arrow functions


ES6 introduced arrow functions as a more compact way to define a function in JavaScript.
Let’s try to rewrite higher-order function example we discussed in A.2.3 using arrow
functions in the following code block.

const message = (recipient, msg, messenger) => {


console.log("sending a message to " + recipient);
// some more work if required;
messenger(recipient, msg);
}

const sms = (recipient, msg) => {


// find the mobile number of the recipient and send the message
console.log("sending an SMS");
}

const email = (recipient, msg) => {


// find the email address of the recipient and send the message
console.log("sending an email");
}

message("peter","fun with JS", email);


message("john","fun with JS", sms);

As you might have already guessed, we don’t use the function keyword to define an arrow
function. Also, there no need to have a name for the function. Prior to the arrow in the arrow
function we defined the input parameters to the function within parentheses. If we have only
input parameter, then we don’t need to have parentheses. After the arrow, is the body of the
function. If there are multiple statements in the body of the function then we need to have
the function body with a pair of curly braces, but if the function has only one return
statement, then we don’t need to curly braces, and also can skip the return keyword.
Following code block shows an example of an arrow function that accepts a number as an
input parameter and has one statement that returns the square of the provided number.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


241

const sqrt = n => n*n;

console.log(sqrt(4)); // prints 16

If an arrow function does not accept any input parameters, then it must have an empty
parentheses as shown in the following code block.

const hello = () => "Hello ES6";

console.log(hello()); // prints Hello ES6

A.2.4 Default values for arguments of a function


With ES6, when you define a function, you can mark certain arguments optional and define
default values for those when the caller decides not to provide any. Let’s have a look at a
code example. In the following code block, the sqrt arrow function accepts an optional
parameter with the default value 2. So, when you invoke it without passing any value, the
function will return the square root of 2, which is 4.

const sqrt = (n=2) => n*n;

console.log(sqrt()); // prints 4
console.log(sqrt(4)); // prints 16

The default value of an optional parameter of a function can also be an expression. Let’s
revisit the code example we had section A.2.3. In the following code block we make the
messenger an optional argument and sets its default value to email, which points to another
function.

const message = (recipient, msg, messenger=email) => {


console.log("sending a message to " + recipient);
// some more work if required;
messenger(recipient, msg);
}

const sms = (recipient, msg) => {


// find the mobile number of the recipient and send the message
console.log("sending an SMS");
}

const email = (recipient, msg) => {


// find the email address of the recipient and send the message
console.log("sending an email");
}

message("peter","fun with JS");


message("john","fun with JS", sms);

You can also define a default value to the messenger argument in the following way. Here
we define a function for the messenger argument as an arrow function inline. This is a good
example to use an arrow function because of its compact nature.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


242

const message = (recipient, msg,


messenger= (recipient, msg) =>
{
console.log("sending an email");
}) => {
console.log("sending a message to " + recipient);
// some more work if required;
messenger(recipient, msg);
}

message("peter","fun with JS");

A.2.5 Template literals


Template literals is a new feature introduced in ES6 that lets you embed JavaScript
expressions into a string. The string that embeds the JavaScript expression should start and
end with the backtick character (`), and any JavaScript expression within the two backtick
characters must start with ${ and end with }. The following code block shows an example.

const hello = () => "Hello ES6";


const helloName = name => `Hello ${name}`;

// prints Say Hello ES6 Hello Peter


console.log(`Say ${hello()} ${helloName("Peter")}`);

You can also use template literals along with a function name as shown in the following code
block. Any template literal applearing right next to a function name (for example right after
the function name helloName), is passed to the corresponding function as input arguments.
This type of template literals is also known as tagged template literals.

const helloName = name => `Hello ${name}`;

// prints Hello Peter


console.log(helloName`Peter`);

A.2.6 Rest operator


In JavaScript, for example, you can pass three arguments for a function that accepts two
arguments, but still the function will effectively take the first two arguments from the passed
list. Following code block illustrates that with an example.

const sum = (num1, num2) => num1 + num2;

// prints 5
console.log(sum(2,3,4));

ES6 introduced the rest operator to handle these kind of cases in a better way. You define a
rest operator with three dots (…) and it must be the last argument in the function declaration
if the function wants accept more arguments than the rest operator. Following code block
shows some valid and invalid function declarations with the rest operator.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


243

const sum = (num1, num2, ...nums) => {//function body;} // this is a valid

const sum = (...nums) => {//function body;} // this is a valid

// Uncaught SyntaxError: Unexpected token '...'


const sum = ...nums => {//function body;} // this is a invalid

// Uncaught SyntaxError: Rest parameter must be last formal parameter


const sum = (...nums, num1) => {//function body;} // this is a invalid

The following code block shows a complete example of how to use the rest operator. Here we
treat the values you get to the rest parameter (...nums) as an array and execute forEach
function on it.

const sum = (...nums) => {let tot=0; nums.forEach(num => tot+=num); return tot;}
console.log(sum (2,3,4)); // prints 9

A.2.7 Spread operator


The rest operator we discussed in section A.2.6 absorbs a set of a parametrs into a single
array, while the spread operator introduced in ES6 does the exact opposite it. The spread
operator breaks down an array or any iterable object into indivual elements. The following
code block shows how to use the spread operator to copy the elements from array to another
array.

const arr1 = [1,2,3];


const arr2 = [...arr1,4] // [1,2,3,4]

You can also use the spread operator to break an array into indivudal elements and pass
those to a function. The following code block demonstrates that use case.

const sum = (num1, num2) => num1 + num2;

const arr = [1,2];

// prints 3
console.log(sum(...arr));

A.2.8 Destructuring an object


Objects are a popular way of structuring data in JavaScript. ES6 introduced a way to
destructure an object into its attributes. The following code snippet shows an example,
where we construct an object called book first with a set of attributes and then destructure it
to a selected set of attributes.

const book = {name: "React in Action", author: "Mark Tielens Thomas", publisher:
"Manning"};

let {author} = book;


console.log(author); // prints Mark Tielens Thomas

let {name, publisher} = book;


console.log(name); // prints React in Action
console.log(publisher); // prints Manning

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


244

Let’s go through another example. In the following code snippet we do object destructuring
for the input parameters of an arrow function.

const book = {name: "React in Action", author: "Mark Tielens Thomas", publisher:
"Manning"};

const printAuthorName = ({author}) => console.log(author);


printAuthorName(book);

A.2.9 Modules
ES6 introduced the first class support for modularity in JavaScript with import and export
statements. In this section we discuss how ES6 modules work in a browser environment. We
do not intend to discuss modules in detail here, but we’ll teach you just enough information
to get started with React. In section A.7 we discuss modules again with respect to React.
Prior to ES6, JavaScript followed the principle shared-everything. All the variables and
functions you define under different script tags of an HTML page are shared among each
other. You can’t restrict access to a given JavaScript function only for the other functions
defined within that particular script tag. The following code listing (listing A.1) illustrates
this with an example.
If you open up the code in the following listing in a web browser you’ll find that number
100 is printed on the console (You can also access the same via
https://prabath.s3.amazonaws.com/a1.html). That confirms that JavaScript code defined
under one script tag has access to the JavaScript code defined under another script tag.
This would still be the case even if you load JavaScript under each script tag from a URL
using src attribute (rather than defining it inline as in listing A.1).

Listing A.1 Inline JavaScript code sharing everything


<!DOCTYPE html>

<body>
<script>
let x = 10;

function getSqrt(num) {
return num*num;
}
</script>

<script>
console.log(getSqrt(x)); // prints 100
</script>

</body>

With the modularity concept introduced in ES6, JavaScript code defined under each script
tag is treated as an independent module, when we set the type attribute of the script tag
to module. If you open up the following code snippet (listing A.2) as an HTML file on your
browser, you find the error Uncaught ReferenceError: getSqrt is not defined on the

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


245

console (You can also access the same via https://prabath.s3.amazonaws.com/a2.html).


That’s because now the JavaScript engine running on the browser treats each script tag as
an independent module. All the functions, variables and classes defined under these script
tags are now considered private to the enclosing script tags

Listing A.2 JavaScript modules defined inline


<!DOCTYPE html>

<body>
<script type="module">
let x = 10;

function getSqrt(num) {
return num*num;
}
</script>

<script type="module">
console.log(getSqrt(x)); // Uncaught ReferenceError: getSqrt is not defined
</script>

</body>

If you want to refer a function defined within one script tag (or a module) from another script
tag (or a module), first you need to export it from the module that defines the function and
import the same from the module that wants to access it. You can import a function, only if
the module that defines that function exports it. Let’s go through an example in the following
code snippet. Here we have externalized the module that defines the getSqrt function in to
an independent JavaScript file called math_mod.js. You cannot import a function from an
inline module, and that’s why we have to define the getSqrt function in math_mod.js file.
The following code snippet shows the content from the math_mod.js file.

Listing A.3 A module defined in the math_mod.js file


// math_mod.js
export let x = 10;

export function getSqrt(num) {


return num*num;
}

The following code snippet shows the HTML page content that loads a module from the
math_mod.js file.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


246

Listing A.4 Loading module defined in math_mod.js file


<!DOCTYPE html>

<body>
<script type="module" src="./math_mod.js"></script>

<script type="module">
import {x, getSqrt} from "./math_mod.js";
console.log(getSqrt(x)); // prints 100
</script>

</body>

If you try to save the above code snippet into an HTML file and try to open it up in a
browser, you will notice an error message on the console. Even though normal script tags
support loading a JavaScript source from the local filesystem, when you have module as the
type of the script tag, the browser makes sure Cross-Origin Resource Sharing (CORS)
rules are enforced. We discuss CORS in chapter 3, and cross origin requests are not
supported for the files loaded from the local file system. So, if you’d need to test this use
case, you would need to host both the math_mod.js file and HTML file in a web server.
We’ve uploaded both these files into Amazon S3, which you can access from
https://prabath.s3.amazonaws.com/a4.html. Once you open this file, you will see 100
printed on the console.

A.3 Getting started with React


In this section we’ll write our first program in React. We assume either you have a good
understanding of ES6 or have gone through the section A.2. You can find all the samples we
use in this appendix under the appendix-a directory of the GitHub repository available at
https://github.com/openidconnect-in-action/samples.
The following code listing shows one of the simplest React programs you can write. You
can find same inside the appendix-a/sample01/public/index.html file. To see the output of
the program, you can open the index.html file using your favorite web browser (Chrome,
Firefox, Safari, Internet Explorer and so on).

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


247

Listing A.5 A simple React program embedded into an HTML page


<!DOCTYPE html>
<head>
<title>Book Club</title>
<script src="https://unpkg.com/react@16/umd/react.development.js"></script>
<script src="https://unpkg.com/react-dom@16/umd/react-dom.development.js"></script>
<script src="https://unpkg.com/babel-standalone@6/babel.min.js"></script>
</head>

<body>
<div id="book-club-app" />
<script type="text/babel">
const App = () => (
<h1>Welcome to the Book Club!</h1>
);
ReactDOM.render(<App />,
document.getElementById("book-club-app"));
</script>
</body>

</html>

The React program in listing A.5 produces the following output in figure A.1, when you open
the index.html file using a web browser.

Figure A.1: The output of the React program on listing A.5

Congratulations! You have successfully run your first React program. Now let’s break down
the code in listing A.5 to multiple sections.
First let’s focus on the code inside the script tag within the HTML body tag. You can find
the same in the following code listing. As you might have already guessed rightly, here we
have defined an arrow function (see section A.2.3), which takes no arguments, and assigned
it to a constant with the name App. The objective of this arrow function is to construct and
return a React element, and we call this function a React component or to be precise a React
function component. A valid React function component accepts zero or one input parameters.
In the following listing it accepts no parameters, however in section A.5 we have an example
that accepts one input parameter.

Listing A.6 Defining a React element with JSX


<script type="text/babel">
const App = () => (
<h1>Welcome to the Book Club!</h1>
);
</script>

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


248

The code inside the body of the arrow function (listing A.6) looks very similar to HTML, but it
is not. It’s written in JSX, which is an extension to JavaScript. JSX allows us to define React
elements in a way that, they look very similar to HTML. So, you don’t need to learn much
about JSX syntax, if you are already familiar with HTML. However, browsers do not
understand JSX. So, we use the Bable (https://babeljs.io) JavaScript library to compile the
JSX code into JavaScript that the browsers understand. To use Babel on the browser we need
to do two things.
• Load the Bable JavaScript library

<script src="https://unpkg.com/babel-standalone@6/babel.min.js"></script>

• Embed the JSX code with a script tag with the type text/babel

<script type="text/babel"></script>

Once the App function constructed the React element, we need to talk to the ReactDOM API
to add the following code line, which you also can find in the listing A.5, to add the React
element to the browser’s DOM. Here we first find the book-club-app element in our HTML
code (which is an HTML div tag) with the JavaScript function document.getElementById
and then add the App element to it.

ReactDOM.render(<App />,document.getElementById("book-club-app"));

To use the ReactDOM API from the browser, we need to load the ReactDOM JavaScript
library using the following script tag.

<script src="https://unpkg.com/react-dom@16/umd/react-dom.development.js"></script>

We’ve talked about all the key elements in listing A.5 except the following script tag.

<script src="https://unpkg.com/react@16/umd/react.development.js"></script>

This loads the React JavaScript library that helps the browser to understand React element.
However, in the code listing A.5 we have not defined any React elements. But yet, we need
to load this React JavaScript library because, when Bable compiles JSX code to JavaScript,
that will result in React element.
In the rest of this appendix we’ll discuss how to improve this simple React program in
steps to build a production-grade React application. In doing that we’ll introduce you to some
important concepts in React.

A.4 Working with multiple React components


In a typical React application, there will be one core component, and that component’s
responsibility is to load and layout the rest of the React components on the HTML page. In
this section you’ll learn how one React component can load another React component.
In the following listing we use <App /> as our core React component, and it loads the
<Books /> element. Just like <App /> element, the <Books /> element too defined using an
arrow function. The following listing only shows React code that will be compiled by Babel.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


249

You can find the complete HTML page in appendix-a/sample02/public/index.html. To see the
output of the program, you can open the index.html file using your favorite web browser.

Listing A.7 The App React element loads the with Books element
<script type="text/babel">

const App = () => (


<div>
<h1>Welcome to the Book Club!</h1>
<Books />
</div>
);

const Books = () => (


<div>
<table>
<tr>
<td>React in Action</td>
<td>Mark Tielens Thomas</td>
</tr>
<tr>
<td>React Hooks in Action</td>
<td>John Larsen</td>
</tr>
<tr>
<td>React Native in Action</td>
<td>Nader Dabit</td>
</tr>
</table>
</div>
);

ReactDOM.render(<App />, document.getElementById("book-club-app"));


</script>

A.5 Passing messages among components


In this section we are extending the example we had under listing A.7 in section A.4 to see
how we the communication happens among multiple React components, or in other words
how one React component can pass one or more parameters to another React component.
In the following code listing, the arrow function that builds the <App /> element accepts
props as an input parameter. The name of this input parameter necessarily needs not to be
props, it can be any valid parameter name, and however, as a practice we call it props, as it
represents a set of properties. The following listing only shows React code that will be
compiled by Babel. You can find the complete HTML page in appendix-
a/sample03/public/index.html.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


250

Listing A.8 The App React element accepts a set of properties from the caller
<script type="text/babel">

const App = (props) => (


<div>
<h1>Welcome to the Book Club, {props.name}!</h1>
<Books />
</div>
);

const Books = () => (


<div>
<table>
<tr>
<td>React in Action</td>
<td>Mark Tielens Thomas</td>
</tr>
<tr>
<td>React Hooks in Action</td>
<td>John Larsen</td>
</tr>
<tr>
<td>React Native in Action</td>
<td>Nader Dabit</td>
</tr>
</table>
</div>
);

ReactDOM.render(<App name="Peter"/>,document.getElementById("book-club-app"));
</script>

The following code line that adds the <App /> element into the browser DOM, defines under
the <App /> element, which attributes it needs to pass to the App function component. The
props input parameter the App function component accepts in fact a JavaScript object, and
it is populated from the attributes defined under the corresponding element. In this case the
props will have a property called name that carries the value Peter.

ReactDOM.render(<App name="Peter"/>,document.getElementById("book-club-app"));

If you open up the file appendix-a/sample03/public/index.html file using your favourite


browser, you’ll see the following output (figure A.2).

Figure A.2 The output of the React program on listing A.4

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


251

Let’s have a look at how the App function component reads the values from the props input
parameter. As shown in the following code snippet, to refer to any JavaScript from JSX code,
you need to have that within curly braces. Here we read the value of name attribute defined
in th <App /> element as props.name.

<h1>Welcome to the Book Club, {props.name}!</h1>

There is another, better way of reading values from the props input parameter. To do that
we need to change the function definition of the App component in the following way. Here
we destructure (section A.2.8) the props object and specify the exact names of the
attributes we expect within curly braces, and in the JSX code directly refers to an attribute
using the corresponding name.

const App = ({name}) => (


<div>
<h1>Welcome to the Book Club, {name}!</h1>
<Books />
</div>
);

A.6 Managing state for React components


In this section you’ll learn how to manage state of a React component. In the section A.5 we
discussed how to pass properties from one React component to another. However, properties
are read-only and cannot be used to update the value of an attribute of a React component
throughout the lifetime of the component. A property object is constructed at the time we
render the corresponding React component, with all its attributes (see the following code
snippet). Once the property object is constructed it cannot be modified.

ReactDOM.render(<App name="Peter"/>,document.getElementById("book-club-app"));

The lifetime of a React component begins at the time you load it to the browser, with the
ReactDOM.render method, and ends when you reload the application, probably by refreshing
the browser or calling the ReactDOM.render method again. You may recall from our previous
examples, we don’t call ReactDOM.render method to explicilty render all the React
components. We explicitly call ReactDOM.render method to load our core React component
(for example <App />) and other React components are loaded by the core component
implicitly.
Let’s say we want to build a React component that keeps asking the user to type a
number and shows the sum of the numbers typed by the user, one after the other. A user
types 2, for example, and the React component adds 2 to 0 and displays 2. Then the user
types 4 and the React component adds 4 to 2 (the previous sum) and displays 6. This is a
stateful React component and it has to maintain the state of the sum between the user
interactions.
To manage state in a React component we use hooks. React introduced hooks from React
version 16.8 onward. As the name implies a hook in React helps you hook up additional
functionality into your React function components to make your life easier. In other words, a
React hook is a function that wraps some functionality, which you can use from your function

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


252

component. You can use the useState React hook, for example, to manage state of your
React component. There are many other React hooks and we’ll discuss some of them later in
this appendix.
In the following code listing we demonstrate how to use useState React hook within a
function component. The following listing only shows React code that will be compiled by
Babel. You can find the complete HTML page in appendix-a/sample04/public/index.html.

Listing A.9 Using useState React hook to manage state of a React component
<script type="text/babel">
const App = () => {
const [sum,setSum] = React.useState(0);
return (
<div>
<h1>Sum of Numbers: {sum}</h1>
<input type="text"
onKeyPress={event=>event.key=="Enter" ?
setSum(parseInt(sum)+parseInt(event.target.value)):undefined} />
</div>
);
}

ReactDOM.render(<App sum='0' />,document.getElementById('number-sum-app'));


</script>

The most import code line in the listing A.9 is the following code snippet that calls the
useState function. The response from the useState function is destructured into two
constants. The setSum constant is a function we can use to update the value of the constant
sum. Since sum is a constant, we cannot directly update its value, and must use the setSum
function all the time. When invoking the useState function, we can pass the initial value of
the sum constant, and here we pass 0.

const [sum,setSum] = React.useState(0);

Let’s have a look at the following code snippet that invokes the useState function when we
type a number in the input box and press the Enter key. Here we define an arrow function
inline, for the onKeyPress event of the input box. Since this event being triggered for all the
key, we need to first check whether key pressed is the Enter key, and then we read the
value of the sum constant from the state, parse it as an integer, add the number type in the
input box and call setSum function to update the state of the sum. That’s all we need to do
and React will automatcially update the value of sum in all the places within that React
component that refer the sum constant.

<input type="text"
onKeyPress={event=>event.key=="Enter" ?
setSum(parseInt(sum)+parseInt(event.target.value)):undefined} />

A.7 Organizing a React application


All the React applications we developed so far embeded the HTML code, the React
components and other JavaScript files into a single HTML file, and also Bable compiled JSX

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


253

code into React elements (or the JavaScript the browser understands) while loading the
HTML page into the web browser. This is not an optimal way of developing a React
application and also not the way how you develop a React application in a production
enviornment. However, we followed that approach in all the examples so far, because, that
helps you understand how React works. In this section we’ll teach you how to organize your
React application in a modular way.
Let’s first be clear about our objectives. Ultimately we want to produce a single-page (an
HTML page), with all what we want to render our application on the browser. This is exactly
what we did in all our examples so far in this appendix. So, we need to do the same but in a
different way. Ideally we need to find a way to maintain the HTML code and the React
components independent from each other in our code repository (GitHub) and use some kind
of a tool to aggregate all the HTML code, React components and all the other JavaScript files
to build a single HTML file. The rest of this section takes you through how you can structure
a React project in a better way.

A.7.1 Decouple the code to distribute from rest of the dependencies


You can find a restructured React project inside the appendix-a/sample05 directory. Inside
the sample05 directory there are two sub directories called public and src. Inside the public
directory you can find a file called index.html. This HTML file loads a JavaScript file called
bundle.js and has an HTML div element with the id book-club-app. As you might have
rightly guessed, and as we have done so far in all our examples, we load our root React
element to the book-club-app HTML element (the div). The bundle.js is the JavaScript
file we need to produce by aggregating all the React components and related JavaScript files.
Just to reiterate, all React components are too JavaScript files (as we learnt already, React is
a JavaScript library). We need to have some kind of a tool to generate the bundle.js file.
We still don’t have the bundle.js file inside the public directory, and we’ll see how we can
generate it later in this section. When we want to distribute our React application, we only
share the two files under the public directory (the bundle.js file and the HTML file). The
following listing shows the content of the index.html file.

Listing A.10 The content of the public/index.html


<!DOCTYPE html>
<head>
<meta charset="utf-8" />
<title>Book Club</title>
</head>
<body>
<div id="book-club-app" />
<script src="bundle.js"></script>
</body>
</html>

A.7.2 Decouple the rendering code from other React components


Under the src directory, you’ll find a JavaScript file called index.js. This file uses ReactDOM to
add the root React component to the browser DOM, or in this case adds the <App /> React
component to the HTML div element with the id book-club-app. Since we get all our

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


254

JavaScript ultimately being added into the index.html file (via bundle.js file), the JavaScript
in the index.js (listing A.11) file can find the div element that we already have in the
index.html file and add the <App /> React element to it.

Listing A.11 The content of the src/index.js file


import React from "react";
import ReactDOM from "react-dom";
import App from "./components/App.js";

ReactDOM.render(<App />, document.getElementById("book-club-app"));

There are three important statements at the top of the code listing A.11. In the first import
statement we import React component from the react module. In our previous examples
(see listing A.5) we used the following script tag to do the same.

<script src="https://unpkg.com/react@16/umd/react.development.js"></script>

As you learnt at the beginning of this appendix, React is a JavaScript library. A JavaScript
library can be distributed as a JavaScript file or as a node package (there are other ways
too). In our previous examples we used a file to load the React JavaScript library from
https://unpkg.com/react@16/umd/react.development.js. We also used multiple other files to
load ReactDOM (https://unpkg.com/react-dom@16/umd/react-dom.development.js) and the
Babel (https://unpkg.com/babel-standalone@6/babel.min.js) JavaScript libraries.
However, the objective of this section is to have single script file, which is the bundle.js
file that aggregates all the JavaScript we use in our application; so we don’t need to ask the
browser to download multiple scripts from different locations. So, the tool we will introduce
later in this section to build our React application will load the react node package (a node
package is a distribution unit that has multiple JavaScript modules) from the central npm
(node package manager) registry and will add the corresponding JavaScript (from the react
node package) to the bundle.js file.
In section A.2.9 we discussed JavaScript modules, and you may recall how we imported
one module from another module. There we had to refer the module we need to load by
pointing to the corresponding JavaScript file that defines that module, as shown in the
following code.

<script type="module">
import {x, getSqrt} from "./math_mod.js";
console.log(getSqrt(x)); // prints 100
</script>

In listing A.7 we used two methods to load modules. Just as in section A.2.9, we load the
App.js module by pointing to the JavaScript file that defines it, as shown in the following
code. This import statement imports the App component from the file components/App.js.
You can find App.js file under sample05/src/components directory, and App.js defines a
JavaScript module and exports the App component.

import App from "./components/App.js";

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


255

We also use another type of import in listing A.7, as shown in the following code. Here we
load two modules: react and react-dom, from the npm central registry by default. To be
pecise the JavaScript engine that runs this script should understand how to find these two
modules and load them. By default if you use node as our JavaScrip engine, it tries to load
these modules from the npm central registry available at https://registry.npmjs.org. Still, if
you’d like you can instruct node to use a different registry.

import React from "react";


import ReactDOM from "react-dom";

You surely have a question now! If you use import statement like the above in our JavaScript
modules, and when they run on the browser, how does the browser knows where to find the
correponding JavaScript associated with those imported modules? At the time of this writing
the browsers do not support above type of import statements, you need to always specify
where to load the JavaScript associated with the module you want to import, in the following
way.

import App from "./components/App.js";

Now we have another question. If the browser do not support importing module just by the
name, then why did we use them in the code listing A.7. We use Babel to compile our
module code in away that is understood by the browsers, and all the code will be included in
the bundle.js file that we are going to generate shortly.

Let’s have a look at the components/App.js file (listing A.12). This file defines the App
function component and exports it. Following listing shows the content from the App.js file.
By now you should know how everything in this file works.

Listing A.12 The content of the components/App.js file


import React from "react";

const App = () => {


const [sum,setSum] = React.useState(0);
return (
<div>
<h1>Sum of Numbers: {sum}</h1>
<input type="text" onKeyPress={event=>event.key=="Enter" ?
setSum(parseInt(sum)+parseInt(event.target.value)) : undefined} />
</div>
);
}

export default App;

A.7.3 Aggregate all JavaScript code into a single file


Now we have all the code content to build our React application, and need to find a way to
aggregate all the JavaScript content to produce the bundle.js file, which is referred from the
index.html file. To do that we need to complete few more steps, as defined following. We’ll
explain what each steps mean in the following sections.
• Install Node.js

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


256

• Initialize the project with npm


• Install the node modules: react, react-dom, webpack, webpack-cli, bable-loader,
@babel/core, @babel/preset-env and @babel/preset-react
• Create a build with webpack
• Configure Babel
• Build the project with npm

INSTALLING NODE.JS
Node.js is a JavaScript runtime that runs on V8 JavaScript engine. You can download and
install Node.js, following the instructions available here: https://nodejs.org/. If the
installation is successful, the following command should return the verion of the Node.js you
installed.

\> node -v
v12.18.1

We intend to run our React application on the browser; so, why do we need another
JavaScript engine with Node.js? There are two reasons as explained below why we need to
install Node.js.

1. We are using a tool called npm to download JavaScript modules that our React
application uses, from the npm central registry. Once we download all the modules our
React application depends on, later we can aggregate all those modules together to
generate the bundle.js file. The npm tool comes with Node.js.
2. We use Babel to compile Next Gen JavaScript and JSX code into the JavaScript that
most of the browsers do understand. You may recall that in the examples we had
before, we embedded Babel as a JavaScript to our HTML code and Babel did the
compilation on the fly while loading the HTML page. That’s not an optimal way of
doing things; so, here we want Babel to do the compilation before we generate
bundle.js file. The browser should understand everything in the bundle.js file. To run
Bable we need a JavaScript engine, and we use Node.js for that.
INITIALIZE THE PROJECT WITH NPM
The node package manager or npm is a tool that comes with Node.js. If you have installed
Node.js, then you also have npm in your system. To initialize your React project, run the
following command from appendix-a/sample05 directory. Here we pass –y as an argument,
so npm will use the default settings.

\> npm init -y

Now if you look inside the sample05 directory, you will find a file with the name
package.json. Following listing shows the auto-generated content of the package.json file.
Since we passed –y as an argument to the above npm command, the content of
package.json file is generated using the default settings. The package.json file carries all the
node packages our React application depends on. We don’t see them now in the generated
file, but in the next section when we install the packages we need, using the npm tool, it will
automatically update this file.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


257

Listing A.13 The content of the generated package.json file


{
"name": "sample05",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC"
}

INSTALLING NODE MODULES


Let’s run the following command from the sample05 directory to install these node packages:
react, react-dom. These packages carry the JavaScript modules corresponding to the React
JavaScript library, which we need to run our application on the browser.

\> npm install react react-dom

After you run the above command, npm tool updates package.json file. The following listing
shows the updated file, with the newly added dependencies section, which carries
references to the react and react-dom packages.

Listing A.14 The content of the generated package.json file


{
"name": "sample05",
"version": "1.0.0",
"description": "",
"main": "webpack.config.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"dependencies": {
"react": "^16.13.1",
"react-dom": "^16.13.1"
}
}

In addition to the dependecies our React application needs at the runtime, to run on the
browser, we also need few more node packages to help the build process of the project. We
need the node packages bable-loader, @bable/core, @babel/preset-env and @babel/preset-
react to translate the Next Gen Javascript and JSX code to the JavaScript that is understood
by the browsers. And we also need the node packages webpack and webpack-cli to
aggregate all the JavaScript we need for our React application and generate the bundle.js
file. Let’s run the following npm command from the sample05 directory to install those
packages. Here we pass –save-dev argument to instruct npm that we only need these
dependencies during the development time.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


258

\> npm install webpack webpack-cli bable-loader @babel/core @babel/preset-env


@babel/preset-react --save-dev

The following listing shows the updated package.json file, with the newly added
devDependencies section.

Listing A.15 The content of the generated package.json file


{
"name": "sample05",
"version": "1.0.0",
"description": "",
"main": "webpack.config.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"dependencies": {
"react": "^16.13.1",
"react-dom": "^16.13.1"
},
"devDependencies": {
"@babel/core": "^7.11.6",
"@babel/preset-env": "^7.11.5",
"@babel/preset-react": "^7.10.4",
"bable-loader": "0.0.1-security",
"webpack": "^4.44.2",
"webpack-cli": "^3.3.12"
}
}

Apart from generating the package.json file, the npm install command also creates a new
directory called node_modules under the sample05 directory. This directory includes all the
JavaScript files corresponding to the npm packages we installed. We do not need to do any
changes to those files.

CREATING A BUILD WITH WEBPACK


As discussed in the previous section webpack helps to aggregate all the JavaScript we need
for our React application and generate the bundle.js file. To create a build with webpack we
need to create a file called webpack.config.js under the sample05 directory. The following
listing shows the content of the webpack.config.js. This file instructs webpack, the entry
point to the React application, which in this case points to the src/index.js. This helps
webpack to find all the dependencies and build a single JavaScript file with all the JavaScript
files our projects needs. The output element of the configuration defines the name of output
JavaScript file (bundle.js) and the location.

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


259

var path = require("path");

module.exports = {
entry: "./src/index.js",
output: {
path: path.join(__dirname, "public"),
filename: "bundle.js"
},
module: {
rules: [{ test: /\.js$/, exclude: /node_modules/, loader: "babel-loader" }]
}
};

To run the project build with webpack, we also need to update the package.json file under
the sample05 directory, with the following content. Here we add a new element called build
under scripts element of the package.json file. So, when we execute npm run build, the
webpack command we have in the following code will get executed.

"scripts": {
"test": "echo \"Error: no test specified\" && exit 1",
"build": "webpack --mode production"
},

CONFIGURE BABEL
To configure Babel, we need to create a file with the name .bablerc under the sample05
directory. This file instructs Babel on how to carry out the compilation process. In this file we
can define a set of Babel presents. A Babel preset brings in a set of plugins that gets
executed during the compilation process. The @babel/preset-env preset brings in all the
features you need to compile ES6 or later JavaScript. The @babel/preset-react in all Babel
plugin you need for a React application. The following listing shows the content of the
.bablerc file.

{
"presets": ["@babel/preset-env", "@babel/preset-react"]
}

BUILD THE PROJECT


Now we’ve got everything ready to build the project. Run the following command from the
sample05 directory to build the project.

\> npm run build

Once the build is successful you can find a new file called bundle.js created under
sample05/public directory. To see how the application works, open up the
sample05/public/index.html file usinng your favorite web browser.

A.8 Organizing a React application in an easy way


The section A.7 explained in step by step what you need to do to create a React application
in a modular way. As you might have rightly guessed already, that process includes lot of
manual work, which probably can be automated. The Create React App is a tool that helps

©Manning Publications Co. To comment go to liveBook

Licensed to Mayuran Satchi <[email protected]>


260

you automate the process and with a single command you can create a complete React
application.
Let’s install the Create React App tool with the following command. This is only a one-
time installation and you can run the following command from anywhere in your local
machine. Here we pass -g to indicate npm to install create-react-app package globally, so
it’s available to any React project you create.

\> npm –g install create-react-app

To reate a React application using the tool, run the following command from the appendix-a
directory. Here we use another tool that comes with the Node.js installation called npx,
which is also called a package runner. Here we pass hello-react as an agument to the
command, which is the name of our React application.

\> npx create-react-app hello-react

Once the command runs successfully, it creates a directory with the name of the React
application, hello-react. Under the hello-react directory you’ll find a directory structure and a
set of files similar to what we created in the section A.7. One thing to notice here is, in the
package.json file you won’t find all the packages we had before in the section A.7. The trick
is with the react-scripts package (which we didn’t have in the section A.7). The react-scripts
is a special package that embeds some packages related to Babel, webpack and many more.
The template React application created from the above command is designed to run on
web browser. If you just try to open the index.html file using your web browser you will find
encounter some errors due to the location of some file. The best way to test the app is to run
the app in a web server using the following command, run from the hello-react directory. It
will spin up a web server and host the React application on localhost port 3000 by default.

\> npm start

You can now view hello-react in the browser.

Local: http://localhost:3000
On Your Network: http://10.0.0.129:3000

Note that the development build is not optimized.


To create a production build, use npm run build.

©Manning Publications Co. To comment go to liveBook

You might also like