OpenID Connect in Action v13
OpenID Connect in Action v13
Over time, I’ve seen more and more applications adding support for OpenID Connect, which
is easily overtaking its most successful predecessor, SAML 2.0. Since 2016, most applications
developed globally use OpenID Connect for login.
You may have heard of OpenID, which was OpenID Connect’s predecessor. When I joined
WSO2 in 2007, my first task was to implement OpenID support for the open-source Identity
Server, which was called Identity Solution in those days. A few years later, in 2009, we
completed a large-scale deployment of Identity Server as an OpenID provider in Saudi Arabia
with a user base of 4 million. That was my first hands-on experience with OpenID. Later, I also
implemented OAuth 1.0 support, and some part of the OAuth 2.0 support in the Identity Server
product. When OpenID Connect became mainstream, we added OpenID Connect support to the
Identity Server.
In this book, I focus on four types of applications: single-page applications, native mobile
applications, and server-side web applications. I picked these types of applications because they
address almost all the common OpenID Connect use cases we see in practice today.
In addition to explaining OpenID Connect internals in detail while taking you through how
different applications integrate with OpenID Connect, the book also covers security pitfalls and
the best practices and guidelines to avoid those while integrating client applications with OpenID
Connect for login.
The sample applications in the book use Java and React. Although having some working
knowledge of those technologies will be helpful, it’s not a must. If you are comfortable with any
programming language, know the basics of JavaScript, and have a solid understanding of how
the web works in general, you are all set.
I hope you’ll find the book useful. Please post any questions, comments, or suggestions
you have about the book in the liveBook discussion forum. Your feedback is essential in
developing the best book possible.
—Prabath Siriwardena
1 Coursera (https://www.coursera.org), for example, the famous online learning platform, supports login with both Apple ID and Google ID. Hotels.com,
eBay, Bloomberg, Reddit, Meetup, Ups, Rakuten, and many more also support login with both Apple ID and Google ID.
2 OpenID Foundation has defined the standard for OpenID Connect and the core specification is available at https://openid.net/specs/openid-connect-
core-1_0.html.
OpenID Provider is an identity provider that supports the OpenID Connect standard, as the
protocol to communicate with client applications. 3
OpenID Connect is not the only standard out there or the only option you have, to
integrate your web and mobile applications with an identity provider for login. But, by far,
and also in the foreseeable future, OpenID Connect is the most widely adopted technology
for login among greenfield applications. Perhaps that’s why you’ve invested in a book on
OpenID Connect. You’ve made the right decision! Understanding how OpenID Connect works,
integrating OpenID Connect with your web and mobile applications, and understanding the
role of OpenID Connect in securing your APIs/microservices are key skills every developer
should possess.
In this book you will learn everything you need to know about OpenID Connect, and we
don’t expect you to know about anything other than OpenID Connect is being used for login.
The sample applications in the book use Java, React, and React Native. So, having some
working knowledge of those technologies will be helpful, but is not a must. If you are
comfortable with any programming language and know the basics of JavaScript, and have a
solid understanding of how the web works in general, you are all set to get started. Appendix
A and B of the book help you get a jumpstart with React and React Native respectively.
3 In general we call the identity provider that supports OpenID Connect standard, an OpenID provider – not an OpenID Connect provider.
4 OAuth 2.0 is about authorization, while OpenID Connect is about authentication. OpenID Connect uses the OAuth 2.0 protocol to transport attributes
related to a user’s identity from an identity provider to a client application.
OpenID provider and what you need to worry about when picking an OpenID provider for
your project. We’ll use popular open-source OpenID providers in our samples, and since how
you set up those tools could vary time to time with new releases, we’ll keep the setting up
steps of those OpenID providers outside the book, in our GitHub repository
(https://github.com/openidconnect-in-action/samples).
In simple terms, OpenID Connect is a standard that defines how a client application
communicates with an OpenID provider to identify a user. How exactly the communication
between a client application and the OpenID provider takes place is defined in the OpenID
Connect Core specification (https://openid.net/specs/openid-connect-core-1_0.html)
developed by the OpenID foundation. There are few more specifications developed by the
OpenID foundation to address some other use cases around OpenID Connect. As we delve
deep into OpenID Connect, we’ll introduce you to those specifications. Also, in section 1.8 we
discuss OpenID Connect use cases.
Figure 1.1 The OpenID Connect specification defines how a client application can authenticate a user by
talking to an OpenID provider. It further defines how exactly the messages being passed between the client
application and the OpenID provider.
Figure 1.1 shows a typical OpenID Connect flow at a very highlevel. From the user’s point of
view, its quite similar to an OAuth 2.0 authorization code grant type, which you’ll learn in
chapter 2. You’ll learn in detail, what exactly happens under each arrow in chapter 3, when
we explain how to use OpenID Connect to log into a single-page application. However,
following lists out the communications happen between the client application, OpenID
provider and the end user at a highlevel.
• In step 1, the client application redirects the user to the OpenID provider for
authentication. On eBay for example, once you click on Login with Google you are
redirected to Google for authentication.
• The OpenID provider first checks whether the user has a valid session under the
domain of the OpenID provider, and if not, in step 2, challenges the user to
authenticate. Also, will show the end user which user attributes the client application
is requesting. This step is outside scope of the OpenID Connect protocol, and different
OpenID providers have implemented this in different ways.
• In step 3, the end user authenticates to the OpenID provider and gives or rejects
their consent to share the requested attributes with the client application. This step is
also outside the scope of the OpenID Connect protocol.
• In step 4, the OpenID provider redirects the user back to the client application. Based
on the OpenID Connect authentication flow you use (which we discuss in chapter 3),
this step may return the requested attributes directly to the client application, or a
temporary token that can be exchanged to the requested attributes via another direct
call happens between the client application and the OpenID provider.
• A schema for a token (which is a JSON Web Token (JWT)) and a set of processing
rules around it. A JWT is a container to transport a set of claims (a set of attributes)
from one point to another point in a cryptographically safe manner. You’ll learn more
about JWT in chapter 4. The OpenID Connect specification identifies this token, as the
ID token and you will learn more about ID token in chapter 5. Following is a decoded
ID token and only shows the header and the body parts. 5 The third part is the
signature of the body, which is not shown.
• A transport binding, which defines how to transport an ID token from one point to
another. OpenID Connect specification uses the term, authentication flows, to
define multiple ways how you can transport an ID token from one point to another.
5 A JWT can be a JWS (JSON Web Signature) or a JWE (JSON Web Encryption). In practice most of the time an ID token is a JWS. However, it can be a JWE as
well. If an ID token is a JWS, then it has three parts, and if the ID token is a JWE, then it has five parts. For further details please check chapter 4.
Most of the applications of OpenID Connect use both the token type and the transport
binding, for example when you use Login with Google on eBay. But, still there are some
applications that rely on only the token (the ID token). Those applications use ID token as
the contract to transfer attributes in cryptographically safe manner. For example, Kubernetes
uses an ID token to authenticate to the Kubernetes API server, and SPIFFE JWT SVID profile
defines the structure of the JWT as an ID token. 6
6 If you are new to Kubernetes and/or SPIFFE please check the Appendix H and Appendix J of the book, Microservices Security in Action (Manning
Publications, 2020) by Prabath Siriwardena and Nuwan Dias.
7 OpenID is an obsolete protocol today. However if you are interested in reading more about OpenID, please check this presentation done by Prabath
Siriwardena (the author of this book) in 2008: https://www.slideshare.net/prabathsiriwardena/understanding-openid.
8 In general, we call the identity provider that supports OpenID Connect standard, an OpenID provider – not an OpenID Connect provider.
authorization server, but we cannot call Facebook an OpenID provider, because it does not
support OpenID Connect.
Figure 1.2 A typical OAuth 2.0 authorization code grant type flow. The client application, following the
authorization grant type obtains an access token from the authorization server to access a resource on behalf
of the user (the resource owner).
One of the key features in OAuth 2.0 is grant types. A grant type defines how a client
application can get an access token from an OAuth 2.0 authorization server (see figure 1.2).
This access token lets the client application accesses a resource (an API, a microservice) on
behalf of the owner of the resource. If a client application, for example, wants to access your
Facebook photos via the Facebook API (GET graph.facebook.com/me/photos), the client
application first has to obtain an access token from the Facebook authorization server, with
your consent, and use that access token along with the API call to access the photos. The
photos are the resources and you are the resource owner. The me, in above API represents
you, or the owner of the resource, or the owner of the access token. The following table
summarizes the mapping between OAuth 2.0 terminolgy and Facebook terminology.
Table 1.1 The summary of the mapping between OAuth 2.0 terminology and Facebook
terminology
With OAuth 2.0, the client application gets an access token from the authorization server.
This token necessarily does not identify the end user (or the resource owner). It’s only good
enough to access a resource on behalf of the resource owner. The client application can
access a resource even without knowing who the resource owner is. The bottom line is,
OAuth 2.0 does not let the client application know who the user is, and it only lets the client
application access a resource on behalf of the user. OAuth 2.0 is not about authentication,
but authorization.
OpenID Connect, which is built on top of OAuth 2.0, defines how you can extend OAuth
2.0 to identify a user. In simple terms OpenID Connect is the identity layer built on top of
OAuth 2.0. When you use OpenID Connect, in addition to the access token, you also get an
identity (ID) token. OpenID provider uses a structure (or a container) called ID token to
transport claims to the client applications. As discussed in section 1.2, the ID token is in fact
a JSON Web Token (JWT).
An ID token can carry multiple assertions. An assertion (or a claim) is a strong
statement about someone or something, which can be cryptographically verified by the
recipient of the assertion. Typically, you can call an OpenID provider (or any identity
provider), an issuer of claims or assertions. An assertion can be an attribute assertion, an
authentication assertion, or an authorization assertion.
• An attribute assertion: A strong statement about someone or something, which
carries a set of attributes. Your driving license, for example, issued to you by the
Department of Motor Vehicles (DMV) carries a set of attribute assertions: name,
address, eye color, hair color, gender, date of birth, license number and so on.
• An authentication assertion: A strong statement issued with respect to how the
OpenID provider authenticates the user during the login flow. The authentication
assertion might be the username of the user and how the issuer authenticates the
user before issuing the assertion.
• An authorization assertion: A strong statement issued with respect to the
corresponding user’s entitlements. Based on the authorization assertion, the recipient
of the assertion (or the client application) can decide how to act. When an issuer, for
example, shares user’s age or date of birth with a client application, its an attribute
assertion, but if the issuer says the user is entitled to buy a beer (if the user is old
enough) without sharing the age, then its an authorization assertion.
In chapter 3, we’ll discuss in detail how OpenID Connect extends OAuth 2.0. For the time
being, let’s conclude that OAuth 2.0 is about authorization, while OpenID Connect is about
authentication.
1.5 How login with Facebook works around OAuth 2.0 for
authentication
As we concluded in section 1.4, OAuth 2.0 is about authorization. Then why are there zillions
of web sites using Login with Facebook to authenticate users? Facebook is using OAuth
2.0, not OpenID Connect. In this section we discuss how client applications work around
OAuth 2.0 to authenticate users. In chapter 3 we will discuss how OpenID Connect builds an
identity layer on top of OAuth 2.0, and why you should be using OpenID Connect for
authentication, rather building your own ways around OAuth 2.0.
At the end of an OAuth 2.0 flow you get an access token. This access token is for the
client application to access a resource (an API, a microservice) on behalf of the resource
owner. Or in other words, it’s not for the consumption of the client application itself. Put this
in a better way, the access token’s audience is not the client application, but the resource
(while the audience of the ID token, which comes with OpenID Connect, is the client
application). So, the client applications should not try to interpret the meaning of an access
token.
The OAuth 2.0 access token a client application gets at the end of the Login with
Facebook flow, for example, is for the consumption of the Facebook API, not for the client
application. It’s an opaque token for the client application. It should not try to decode the
access token and try to interpret the meaning of it. There are two types of access tokens
commonly used: a reference token and a self-contained token. At the time of this writing
Facebook only uses reference tokens.
• A reference token is just a random string value generated by the authorization
server. It carries no meaningful data and only makes sense to the issuer of the token.
When an API (or the recipient of the token) wants to validate a reference token, it has
to talk to the authorization server (or the issuer of the token) all the time.
• A self-contained access token, contains some useful information with respect to
the user, client application, scopes and so on, and it’s a JSON Web Token (JWT). We
discuss JWT in detail in chapter 4.
In either case, whether the access token you get is a reference token or a self-contained
token, the client application should not try to interpret the data embedded into the token. It
should only use an access token to access a resource on behalf of the resource owner.
Figure 1.3 Client applications use Login with Facebook to authenticate users, which uses OAuth 2.0
authorization code grant type underneath. To identify the user, the client applications access an OAuth 2.0-
protected Facebook business API with the access token received in step 4.
Let’s get back to the Facebook use case. Once a user completes the Login with Facebook
flow, the client application gets an access token (figure 1.3). This access token does not
carry any user information. But is only good enough to access the Facebook API,
https://graph.facebook.com/me.
The request to the Facebook API will return you back the information with respect to the
owner of the access token you pass along with the request. The level of data exposed via this
API is governed by the OAuth 2.0 scopes. We discuss OAuth 2.0 scopes in detail in chapter 2.
For the time being, think about scopes as permissions. The scopes attached to an access
token define what you can do with that token.
This is how Login with Facebook works around OAuth 2.0 for authenticating users. At the
end of the OAuth 2.0 flow, the client application has to use a non-standard API provided by
Facebook to authenticate users, and it’s outside the scope of OAuth 2.0 specification. So still,
OAuth 2.0 acts as an authorization framework, and to identify a user, the client applications
talk to a Facebook business API. In chapter 3, we discuss the limitations of this approach,
and why you should use OpenID Connect instead.
Connect in terms of the end user experience. OpenID (not OpenID Connect) and SAML 2.0
Web SSO came out in the same year to address similar concerns. But yet, most of the
enterprises adopted SAML 2.0 Web SSO, while most of the web sites that required
community interactions adopted OpenID. The pedigree of SAML 2.0 with the support from all
the members of the Liberty Alliance (http://www.projectliberty.org/), Shibboleth initiative
(https://www.shibboleth.net/) and many who already were using SAML version 1.1, helped
SAML 2.0 to be a widely adopted standard. At the same time, SAML 2.0 addressed most of
the enterprise SSO use cases, while OpenID failed behind. However, OpenID established
itself as a successful standard in the web community.
Even today, few years after OpenID Connect replaced OpenID, SAML 2.0 Web SSO is still
the most widely used standard for SSO (even though most of the new greenfield applications
today use OpenID Connect). The popularity of SAML 2.0 Web SSO started to fade with the
popularity of XML, around early 2010s. Identity industry started looking for a JSON based
standard, which is also as strong as SAML 2.0 Web SSO in terms of enterprise level use
cases and security. OpenID Connect filled that gap. Unlike OpenID, OpenID Connect today is
as strong as SAML 2.0 and addresses key enterprise use cases. If you start building any
application today, we recommend using OpenID Connect instead of SAML 2.0 Web SSO.
When you transfer identity attributes among two or more trust domains, we call that process
an identity federation.
When sharing identity attributes, the application in the recipient domain only accepts
identity attributes from a trusted OpenID provider. The most common way to establish trust
among domains is via X509 certificates. The OpenID provider signs the attributes it shares
with the client applications, and the client applications validate the signature using the
corresponding public certificate of the OpenID provider, which is known to the client
applications already.
OpenID Connect, SAML 2.0 Web SSO, WS-Federation, Central Authentication Service
(CAS) are the standard ways of implementing identity federation. All of them define a
protocol for how to transport identity related attributes from an identity provider to a client
application. For the same reason these standards are also called standards for identity
federation. When you hear someone saying OpenID Connect is a standard for identity
federation, now you know what it means!
1.9 The benefits of having one trusted identity provider for multiple
client applications
The most traditional way of integrating login with an application is form-based
authentication. The application itself asks the user to provide their credentials and connects
to a credential store, which could be an LDAP server (OpenLDAP, OpenDS, OpenDJ, Microsoft
Active Directory and so on) or a database to verify the credentials (figure 1.4). In this
section we discuss the drawbacks of this design and the advantages you gain by moving to a
model where you have one trusted identity provider (an OpenID provider).
Figure 1.4 Each application itself requests the user to provide their credentials and post those to a backend
system to validate. To share the credentials with an application, the user has to trust each of them, which
broadens up the attack surface. Each has to worry about how to securely accepts and transmit user
credentials.
If you have experienced in using SAML 2.0 Web SSO or any other OpenID Connect
equivalent protocols, you probably are familiar with this topic and could safely skip this
section and move to section 1.10.
1.9.1 Having one trusted identity provider means you have a single source of truth
Having each application to manage user credentials would probably work for a small
company with few applications and one team to own them all. In this design you expect each
application to handle user credentials in a responsible manner. The application has direct
access to the credentials a user shares during the login flow, as well as direct access to the
credential store. This puts lot of trust on the implementation and the deployment of the
application. You probably can do that when you are a small company and you have just one
team to handle everything. But as you scale, and when you have multiple teams developing
applications, or you start onboarding applications from different vendors, you cannot trust
each application to handle user credentials in a secure way.
Figure 1.5 Each application trusts the OpenID provider for authenticating users and the OpenID provider acts
as the single source of truth. Only the OpenID provider has direct access to user credentials. To authenticate a
user to a given application, the corresponding application redirects the user to the OpenID provider.
OpenID Connect helps decouple applications from user credentials. Only the OpenID provider
needs direct access to the credential store and also only the OpenID provider accepts
credentials from the users. This makes the OpenID provider the single source of truth. All
the other applications only need to establish a trust relationship with the OpenID provider,
and does not need to trust each and every individual user (see figure 1.5). 9 If the OpenID
provider says it’s a legitimate user in the system, the client applications simply take its word.
In chapter 3 we’ll discuss how to establish a trust relationship between the OpenID provider
and the client applications.
1.9.2 Having one trusted identity provider helps implementing single sign on
(SSO) across multiple client applications
When we have multiple applications that trust a single OpenID provider for login, then a user
only needs to login to that OpenID provider once; that is at the first time they log into an
9 The most common way for an application to establish trust with an OpenID provider is to trust the public certificate corresponding to that OpenID
provider. The OpenID provider signs all identity attributes it shares with the client applications, and the client applications can verify the signature
using the corresponding public certificate.
application via the OpenID provider. When login to other applications via the same OpenID
provider, users do not need to type their credentials again and again. From end user’s point
of view, this provides a single sign-on experience. As we discussed in section 1.8, in this
book we only talk about web SSO.
The OpenID Connect by design maintains the logged in user’s session under a single
domain, which is the OpenID provider’s domain. Here the domain means, the HTTP domain
name of the OpenID provider, which you use to access via a web browser. For example,
when you use you Google ID to login to eBay, the Google OpenID provider maintains your
logged-in session under the accounts.google.com domain name.
Multiple applications that trust an OpenID provider for login will redirect the users to the
OpenID provider’s domain for authentication. Since the OpenID provider maintains the users’
logged-in sessions under the OpenID provider domain name, it can detect whether a given
user has already logged in or not. If the user is already logged in, then the OpenID provider
can skip asking credentials from that user. Typically, the OpenID provider uses browser
cookies to detect whether a given user has a valid login session or not. From the end user’s
point of view, this builds an SSO experience.
1.9.3 A single place to implement and configure multiple login options for user
authentication
If each application has to deal with user authentication independantly, then each application
should know how to implement and maintain those authentication protocols. This could be
doable when you simply rely on username/password-based authentication. But to address
the current threats in cyber security and ever-increasing data breaches globally, most
modern applications rely on multi-factor authentication (MFA) and adaptive authentication.
With MFA, in most of the times, you would use one more login options along with
username/password-based authentication. This second factor could be a onetime passcode
(OTP) sent over SMS or email, Fast Identity Online (FIDO) 2.0, time based OTP (TOTP) using
something like the Google Authenticator app, and so on 10.
With adaptive authentication, you decide how you want to authenticate the users based
on some contextual parameters. These contextual parameters can be the role of a user, the
time of the day, location of the user, the number of failed login attempts, or a risk score,
which is derived dynamically during the login flow, to decide which login options you should
use. If the risk score is high, for example, you can use FIDO 2.0 as the second factor, while if
the risk score is medium, use OTP over SMS. Typically, to find the risk score your application
has to connect to a risk engine. 11
These are complex scenarios that we cannot expect each application to handle by it self.
It’s a hard ask for an application developer to build support for these authentication options
in their applications, as to implement those requires specialized skills and needs to be tested
thoroughly.
Since the design of OpenID Connect helps you decouple applications from the identity
provider, which is also the single source of truth, we don’t want individual applications to
implement the above complex authentication options. All the client applications rely on an
OpenID provider to authenticate users and the OpenID provider can implement all the
complex authentication options. Then again, you should not worry about building an OpenID
provider yourself; rather use one already available. Section 1.8 lists some of the available,
production-ready, open source OpenID providers.
1.9.4 Having one trusted identity provider helps to bootstrap trust with external
identity providers
Most enterprises, as they grow, start building relationships with external partners, suppliers,
and many more who need access to their internal applications. One straightforward way of
onboarding a partner is to create an account for each employee of the partner who needs
access to your internal applications, in your local credential store itself. This approach looks
simple and hassle free, but yet has a major security concern. Here, we do not trust the
partner company, rather individuals of that company. If an individual leaves the
corresponding company, unless they notify you and you consciously remove their accounts
from your system, they will still have access. Not just the security concern, there is a
usability concern too from the user’s point of view. Now the employees of the partner
company have to maintain one more set of credentials.
The ideal approach in fixing these kinds of problems is to build trust between the two
parties – or the two companies. How do you do that? One approach is that each and every
application in your domain trusts an identity provider running in your partner’s domain. 12
This may work to some extent if you have one application and one partner. As you scale and
start adding more applications and partners, managing trust relationships between
applications and partners could lead to a maintenance nightmare. 13 We also call this
spaghetti identity pattern, as you build multiple point-to-point connections between
applications and partners. Here are some of the drawbacks of this approach.
12 Each application needs to know the public certificate associated with each external identity provider. When an application gets a set of identity
attributes from an external identity provider, it validates the signature of identity attributes using the corresponding public certificate.
13 The most common way to build a trust relationship is done via X509 certificates. If an application trusts an partner, that application should know the
X509 certificate (or the public certificate) associated with the corresponding partner.
• Hard to enforce a centralized governance model for the company that decides which
partner should have access to which application.
• Each application has to trust one or more partner domains. Every time to add or
remove a partner, need to update multiple systems.
• Each application has to deal with protocol / claim transformations by itself. For
example, when your application supporting OpenID Connect has to connect to a
partner identity provider that supports SAML 2.0 Web SSO, your application should
know how to transform a SAML token to an OpenID Connect token. We discuss this in
section 1.7.5 in detail.
The centralized identity provider approach helps to overcome all the above drawbacks (see
figure 1.6). You have one identity provider all your applications trust, and you build trust
relationships between that identity provider and partner identity providers. In that way, all
your applications only need to trust your own internal identity provider, and should only
know how to talk to it. This internal identity provider bootstraps trust with partners (or with
partner identity providers). 14
Figure 1.6 The internal identity provider bootstraps trust with partner identity providers, and does
claim/protocol transformations, as expected by the client applications connected to it. All the client
applications only need to trust the internal identity provider, and should only need to support the federation
protocol, the internal identity provider supports.
14 The bootstrap trust refers to the process you have to go through to introduce a new partner into the system as a trusted identity provider. During this
process you will upload the public certificate of the partner identity provider to your internal identity provider, so the internal identity provider can
validate the signature of the identity attributes coming from the corresponding partner identity provider.
15 WS-Federation and CAS are federation protocols similar to OpenID Connect and SAML 2.0 Web SSO. However those are not that popular.
16 Typically, an identity provider shares claims as name / value pairs. The name of the claim is also called the claim URI and the value is called the claim
value
a browserless device such as a smart TV. In chapter 3 we discuss in detail how you can
integrate OpenID Connect with a single-page application developed in React, and in chapter
6 we discuss how you can use OpenID Connect to log into a server-side web application
developed in Java. Then in chapter 9 we discuss how to integrate OpenID Connect for login
with a native mobile application developed in React.
17 In general most of the OpenID providers have their own ways of configuring what attributes need to be shared with which client applications. These
configurations are done out-of-band by application developers directly interacting with the corresponding OpenID provider.
Figure 1.7 Coursera, the famous online learning platform, supports signup with both Apple ID and Google ID,
which use OpenID Connect underneath. Once the login flow is completed, the client application checks
whether the user has a valid local account with it; if not, requests the user to sign up.
18 There are more than 70 million small businesses on Facebook, using it on day-to-day basis for login.
19 Sri Lanka banned Facebook and some other social media once in 2018 to avoid the spread of false news during a communal riot and again in 2019
during the Easter day bombing attack.
20 When an API receives a request from a client application along with an OAuth 2.0 access token, it always validates the token by talking to an
authorization server it trusts.
Figure 1.8 The client application exchanges the ID token it got from the OpenID provider it trusts to an OAuth
2.0 access token from the authorization server the API trusts. The authorization server only accepts the token
the client application presents in step 2, if it trusts the corresponding OpenID provider. The trust between the
OpenID provider and the authorization server, and the trust between the API gateway and the authorization
server can be established out of band.
In a scenario where the authorization server does not know how to authenticate the users, or
the authorization server has no control over how the client applications authenticate their
users, we need to worry about federation.
When a client application wants to access an API on behalf of a logged in user, it has to
bring in an OAuth 2.0 access token from the authorization server the API trusts. Assuming
the client application has its own trusted OpenID Provider to authenticate its users, it won’t
be able to use the OAuth 2.0 access token it gets during the OpenID Connect login flow, to
access the API. The API only trusts its own authorization server, not the OpenID provider
attached to the client application.
To fix this problem, the client application has to talk to the authorization server the API
trusts and get an access token for the logged in user. In doing that, the client application can
pass the ID token of the corresponding user, it got from the OpenID provider it trusts, during
login flow (see figure 1.8).
During this token exchange the authorization will issue an access token for the user
attached to the ID token, if it trusts the OpenID provider who issued the corresponding ID
token. So, all the time when the client application accesses the API, it passes an access
token issued by the authorization server the API trusts.
Here’s an alphabetical list of popular cloud-based identity management solutions that support
OpenID Connect.
• Auth0 : https://auth0.com
• Azure AD: https://azure.microsoft.com/en-us/services/active-directory
• Okta: https://www.okta.com
• OneLogin: https://www.onelogin.com
• Ping One: http://pingone.com
Mostly, the focus of this book is to help you integrate your client applications with an OpenID
provider. The client application can be a web application, single-page application, native
mobile application or an application running in a browserless device such smart TV.
To build these applications, based on your preferred programming language, you need to
pick a library which knows how to deal with OpenID Connect specific nitty-gritty, and build
your application on top of that. Here’s a list of open source libraries that you can use to
develop OpenID Connect client applications.
1.13 Summary
• OpenID Connect is an identity layer built on top of OAuth 2.0.
• OpenID Connect has roots in OpenID, but OpenID Connect and OpenID are not
compatible standards and OpenID is no longer used.
• Most of the greenfield applications developed today use OpenID Connect.
• OpenID Connect facilitates single sign on, identity federation, attribute sharing, single
logout, and many more use cases.
• There are multiple benefits of having one trusted identity provider for a given
enterprise, rather having each application to implement support for multiple
heterogenous identity federation protocols.
• There are many open source implementations of OpenID providers and OpenID
Connect client libraries, and you pick an OpenID provider based on your
requirements, and a client library based on the type of the client and the technology
you prefer in using.
• What is OAuth 2.0 and how does it fix the access delegation problem?
• The actors of an OAuth 2.0 flow
• OAuth 2.0 grant types and client types
• What’s new in OAuth 2.1?
In chapter 1 you learned that OpenID Connect is an open standard developed by the OpenID
foundation on top of OAuth 2.0 specification. Also in chapter 1 you learned that OpenID
Connect is an identity layer built on top of OAuth 2.0. OAuth 2.0 is the cornerstone of
OpenID Connect, and in this chapter we delve deeply into OAuth 2.0. If you are already
familiar with OAuth 2.0, you can safely skip this chapter.
You’ll also learn in this chapter the new changes proposed by the successor to OAuth 2.0,
the OAuth 2.1. The OAuth 2.1 is still at draft stage at the time of this writing. Even though
you are familiar with OAuth 2.0, but still want to learn about what’s new in OAuth 2.1, still
worth going through this chapter.
This chapter does not cover all the bits and pieces of OAuth 2.0. OAuth 2.0 has evolved a
lot since its inception in 2012. If you’re interested in understanding OAuth 2.0 in detail, we
recommend Advanced API Security: OAuth 2.0 and Beyond (Apress, 2019) by Prabath
Siriwardena and OAuth 2 in Action (Manning, 2017) by Justin Richer and Antonio Sanso.
framework developed by the Internet Engineering Task Force (IETF) OAuth working group.
It’s defined in the RFC 6749 (https://tools.ietf.org/html/rfc6749). The fundamental focus of
OAuth 2.0 is to fix the access delegation problem, and in section 2.1.1 we discuss what
access delegation problem is. In the section 2.1.2 you’ll learn how OAuth 2.0 fixes the access
delegation problem, and in section 2.1.3 you’ll learn why we call OAuth 2.0 a framework.
1 The main difference is that OAuth 2.0 is more extensible than OAuth 1.0. OAuth 1.0 is a concrete protocol, whereas OAuth 2.0 is an authorization
framework.
Figure 2.1 A third-party application follows the model of access delegation with no credential sharing in order
to get a temporary token from Facebook, which is only good enough to read a user’s status messages.
The temporary token Facebook issues has a limited lifetime and is bound to the Facebook
user, the third-party web application, and the purpose. The purpose of the token here is to
read the user’s Facebook status messages, and the token should be only good enough to do
just that and no more. The OAuth 2.0 terminology is as follows:
• The Facebook user is called the resource owner. The resource owner decides who
should have which level of access to the resources they own.
• Facebook, which issues the token, is called the authorization server. The authorization
server knows how to authenticate (or identify) the resource owner, and grants access
to third-party applications to access resources owned by the resource owner, with
their consent. The authorization server also knows how to identify these third-party
applications.
• The Facebook API is called the resource and the server that hosts all the resources
are called the resource server. The resource server guards the resources owned by
the resource owner, and lets someone access a resource only if the access request
comes along with a valid token issued by an authorization server the resource server
trusts.
• The third-party web application is called the client. The client consumes a resource on
behalf of the resource owner.
• The token Facebook issues to the third-party web application is called the access
token. The authorization server issues access tokens, and the resource server
validates those. To validate an access token, the resource server may talk to the
authorization server. We used term ‘may’ here because, when we use self-contained
access tokens, which we discuss in section 2.6, the resource server does not need to
talk to the authorization server to validate the token.
• The purpose of the token is called the scope. The resource server makes sure a given
token can be used only for the scopes attached to it. If the third-party application
tries to write to the user’s Facebook wall with the access token it got to read the
status messages, that request will fail. We discuss scopes in detail in section 2.5.
• A grant type defines the flow of events that happens during the process of the third-
party web application getting an access token from the authorization server. OAuth
2.0 defines a set of grant types, which we discuss in section 2.4.
2 Many European countries follow the UK Open Banking (https://standards.openbanking.org.uk/) standard to implement the support for Payment
Services Directive 2 (PSD2). PSD2 is a data and technology-driven directive that aims to drive increased competition, innovation and transparency
across the European payments market.
In a typical access delegation flow, a client application accesses a resource that’s hosted on a
resource server on behalf of a resource owner with a token provided by an authorization
server. This token grants access rights to the client to access a resource on behalf of the
resource owner.
Figure 2.2 In a typical OAuth 2.0 access delegation flow, a client accesses a resource that’s hosted on a
resource server, on behalf of the end user, with a token provided by the authorization server.
• Client credentials—Suitable for client applications that directly access resources with
no end-users. The client application itself is the resource owner (we discuss this in
section 2.3.1). A Weather App (the client application), for example, can use the client
credentials grant type to obtain an access token, and use it to access the Weather API
(the resource).
• Resource owner password—Suitable for applications the authorization server trusts
(we discuss this in section 2.3.2). This should be avoided at all cost and the
OAuth 2.1 specification has removed the resource owner password grant
type. 3
• Authorization code—Suitable for almost all the applications with an end user (we
discuss this in section 2.3.4)
• Implicit—Don’t use it! (we discuss this in section 2.3.5). The OAuth 2.1
specification has removed the implicit grant type.
• Refresh token—Used for renewing expired access tokens. The refresh token grant
type is little different from other four grant types listed above and we discuss the
differences in section 2.3.3.
The OAuth 2.0 framework isn’t restricted to these five grant types. It’s an extensible
framework that allows you to add grant types as needed. The following are two other popular
grant types that aren’t defined in the OAuth 2.0 RFC but in related profiles:
• SAML Profile for OAuth 2.0 Client Authentication and Authorization Grants —Suitable
for applications having single sign-on using SAML 2.0 (defined in RFC 7522).
• JWT Profile for OAuth 2.0 Client Authentication and Authorization Grants —Suitable for
applications having single sign-on using OpenID Connect (defined in RFC 7523).
3 The OAuth 2.1 specification is in draft at the time of this writing and whenever mentions OAuth 2.1 in the rest of this chapter we refer to the draft
specification, which is available at https://www.ietf.org/archive/id/draft-ietf-oauth-v2-1-00.html.
Figure 2.3 The client credentials grant type lets an application obtain an access token with no end user; the
application itself is the end user.
4 The RFC 8705, OAuth 2.0 Mutual-TLS Client Authentication and Certificate-Bound Access Tokens is a specification by the IETF working group, which
standardize the usage of mutual TLS for client authentication: https://www.rfc-editor.org/rfc/rfc8705.html.
Here’s a sample curl command for the client credentials grant request (this is just a sample,
so don’t try it out as is):
Listing 2.1 A sample token request with client credentials grant type
\> curl \
-u application_id:application_secret \ #A
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&scope=create_customer update_payment" \
https://localhost:8085/oauth/token
#A The client application uses HTTP Basic Authentication with its client_id and client_secret.
In listing 2.1, the value application_id is the client ID, and the value application_secret
is the client secret of the client application. The -u parameter instructs curl to perform a
base64-encoded operation on the string application_id:application_secret. The
resulting base64-encoded string that’s sent as the HTTP Authorization header to the
authorization server would be YXBwbGljYXRpb25faWQ6YXBwbGljYXRpb25fc2VjcmV0.
Even though we use a client secret (application_secret) in the curl command in
listing 2.1 to authenticate the client application to the token endpoint of the authorization
server, the client application can use mTLS instead (see listing 2.2), if stronger
authentication is required. In that case, we need to have a public/private key pair for the
client application, and the authorization server must trust the issuer of the public key or the
client certificate.
Listing 2.2 A sample token request with client credentials grant type with mutual TLS
\> curl \
--cert client.crt \ #A
--key client.key \ #B
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=client_credentials&scope=create_customer update_payment" \
https://localhost:8085/oauth/token
#A Public certificate corresponding client application’s private key.
#B The private key of the client applications.
In addition to client authentication, the token request in the code listing 2.1 and listing 2.2
also carries following parameters:
• The grant_type parameter is a required parameter for all the grant types and for the
client credentials grant type, its value must be client_credentials.
• Optionally, you can also specify the expected purpose of the access token with the
scope parameter. Not just for the client credentials grant type, you can pass scope
parameter along with the token request or the authorization request under all the
grant types. If a given grant type has an authorization request (for example
authorization code grant type and implicit grant type) then the scope parameter goes
with the authorization request, if not with the token request. As in code listing 2.2,
you can specify multiple values for the scope parameter where each one is separated
by a space. In section 2.5 you’ll learn more about the scope parameter.
The authorization server validates the request in listing 2.2 and issues an access token as in
the following HTTP response:
Listing 2.3 A sample token response from the authorization server for client credentials grant
type
{
"access_token":"de09bec4-a821-40c8-863a-104dddb30204",
"token_type":"bearer",
"scope":"create_customer",
"expires_in":3599
}
The following lists out the parameters included in the token response from the authorization
server as in the code listing 2.3:
• The access_token parameter carries the access token the authorization server issues.
This is a required parameter.
• The token_type parameter specifies the type of the token. In practice, at the time of
this writing most of the OAuth 2.0 implementations use bearer tokens and in section
2.6 we discuss token types in detail. This is a required parameter.
• The expires_in parameter hints the client application about the lifetime of the access
token in minutes from the time the token is issued. This is not a required parameter,
however recommended. In the absence of expires_in parameter in the response, the
client application should know other means to find the token expiration or else the
client can simply keep using the access token to access the corresponding resource
and if the token is expired the resource server will respond back with an error (HTTP
401 status code), and at that point the client application can request a new token
from the authorization server. We discuss an approach to request a new token in
section 2.3.3.
• The scope parameter in the response carries the scopes associated with the issued
access token. This is only required in the token response, if the authorization server
issued the access token corresponding to a subset of the requested scopes. If the
access token is issued to the same set of scopes requested by the client application,
its optional to have the scope parameter in the token response.
The client credentials grant type is suitable for applications that access APIs and that don’t
need to worry about an end user. Simply put, it’s good when you need not be concerned
about access delegation, or in other words, the client application accesses an API just by
being itself, not on behalf of anyone else. Because of this, the client credentials grant type is
mostly used for system-to-system authentication when an application, a periodic task, or any
kind of a system directly wants to access an endpoint that is protected with OAuth 2.0.
Let’s take a weather API, for example. It provides weather predictions for the next five
days. If you build a web application to access the weather API, you can simply use the client
credentials grant type because the weather API isn’t interested in knowing who uses your
application. It is concerned with only the application that accesses it, not the end user.
Figure 2.4 The password grant type allows an application to obtain an access token.
The following is a sample curl command for a token request following the resource owner
password grant type made to the authorization server (this is just a sample, so don’t try it
out as is):
Listing 2.4 A sample token request with cURL under resource owner password grant type
\> curl \
-u application_id:application_secret \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=password&username=user&password=pass" \
https://localhost:8085/oauth/token
As with the client credentials grant type, the application_id and application_secret are
sent in base64-encoded form in the HTTP Authorization header, using HTTP Basic
authentication. In this case, the authorization server validates not only the client ID and
As with the client credentials grant type, upon successful authentication, the authorization
server responds with a valid access token as shown in the following code listing. Except the
refresh_token parameter in the response, we discussed the meaning of all the other
parameters in the section 2.3.1:
Listing 2.5 A sample token response from the authorization server under resource owner
password grant type
{
"access_token":"de09bec4-a821-40c8-863a-104dddb30204",
"refresh_token":" heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3",
"token_type":"bearer",
"expires_in":3599
}
The value of the refresh_token parameter you find in the response (listing 2.4) can be used
to renew the current access token before it expires. We discuss refresh tokens in section
2.3.3 in detail. You might have noticed that we didn’t get a refresh_token in the client
credentials grant type. You only need a refresh_token to refresh a given access token
offline, when the resource owner is not in presence or when the client application has no
interactions with the resource owner. But, with the client credentials grant type, the client
application itself is the resource owner, and the client application has access to the resource
owner’s (the application itself) credentials all the time. So, to refresh an access token the
client application does not need another token, it can simply do that with its own original
credentials. That’s why the authorization server does not return a refresh_token token for
the client credentials grant type.
With the password grant type, the resource owner (user of the application) needs to
provide their username and password to the client application. Therefore, this grant type
should be used only with client applications that are trusted by the authorization server. This
model of access delegation is called access delegation with credential sharing. It is, in fact,
what OAuth 2.0 wanted to avoid using. Then why is it in the OAuth 2.0 specification? The
only reason the password grant type was introduced in the OAuth 2.0 specification was to
help legacy applications using HTTP Basic authentication to migrate to OAuth 2.0; otherwise,
you should avoid using the password grant type where possible, and the OAuth 2.1
specification has officially removed the password grant type from the specification.
As with the client credentials grant type, the password grant type requires the application
to store the client secret securely. It’s also critically important to deal with the user
credentials responsibly. Ideally, the client application must not store the end user’s password
locally; should only be used to get an access token from the authorization server and then
forget it. The access token the client application gets as the response to the token request
has a limited lifetime. Before this token expires, the client application can get a new token by
using the refresh_token received in the token response from the authorization server. This
way, the client application doesn’t have to prompt for the user’s username and password
every time the token on the application expires.
Figure 2.5 The refresh token grant type allows a token to be renewed when it expires.
The following curl command can be used to renew an access token with the refresh token
grant type (this is just a sample, so don’t try it out as is):
Listing 2.6 A sample cURL request with refresh token grant type
\> curl \
-u application_id:application_secret \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=refresh_token&
refresh_token=heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3" \
https://localhost:8085/oauth/token
As we discussed in section 2.3.1 and 2.3.2, the application’s client ID and client secret
(application_id and application_secret) must be sent in base64-encoded format as the
HTTP Authorization header, using HTTP Basic authentication.
In addition to client authentication, the token refresh request in the code listing 2.6 also
carries following parameters:
• The request body contains the grant_type parameter, and its value must be set to
refresh_roken. This is a required parameter.
• The refresh_token parameter caries the value of a valid refresh token the client
application already has. This is a required parameter. The refresh token grant type
should be used only with the applications that can store the client secret and refresh
token values securely, without any risk of compromise. You will learn in section 2.3.5
that implicit grant type cannot store tokens securely, so the client applications that
use the implicit grant type do not get a refresh token.
• The refresh token request can optionally include the scope parameter, which carries
one or more identifiers that are known to the authorization server. The value of the
scope parameter must be equal to the scope of the original access token or to a sub
set of the scopes that are attached to the already issued access token, which you
want to refresh. Then again why do we need to refresh an access token for a subset
of the scopes attached to the original access token? Practically, we would require
doing this to avoid over-scoped access tokens. It’s too early to discuss what this really
means in this chapter, and you’ll find more details about over scoped access tokens in
chapter 10.
The refresh token usually has a limited lifetime, but it’s generally much longer than the
access token’s lifetime, so an application can renew its access token even after a significant
duration of idleness. When you refresh an access token, in the response, the authorization
server sends the renewed access token, along with another refresh token. This refresh token
may or may not be the same refresh token you get in the first request from the authorization
server. It’s up to the authorization server; it’s not governed by the OAuth 2.0 specification.
The following listing shows the response from the authorization server for an access
token refresh request. This is the exact response, which discussed under listing 2.5 in section
2.3.2.
Listing 2.7 A sample response for the access token refresh request
{
"access_token":"de09bec4-a821-40c8-863a-104dddb30204",
"refresh_token":" heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3",
"token_type":"bearer",
"expires_in":3599
}
Figure 2.6 The authorization code grant type allows a client application to obtain an access token on behalf of
an end user (or a resource owner).
As shown in figure 2.6, the first step of the client application is to initiate the authorization
code request. The HTTP request to get the authorization code looks like the following (this is
just a sample, so don’t try it out as is):
Listing 2.8 A sample authorization request to the authorization server under authorization code
grant type
GET https://localhost:8085/oauth/authorize?
response_type=code&
client_id=application_id&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fweb.application.domain%2Flogin
The authorization request in the code listing 2.8 carries following parameters:
• The response_type parameter indicates to the authorization server that an
authorization code is expected as the response to this request. This parameter is only
included in the authorization requests from the client application to the authorization
endpoint of the authorization server. In the client credentials grant type and resource
owner password grant type there is no authorization request – but only a token
request from the client application to the token endpoint of the authorization server.
That’s why you didn’t see the response_type parameter in those two grant types.
OpenID Connect heavily uses the response_type parameter and in chapter 3 we
discuss this in detail. In the authorization code grant type, the value of the
response_type parameter must be code and it’s a required parameter.
• The client_id parameter carries the identifier given to the client application by the
authorization server. This request only includes the client_id and no client secret. In
this step the authorization server does not authenticate the client application, rather
uses the client_id to identify the application to load the configuration related to it.
This is a required parameter.
• The redirect_uri in the request should be equal to the redirect_uri provided when
registering the corresponding client application at the authorization server. This is an
optional parameter. If the client application does not provide a redirect_uri in the
authorization request, the authorization server picks the already registered
redirect_uri. However, a client application can register multiple redirect_uris at
the authorization server, and pick one of them by sending the preferred
redirect_uri in the authorization request. Also, some authorization servers support
registering a regular expression as the redirect_uri, so the redirect_uri in the
request must match with the registered regular expression.
• One optional parameter that’s not included in listing 2.8 is the scope parameter.
When making the authorization request, the application can request the scopes it
requires on the token to be issued. We discuss scopes in detail in section 2.5.
• Another optional parameter that’s not included in listing 2.8 is the state parameter.
The client application can send any value in the state parameter, and can expect the
same value in the response from the authorization server. Although this is an optional
parameter, it is recommended to use, and in chapter 10 we discuss how a client
application can use the state parameter to mitigate possible cross-site request forgery
attacks.
Upon receiving the authorization request in listing 2.8, the authorization server first validates
the client ID and the redirect_uri; if these parameters are valid, it presents the user with
the login page of the authorization server (assuming that no valid user session is already
running on the authorization server). The user needs to enter their username and password
on this login page. When the username and password are validated, the authorization server
issues the authorization code and provides it to the user agent via an HTTP redirect (figure
2.6). The authorization code is part of the redirect_uri as shown here:
The following code listing shows the response from the authorization server for the
authorization request in listing 2.8.
Listing 2.9 A sample authorization response under authorization code grant type
https://web.application.domain/login?code=hus83nn-8ujq6-7snuelq
The authorization response in the code listing 2.8 carries the following parameters:
• The code parameter in the response carries the authorization code the authorization
server generates. The code is sent to the user agent via the redirect_uri, and it
must be passed over HTTPS. Also, because this is a browser redirect, the value of the
authorization code is visible to the end user, and also may be logged in server logs.
To reduce the risk that this data will be compromised, the authorization code usually
has a short lifetime (no more than 30 seconds) and is a one-time-use code. The client
application later uses the authorization code (before it expires) to talk to the token
endpoint of the authorization server to get an access token. If the code is used more
than once, the authorization server revokes all the tokens previously issued against it.
This is a required parameter.
• If the authorization request (listing 2.8) had the optional state parameter in it, then
the authorization server must include the same value in the response, as a query
parameter.
As in listing 2.9 the authorization code is provided as a query parameter in an HTTP redirect
(https://developer.mozilla.org/en-US/docs/Web/HTTP/Redirections) on the provided
redirect_uri. The redirect_uri is the location to which the authorization server should
redirect the browser (user agent) upon successful authentication.
In HTTP, a redirect happens when the server sends a response code between 300 and
310. In this case, the response code is 302. The response contains an HTTP header named
Location, and the value of the Location header is the URL to which the browser should
redirect. The URL (host) in the Location response header should be equal to the
redirect_uri query parameter (listing 2.8) in the HTTP request used to initiate the
authorization grant flow:
Location: https://web.application.domain/login?code=hus83nn-8ujq6-7snuelq
Upon receiving the authorization code, the client application issues a token request to the
token endpoint of the authorization server, requesting an access token in exchange for the
authorization code. The following is a curl command of such a request (step 6 in figure 3.6):
Like the other grant types discussed so far, the authorization code grant type requires the
client ID and client secret (optional) to be sent as an HTTP Authorization header in base64-
encoded format using HTTP Basic authentication. Also, as discussed in section 2.3.1, the
client application can use a much stronger form of authentication for the client application. In
addition to client authentication, the token request in the code listing 2.9 also carries
following parameters:
• The request body contains the grant_type parameter, and its value must be set to
authorization_code. This is a required parameter.
• The code parameter is the authorization code the client application got from the
authorization server as in listing 2.8. This is a required parameter.
• The value of the client_id parameter is the same client identifier that we had in
listing 2.8, in the authorization request. This is a required parameter. Then again, why
do we need to include client_id parameter again in the HTTP POST body; we are
already sending it in the HTTP Authorization header, for client authentication? The
client authentication needs not to happen with client_id and client secret all the
time. As mentioned before, the client application can pick a stronger authentication
option that is supported by the authorization server. So, the client_id parameter
carried in the token request, in the HTTP POST body helps the authorization server to
identify the client application irrespective of the authentication method the client uses.
• The value of the redirect_uri must be identical to the value of the redirect_uri
parameter we had in the authorization request in listing 2.8. If the client application
added the redirect_uri to the authorization request, it must add the same to the
token request as well.
Upon validation of the token request in listing 2.10, the authorization server issues an access
token to the client application in an HTTP response. The following listing shows the exact
response, which we discussed under listing 2.5 in section 2.3.2.
As you’ve seen, the authorization code grant type involves the user, client application, and
authorization server. Unlike the password grant type, the authorization code grant type
doesn’t require the user to provide their credentials to the client application. The user
provides their credentials only on the login page of the authorization server. This way, you
prevent the client application from learning the user’s login credentials. Therefore, this grant
type is suitable to provide user credentials for web, mobile, and desktop applications that
you don’t fully trust.
A client application that uses the authorization code grant type needs to have some
prerequisites to use it securely. Because the application needs to know and deal with
sensitive information, such as the client secret, refresh token, and authorization code, it
needs to be able to store and use these values with caution. It needs to have mechanisms
for encrypting the client secret and refresh token when storing and to use HTTPS, for
example, for secure communication with the authorization server. The communication
between the client application and the authorization server needs to happen over TLS so that
network intruders don’t see the information being exchanged.
Figure 2.7 The implicit grant type allows a client application to obtain an access token.
With the implicit grant type, when the user attempts to log in to an application, the client
application initiates the login flow by creating an implicit grant request. This request should
contain the client ID and the redirect_uri. The redirect_uri, as with the authorization
code grant type, is used by the authorization server to redirect the user agent back to the
client application when authentication is successful. The following is a sample authorization
request the client application sends to the authorization endpoint of the authorization server
under the implicit grant type (this is just a sample, so don’t try it out as is):
The authorization request in the code listing 2.12 carries following parameters:
As you can see in the HTTP requests, the difference between the authorization code grant’s
initial request and the implicit grant’s initial request is the fact that the response_type
parameter in this case is token. This indicates to the authorization server that the client
application is interested in getting an access token as the response to the implicit
authorization request. As with the authorization code grant type, here too scope is an
optional parameter that the user agent (or the client application) can provide to ask the
authorization server to issue a token with the required scopes.
Once the authorization server receives this request (listing 2.12), it validates the client ID
and the redirect_uri, and if those are valid, it presents the user the login page of the
authorization server (assuming that no active user session is running on the browser against
the authorization server). When the user enters their credentials, the authorization server
validates them and only if scope is provided in the request, presents to the user a consent
page to acknowledge that the application is given the required permissions to perform the
actions denoted by the scope parameter.
Note that the user provides credentials on the login page of the authorization server, so
only the authorization server gets to know the user’s credentials. When the user has
consented to the required scopes, the authorization server issues an access token and
provides it to the user agent on the redirect_uri itself as a URI fragment. The following is
an example of such a redirect:
The response from authorization server as in the code listing 2.12 carries the same set of
parameters we discussed under sections 2.3.1 (listing 2.2). The access_token, token_type
are required parameters and optionally the response from the authorization server can
include expires_in, scope and state parameters. If the state parameter was included in
the authorization request, the authorization response must have the state parameter with
the same value as in the request.
When the user agent (web browser) receives this redirect, it makes an HTTPS request to
the web.application.domain/login URL. Because the access_token field is provided as a
URI fragment (denoted by the # character in the URL), that particular value doesn’t get
submitted to the server on web.application.domain. Only the authorization server that
issued the token and the user agent (web browser) get to know the value of the access
token. The implicit grant type doesn’t provide a refresh token to the user agent. As we
discussed earlier in this section, because the value of the access token is passed in the URL,
it will be in the browser history and also possibly logged into server logs.
The implicit grant type doesn’t require your client application to maintain any sensitive
information, such as a client secret or a refresh token. This fact made it a good candidate for
use in SPAs, where rendering the content happens on web browsers through JavaScript.
These types of applications execute mostly on the client side (browser); therefore, these
applications are incapable of handling sensitive information such as client secrets. But still,
the security concerns in using the implicit grant type is much higher than its benefits, and
it’s no longer recommended, even for SPAs. The OAuth 2.1 has removed the implicit
grant type from the specification. As discussed in the previous section, the
recommendation is to use the authorization code grant type with no client secret, even for
SPAs. In chapter 10 we discuss in detail the security concerns associated with the implicit
grant type.
In contrast to the OAuth 2.0, the OAuth 2.1 specification proposes three types of clients:
confidential, credentialed and public. Both the confidential and public clients carry the
same meaning as we discussed before with respect to OAuth 2.0. At the time OAuth 2.0
defined the confidential client type, it was assumed that the authorization server issues these
credentials to a client application during the application registration process, with proper
verification. And most of the time, given a client id, the authorization server knows the
application developer who registered the client application.
But with the introduction of RFC 7591: OAuth 2.0 Dynamic Client Registration Protocol
(https://tools.ietf.org/html/rfc7591), one can register a client application dynamically and
obtain and a client id and a client secret. An authorization server that supports RFC 7591
provides an endpoint to dynamically register client applications, and this endpoint can be
protected or open. If its open, anyone having access to the environment can register an
application.
The OAuth 2.1 specification proposes to differentiate client applications that carry verified
secrets (the confidential clients) and the client applications that register themselves with an
open dynamic client registration endpoint. To identify the later type clients, OAuth 2.1
introduced a new client type called credentialed. The credentialed clients have credentials
(client id and client secret), however the authorization server has not verified the identity of
that client application, before issuing them.
Figure 2.8 The client application requests an access token along with the expected set of scopes. When the
access token is a self-contained JWT, the resource server validates the token by itself, without talking to the
authorization server.
retrieve the token. So, even if someone steals the token, they won’t be able to use it outside
the TLS channel, which was used to obtain the token.
The OAuth 2.0 Token Binding specification was built on top of few other specifications:
Token Binding over HTTP (https://tools.ietf.org/html/rfc8473), Transport Layer
Security (TLS) Extension for Token Binding Protocol Negotiation
(https://tools.ietf.org/html/rfc8472), and The Token Binding Protocol Version 1.0
(https://tools.ietf.org/html/rfc8471). Also, for the solution proposed in this specification to
work, it was required the authorization servers and the resource servers to support the new
TLS extension the RFC 8472 proposed and the browsers too had to have the built in support.
So, in practice due to the lack implementation support OAuth 2.0 token binding never
became popular. You can learn more about OAuth 2.0 token binding from this blog
(https://medium.facilelogin.com/oauth-2-0-token-binding-e84cbb2e60), which I wrote in
2017.
After the OAuth 2.0 token binding not becoming mainstream, mostly due to the lack of
support from other systems, the IETF OAuth working group came up with another approach
to support proof-of-procession tokens with the OAuth 2.0 Demonstrating Proof-of-Possession
at the Application Layer specification (https://tools.ietf.org/html/draft-ietf-oauth-dpop-02),
which is at the draft stage at the time of this writing. This approach that is commonly known
as DPoP, does not require any changes to browser or TLS implementation. Also proposed
changes are doable at the application side. In chapter 10 we discuss DPoP in detail. When an
authorization server returns an access token as per the DPoP specification it has to set the
value of the token_type parameter to DPoP.
In contrast, if the token is a self-contained token, the resource server can validate the token itself; there’s no need
to talk to the authorization server. A self-contained token is a signed JWT or a JWS (see chapter 4). The JWT Profile for
OAuth 2.0 Access Tokens (which is in its tenth draft at the time of writing), developed under the IETF OAuth working
group, defines the structure for a self-contained access token.
In addition to the above RFCs, there are few interesting discussions happening in the IETF
OAuth working group, which is totally worth to be aware of as an application developer. The
following list highlights some of the draft specification in discussion, at the time of this
writing.
Since the introduction of OAuth 2.0 in 2012, it’s usage has gone well-beyond the initial
expectations. OAuth 2.0 is the de facto standard for securing APIs, microservices and many
more. Also, OAuth 2.0 is not only being used over HTTP. In the microservices world, for
example, OAuth 2.0 is being used to secure gRPC communications, Kafka topics and so on.
The Microservice Security in Action (Manning, 2020) book, which I co-authored with Nuwan
Dias, explains how to use OAuth 2.0 to secure microservices. Open Banking is another area
OAuth 2.0 is getting rapid adaption. So, for all of these use cases, security and
interoperability are a top priority, which OAuth 2.1 tries to address.
Following lists out some of the key changes OAuth 2.1 has introduced:
• Implicit and resource owner password grant types are no more. We already discussed
in this chapter, even with OAuth 2.0, these two grant types are not recommended.
• In addition to the confidential and public client types in OAuth 2.0, OAuth 2.1
introduces another new client type: credentialed. In section 2.4, we discussed about
credentialed client type.
• The authorization code grant is extended with the functionality from Proof Key for
Code Exchange (RFC 7636). We discuss RFC 7636 in detail in chapter 5.
• Under OAuth 2.0, when a client application accesses a resource, it can send the
access token either as an HTTP header or a query parameter. OAuth 2.1 removes the
support to send an access token as a query parameter.
• While validating an authorization request, the authorization server validates the
redirect_uri from the request with one already registered at the authorization server,
during application registration. With OAuth 2.1, the authorization server must make
sure that the redirect_uri in the authorization request exactly matches with the
redirect_uri registered at the authorization server.
2.9 Summary
• OAuth 2.0 is an authorization framework developed by the Internet Engineering Task
Force (IETF) OAuth working group. It’s defined in the RFC 6749
(https://tools.ietf.org/html/rfc6749).
• The fundamental focus of OAuth 2.0 is to fix the access delegation problem, which is
to let someone else access a resource you own, on your behalf.
• OpenID Connect is a standard developed by the OpenID foundation, on top of OAuth
2.0 specification.
• There are key roles defined in the OAuth 2.0 specification: client application,
authorization server, resource server and resource owner.
• An OAuth 2.0 grant type defines a request/response flow to get an access token from
the authorization server.
• A grant type defines four key components: authorization request, authorization
response, access token request, and access token response. However not all
the grant types implement all for key components we mentioned above.
• The OAuth 2.0 RFC identifies five main grant types: authorization code, implicit,
resource owner password, client credentials. Each grant type outlines the steps for
obtaining an access token.
• An access token can be either a reference token or a self-contained token.
• OAuth 2.1 does not introduce new drastic changes on top of OAuth 2.0. It tries to
simplify OAuth 2.0 for developers.
• OAuth 2.1 dropped resource owner password and implicit grant types from the
specification.
• OpenID Connect authentication flows and how they differ from OAuth 2.0 grant types
• How implicit authentication flow works with a single-page application
• How authorization code flow works with a single-page application
• Why you need to pick authorization code flow over implicit flow
• Securing a React-based single-page application with OpenID Connect
With the heavy adoption of APIs, over time, single-page applications (SPA) have become one
of the most popular options for building client applications on the web. If you are new to
single-page application architecture, we recommend you first go through the book SPA
Design and Architecture: Understanding Single Page Web Applications by Emmit Scott
(Manning Publication, 2015). Also in this chapter we assume you have a good knowledge of
OAuth 2.0, which is the fundamental building block of OpenID Connect. In case you don’t,
please go through chapter 2 before following the rest of the chapter.
Under OAuth 2.0 terminology, a SPA is identified as a public client application. 1 In
principle, a public client application is unable to hide any secrets from the users of it. Most of
the time a SPA is an application written in JavaScript that runs on the browser; so, anything
on the browser is visible to the users of that application. This is a key-deciding factor on how
you want to use OpenID Connect to secure a SPA.
In this chapter we’ll teach you different OpenID Connect authentication flows and how
those flows work with a SPA. You will also learn how to build a SPA using React and then log
1 In chapter 2, under the section 2.4 we discuss OAuth 2.0 client types and their characteristics.
in to it via OpenID Connect. React is the most popular JavaScript library for developing user
interfaces. If you are new to React, please go through appendix A first.
Figure 3.1 OpenID authentication flows define how the client application communicates with the OpenID
provider to authenticate an end user. Some communications happen via the web browser and some happens
directly between the client application and the OpenID provider.
In section 3.3 you learn how implicit flow works and in section 3.9 how authorization code
flow works. We’ll discuss hybrid flow in detail in chapter 6.
Table 3.1 The differences in the terminology, OAuth 2.0 vs. OpenID Connect
Authorization request: Initiated from the from client Authentication request: Initiated from the client
application to the authorization server. The scope and application to the OpenID provider. The scope and
redirect_uri are optional parameters in the redirect_uri are requred parameters in the
authorization request. authorization request.
Authorization response: Initiated from the Authorization response: Initiated from the OpenID
authorization server to the client application provider to the client application
Access token request: Initiated from the client ID token request: Initiated from the client application
application to the authorization server to the OpenID provider
Access token response: Initiated from the client ID token response: Initiated from the OpenID provider
application to the authorization server to the client application
The OpenID Connect specification defines the authentication flows in a self-contained manner
in itself. So we should not confuse the OAuth 2.0 grant types with OpenID Connect
authentication flows. The authorization code flow in OpenID Connect is not as same as the
authorization code grant type in OAuth 2.0, and the implicit flow in OpenID Connect is not as
same as the implicit grant type in OAuth 2.0.
2 The RFC 6749 in fact introduced five grant types. However, the behavior of refresh token grant type is quite different from other four: authorization code,
implicit, password and client credentials. When we say OAuth 2.0 defines four grant types, we refer to those four core grant types. As discussed in
chapter 2, OAuth 2.1 has removed implicit and password grant types from the specification.
Figure 3.2 The client application uses the implicit authentication flow to communicate with the OpenID
provider to authenticate the user. With the implicit authentication flow, all the communications between the
client application and the OpenID provider happen via the user agent.
The request the client application generates in step 1 of figure 3.2 is called an
authentication request. You may recall from the chapter 2, in OAuth 2.0 the request
initiated from the client application to the OAuth 2.0 authorization server is called an
authorization request, following the implicit grant type.
The following listing shows an example of an authentication request. This is in fact a URL
constructed by the client application, which takes the user to the authorization endpoint of
the OpenID provider, when the user clicks on the login link.
#A The scope values are separated by a space. However, when you type this on the browser, the browser will URL
encode the space, so the space will be replaced by %20.
#B The response_type values are separated by a space. However, when you type this on the browser, the browser will
URL encode the space, so the space will be replaced by %20.
Let’s go through the query parameters added to the authentication request by the client
application, as shown in listing 3.1. The definition of these parameters are consistent across
all three authentication flows the OpenID Connect defines, however, the values may change.
• client_id: This is an identifier the OpenID provider uses to uniquely identify a client
application. The client application gets a client_id after registering itself at the
OpenID provider. For registration at the OpenID provider, either you can follow an
out-of-band mechanism provided by the OpenID provider or use OpenID Connect
dynamic client registration API. 3 The client_id is a required parameter in the
authentication request, and is originally defined in the OAuth 2.0 specification, which
we discussed in detail in chapter 2.
• redirect_uri: This is an endpoint belongs to the client application. After successfully
authenticating the user and getting the consent from the user to share the requested
data with the client application, the OpenID provider redirects the user to the
redirect_uri endpoint along with the requested tokens (step 5 of figure 3.2). During
the client registration process at the OpenID provider, you need to share the exact
URI you use for redirect_uri parameter in the authentication request, with the
OpenID provider.
The OpenID provider will do one-to-one matching of the value of the redirect_uri in
the authentication request against the one already registered by the client application.
Most OpenID providers do an exact match between these two URIs. However, some
OpenID providers let the client applications register multiple URI and some let the
client applications define a regular expression for the validation of the redirect_uri.
3 A out-of-band mechanism could be a developer registering a client application using the UI provided by the OpenID provider.
The OpenID Connect specification defines four scope values (profile, email, address
and phone) in addition to the openid scope. A client application can use any of these
scope values to request claims from the OpenID provider. In chapter 5, we discuss in
detail how to use scopes to request claims.
In the implicit flow there are two possible values: id_token or id_token token. If
the value of the response_type is id_token, then the authorization endpoint will only
return back an ID token, and if the response_type is id_token token, then the
authorization endpoint will return back an ID token and an access token.
https://app.example.com/redirect_uri?token=XXXXX&id_token=YYYYYY
If you set the value of the response_mode parameter to fragment, then all the
response parameters are added to the redirect_uri as a URI fragment as shown
below.
https://app.example.com/redirect_uri#token=XXXXX&id_token=YYYYYY
In addition to the query and fragment, the OAuth 2.0 Form Post Response
Mode (https://openid.net/specs/oauth-v2-form-post-response-mode-1_0.html)
specification defines another response_mode called form_post, and we’ll
discuss that in chapter 6.
The response_type and response_mode are bit related to each other. If you do
not specify a response_mode parameter in the authentication request, then the
default response_mode associated with the corresponding response_type gets
applied automatically. If the response_type is id_token or id_token token
(implicit flow) for example, then the corresponding default response_mode is
fragment (table 3.2). That means, when you use implicit grant flow, the
OpenID provider sends back the response parameters as an URI fragment.
Under section 3.6 we discuss the differences between a URI fragment and a
query string.
Table 3.2 The default response_mode values for the corresponding response_type
Figure 3.3 The client application uses implicit authentication flow to communicate with the OpenID provider
to authenticate the user. This is duplicating the figure 3.1 for readability purpose.
• state: The value of state parameter is just a string, which is added to the
authentication request by the client application and the OpenID provider must return
back the same value (unchanged) in the response (step-5) in figure 3.3. In section
3.5 we discuss the use cases of the state parameter and also in chapter 10 we
discuss how to use the state parameter to mitigate some possible security threats.
In addition to the authentication request parameters we discussed in the above list, there are
few more optional ones: display, prompt, max_age, ui_locales, id_token_hint, and
acr_values. We’ll discuss them in detail in chapter 6.
THE OPENID PROVIDER VALIDATES THE AUTHENTICATION REQUEST AND REDIRECTS THE USER BACK TO THE BROWSER FOR
AUTHENTICATION
Once the OpenID provider validates the authentication request from the client application, it
checks whether the user has a valid login session under the OpenID provider’s domain. Here
the domain is the HTTP domain name that you use to access the OpenID provider using a
web browser. If the user has logged into the OpenID provider already from the same web
browser, then there exists a valid login session, unless its expired.
If the user does not have a valid login session, then the OpenID provider will challenge
the user to authenticate (step 2 in figure 3.3); and also will get user’s consent to share the
requested claims with the client application. In step 3 of figure 3.3 user types the login
credentials and in step 4 in figure 3.3, the browser posts the credentials to the OpenID
provider. The steps 2, 3 and 4 are out side the scope of the OpenID Connect specification
and up to the OpenID providers to implement in the way they prefer. Figure 3.4 shows a
sample login page, Google OpenID provider pops up during the login flow.
Figure 3.4 A sample login screen for user authentication from the Google OpenID provider.
THE OPENID PROVIDER RETURNS BACK THE REQUESTED TOKENS TO THE CLIENT APPLICATION
In step 5 of figure 3.5, the OpenID provider returns back the requested tokens to the client
application. If the client application, for example, requested only an ID token in step 1, by
having id_token as the value of the response_type parameter in the authentication
request, then the OpenID provider only returns back an ID token as shown below.
https://app.example.com/redirect_uri#id_token=YYYYYYYY&state=Xd2u73hgj59435
If the value of the response_type parameter was id_token token, then the OpenID
provider will return back both the ID token and the access token as shown below.
https://app.example.com/redirect_uri#access_token=XXXXXX&token_type=Bearer&expires_in=3600&
id_token=YYYYYYYY&state=Xd2u73hgj59435
In both the above cases the OpenID provider returns the tokens as an URI fragment.That’s
because, if you do not explicitly mention the response_mode parameter in the authentication
request, for the implicit flow the default value of the response_mode is fragment.
Figure 3.5 In step 5 the OpenID provider returns back the requested tokens to the client application.
Let’s go through the parameters in the URI fragments added to the authentication response
by the OpenID provider (step-5 in figure 3.5). The definition of these parameters are
consistent across all three authentication flows the OpenID Connect defines, however, the
values may change.
• access_token: The value of the access_token parameter carries the OAuth 2.0
access token. The OpenID provider adds an access_token to the response only if the
response_type is id_token token, under the implicit flow. A client application can
use this access token to securely access OpenID provider’s userinfo endpoint to
retrieve claims with respect to the logged in user or access a business API. In chapter
5, we discuss userinfo endpoint in detail. In case the client application does not need
to access any OAuth 2.0 secured APIs, it should use just id_token for the
response_type. So, there won’t be any access_token in the authentication response.
The OpenID provider (or the authorization server under the OAuth 2.0 terminology)
does not necessarily need to respect the scope value in the authentication request all
the time. Based on the consent provided by the user and other policies, the OpenID
provider can decide which scopes out of all the scopes in the authentication request it
wants to respect. So, a client application should not expect all the time to get an
access token, which is bound to the quested scope values.
If the scope of the access token in the response is different from the requested scope,
the OpenID provider must include the corresponding scope value in the response,
otherwise the client application can safely assume the token is issued for the
requested scope values.
• id_token: The value of the id_token parameter carries the OpenID Connect ID
token, which is a JWT. This is a required parameter and we discuss ID token in detail
in chapter 5.
• token_type: The value of the token_type parameter carries the type of the OAuth
2.0 access token. This is a required parameter and we discussed this in detail in
chapter 2.
• expires_in: The value of the expires_in parameter carries the validity of the OAuth
2.0 access token in seconds calculated from the time it is issued. This is an optional,
but a recommended parameter and we discussed this in detail in chapter 2.
• state: The value of the state parameter copies the value of the state parameter
from the authentication request. The value of the state parameter in response must
be exactly the same found in the request. In section 3.6 we discuss how to use the
state parameter. This is a required parameter only if the authentication request
carries a state parameter.
• scope: If the scope of the access token in the response is different from the
requested scope, the OpenID provider must include the corresponding scope value in
the response, otherwise the client application can safely assume the token is issued
for the requested scope values.
One important thing you might have already noticed in the authentication response from the
OpenID provider is, there is no refresh_token. A refresh_token is defined in the OAuth 2.0
specification and is used by the client applications to refresh (extend the token expiration)
the access_token and the id_token. However, the implicit flow in OpenID Connect does not
return back an refresh_token.
If you don’t have a refresh_token, the client application won’t be able to renew the
access_token (or the ID token) it got from the OpenID provider, and in that case the client
application has to initiate a new authentication request to get a new access_token and an ID
token. This is one of the reason (not the only reason, we discuss more in section 3.11)
people prefer to use the authorization code flow over implicit flow. We discuss authorization
code flow in detail in section 3.9 and refresh tokens in detail in chapter 6.
Figure 3.6 The client application uses implicit authentication flow to communicate with the OpenID provider to
authenticate the user.
Once the client application gets the tokens in the authentication response, it can use a
JavaScript to extract out the access_token and the id_token from the URL fragment. In
practice, what happens is, the OpenID provider does an HTTP redirect (with 302 status code)
to the redirect_uri corresponding to the client application and the client application
delivers an HTML page with a JavaScript to the browser, which extracts out the ID token and
the access token from the URI fragment and does the validation (see figure 3.6).
The following code listing shows a JavaScript code segment extracted out from the
OpenID Connect Implicit Client Implementer's Guide 1.0 (https://openid.net/specs/openid-
connect-implicit-1_0.html) that extracts out the URI fragment from the browser location bar
and posts to a backend endpoint for validation.
Listing 3.2 A JavaScript code that validates the tokens in the URI fragment
<script type = "text/javascript" >
req.onreadystatechange = function(e) {
if (req.readyState == 4) {
if (req.status == 200) {
// If the response from the POST is 200 OK, perform a redirect
window.location = 'https://' +
window.location.host + '/redirect_after_login'
}
// if the OAuth response is invalid, generate an error message
else if (req.status == 400) {
alert('There was an error processing the token')
} else {
alert('Something other than 200 was returned')
}
}
};
req.send(postBody)
</script>
rather, they want to reuse one client_id, with multiple redirect_uris by different
applications (see figure 3.7).
Figure 3.7 Multiple client applications reuse the same client_id but with different redirect_uris.
If you are to follow this pattern, you need to do this cautiously by knowing the drawbacks of
this approach, as listed below.
• Since you are using a single client_id to identify multiple applications, OpenID
provider won’t be able to recognize authentication requests generated by different
applications independently. Even though still it’s possible to differentiate
authentication requests from each other by looking at the redirect_uri, in practice,
most of the OpenID providers are not built in that way.
• Most of the OpenID providers make the access_token unique by the client_id,
user, associated scopes and the status of the token. The status of a token can be
ACTIVE, EXPIRED, REVOKED, LOCKED and so on. None of these are defined in the
OpenID Connect or OAuth 2.0 specifications; rather implemented by different OpenID
providers in the way they want. For a given client_id, user, set of scopes and
status, OpenID provider can find one access_token.
Then again, if the same user tries to login to the same client application several times
even before the first issued token is expired, the number of tokens the OpenID
provider has to maintain could explode, if the OpenID provider generates a new token
for each login of the user. To fix this token explosion problem, some OpenID providers
return the same access_token (not the ID token) back to the client application for
subsequent login requests from the same client application for the same user for the
same set of scopes, if the original token is not expired.
Under the context of this topic, the above solution to the token explosion problem will
result in sharing the same access_token with multiple applications, because they do
share the same client_id. One workaround for this is, use one additional scope value
to represent the client application – and it has to be unique for a given application.
This is a special scope value, which acts as a signal to the OpenID provider, rather a
scope that requires user’s consent. Since the requested scopes are different by the
application, even though they share the same client_id, still OpenID provider will
generate different tokens by application.
• Even if you register multiple redirect_uris at the OpenID provider, still the OpenID
provider does one-to-one validation of the redirect_uri in the authentication
request against each of the registered redirect_uris to see whether there is a
match.
So, if you are using a regular expression pattern for redirect_uri validation, you
need to make sure that it’s tested properly only to include what you want to have.
However, as discussed in chapter 2, with OAuth 2.1, the authorization server (or the
OpenID provider) must make sure that the redirect_uri in the authorization
request exactly matches with the redirect_uri registered at the authorization server.
client application. From the client application’s point of view, this provides a way to correlate
an authentication response from the OpenID provider with the authentication request it
generates (figure 3.8).
Figure 3.8 The client application adds a random, nonguessable string as the value of the state parameter to
the authentication request, and the OpenID provider returns the same value in the authentication response,
back to the client application.
• In an ecommerce application, for example, a user can add certain items to the
shopping cart and at the point they decide to checkout, the login with OpenID
Connect will kick in and will redirect the user to the OpenID provider. However, when
the user returns back from the OpenID provider to the client application, the user
expects the same shopping cart with all the items they picked, before being redirected
to the OpenID provider.
In a typical web application, this is handled by maintaining a session with the backend
web server, and correlated with the browser via a session cookie. In a SPA, you can
use the HTML5 session storage of the browser to store the shopping cart items against
a key, which is a randomly generated identifier. The value of the key is non-guessable
and non-static, and generated once for each browser session.
This key can go as the value of the state parameter in the authentication request;
and when the user returns back from the OpenID provider, the client application can
find the state value from the authentication response and use that as a key to load
the saved shopping cart from the browser session storage.
• In section 1.9.4 of chapter 1 we discussed bootstrapping trust with external identity
providers as a benefit of having a single trusted identity provider for a client
application. So, the client application only connects to its own trusted OpenID
provider, and then that OpenID provider helps connecting the client application to
other external identity providers (figure 3.9). These external identity providers can
support OpenID Connect, SAML and so on.
Figure 3.9 The internal identity provider bootstraps trust with partner identity providers, and does
claim/protocol transformations, as expected by the client applications connected to it. All the client
applications only need to trust the internal identity provider, and should only need to support the federation
protocol, the internal identity provider supports.
The OpenID provider the client application directly connects is responsible for getting
a response from the external identity providers and transforming the response in a
way the response it (the OpenID provider) generates is understood by the
corresponding client application. To do that the OpenID provider has to cache (or
store) the initial request it got from the client application and should be able to
correlate it to the response it gets from the external identity provider.
Most of the OpenID providers implement this using the state parameter in the
authentication request. Before the OpenID provider redirects the user to an external
identity provider, it generates a random string as the correlation handle, and stores
the authentication request from the client application against it. The OpenID provider
adds the correlation handle as the value of the state parameter in the authentication
request it generates, and when it receives a response from an external identity
provider, the OpenID provider expects the same correlation handle to be in the state
parameter of the response; and that helps to retrieve the initial authentication request
from the cache.
• Finally, and most importantly, one of the primary use cases of the state parameter is
security. The state parameter helps protecting a client application from a cross-site
request forgery (CSRF) attack, which we will discuss in detail in chapter 10.
https://app.example.com/redirect_uri?token=XXXXX&id_token=YYYYYY
If you set the value of the response_mode parameter to fragment, then all the response
parameters are added to the redirect_uri as a URI fragment as shown below.
https://app.example.com/redirect_uri
#access_token=XXXXXX&token_type=Bearer&expires_in=3600&id_token=YYYYYYYY&state=Xd2u7
3hgj59435
When the OpenID provider redirects the user back to the client application, it sets the HTTP
response status code to 302 and sets the Location header to the redirect_uri either with
a query string or with an URI fragment. Then the browser after receiving 302 from the
OpenID provider, extracts out the URL from the Location header and does an HTTP GET to
the extracted URL. Following lists out the key differences between an URI fragment and a
query string.
• Any parameter in a URI fragment never leaves the browser. When the browser does
an HTTP GET to the URL comes from the Location header of the 302 response from
the OpenID provider, the HTTP GET request goes to the backend web server, but still
the URI fragment remains on the browser address bar. However, a query string
attached to a URL, goes to the backend web server.
• As defined in the RFC 2616 (https://tools.ietf.org/html/rfc2616), the HTTP Referer
header does not carry anything from the URI fragment, but everything in the query
string is included.
Usually, when you click on a link on a web page, the URL of the current web page
goes as the Referer header of the HTTP request to the new web page, unless you
have setup a policy to not to include the Referer header. Having certain information
in the Referer header could lead to possible security issues and in chapter 10 we
discuss them in detail and how to mitigate those.
The following code listing shows how to use the java.secure.SecureRandom Java class to
generate a random number. Neil Madden, the author of the book API Security in Action
(Manning, 2020) suggests this approach in his blog
(https://neilmadden.blog/2018/08/30/moving-away-from-uuids/).
Listing 3.5 Authentication that redirects the user to the Google OpenID provider
https://accounts.google.com/o/oauth2/v2/auth?
client_id=450443992251-7gft9u.apps.googleusercontent.com&
redirect_uri=https://localhost:3000/redirect_uri&
scope=openid profile&
response_type=id_token token&
state=caf7871khs872&
nonce=89hj37b3gd3
Once you copy and paste the above request into your browser location bar, it will take you to
Google (the OpenID provider) for authentication. If you are not logged in already, you may
see a screen similar to figure 3.5. Also, the same screen will display the name of the client
application you used during the application registration process, and the set of attributes
Google about to share with the client application. We used profile as the scope value (in
addition to openid) in the authentication request, and the list of attributes (name, email
address, and so on) shown in figure 3.10 are related to that. You’ll learn more about
requesting user attributes using scopes in chapter 5.
Figure 3.10 Google login screen for user authentication. The name of the client application you used during
application registration process is displayed on the screen (Book Club), along with the user attributes Google is
going to share with the client application.
Once you complete the authentication flow at the Google OpenID provider by entering your
own credentials, it will redirect you back to the client application (to the redirect_uri we
provided ). However, since we do not have any application running on that address, the
response from Google OpenID provider will remain on the browser location bar, as shown in
the following code (listing 3.6). In practice, you will get a lengthy string for the values of the
access_token and id_token parameters, which we have replaced in the listing 3.6 with the
text <ACCESS_TOKEN> and <ID_TOKEN> respectively. In the response from Google, we got
both the access_token and the id_token; as you might have rightly guessed already, that is
because we used id_token token as the response_type in the authentication request
(listing 3.5). In case we have used just id_token as the response_type, you won’t see the
access_token in the response.
3.8.3 An overview of the ID token returned back from the Google OpenID provider
In section 3.8.2 we got an authentication response from the Google OpenID provider, which
includes an ID token. In this section we’ll delve deep into the attributes that you find in the
ID token (listing 3.6). As you learnt in chapter 1, the ID token is a JSON Web Token (JWT).
In chapter 4, we discuss JWT in detail.
The ID token you got from Google has three parts in it, where they are separated by each
other with a period (.). If you have three parts in a JWT, that is a JSON Web Signature (JWS)
and if you have five parts in a JWT, that is a JSON Web Encryption (JWE). In chapter 4, you
learnt about both JWS and JWE in detail.
To decode the JWT you got in the response from Google, in listing 3.6, let’s use the
https://jwt.io. Just copy the value of id_token parameter in the Encoded text area of the
web site (jwt.io), and you will get the values decoded. Then again, you need to treat your ID
tokens as secrets and never use public sites like jwt.io to decode any production ID tokens.
4Once you decode the ID token, you will see the decoded payload of the JWT as shown in the
following listing.
Listing 3.7 The decoded payload of the ID token returned from Google
{
"iss": "https://accounts.google.com",
"azp": "450443992251-9l17d82cli8npa9cdrvcp1g9m17gft9u.apps.googleusercontent.com",
"aud": "450443992251-9l17d82cli8npa9cdrvcp1g9m17gft9u.apps.googleusercontent.com",
"sub": "104063262378861625904",
"at_hash": "Krnm3SB1v_UR00j50VzLoQ",
"nonce": "89hj37b3gd3",
"name": "Prabath Siriwardena",
"picture": "https://lh3.googleusercontent.com/a-
/AOh14GiriDTmbf8tcSKzMkFYvYwYuBMUmGFdtEBqpvRGOA=s96-c",
"given_name": "Prabath",
"family_name": "Siriwardena",
"locale": "en",
"iat": 1601313517,
"exp": 1601317117,
"jti": "4fea08bd4386e45d6f8b869520ace3f7a4f80bde"
}
The JWT payload in listing 3.7 has two types of claims: the claims related to the end user
and the claims related to token validation. The sub, name, picture, given_name,
family_name, and locale are related to the end user, and all those claims related to the
user are standard claims defined in the OpenID specification. In chapter 5 we’ll go through
these claims in detail.
The rest of the claims in listing 3.7 are defined in two specifications. Some are defined in
the JWT specification (jti, iat, exp, aud, iss) and some are defined in the OpenID Connect
specification (nonce, at_hash, azp). Chapter 4 covers in detail all the claims defined by the
JWT specification, and in the section 3.7.4, we’ll explain nonce, at_hash and azp. Even
though iat, exp, aud and iss claims are defined in the JWT specification, the OpenID
Connect specification makes all of them mandatory in an ID token.
There are few more claims (auth_time, act and amr) OpenID Connect specification
introduced with respect to token validation and we’ll discuss them in detail in chapter 6. Then
again most of the time as an application developer, you will not write code to do these
validations yourself, rather will use some OpenID Connect library.
Security issues with the implicit flow and how to mitigate those
We discussed OAuth 2.0 implicit grant type in chapter 2. The IETF OAuth 2.0 working group recommends not using
the implicit grant type due to couple of security issues explained in the OAuth 2.0 Security Best Current Practice
document (https://tools.ietf.org/html/draft-ietf-oauth-security-topics-15), mostly the access token replay and access
token leakage attacks. We discuss these two attacks in chapter 10. However, the implicit flow in OpenID Connect is
not exactly the implicit grant type in OAuth 2.0; and OpenID Connect adds protection to prevent the implicit flow from
both the access token replay and access token leakage attacks.
The at_hash parameter we discussed in section 3.7.4 binds the access token issued along with the ID token, to
the ID token. Since the ID token itself has replay protection with the nonce parameter, binding the access token to
the ID token, also protects the access token being replayed. We have dedicated chapter 10 to discuss security issues
related to OpenID Connect and OAuth 2.0; so, we’ll differ a detailed discussion on this to chapter 10.
Figure 3.11 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.
THE CLIENT APPLICATION INITIATES A LOGIN REQUEST VIA THE BROWSER (STEP 1)
In the step 1 of figure 3.11, the client application initiates a login request via the browser. In
case of a SPA, we can expect the user clicks on a login link on the web page of the client
application, and browser does an HTTP GET to the authorization endpoint of the OpenID
provider. Listing 3.8 shows an example of an authentication request under the authorization
code authentication flow.
Listing 3.8 Authentication request generated by the client application (authorization code flow)
https://accounts.google.com/o/oauth2/v2/auth?
response_type=code&
client_id=424911365001.apps.googleusercontent.com&
scope=openid email&
redirect_uri=https%3A//app.example.com/redirect_uri&
state=Xd2u73hgj59435&
[email protected]&
nonce=0394852-3190485-2490358
In section 3.3 we discussed the usage of all the parameters in the listing 3.8, so we do not
intend to duplicate the same here. However, a few parameters need attention:
• response_type: The value of the response_type parameter in the authentication
request defines which tokens the authorization endpoint of the OpenID provider
should return back to the client application.
In the authorization code flow there is only one possible value: code. The client
application expects that authorization endpoint of the OpenID provider to return a
code in the authentication response. This is the key parameter in the authentication
request that differentiates an authorization code flow from the implicit flow.
• response_mode: The value of the response_mode parameter in the authentication
request defines how the client application expects the response from the OpenID
provider. For the authorization code flow, the default value of response_mode
parameter is query. So, the client application expects the OpenID provider to return
the code and the corresponding parameters in the query string to the redirect_uri.
THE OPENID PROVIDER VALIDATES THE AUTHENTICATION REQUEST AND REDIRECTS THE USER BACK TO THE BROWSER FOR
AUTHENTICATION (STEP 2)
Once the OpenID provider validates the authentication request from the client application, it
checks whether the user has a valid login session under the OpenID provider’s domain. If the
user has logged into the OpenID provider already from the same web browser, then there
exists a valid login session, unless its expired.
If the user does not have a valid login session, then the OpenID provider will challenge
the user to authenticate (step 2 in figure 3.12); and also will get user’s consent to share the
requested claims with the client application. In step 3 of figure 3.12 user types the login
credentials and in step 4 in figure 3.12, the browser posts the credentials to the OpenID
provider. The steps 2, 3 and 4 are outside the scope of the OpenID Connect specification and
up to the OpenID providers to implement in the way they prefer.
Figure 3.12 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.
THE OPENID PROVIDER RETURNS BACK THE AUTHORIZATION CODE TO THE CLIENT APPLICATION (STEP 5)
In step 5 of figure 3.12, the OpenID provider returns the authorization code along with the
state parameter to the client application in a query string to the redirect_uri (as shown
in the following line of code). Once the the client application receives the request, it needs to
make sure that the value of the state parameter in the response exactly matches with the
value of the state parameter in the authentication request.
https://app.example.com/redirect_uri?code=YDed2u73hXcr783d&state=Xd2u73hgj59435
THE CLIENT APPLICATION EXCHANGES THE AUTHORIZATION CODE TO AN ID TOKEN AND AN ACCESS TOKEN (STEP 6)
Unlike in the implicit flow, in the authorization code flow, the client application does not get
the ID token or the access token in the authentication response. To get an ID token and an
access token, the client application has to talk to the token endpoint of the OpenID
providers, as shown in the step 6 of figure 3.12. Following listing shows an example request
to the token endpoint of the OpenID provider, which carries the authorization code from the
authentication response. The token request defined in the authorization code authentication
flow in OpenID Connect is identical to the token request defined under the authorization code
grant type in OAuth 2.0 specification, which we discussed in the chapter 2.
Listing 3.9 Request to the token endpoint of the OpenID provider (authorization code flow)
POST /token
HTTP/1.1
Host: oauth2.googleapis.com
Content-Type: application/x-www-form-urlencoded
code=YDed2u73hXcr783d&
client_id=your_client_id&
redirect_uri=https%3A//oauth2.example.com/code&
grant_type=authorization_code
One key thing to notice here is, in the token request in listing 3.9, the client application does
not authenticate to the OpenID provider. So, we only use the client_id in the request and
the client application does not need to have a client_secret or any other authentication
mechanism. As you already learnt in chapter 2, a SPA is called a public client under the
OAuth 2.0 terminology and a public client does not have the capability protect any secrets.
Since a SPA runs on the browser, it can’t hide any secrets from the end user. Anything
that you hide on the browser is visible to the end user. So, no point of having any credentials
for an SPA. In chapter 6, we’ll discuss how to use the authorization code flow with a
traditional (server-side) web application and there the web application will use client_id
and client_secret to authenticate to the token endpoint of the OpenID provider.
THE OPENID PROVIDER RETURNS BACK AN ID TOKEN AND ACCESS TOKEN TO THE CLIENT APPLICATION (STEP 7)
In step 7 of figure 3.8 the OpenID provider returns back an ID token and an access token to
the client application, as shown in the following listing.
Listing 3.10 The response from the token endpoint of the OpenID provider
{
"access_token": "1/fFAGRNJru1FTz70BzhT3Zg",
"expires_in": 3920,
"token_type": "Bearer",
"id_token": "<ID_TOKEN>"
"refresh_token": "<ACCESS_TOKEN>"
}
The only difference you see in the response in listing 3.10 and the response you get from the
token endpoint under the OAuth 2.0 authorization grant type (which we discussed in chapter
2), here we have an additional parameter called id_token, which carries the ID token. Also,
unlike in the implicit flow, in the authorization code flow, the client application also gets a
refresh_token, which can be used to renew the access_token and also the id_token. We
talk about refreshing tokens in detail in chapter 6.
• Implicit flow does not return back a refresh_token, while the authorization code flow
does. Under implicit flow if the access_token expires, then the client application has
to re-initiate the login flow to obtain a new access_token.
• This point is not directly related to SPA, however if you implement OpenID Connect
login with the implicit flow in a native mobile application or a desktop application, it
could possibly be vulnerable for token interception, which we discuss in detail in
chapter 10. Under the authorization code flow, this attack can be mitigated by
implementing Proof Key for Code Exchange (PKCE) OAuth 2.0 profile
(https://tools.ietf.org/html/rfc7636) that we discuss in chapter 6 in detail.
• Finally, its better to understand why the implicit flow was added to the OpenID
Connect specification. When you use the authorization code flow with a SPA, to
exchange the code to an ID token or/and an access token, you need to do a direct
HTTP request from a JavaScript running on the browser to the token endpoint of the
OpenID provider. Most of the time, the domain of the token endpoint is different from
the domain of the client application. So, by default, browsers won’t permit cross-
domain calls. That means a client application running on app.example.com domain,
won’t be able to do a direct HTTP request from the JavaScript to the op.example.com
domain.
That’s was the reason, the OpenID Connect introduced the implicit flow. Unlike in the
authorization code, with implicit flow there is no direct HTTP requests between a client
application running on the browser and the OpenID provider. However, over the time
cross-origin resource sharing (CORS) policies have become popular. CORS enables
you to do cross domain calls selectively. So, you can use authorization code flow by
enabling CORS for the client application’s domain to access the token endpoint of the
authorization server. That means, there is no reason to use the implicit flow now. In
chapter 5, you’ll learn more details about CORS.
To build the React application, run the following command from sample01 directory on the
command console. This command wll create directory called build and copy all the files that
you want to deploy into your production web server into the build directory.
In this example we use a node server as our web server. You can start the node server using
the following command run from the sample01 directory.
The above command starts the node server on localhost port 3000 by default; if you visit
http://localhost:3000 link on the web browser, you will see a welcome message. This is the
simplest React application you can have; in the next two sections, you’ll learn how to secure
this application with OpenID Connect.
Listing 3.11 Parameters with sample values required to communicate with the OpenID provider
client_id: D4ZoMSpsxqgvUuiC6j5ROnEYea0a
redirect_uri: https://localhost:3000
Authorization endpoint: https://localhost:9443/oauth2/authz
Token endpoint: https://localhost:9443/oauth2/token
Issuer: https://localhost:9443
To install the @facilelogin/oidc-react package, run the following command from sample01
directory. Once the command runs successfully, you’ll find a new entry added into the
sample01/package.json file under the dependencies section with respect to the
@facilelogin/oidc-react package.
ReactDOM.render(
<OIDCProvider
domain="localhost:9443"
tokenEp="https://localhost:9443/oauth2/token"
authzEp="https://localhost:9443/oauth2/authorize"
clientId="D4ZoMSpsxqgvUuiC6j5ROnEYea0a"
issuer="https://localhost:9443/oauth2/token"
redirectUri={window.location.origin}>
<App />
</OIDCProvider>,
document.getElementById('book-club-app')
);
Now you can replace the code in sample01/src/components/App.js with the following, that
adds a login button to the welcome page.
Listing 3.13Updated App.js code that renders the login button to initiate the login flow
import React from 'react';
import { useAuth0 } from '@facilelogin/oidc-react';
function App() {
const {isLoading,isAuthenticated,error,user,loginWithRedirect,logout,} = useAuth0();
if (isLoading) {
return <div>Loading...</div>;
}
if (error) {
return <div>Oops... {error.message}</div>;
}
if (isAuthenticated) {
console.log(user.id);
return (
<div>
Hello {user.sub}{' '}
<button onClick={() => logout({ returnTo: window.location.origin })}>
Log out
</button>
</div>
);
} else {
return <button onClick={loginWithRedirect}>Log in</button>;
}
}
Now you can build the updated React application and start the node server using the
following two npm commands.
Once the node server successfully started up, you can visit http://localhost:3000 and click on
the login button to initiate the login flow; and you will get redirected to the OpenID
provider’s login page.
3.12 Summary
• OpenID Connect defines three authentication flows: authorization code flow,
implicit flow, and hybrid flow.
• An authentication flow in OpenID Connect uses OAuth 2.0 grant types, but an
authentication flow is more than a grant type. It defines additional request/response
parameters on top of what is already defined by OAuth 2.0 for grant types.
• The OpenID Connect specification defines the authentication flows in a self-contained
manner in itself. So we should not confuse the OAuth 2.0 grant types with OpenID
Connect authentication flows.
• The implicit authentication flow uses id_token or id_token token as the value of the
response_type parameter in the authentication request and gets both the access
token and the ID token from the authorization endpoint of the OpenID provider.
• The authorization code authentication flow uses code as the value of the
response_type parameter in the authentication request and gets the access token
and the ID token from the token endpoint of the OpenID provider.
• In practice, some SPAs use implicit flow as well as authorization code flow. However,
most new SPAs use only authorization code flow.
Imagine that your state’s Department of Motor Vehicles (DMV) creates a JWT, to
represent your driver’s license, with your personal information, which includes your name,
address, eye color, hair color, gender, date of birth, license expiration date, and license
number. All these items are attributes, or claims, about you and we can also call them as
attribute assertions. The DMV is the issuer of these attribute assertions, and also the issuer
of the JWT that embeds those assertions into the JWT.
Anyone who gets a JWT can decide whether to accept what’s in it as true, based on the
level of trust they have in the issuer of the token (in this case, the DMV). But before
accepting a JWT, how do you know who issued it? The issuer of a JWT signs it by using their
private key and the JWT itself carries an identifier corresponding to the issuer under a special
attribute id called iss. In the scenario illustrated in figure 4.1, a bartender, who is the
recipient of the JWT, can verify the signature of the JWT and see who signed the token.
Figure 4.1 A JWT is used as a container to transport assertions from one place to another in a
cryptographically safe manner. The bartender, who is the recipient of the JWT, accepts the JWT only if they
trust the DMV, the issuer of the JWT.
In addition to the attribute assertions, a JWT can carry authentication and authorization
assertions. In fact, a JWT is a container; you can fill it with anything you need. An
authentication assertion might carry an identifier corresponding to the user the issuer of the
JWT asserts and how the issuer authenticated the user before issuing the assertion. In the
DMV use case, an authentication assertion might be your first name and last name (or even
your driver’s license number), or how you are known to the DMV. Later in this chapter you’ll
learn how the issuer embeds authentication assertions into a JWT.
An authorization assertion is about the user’s entitlements, or what the user is entitled to
do. Based on the assertions the JWT brings in from the issuer, the recipient of the JWT can
decide how to act. In the DMV example, if the DMV decides to embed the user’s age as an
attribute in the JWT, that data is an attribute assertion, and a bartender can do the math to
calculate whether the user is old enough to buy a beer. Also, without sharing the user’s age
with the bartender, the DMV may decide to include an authorization assertion stating that
the user is old enough to buy a beer. In that case, a bartender will accept the JWT and let
the user buy a beer. The bartender wouldn’t know the user’s age, however they will let the
user buy a beer because the DMV authorized to do so.
In addition to carrying a set of assertions about the user, a JWT plays another role behind
the scenes. Apart from the end user’s identity, a JWT also carries the issuer’s identity, which
is the DMV in this case. The issuer’s identity is implicitly embedded in the signature of the
JWT. By looking at the corresponding public key while validating the signature of the JWT,
the recipient can figure out who the token issuer is. Also, the JWT carries an identifier
corresponding to the issuer under a special attribute id called iss.
Figure 4.2 A JWT formatted as a JWS. This has three parts in it: the JWT header, which is also known as the
JOSE header, JWT body, which is also known as the claims set, and the third part is the signature.
As you can see in figure 4.2, a JWT, which is also a JWS token, has three parts, with a dot (.)
separating each part (in section 4.4 you’ll learn that when a JWT is a JWE, it has 5 parts):
• The first part is known as the JSON Object Signing and Encryption (JOSE) header.
We’ll discuss JOSE header in detail in section 4.3 and 4.4.
1 In computing, serialization is the process of translating a data structure or object state into a format that can be stored or transmitted and reconstructed
later (https://en.wikipedia.org/wiki/Serialization)
• The second part is the claims set, or body (or payload). In general, under JWS we call
this the JWS body, however when a JWS becomes a JWT, we call this a claims set.
We’ll discuss these differences in section 4.3.
• The third part is the signature of the JOSE header and the claims set. We’ll discuss in
section 4.3 how an issuer generates a signature and a client application verifies it.
#A The cryptographic algorithm used to sign the JOSE header and the body of a JWT. The RFC 7518: The JSON Web
Algorithms specification defines identifiers of these algorithms.
You already learned that a JWT either has to be a JWS or JWE token. So, what exact
attributes go into the JOSE header depends on whether the JWT is a JWS or JWE token. The
JWT specification does not define any new attributes that go in the JWT JOSE header. All the
header attributes JWT inherits either come from the JWS and JWE specifications. However, in
the JWT specification, it talks about two header attributes that it herits from the JWS
sepcification and explains the usage of those two with respect to a JWT. Following lists out
those two header attributes.
• The typ attribute in the JOSE header is used to define the media type of the complete
JWT. A media type is an identifier, which defines the format of the content,
transmitted on the Internet. There are two types of components that process a JWT:
JWT implementations and JWT applications. Nimbus is a JWT implementation in Java.
The Nimbus library knows how to build and parse a JWT. A JWT application can be
anything, which uses JWTs internally. A JWT application uses a JWT implementation
(such as Nimbus) to build or parse a JWT. In this case, the typ element is just
another element for the JWT implementation. It will not try to interpret the value of it,
but the JWT application would. The typ element helps JWT applications to
differentiate the content of the JWT when the values that are not JWTs could also be
present in an application data structure along with a JWT object. This is an optional
element and if present for a JWT, it is recommended to use JWT as the media type.
This is an optional attribute.
{
"alg": "RS256",
"typ": "JWT"
}
• The cty attribute in JOSE header is used to define the structural information about the
JWT. In case of a nested JWT this attribute must be present in the JOSE header and
the value must be JWT. However in non-nested cases, it is not recommended to use
this. A nested JWT is a JWT that encloses another JWT. In section 4.11, we discuss
nested JWTs in detail.
So, what’s difference between the typ and cty attributes? The typ attribute says
whether the overall structure is a JWT or not. If the cty attribute is present and if its
value is set to JWT, then it’s an indication that it’s a nested JWT.
{
"alg": "RS256",
"cty": "JWT"
}
The JWT specification (RFC 7519) defines seven attributes: sub, aud, nbf, iss, exp, iat, and
jti. None of these are mandatory—and it’s up to the other specifications that rely on JWT to
define what is mandatory and what is optional. For example, the OpenID Connect
specification makes the iss, iat, aud and exp attributes mandatory.
These seven attributes that the JWT specification defines are registered in the Internet
Assigned Numbers Authority (IANA) Web Token Claims registry
(https://www.iana.org/assignments/jwt/jwt.xhtml). However, you can introduce your own
custom attributes to the JWT claims set (in chapter 5 we discuss how to introduce custom
claims into an ID token). In the following sections, we discuss these seven attributes in
detail.
JWT is optional, the OpenID Connect specification makes it a required attribute in the ID
token. So, a JWT, when used within the context of OpenID Connect protocol (as an ID
token), it must have the iss attribute.
Time (UTC). Any recipient of a JWT must make sure that the time represented by the exp
attribute is not in the past when accepting a JWT—or in other words, the token is not
expired. The iat attribute in the JWT claims set expresses the time when the JWT was
issued. That too is expressed in seconds and calculated from 1970-01-01T0:0:0Z as
measured in UTC.
The time difference between iat and exp in seconds isn’t the lifetime of the JWT when
there’s an nbf (not before) attribute present in the claims set. You shouldn’t start processing
a JWT (or accept it as a valid token) before the time specified in the nbf attribute. The value
of nbf is also expressed in seconds and calculated from 1970-01-01T0:0:0Z as measured in
UTC. When the nbf attribute is present in the claims set, the lifetime of a JWT is calculated
as the difference between the exp and nbf attributes. However, in most cases, the value of
nbf is equal to the value of iat.
According to the JWT specification, exp, iat and nbf are optional attributes. However,
the OpenID Connect specification makes exp and iat attributes mandatory in an ID token,
and does not talk about the nbf attribute. So, the OpenID Connect client applications should
always look for the exp and iat attributes while verifying the ID token and must not expect
the nbf attribute to be present all the time. However, if the nbf attribute is present in the
token, the corresponding application must validate it.
4.3 What does JSON Web Signature (JWS) token look like?
In section 4.2 you learned that a JWT is either a JWS or a JWE token, but the reverse is not
true for any JWT or a JWT token. The JWT we went through in the section 4.2 is also a JWS
token. In this section we delve deep in to JWS and you’ll learn the structure of a JWS token
and what differences are there bwteen a JWT and JWS token.
The RFC 7515: JSON Web Signature (JWS) defines the structure and the processing rules
of a JWS token (https://tools.ietf.org/html/rfc7515). Unlike the JWT specification, which is
developed under the IETF OAuth working group, the JWS specification is developed under the
IETF JOSE working group. JWS provides two standard ways to represent a signed message.
The message to be signed, also known as the JWS payload (figure 4.3) can be anything,
such as a JSON payload, an XML payload, or a binary. One way to represent a signed
message with JWS is to use compact serialization, and the other way is to use JSON
serialization.
Figure 4.3 A JWT formatted as a JWS. This has three parts in it: the JWT header, which is also known as the
JOSE header, JWT body, which is also known as the claims set, and the third part is the signature.
As you learned already in section 4.2 we don’t call every JWS a JWT. A JWS becomes a JWT
only when it follows compact serialization. Then again, that’s not 100% precise. As per the
JWS specification, whether a JWS token is compact serialized or JSON serialized, the
message (the JWS payload) it signs can be anything, not necessarily a JSON payload.
However, if we are to call a JWS token, a JWT, then it has to be a compact serialized JWS
token that signs a JSON payload. In other words, for a generic JWS, the content it protects
(or signs) can be represented in any format, but for a JWS to become a JWT, the JWS
payload must be a JSON object (see figure 4.4). In fact JWS payload is the generic name for
the content that is to be protected, and we call the JWS payload a claims set only when it
becomes a JWT.
Figure 4.4 A JWS token can be either JSON serialized or compact serialized and the JWS payload can be any
content: XML, JSON, binary and so on. A JWT is a compact serialized JWS (or JWE) token where the payload is
a JSON payload.
With JSON serialization, the JWS is represented as a JSON payload (listing 4.3). It’s not
called a JWT. The payload parameter in the JSON-serialized JWS can carry any value. The
message being signed and represented in listing 4.3 is a JSON message with all its related
metadata.
Listing 4.3 An example of a JWS token with JSON serialization and all related metadata
{
"payload":"eyJpc3MiOiJqb2UiLA0KICJleHAiOjEzMDA4MTkzODA...",
"signatures":[
{
"protected":"eyJhbGciOiJSUzI1NiJ9",
"header":{
"kid":"2010-12-29"
},
"signature":"cC4hiUPoj9Eetdgtv3hF80EGrhuB__dzERat0"
},
{
"protected":"eyJhbGciOiJFUzI1NiJ9",
"header":{
"kid":"e9bc097a-ce51-4036-9562-d2ade882db0d"
},
"signature":"DtEhU3ljbEg8L38VWAfUAqOyKAM6..."
}
]
}
Unlike in a JWT, a JSON serialized JWS can carry multiple signatures corresponding to the
same payload. In listing 4.3, the signatures JSON array carries two elements, and each
element carries a different signature of the same payload. The protected and header
attributes inside each element of the signatures JSON array define the metadata related to
the corresponding signature. Since the focus of this chapter is on JWT, we don’t intend to
discuss JSON serialized JWS tokens in detail. However, if you are interested in learning more
of JWS JSON serialization please check the chapter 7 of the book, Advanced API Security.
Now run your Java program to create a JWS with the following command (from the
chpater04/sample01/lib directory). If it executes successfully, it prints the base64url-
encoded JWS:
eyJhbGciOiJSUzI1NiJ9.eyJzdWIiOiJwZXRlciIsImF1ZCI6IiouZWNvbW0uY29tIiwibmJmIj
oxNTMzMjcwNzk0LCJpc3MiOiJzdHMuZWNvbW0uY29tIiwiZXhwIjoxNTMzMjcxMzk0LCJpYXQiO
jE1MzMyNzA3OTQsImp0aSI6IjVjNGQxZmExLTc0MTItNGZiMS1iODg4LTliYzc3ZTY3ZmYyYSJ9
.aOkwoXAsJHz1oD-N0Zz4-dvZBtz7oaBXyoysfTKy2vV6C_Sfw05w10Yg0oyQX6VBK8tw68Tair
pA9322ZziTcteGxaNb-Hqn39krHT35sD68sNOkh7zIqLIIJ59hisO81kK11g05Nr-nZnEv9mfHF
vU_dpQEP-Dgswy_lJ8rZTc
You can decode this JWS token by using the JWT decoder available at https://jwt.io. The
following is the decoded JWS claims set, or payload:
{
"sub": "peter",
"aud": "app.example.com",
"nbf": 1533270794,
"iss": "iss.example.com",
"exp": 1533271394,
"iat": 1533270794,
"jti": "5c4d1fa1-7412-4fb1-b888-9bc77e67ff2a"
}
Take a look at the code that generated the JWT. It’s straightforward and self-explanatory
with comments. You can find the complete source code in the
sample01/src/main/java/com/manning/oidc/chapter04/sample01/RSASHA256JWTBuilder.jav
a file. The following code listing does the core work of JWT generation. It accepts the token
issuer’s private key as an input parameter and uses it to sign the JWT using the RSA-SHA256
signing algorithm.
// create the signed JWT with the JWS header and the JWT body.
SignedJWT signedJWT = new SignedJWT(jswHeader, jwtClaims);
return jwtInText;
}
{
"alg": "RS256"
}
4.5.2 The jku carries a URL pointing to a JSON Web Key set
The jku attribute in the JOSE header carries a URL, which points to a JSON Web Key (JWK)
set. This JWK set represents a collection of JSON-encoded public keys, where one of the keys
is used to sign the JWS token. 2 Whatever the protocol used to retrieve the key set should
provide the integrity protection. If keys are retrieved over HTTP, then instead of plain HTTP,
HTTPS (or HTTP over TLS) should be used. The jku is an optional attribute as per the JWS
specification. However, under the context of OpenID Connect, the jku attribute must
not be used to load the keys to verify an ID token, instead the OpenID provider (the
issuer of the ID token) and the client application (the recipient of the ID token) must
communicate the keys used for signing in some other means, which we discuss in the
following sections.
{
"jku": "https://example.com/jwks.json"
}
2 A JSON Web Key (JWK) is a JSON representation of a cryptographic key, and a JSON Web Key Set (JWKS) is a representation of multiple JWKs. The RFC
7517 (https://tools.ietf.org/html/rfc7517) provides the structure and the definition of a JWK.
4.5.3 The jwk carries the public key corresponding to the signature
The jwk attribute in JOSE header represents the public key corresponding to the key that is
used to sign the JSON payload. The key is encoded as per the JSON Web Key (JWK)
specification (https://tools.ietf.org/html/rfc7517). The jku parameter, which we discussed in
section 4.5.2 points to a link that holds a set of JWKs, while the jwk parameter embeds the
key into the JOSE header itself. The jwk is an optional parameter. However, as discussed in
section 4.5.2 under the context of OpenID Connect, the jwk attribute must not be
used to load the keys to verify an ID token; instead the OpenID provider (the issuer of
the ID token) and the client application (the recipient of the ID token) must communicate
the keys used for signing in some other means, which we discuss in the following sections.
{
"jwk": <Embeds the JWK>
}
4.5.4 The kid represents an identifier for the key used to sign the message
The kid attribute of the JOSE header represents an identifier for the key that is used to sign
the JOSE payload. Using this identifier, the recipient of the JWS should be able locate the
key. In OpenID Connect implmentations this is the mostly used approach. The actual key
used to sign the message is exchanged between the OpenID provider and the client
application, and the client application when it has to validate the ID token, it looks up for the
corresponding key, using the kid attribute in the JOSE header.
If the token issuer uses the kid parameter in the JOSE header to let the recipient know
about the signing key, then the corresponding key should be exchanged “somehow” between
the token issuer and the recipient beforehand. How this key exchange happens is out of the
scope of the JWS specification. If the value of the kid parameter refers to a JWK, then the
value of this parameter should match the value of the kid parameter in the JWK. The kid is
an optional parameter in the JOSE header.
{
"kid": "wkek18392-199kkjh39-2983j7h"
}
intermediate CAs (certificate authority) and the root CA. The x5u is an optional parameter in
the JOSE header. However, as discussed in section 4.5.2 under the context of OpenID
Connect, the x5u attribute must not be used when building the ID token.
{
"x5u": "https://example.com/x509.pem"
}
{
"x5c": <PEM encoded X.509 certificate or chain of certificates>
}
{
"x5t": "Khk48kek39kk3..."
}
In the same as in the x5t attribute, the x5t#s256 attribute in the JOSE header represents
the base64url-encoded SHA256 thumbprint of the X.509 certificate corresponding to the key
used to sign the JSON payload. The only difference between x5t#s256 and the x5t is the
hashing algorithm. The x5t#s256 is an optional parameter in the JOSE header.
{
"alg": "Xjeklr39kdj3d..."
}
{
"crit": ["exp"]
}
Figure 4.5 A JWT formatted as a JWS. This has three parts in it: the JWT header, which is also known as the
JOSE header, JWT body, which is also known as the claims set, and the third part is the signature.
1. First to build the JOSE header, we construct a JSON object that includes all the header
attributes, which expresses the cryptographic properties of the JWS token. As
discussed before, the token issuer should advertise in the JOSE header (figure 4.5),
the public key corresponding to the key used to sign the message. This can be
expressed via any of these header
elements: jku, jwk, kid, x5u, x5c, x5t and x5t#s256. However, as you learned
in section 4.5, when building an ID token under the context of OpenID Connect, you
cannot use jku, jwk, x5u and x5c attributes.
2. Compute the base64url-encoded value against the UTF-8 encoded JOSE header from
the step 1, to produce the 1st element of the JWS token.
3. Construct the payload or the content to be signed — or the JWS payload. The payload
is not necessarily JSON — it can be any content. However as you learned already in
this chapter, if the JWS to be n JWT, the JWS payload must be a JSON payload, and
we call it the claims set.
4. Compute the base64url-encoded value of the JWS payload from the step 3 to produce
the 2nd element (figure 4.5) of the JWS token.
5. Build the message to compute the digital signature or the Mac. The message is
constructed as ASCII(BASE64URL-ENCODE(UTF8(JOSE Header)) ‘.’ BASE64URL-
ENCODE(JWS Payload)).
6. Compute the signature over the message constructed in the previous step, following
the signature algorithm defined by the JOSE header element alg. The message is
signed using the private key corresponding to the public key advertised in the JOSE
header.
7. Compute the base64url-encoded value of the JWS signature produced in the step 6,
which is the 3rd element (figure 4.5) of the serialized JWS token.
8. Now we have all the elements to build the JWS token in the following manner.
4.7 What does JSON Web Encryption (JWE) token look like?
In the section 4.5, we stated that a JWT is a compact-serialized JWS. It can also be a
compact-serialized JSON Web Encryption (JWE) token. Like JWS, a JWE represents an
encrypted message using compact serialization or JSON serialization. In this section we delve
deep in to JWE and you’ll learn the structure of a JWE token and what differences there are
between a JWT and JWE token.
JWS addresses the integrity and nonrepudiation aspects of the data contained in it, while
JWE protects the data for confidentiality. If you transfer some data in a JWS, for an example,
the recipient of the JWS would know if the token got modified in the middle, by verifying the
signature of the token. If the signature verification fails, that means the message the
recipient got is not same as the original message. In other words, the signature of the JWS
protects the integrity of the data it transfers. Also, the signature verification helps to achieve
nonrepudiation. Nonrepudiation means the sender of the JWS cannot later dispute that they
did not send the message. However, JWS does not prevent anyone from seeing the content
of the message while in transit, unless you send the message over TLS. But when you use
JWE to transfer some data, irrespective of the transport channel uses TLS or not, it makes
sure only the intended recipient of the token can see the content of the token.
A JWE is called a JWT only when compact serialization is used. Then again as we
discussed in section 4.3, to be precise, for us to call JWE a JWT, it has to be compact
serialized, as well as the JWE payload must be a JSON object. In a generic JWE token, the
JWE payload can be anything, XML, JSON and so on. However, for a JWE to become a JWT,
the JWE payload must be a JSON object, and once the JWE payload becomes a JSON object,
we call it a claims set.
A compact-serialized JWE (see figure 4.6) has five parts; each part is base64url-encoded and
separated by a dot (.). The JOSE header is the part of the JWE that carries metadata related
to the encryption. The JWE encrypted key, initialization vector, and authentication tag are
related to the cryptographic operations performed during the encryption. We won’t talk about
those in detail here. If you’re interested, we recommend chapter 8 of the book, Advanced
API Security: OAuth 2.0 and Beyond. Finally, the ciphertext part of the JWE includes the
encrypted payload.
You might have noticed in figure 4.6, that there is no JWE payload. Yes, ciphertext in the
figure 4.6 is produced by encrypting the JWE payload, and that’s why we do not have JWE
payload in figure 4.6.
With JSON serialization, the JWE is represented as a JSON payload. It isn’t called a JWT.
The ciphertext attribute in the JSON-serialized JWE carries the encrypted value of any
payload, which can be JSON, XML or even binary. The actual payload is encrypted and
represented in listing 4.5 as a JSON message with all related metadata. Since the focus of
this chapter is on JWT, we don’t intend to discuss JSON serialized JWE tokens in detail.
However, if you are interested in learning more of JWS JSON serialization please check the
chapter 8 of the book, Advanced API Security: OAuth 2.0 and Beyond.
Listing 4.6 An example of a JWE token with JSON serialization and all related metadata
{
"protected":"eyJlbmMiOiJBMTI4Q0JDLUhTMjU2In0",
"unprotected":{
"jku":"https://server.example.com/keys.jwks"
},
"recipients":[
{
"header":{
"alg":"RSA1_5",
"kid":"2011-04-29"
},
"encrypted_key":"UGhIOguC7IuEvf_NPVaXsGMoLOmwvc1G"
},
{
"header":{
"alg":"A128KW",
"kid":"7"
},
"encrypted_key":"6KB707dM9YTIgHtLvtgWQ8mKwboJW3of9locizkDTHzBC2IlrT1oOQ"
}
],
"iv":"AxY8DCtDaGlsbGljb3RoZQ",
"ciphertext":"KDlTtXchhZTGufMYmOYGS4HffxPSUrfmqCHXaI9wOGY",
"tag":"Mz-VPPyU4RlcuYv1IwIvzw"
}
repository inside the chapter04 directory. Before you delve into the Java code that you’ll use
to build the JWE, try to build the sample and run it. Run the following Maven command from
the chapter04/sample02 directory. If everything goes well, you should see the BUILD
SUCCESS message at the end:
Now run your Java program to create a JWE with the following command (from the
chapter04/sample02/lib directory). If it executes successfully, it prints the base64url-
encoded JWE:
eyJlbmMiOiJBMTI4R0NNIiwiYWxnIjoiUlNBLU9BRVAifQ.Cd0KjNwSbq5OPxcJQ1ESValmRGPf
7BFUNpqZFfKTCd-9XAmVE-zOTsnv78SikTOK8fuwszHDnz2eONUahbg8eR9oxDi9kmXaHeKXyZ9
Kq4vhg7WJPJXSUonwGxcibgECJySEJxZaTmA1E_8pUaiU6k5UHvxPUDtE0pnN5XD82cs.0b4jWQ
HFbBaM_azM.XmwvMBzrLcNW-oBhAfMozJlmESfG6o96WT958BOyfjpGmmbdJdIjirjCBTUATdOP
kLg6-YmPsitaFm7pFAUdsHkm4_KlZrE5HuP43VM0gBXSe-41dDDNs7D2nZ5QFpeoYH7zQNocCjy
bseJPFPYEw311nBRfjzNoDEzvKMsxhgCZNLTv-tpKh6mKIXXYxdxVoBcIXN90UUYi.mVLD4t-85
qcTiY8q3J-kmg
JWE Header:{"enc":"A128GCM","alg":"RSA-OAEP"}
JWE Content Encryption Key: Cd0KjNwSbq5OPxcJQ1ESValmRGPf7BFUNpqZFfKTCd-9
XAmVE-zOTsnv78SikTOK8fuwszHDnz2eONUahbg8eR9oxDi9kmXaHeKXyZ9Kq4vhg7WJPJXS
UonwGxcibgECJySEJxZaTmA1E_8pUaiU6k5UHvxPUDtE0pnN5XD82cs
Initialization Vector: 0b4jWQHFbBaM_azM
Ciphertext: XmwvMBzrLcNW-oBhAfMozJlmESfG6o96WT958BOyfjpGmmbdJdIjirjCBTUA
TdOPkLg6-YmPsitaFm7pFAUdsHkm4_KlZrE5HuP43VM0gBXSe-41dDDNs7D2nZ5QFpeoYH7z
QNocCjybseJPFPYEw311nBRfjzNoDEzvKMsxhgCZNLTv-tpKh6mKIXXYxdxVoBcIXN90UUYi
Authentication Tag: mVLD4t-85qcTiY8q3J-kmg
Decrypted Payload:
{
"sub":"peter",
"aud":"app.example.com",
"nbf":1533273878,
"iss":"iss.example.com",
"exp":1533274478,
"iat":1533273878,
"jti":"17dc2461-d87a-42c9-9546-e42a23d1e4d5"
}
NOTE If you get any errors while executing the previous command, check whether you executed the
command from the correct location. It has to be from inside the chapter04/sample02/lib directory, not from
the chapter04/sample02 directory. Also make sure that the value of the –cp argument is within double
quotes.
Now take a look at the code that generated the JWE. It’s straightforward and self-
explanatory with code comments. You can find the complete source code in the
sample02/src/main/java/com/manning/oidc/chapter04/sample02/RSAOAEPJWTBuilder.java
file. The method in the following listing does the core work of JWE encryption. It accepts the
token recipient public key as an input parameter and uses it to encrypt the JWE with RSA-
OAEP.
aud.add("*.ecomm.com");
// create the encrypted JWT with the JWE header and the JWT payload.
EncryptedJWT encryptedJWT = new EncryptedJWT(jweHeader, jwtClaims);
return jwtInText;
}
{
"alg": "RS256"
}
Typically assymetric key encryption is resource intensive and also does not perfom well when
it has to encrypt a large amount of data. Because of that, in most of the cases, a symmetric
key does the data encryption, and then an assymetric key encrypts the symmetric key.
4.9.2 The enc represents the algorithm used for content encryption
The enc attribute in the JOSE header represents the name of the algorithm, which is used for
content encryption. This algorithm should be a symmetric Authenticated Encryption with
Associated Data (AEAD) algorithm. 3 This is a required attribute in the JOSE header. Failure to
include this in the header will result in a token parsing error. The value of the enc parameter
is a string, which is picked from the JSON Web Signature and Encryption Algorithms registry
defined by the JSON Web Algorithms (JWA) specification
(https://tools.ietf.org/html/rfc7518). If the value of the enc parameter is not picked from
the preceding registry, then it should be defined in a collision-resistant manner, but that
won’t give any guarantee that the particular algorithm is identified by all JWE
3 Authenticated encryption (AE) and authenticated encryption with associated data (AEAD) are forms of encryption, which simultaneously assure the
confidentiality, and authenticity of data (https://en.wikipedia.org/wiki/Authenticated_encryption).
implementations. It’s always better to stick to the algorithms defined in the JWA
specification.
{
"enc": "A256GCM" #A
}
#A Advanced Encryption Standard (AES) in Galois/Counter Mode (GCM) algorithm as defined in the JWA specification.
{
"zip": "DEF"
}
4.9.4 The jku carries a URL, which points to a JSON Web Key set
The jku parameter in the JOSE header carries a URL, which points to a JSON Web Key (JWK)
set. This JWK set represents a collection of JSON-encoded public keys, where one of the keys
is used to encrypt the Content Encryption Key (CEK). Whatever the protocol used to retrieve
the key set should provide the integrity protection. If keys are retrieved over HTTP, then
instead of plain HTTP, HTTPS (or HTTP over TLS) should be used. The jku is an optional
parameter. However, like in JWS, under the context of OpenID Connect, the jku
attribute must not be used when building the ID token, instead the OpenID provider
(the issuer of the ID token) and the client application (the recipient of the ID token) must
communicate the keys used for encryption in some other means.
{
"jku": "https://example.com/jwks.json"
}
4.9.5 The jwk attribute carries the public key corresponding to the CEK
The jwk attribute in JOSE header represents the public key corresponding to the key that is
used to encrypt the Content Encryption Key (CEK). The key is encoded as per the JSON Web
Key (JWK) specification. The jku parameter, which we discussed before, points to a link that
holds a set of JWKs, while the jwk parameter embeds the key into the JOSE header itself.
The jwk is an optional parameter. However, like in JWS, under the context of OpenID
Connect, the jwk attribute must not be used when building the ID token.
{
"alg": "RS256"
}
4.9.6 The kid carries an identifier for the key used to encrypt CEK
The kid parameter of the JOSE header represents an identifier for the key that is used to
encrypt the Content Encryption Key (CEK). Using this identifier, the recipient of the JWE
should be able to locate the key. If the token issuer uses the kid parameter in the JOSE
header to let the recipient know about the encryption key, then the corresponding key should
be exchanged “somehow” between the token issuer and the recipient beforehand. How this
key exchange happens is out of the scope of the JWE specification. If the value of the kid
parameter refers to a JWK, then the value of this parameter should match the value of the
kid parameter in the JWK. The kid is an optional parameter in the JOSE header.
{
"kid": "a7ejkje-eo38mehr-38klen"
}
{
"x5u": "https://example.com/x509.pem"
}
4.9.8 The x5c carries the X.509 certificate embedded into the token
The x5c attribute in the JOSE header represents the X.509 certificate (or the certificate
chain), which corresponds to the public key, which is used to encrypt the Content Encryption
Key (CEK). This is similar to the jwk parameter we discussed before, but in this case instead
of a JWK, it’s an X.509 certificate (or a chain of certificates). The certificate or the certificate
chain is represented in a JSON array of certificate value strings. Each element in the array
should be a base64-encoded DER PKIX certificate value. The public key corresponding to the
key used to encrypt the Content Encryption Key (CEK) should be the very first entry in the
JSON array, and the rest is the certificates of intermediate CAs (certificate authority) and the
root CA. The x5c is an optional parameter in the JOSE header. However, like in JWS, under
the context of OpenID Connect, the x5u attribute must not be used when building
the ID token.
{
"x5c": <PEM encoded X.509 certificate or chain of certificates>
}
{
"x5t": "Xdelr79e..."
}
The x5t#s256 attribute in the JOSE header represents the base64url-encoded SHA256
thumbprint of the X.509 certificate corresponding to the key used to encrypt the Content
Encryption Key (CEK). The only difference between x5t#s256 and the x5t is the hashing
algorithm. The x5t#s256 is an optional parameter in the JOSE header.
{
"x5t#256": "Xdweelw4kr79e..."
}
{
"crit": ["exp"]
}
1. Figure out the key management mode by the algorithm used to determine the Content
Encryption Key (CEK) value. This algorithm is defined by the alg attribute in the JOSE
header (figure 4.8). There is only one alg element per JWE token.
2. Compute the CEK and calculate the JWE Encrypted Key (figure 4.8) based on the key
management mode, picked in the previous. The CEK is later used to encrypt the JSON
payload. There is only one JWE Encrypted Key element in the JWE token.
3. Compute the base64url-encoded value of the JWE Encrypted Key, which is produced in
the previous step. This is the 2nd element of the JWE token (figure 4.8).
4. Generate a random value for the JWE Initialization Vector. Irrespective of the
serialization technique, the JWE token will carry the value of the base64url-encoded
value of the JWE Initialization Vector. This is the 3rd element of the JWT token (figure
4.8).
5. If token compression is needed, the JSON payload in plaintext must be compressed
following the compression algorithm defined under the zip header element.
6. Construct the JSON representation of the JOSE header and find the base64url-encoded
value of the JOSE header with UTF8 encoding. This is the 1st element of the JWE
token (figure 4.8).
7. To encrypt the JSON payload, we need the CEK (which we already have), the JWE
Initialization Vector (which we already have), and the Additional Authenticated Data
(AAD). Compute ASCII value of the encoded JOSE header from the previous step and
use it as the AAD.
8. Encrypt the compressed JSON payload (from the previous step) using the CEK,
the JWE Initialization Vector and the Additional Authenticated Data (AAD), following
the content encryption algorithm defined by the header enc header element.
9. The algorithm defined by the enc header element is a AEAD algorithm and after the
encryption process, it produce the ciphertext and the Authentication Tag.
10.Compute the base64url-encoded value of the ciphertext, which is produced by the
step one before the previous. This is the 4th element of the JWE token (figure 4.8).
11.Compute the base64url-encoded value of the Authentication Tag, which is produced by
the step one before the previous. This is the 5th element of the JWE token (figure
4.8).
12.Now we have all the elements to build the JWE token in the following manner. The line
breaks are introduced only for clarity.
Figure 4.9 A nested JWT, where the enclosing JWT is a JWE token and the enclosed JWT is a JWS token. Once
you decrypt the ciphertext components of the JWE token, produces the enclosed JWT or the JWS token.
There are two parts in a nested JWT: the enclosing JWT and the enclosed JWT. In figure 4.9,
the enclosing JWT is a JWE token, and the enclosed JWT is a JWS token. Once you decrypt
the ciphertext components of the JWE token (the enclosing JWT), produces the JWS token.
According to the OpenID Connect specification, if you are to encrypt an ID token, it
has to be signed first, and then encrypted. Signing an ID token produces a JWS token,
and then encrypting that JWS token, produces a JWE token, which is also a nested JWT.
Figure 4.10 A nested JWT, where both the enclosing JWT and the enclosed JWT are a JWS token.
The figure 4.10 shows another form of a nested JWT, where both the enclosing JWT and
enclosed JWT are JWS tokens. Both in figure 4.9 and figure 4.10, payload of the enclosing
JWT is not a JSON payload. In both the cases, the payload is a JWT. So, in a JWT, if the
payload is another JWT, we must use the cty JOSE header attribute in the enclosing JWT and
set its value to JWT. We discussed the cty attribute in the section 4.2.1.
{
"cty": "JWT"
}
In the nested JWT we had in figure 4.9, the use case is obvious. Having an enclosing JWE
token helps to keep the content of the enclosed JWS token protected for confidentaility. But
in the second case (figure 4.10), why do we need to sign a JWT twice? This useful, when one
application has to share a JWT it received from a token issuer, with another application. In a
microservices deployment, for example, say one microservice recieves a JWT, and this
microservice needs to authenticate to another microservice and also has to pass the original
context of the client who invoked the first microservice. In that case, the first microservice
can create a nested JWT by signing the original JWT it received from the client, and pass the
nested JWT to the second microservice. The figure 4.11 illustrates this use case.
Figure 4.11 A nested JWT, where both the enclosing JWT and the enclosed JWT are a JWS token. A token
issuer issues the enclosed JWT and the client application passes it to the Inventory microservice. Then, the
Inventory microservice builds nest JWT, signs the enclosing JWT with its own private key and send it to the
Order Processing microservice.
Apart from the two forms of nested JWTs we discussed so far in this section, there can be
another form, where both the enclosing and enclosed JWTs are JWEs. However according to
the OpenID Connect specification if a JWT is encrypted, it must be signed first, so this form
of a nested JWT is not encouraged to use in an OpenID Connect flow.
One key limitation we found in all the forms of nested JWTs we discussed so far, there is
no way for the enclosing JWT to have its own claims set. Rather, the payload of the enclosing
JWT is another JWT. If you take the use case we illustrated in figure 4.11, there can be a
requirement where the Inventory microservice has to share some claims with the Order
Processing microservice along with the nested JWT. To address this requirement in a
standard way, there is proposed draft specification at the IETF OAuth working group called
the Nested JSON Web Token specification (https://tools.ietf.org/html/draft-yusef-oauth-
nested-jwt-03).
This draft specification introduces a new value of the cty header attribute called, NJWT
and new attribute for the JWT claims set called njwt. For any nested JWT to support this
model it has to have the value of the cty header attribute as NJWT and the enclosed JWT is
set as the value of the njwt attribute in the claims set as shown in the following listing. The
following listing shows the JOSE header and the claims set of the nested JWT.
Listing 4.7 JOSE header and the claims set of a nested JWT
{
"cty": "NJWT" #A
}
{
"sub": "peter",
"aud": "app.example.com",
"nbf": 1533270794,
"iss": "issuer.example.com",
"exp": 1533271394,
"iat": 1533270794,
"jti": "5c4d1fa1-7412-4fb1-b888-9bc77e67ff2a".
"njwt": <Enclosed JWT> #B
}
#A The value of cty attribute must be set to NJWT in the JOSE header of the enclosing JWT. The JOSE header will have
other header attributes as well.
#B The njwt attribute in the enclosing JWT’s claims set carries the enclosed JWT
4.12 Summary
• The JWT is one of the key building blocks of in building the OpenID Connect standard.
• A JWT (pronounced jot) is a container that carries different types of assertions or
claims from one place to another in a cryptographically safe manner.
• A JWT is always a JWS or a JWE token. But the reverse is not always true.
• There are two forms of serialization for JWS and JWE tokens: the compact
serialization and JSON serialization.
• A JWS token becomes a JWT, when it is compact serialized and the JWS payload is a
JSON payload.
• A JWE token becomes a JWT, when it is compact serialized and the JWE payload is a
JSON payload.
• The RFC 7519 developed under the IETF OAuth working group defines the structure
and the processing rules of a JWT.
• The RFC 7515 developed under the IETF JOSE working group defines the structure
and the processing rules of a JWS token.
• The RFC 7516 developed under the IETF JOSE working group defines the structure
and the processing rules of a JWE.
• A nested JWT is a JWT that carries another JWT as the payload.
• According to the OpenID Connect specification, if you are to encrypt an ID token, it
has to be signed first, and then encrypted, which will produce a nested JWT.
1 A claim dialect is a way of grouping a set of related claims. We can call all the claims that the OpenID Connect specification supports, the OpenID
Connect claim dialect.
following list is an overview of the available options in OpenID Connect to transport claims
between a client application and an OpenID provider (figure 5.1).
• A client application can request claims using the scope parameter in the
authentication request to the OpenID provider. This is the most popular approach to
request claims and almost all the OpenID provider implementations support it. We
discuss this approach in detail in section 5.2.
• A client application can request claims by talking to the userinfo endpoint of the
OpenID provider. The userinfo endpoint is a standard endpoint OpenID Connect
introduced to share claims with the client applications. This approach is useful, when
you use OpenID Connect under the hybdird mode (with code id_token token as the
response_type), and the length of the ID token is too long. We discuss this approach
in detail in 5.3.
• A client application can request claims using the claims parameter in the OpenID
Connect authentication request. This gives you more control to selectively pick, which
claims you need without binding those to a scope value. We discuss this approach in
detail in section 5.4.
Figure 5.1 A client application can request claims from the OpenID provider in multiple ways. The most
popular way to request claims using the scope parameter and the OpenID provider can return claims, either in
the ID token, or via the userinfo endpoint. The other way to request claims is to use the claims parameter in
the OpenID Connect authentication request.
5.2.1 Requesting claims using scope parameter from Google OpenID provider
In this section we’ll take you through a simple example to show how eBay requests claims
from the Google OpenID provider. If you visit ebay.com and click on log in, then you can pick
either Google or Apple for logging in. These options provided by eBay may vary at the time
you read the book, but that won’t affect our discussion in this chapter. In addition to Google
and Apple, you will also see an option to log in with Facebook. However, login with Facebook
does not support OpenID Connect, so we don’t worry about it.
Let’s assume you picked “log in with Google.” Then you’ll be redirected to the Google
OpenID provider, and if you look at the browser location bar, you’ll find the following URL in
listing 5.1. This URL carries a set of query parameters, and all of them are standard
parameters defined in the OpenID Connect specification, except for the flowName
parameter. The flowName is a custom parameter that is specific to the Google OpenID
provider. However, our focus of this section is only on the scope parameter in the request.
Listing 5.1 An authorization code flow request to the Google OpenID provider
https://accounts.google.com/o/oauth2/v2/auth/identifier?locale=en_US&
client_id=510718330363&
scope=openid email profile& #A
response_type=code&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fwww.ebay.com%2Fsignin%2Fggl%2Fcb&
state=dl4xLjEjaV4xI3BeMSNyXjEjSV4zI2ZeMCN0XlVsNDF&
flowName=GeneralOAuthFlow. #B
#A The value of the scope parameter carries one or more values, which are known to the OpenID provider to request
claims.
#B A custom parameter that is specific to the Google OpenID provider
As we discussed in the chapter 1, any OpenID Connect request must have openid as the
scope, and you can have other additional scopes as well. In listing 5.1, you can see two
other values defined under the scope parameter: email and profile. This is the most
common way a client application requests claims from an OpenID provider, using scope
values. The OpenID provider should know how to interpret the value of scope parameter, in
this example email and profile, and return back the corresponding claims to the client
application. The values email and profile are standard scopes defined by the OpenID
Connect specification and we discuss all the standard scopes OpenID Connect defines in
section 5.2.3.
Listing 5.2 An authorization code flow request to the Apple OpenID provider
https://appleid.apple.com/auth/authorize?locale=en_US&
response_type=code%20id_token&'
scope=name&. #A
response_mode=form_post&
redirect_uri=https%3A%2F%2Ffanyv88.com%3A443%2Fhttps%2Fwww.ebay.com%2Fsignin%2Fapple%2Fcb&
state=dl4xLjEjaV4xI2ZeMCNyXjEjcF4xI0leMyN0XlVsN&
client_id=com.ebay.www
#A eBay uses name as the value of the scope parameter to requests user’s name from Apple.
In code listing 5.2 eBay uses name as the value of the scope parameter to requests user’s
name from Apple. However, as you might have already guessed, in this request we don’t
pass openid as a scope in the request. But as per the OpenID Connect specification, you
must pass openid as a scope in the authentication request; if you don’t, then the behavior is
undefined, or in other words the behavior may differ from one OpenID provider to another.
However, the Apple OpenID provider responds correctly regardless if you pass openid as a
scope parameter. It’s always better to adhere to the OpenID Connect specification and
always pass openid as a scope parameter, in addition to the other values the scope
parameters carries.
Table 5.1 The standard scopes defined by the OpenID Connect specification
profile The profile as a scope value, represents multiple claims that include name,
family_name, given_name, middle_name, nickname,
preferred_username, profile, picture, website, gender,
birthdate, zoneinfo, locale, and updated_at. When you pass
profile as a scope parameter you request from the OpenID provider all these
claims. These 14 claims are known as standard claims in OpenID Connect. They
are called standard claims, because their meaning is well defined in the OpenID
Connect specification. OpenID Connect specification defines 20 standard claims
including these 14 and we discuss those in detail in section 3.2.4.
email The email as a scope value represents multiple claims that include email and
email_verified. These two claims too are part of the 20 standard claims the
OpenID Connect specification defines. When you pass email as a scope parameter
you request from the OpenID provider the two claims: email and
email_verified.
address The address as a scope value represents the standard claim address. When
you pass address as a scope parameter you request from the OpenID provider
the address claim.
phone The phone as a scope value represents multiple claims that include
phone_number and phone_number_verified. These two claims too are part
of the 20 standard claims the OpenID Connect specification defines. When you
pass phone as a scope parameter you request from the OpenID provider the two
claims: phone_number and phone_number _verified.
DO NOT REDEFINE A STANDARD SCOPE! As a best practice you should not try to redefine the
meaning of the standard scopes that are already defined in the OpenID Connect specification. That could
cause confusion and will kill the interoperability between OpenID providers and client applications.
In section 5.2.3 we discussed how 19 out of the 20 standard claims are mapped to four
standard scopes OpenID Connect defines. The only standard claim that is not mapped to a
scope is sub. The sub is a special identifier in the ID token, which is used to identify the
owner of the token. The following table defines all 20 standard claims OpenID Connect
defines.
Table 5.2 The standard claims defined by the OpenID Connect specification
sub An identifier owned by the OpenID provider, which represents the end-user.
preferred_username The username preferred by the end-user. The value of this claim may not be unique
across all the users managed by the OpenID provider.
email The end-user’s email address. The value of this claim may not be unique across all
the users managed by the OpenID provider.
email_verified The value of this claim is true, if the email of the end-user being verified by the
OpenID provider, else false.
gender The gender of the end-user. The OpenID Connect defines male and female as
possible values; and if none of them are applicable the OpenID provider can pick
its own.
zoneinfo
locale
phone_number The end-user’s phone number. The value of this claim may not be unique across all
the users managed by the OpenID provider.
phone_number_verified The value of this claim is true, if the phone_number of the end-user being verified
by the OpenID provider, else false.
updated_at
OpenID provider’s documentation. If you look at the Google OpenID provider documentation
available at https://developers.google.com/identity/protocols/oauth2/openid-connect, for
example, you will find the custom claim hd, which is the hosted G Suite domain of the user.
THE OPENID PROVIDER SHOULD KNOW HOW TO INTERPRET THE VALUE OF THE SCOPE
PARAMETER In the section 5.2.3, we discussed how eBay uses the name as the scope parameter to
request claims from the Apple OpenID provider. The name is not a standard scope. However, as we
discussed in section 5.2.4 name is a standard claim. The rule is, you can’t pass a standard claim as a scope
parameter in the OpenID Connect authentication request and expect the corresponding claim value in the
response. If you use anything other than the standard scope values in the authentication request, the
behavior of such scope values must be defined by the corresponding OpenID provider’s documentation. So,
in this case we can conclude that, in section 5.2.3, the name scope parameter is a custom scope, not a
standard claim.
2 As we discussed in chapter 3, not all the OpenID Connect authentication flows return both the ID token and the access token. In the implicit
authentication flow, for example, when the response_type parameter is set to id_token, the OpenID provider only returns the ID token.
discussed in section 5.2.3, you can guarantee that those are returned to the client
application via the ID token. In section 5.3 you’ll learn how to use an access token issued by
an OpenID provider to request claims by talking to the userinfo endpoint. In fact the userinfo
endpoint of the OpenID provider is secured with OAuth 2.0 and the claims returned by it are
bounded to the scope of the corresponding access token the OpenID provider returns along
with the ID token.
Listing 5.4 An authorization code flow request to the Google OpenID provider
https://accounts.google.com/o/oauth2/v2/auth?
client_id=4504439922519l17d82cli8.apps.googleusercontent.com&
redirect_uri=https://localhost:3000/redirect_uri&
scope=openid profile&
response_type=code&
state=caf7871khs872&
nonce=89hj37b3gd3
The most important parameter in the code listing 5.4 is the scope. That’s the one we worry
about in this section and here we set its value as openid profile. So, we use the standard
scope profile to request attributes from the Google OpenID provider, and expect the
standard claims grouped under the profile scope (see table 5.1) in the response (more
precisely in the ID token).
Once you copy and paste the request in code listing 5.4 into your browser location bar, it
will take you to Google OpenID provider for authentication. If you are not logged in already,
you may see a screen similar to figure 5.2. Also, the same screen will display the name of
the client application you used during the application registration process, and the set of
attributes Google about to share with the client application. We used profile as the scope
value (in addition to openid) in the authentication request, and the list of attributes shown in
figure 5.2 are related to that.
Figure 5.2 Google login screen for user authentication. The name of the client application you used during
application registration process is displayed on the screen along with the user attributes Google is going to
share with the client application.
Even though the profile scope groups 14 standard claims under it, the Google OpenID
provider is only sharing 4 claims with the client application: name, email_address, locale,
and picture. This is quite important to notice. There is no guarantee that the OpenID
provider will return back all the claims a client application requests via scopes. It’s
up to the OpenID provider and also to the end-user to decide which claims they want to
share with client application.
Once you complete the authentication flow at the Google OpenID provider, it will redirect
you back to the client application (to the redirect_uri we provided). However, since we do
not have any application running on that address, the response from Google OpenID provider
will remain on the browser location bar, as shown in the following code snippet.
Listing 5.5 The response form the Google OpenID provider with authorization code
https://localhost:3000/redirect_uri?
state=caf7871khs872&
code=4/5wFzvDar86R-AJECIT&
scope=profile openid https://www.googleapis.com/auth/userinfo.profile&
authuser=0&
prompt=consent
In listing 5.5, in the response from the Google OpenID provider, we got the authorization
code along with four other parameters. If you are familiar with the typical response you get
from an authorization endpoint, you may find this response from Google OpenID provider is
bit weird. As per the OpenID Connect specification (and the OAuth 2.0 RFC), the response
from the authorization endpoint returns only two parameters, the code and the state. 3
USE OF SCOPE WITH AUTHORIZATION CODE AND IMPLICIT AUTHENTICATION FLOWS The
scope is a parameter defined in the OAuth 2.0 RFC, and the OpenID provider is not required to return it back
from the authorization endpoint. As per the RFC, the OpenID provider is only required to send back the scope
parameter in the response from token endpoint, for the OpenID Connect authorization code authentication
flow. However, when using implicit authentication flow, as discussed in section 3.7.2, the OpenID provider is
supposed to return back the scope parameter in the response from the authorization endpoint.
However the response from the Google OpenID provider includes three additional
parameters: scope, authuser, and prompt. These parameters are not defined in either the
OpenID Connect specification or OAuth 2.0 RFC. Also, none of these specifications restricts
an OpenID provider from returning any custom parameters in the response, so you should be
able to safely ignore them, or else look for the documentation provided by the corresponding
OpenID provider and handle them accordingly. These additional parameters the Google
OpenID provider returns are not documented anywhere, so we believe these parameters
could possibly be used by Google applications themselves. 4
3 In section 3.9, we discussed in detail how the OpenID Connect authorization code flow works.
4 The Google OpenID provider documentation is available at https://developers.google.com/identity/protocols/oauth2/openid-connect.
We got the value of code parameter from the authorization endpoint of the Google OpenID
provider from listing 5.5. In listing 5.6, we first export the values of client_id,
client_secret, redirect_uri, and code to four environment variables. You may have your
own values for these four environment variables.
The values of client_id, client_secret, and redirect_uri in listing 3.6 are for your
client application that you registered at the Google OpenID provider, and will be the same for
all the requests your application sends to the token endpoint the Google OpenID provider.
However, the value of the code will change for each request. The following code listing shows
the response from the token endpoint of the Google OpenID provider.
Listing 5.7 The response from the token endpoint of Google OpenID provider
{
"access_token": "ACCESS_TOKEN",
"expires_in": 3599,
"scope": "https://www.googleapis.com/auth/userinfo.profile openid",
"token_type": "Bearer",
"id_token": "ID_TOKEN"
}
For clarity, in listing 5.7 we’ve removed the value of the access_token parameter, and
replaced it with ACCESS_TOKEN, as well as replaced the value of id_token with ID_TOKEN.
The value of the access_token is a lengthy string generated by the Google OpenID provider,
and the value of the id_token is a JWT.
As you learned in chapter 4, JWT can be a JSON Web Signature (JWS) or a JSON Web
Encryption (JWE). The value of the id_token parameter can be either of them. However, the
Google OpenID provider returns back a JWS. A JWS has three parts in it: the header, body
(claims set) and the signature.
The following listing shows the decoded body of the id_token, which carries the claims we
requested using the scope, profile. In section 3.7.3, we had a detailed discussion on the
parameters included in the ID token. So, if you need any clarification on the content of listing
5.9 please refer section 3.7.3.
claims from the OpenID provider. In the example we had in section 5.2, based on the
scopes, the OpenID provider included all or some of the requested claims in the ID token it
returned from the token endpoint.
In the approach we discuss in this section, the client application talks to a special
endpoint at the OpenID provider called the userinfo endpoint to request claims. You’ll first
learn what the userinfo endpoint is and why OpenID Connect introduced this special
endpoint, and then we’ll take you through an example with the Google OpenID provider.
Figure 5.3 The client application first gets an access token from the OpenID provider and then uses that access
to request claims from the userinfo endpoint. The claims issued from the userinfo endpoint are bounded to the
scope of the corresponding access token.
All these three flows get you an access token, except the implicit flow with response_type
parameter having the value id_token. Having an access token is a primary requirement for a
client application to retrieve claims from the userinfo endpoint (see figure 5.3).
The userinfo endpoint is protected with OAuth 2.0; and that’s why a client application
must have an access token to access it. The client application must first complete an OpenID
Connect authentication flow that returns an access token, and then use that access token to
talk to the userfinfo endpoint of the OpenID provider; and the userinfo endpoint returns back
the requested claims with respect to the corresponding user as a JSON payload.
To refresh your memory from chapter 3, the following table lists the tokens you get under
each authentication flow, based on the value of the response_type parameter in the
authentication request.
Table 5.3 The authentication flows defined by the OpenID Connect specification
Unlike the ID token, which is a JWT, the response from the userinfo endpoint is a JSON
payload unless the client application specifically requested for a JWT. However, in most of
the cases, like the ID token, the claims included in the response from the userinfo endpoint
are also based on the value of the scope parameter in the OpenID Connect authentication
request. In section 5.5 we discuss an approach where you can request claims from an
OpenID provider without based on the scope value in the authentication request.
1. To onboard OAuth 2.0 client applications that worked around OAuth 2.0 to
authenticate users.
2. To onboard legacy client applications that didn’t have the cryptographic capabilities to
validate an ID token.
REASON 1 FOR THE OPENID CONNECT SPECIFICATION TO INTRODUCE THE USERINFO ENDPOINT
Prior to OpenID Connect, people worked around OAuth 2.0 to share user claims between an
identity provider (OAuth 2.0 authorization server) and client applications. In section 1.5, we
discussed how the client applications use Login with Facebook to authenticate users.
Facebook only supports OAuth 2.0 (at the time of this writing), however it introduced an
endpoint (https://graph.facebook.com/me), where the client applications can authenticate
with an access token corresponding to a Facebook user and retrieve their claims.
This endpoint Facebook introduced is quite similar to the userinfo endpoint we have in
OpenID Connect. However, it’s not only Facebook who introduced similar endpoints to work
around OAuth 2.0 to share user claims; there were many other OAuth 2.0 authorization
servers did the same. OpenID Connect standardized this approach with the introduction of
the userinfo endpoint. And also the userinfo endpoint helped migrating the OAuth 2.0 client
applications, which worked around OAuth 2.0 to authenticate users to use OpenID Connect.
These client applications used to call a custom endpoint hosted at the OpenID provider and
secured with OAuth 2.0 to get user claims. To migrate to OpenID Connect they had to use
the userinfo endpoint instead of the custom endpoint.
REASON 2 FOR THE OPENID CONNECT SPECIFICATION TO INTRODUCE THE USERINFO ENDPOINT
A client application that receives user claims in a JWT must validate the integrity of the token
by verifying the signature of it. In chapter 4, we discussed the JWT verification process in
detail. In the OpenID Connect implicit authentication flow and in some of the hybrid flows
(where the response_type is code id_token and code id_token token) the OpenID
provider returns back the ID token in the response from the authorization endpoint.
The response from the authorization endpoint always flows via the browser. That means
the end user sees what’s in it and has the ability to change the content of the ID token. So,
the client application should not accept any ID token that flows via the browser without
validating the integrity of it.
However, not all the programming languages had the libraries to validate a JWT some
time back, and the userinfo endpoint helped to onboard the applications developed in those
languages to use OpenID Connect. So, the client applications can use the access token it
gets from the authorization endpoint of the OpenID provider, and use it retrieve user claims
from the userinfo endpoint. The communication between the client application and the
userinfo endpoint must happen over TLS, and can rely on TLS to protect the integrity of the
communication channel.
5.3.4 Using the userinfo endpoint with the Google OpenID provider with cURL
In this section you’ll learn how to retrieve claims from the userinfo endpoint at the Google
OpenID provider, using cURL. We assume you have completed the examples in section 5.2.6
and have a valid access token issued by the Google OpenID provider. In case the access
token is expired you can redo the same steps in section 5.2.6 to get a new access token.
The following cURL command uses that access token to authenticate to the userinfo
endpoint to retrieve user claims. Here we first export the value of the access token to the
TOKEN environment variable and then run the cURL command against the userinfo endpoint
of Google OpenID provider. As you learned already, the userinfo endpoint is an OAuth 2.0
protected endpoint, so we need to pass the access token in the Authorization HTTP header
with the Bearer prefix.
The following listing shows the response from userinfo endpoint, which is a JSON payload
(not a JWT) that carries the user claims we requested via the scope parameter.
Listing 5.10 The response from userinfo endpoint that return user claims
{
"sub": "108063262378861625804",
"name": "Prabath Siriwardena",
"given_name": "Prabath",
"family_name": "Siriwardena",
"picture": "https://lh3.googleusercontent.com/a-
/AOh14GiriDTmbf8tcSKzMkFYvYwYuBMUmGFdtEBqpvRGOA\u003ds96-c",
"locale": "en"
}
If you compare the JSON response in code listing 5.10 with the JWT claims set in listing 5.9
that is part of the ID token, you’ll find that in both the cases Google OpenID provider returns
the same set if user claims. However, the JWT claims set has additional claims (for example,
iss, aud, exp, and so on), but those are not directly related to the user, rather being used by
the client applications to validate the JWT.
same origin. An origin of a given URL consists of the URI scheme, hostname, and port. Given
the URL http://localhost:8080/login, the following elements compose the origin:
• http—The URI scheme
• localhost—The hostname/IP-address
• 8080—The port
The sections after the port aren’t considered to be part of the origin; therefore, /login isn’t
considered to be part of the origin. The same-origin policy exists to prevent a malicious script
on one website from accessing data on other websites unintentionally. The same-origin policy
applies only to data access, not while loading CSS, images, and scripts, so you could write
web pages that consist of links to CSS, images, and scripts of other origins. To be precise,
you can only load scripts from other origins, but you cannot use those scripts to invoke an
endpoint from a different origin. Figure 5.4 illustrates this scenario.
Figure 5.4 In a web browser, the same-origin policy ensures that scripts running on a particular web page can
make requests only to services running on the same origin.
1. The browser loads an HTML file (index.html) from the domain example.com. This
request is successful.
2. The index.html file loaded into the browser makes a request to the same domain
(example.com) to load CSS and JavaScript files; it also loads data (makes an HTTP
request) from the domain example.com. All requests are successful because
everything is from the same domain as the web page itself.
3. The index.html file loaded into the browser makes a request to a domain named
example.org to load CSS and JavaScript files. This request, although made to a
different domain (example.org) from a web page loaded from another domain
(example.com) is successful because it’s loading only CSS and JavaScript.
4. The index.html file loaded into the browser loads data (makes an HTTP request) from
an endpoint on domain example.org. This request fails because, by default, the
browser doesn’t allow web pages in one domain (example.com) to make HTTP
requests to endpoints in other domains (example.org) unless the request is for CSS,
JavaScript, or images.
The server responds to this preflight request with the following headers:
• Access-Control-Allow-Credentials—Indicates whether the server allows the
request originator to send credentials in the form of authorization headers, cookies, or
TLS client certificates. This header is a Boolean value that indicates true or false.
• Access-Control-Allow-Headers—Indicates the list of headers allowed by the
particular resource on the server. If the server allows more than is requested via the
Access-Control-Request-Headers header, it returns only what is requested.
• Access-Control-Allow-Methods—Indicates the list of HTTP methods allowed by the
particular resource on the server. If the server allows more than is requested via the
Access-Control-Request-Method, it returns only the one requested (such as GET).
• Access-Control-Allow-Origin—Indicates the cross-origin allowed by the server.
The server may support more than one origin, but what is returned in this particular
header is the value of the Origin header requested if the server supports cross-origin
requests from the domain of the request originator (such as http://localhost:8080).
• Access-Control-Max-Age—Indicates for how long, in seconds, browsers can cache
the response to the particular preflight request (such as 3600).
Upon receiving the response to the preflight request, the web browser validates the response
headers to determine whether the target server allows the cross-origin request. If the
response headers to the preflight request don’t correspond to the request to be sent
(perhaps the HTTP method isn’t allowed, or one of the required headers is missing in the
Access-Control-Allow-Headers list), the browser stops the cross-origin request from being
executed, and will show an error message stating that a cross-origin request was blocked.
To request independent claims from an OpenID provider, you need to use the claims
parameter in the OpenID Connect authentication request. The claims parameter is part of
the request object we discussed in section 3.12.
With the claims parameter in the authentication request, a client application can tell the
OpenID provider which claims it expects in the ID token and which claims it expects in the
response from the userinfo endpoint. When you use scopes to request user claims, it is not
possible to instruct the OpenID provider to return back two different set of claims, one set in
the ID token and the other set in the response from the userinfo endpoint.
#A This JSON Object carries the claim identifiers that are expected to be in the response from the userinfo endpoint.
#B The essential keyword indicates to the OpenID provider, this is a required claim for the client application to
function properly.
#C The null keyword indicates to the OpenID provider that there is no special requirement in requesting this claim
#D This JSON object carries the claim identifiers that are expected to be in the ID token
In listing 5.11, the client application requests given_name, nickname, email, and picture
claims from the userfinfo endpoint and expects gender and birthdate claims to be in the ID
token. All these are standard claim identifiers are defined in the OpenID Connect
specification, and you can also find the complete list in the table 5.2.
The following listing shows the complete OpenID Connect authentication request that
carries the claims parameters. This example is with respect to the authorization code
authentication flow. However it works in a similar manner for the implicit and hybrid flows as
well. The claims parameter is inside the JWT that goes under the request parameter.
That’s why you cannot see it the following code listing.
The following listing shows the decoded claims set of the JWT. There you can find the value
of the claims parameter. In section 3.12 we already discussed the structure of the request
parameter.
Listing 5.13 The decoded request parameter that carries the claims parameter
{
"iss": "s6BhdRkqt3",
"aud": "https://server.example.com",
"response_type": "code",
"client_id": "4504439922519l17d82cli8.apps.googleusercontent.com",
"redirect_uri": "https://localhost:3000/redirect_uri",
"scope": "openid",
"state": "af9ifjsldky",
"nonce": "n-0S6_WzA2Mj",
"max_age": 86400,
"claims":{
"userinfo":
{
"given_name": {"essential": true},
"nickname": null,
"email": {"essential": true},
"picture": null
},
"id_token":
{
"gender": null,
"birthdate": {"essential": true}
}
}
}
Figure 5.5 The client application can access the userinfo endpoint of the OpenID provider with the provided
access_token to retrieve user claims.
The API or the microservice does not need to know user claims. However, if you pass an
access token with a scope that is bound to a set of user claims, then the API or the
microservice can use that access token to talk to the userinfo endpoint of the OpenID
provider and retrieve user claims, which is not a desirable behavior. So, instead of using
scopes to request claims, if you use the claims parameter in the OpenID Connect
authentication request to instruct the OpenID provider to include user claims in the ID token,
then whoever gets the corresponding access token won’t be able retrieve user claims.
However, if the authentication request from the client application to the OpenID provider
requests to include user claims in the response from the userinfo endpoint, then still the API
or the microservice can use the corresponding access token to talk to the userinfo endpoint
of the OpenID provider and retrieve user claims. So, as a developer, when you decide which
claims you want from the userinfo endpoint and embedded into the ID token, you need to be
mindful about whom you need to give access to those claims.
defined in the OpenID Connect specification. If you take Google OpenID provider, for
example, you will find the custom claim hd, which is the hosted G Suite domain of the user.
The way a client application requests claims from an OpenID provider does not change
based on whether it’s a custom claim or a standard claim. However, the OpenID provider you
use must support custom claims or in other words, the OpenID provider you use should let
you define custom claims, and most of the open source and commercial OpenID provider
products do support it.
If you have a client application that allows login with multiple OpenID providers, then you
cannot expect consistency among different OpenID providers on how they handle custom
claims. For example, to request billing address of a user, you may use the custom claim
billing_address with one OpenID provider, and the custom claim billlingAddress with
another OpenID provider.
In section 1.9.5, we discussed a pattern how you can handle such cases by delegating
the claim transformation logic to a centralized identity provider. For example, your
application expects the ID token to carry the billing_address identifier all the time,
irrespective of which OpenID provider user picks, and the centralized identity provider takes
care of transforming different claim identifiers from different OpenID provider into the one
that your application expects. However, how you do the claim transformation is outside the
scope of OpenID Connect.
USING CLAIMS TO CONTROL ACCESS TO A CLIENT APPLICATION How you control access at the
client application is outside the scope of OpenID Connect, and OpenID Connect only helps you by bringing in
user claims from the OpenID provider to the client application. The client application can define its own
access control policies in the best way that suits it and evaulate those policies against the claims the ID
token brings in or the claims that you get from the userinfo endpoint. In chapter 8 you’ll learn how to define
access control policies using the Open Policy Agent (OPA).
As per our discussion so far in the book the OpenID provider holds and manages user
claims. These claims fall under the normal type. In that case you can think about an OpenID
provider as a claims provider too. However in certain enterprise use cases there are
specialized claims providers, who are not necessarily the OpenID providers.
Identity proofing services are a good example of claims providers. Identity proofing is the
process of a user reliably identifying themselves to an identity proofing service. For example,
Evident ID (https://www.evidentid.com/) is a company that provides identity-proofing
services. You can upload a scanned copy of your US driving license to Evident ID and Evident
ID will verify the authenticity of the driving license and extract your first name, last name,
date of birth and driving license number from it.
Some identity proofing services make the proofing process more reliable, by asking the
user to hold the driving license closer to their face in front of the camera, and makes sure
the photo of the user on the driving license is the same person who holds the driving license
in front of the camera. Only if that verification is successful, the identity proofing service will
extract out the corresponding claims from the driving license and will record those against
the corresponding user.
So, you can call an identity-proofing service a claims provider who holds verified claims,
and repeating again, it does not necessarily need to be an OpenID provider. However, most
of the identity-proofing services expose an API, which can be consumed by an OpenID
provider to retrieve verified claims.
Figure 5.6 The OpenID provider builds ID token by aggregating claims from different claims providers. An
identity proofing service is a good example of a claims provider.
A given OpenID provider can connect to multiple claims providers (figure 5.6), and during a
user’s login process, the OpenID provider builds the ID token by aggregating all the claims it
gets from the connected claims providers. In the identity-proofing example, how the OpenID
provider should build the ID token with aggregated claims is defined in the OpenID Connect
Listing 5.14 uses two elements in the ID token, _claim_names and _claim_sources. Each one
of them carries a JSON object. Each element in the _claim_names JSON object carries a
claim identifier, and the value of that claim identifier refers to an element in the
_claim_sources JSON object. Each element in the _claim_sources JSON object holds
another JSON object, and each of those JSON objects comes from a claims provider.
In listing 5.9, the name of the first element (and only element) in the _claim_sources
JSON object is src1, and this is the element being referred from the verified_claims
element in the _claim_names JSON object. The value of the src1 element is another JSON
object, and according to OpenID Connect specification, this JSON object must have an
element called JWT. This JWT carries all the claims provided by the corresponding claims
provider.
The identifiers _claim_names, _claim_source and JWT are defined in the OpenID
Connect specification, while the verified_claims identifier is defined in the OpenID Connect
for Identity Assurance specification. However if you are to use aggregated claims for the use
cases outside the identity proofing, you need not to worry about the OpenID Connect for
Identity Assurance specification and you can use whatever the claim identifiers you need
under the _claim_names JSON object. The following listing shows a more generic example of
aggregated claims.
#A Each element in the _claim_names JSON object carries a claim identifier, and the value of that claim identifier
refers to an element in the _claim_sources JSON object.
#B A claim identifier and the corresponding value can be found under the src1 child element under _claim_sources
element
#C Each element in the _claim_sources JSON object holds another JSON object, and each of those JSON objects
comes from a claims provider
#D The JWT carries aggregated claims
In section 5.7.3 we discuss how to verify an ID token that carries aggregated claims.
Figure 5.7 The OpenID provider builds ID token including the endpoint details of claims providers, so the client
application can directly talk to them and retrieve user claims.
The following code listing shows an example of an ID token that carries distributed claims, as
per the OpenID Connect for Identity Assurance specification.
#E Each element in the _claim_sources JSON object holds another JSON object, and each of those JSON objects
comes from a claims provider
#F Points to and endpoint under the distributed claims provider
#G The access token to autheniticate to the endpoint under the distributed claims provider
In listing 5.16 the _claims_sources JSON object under each element defines an endpoint
belongs to the claims provider, along with an access token (optional) for authentication. This
endpoint should return the corresponding claims in JWT. The following listing is an example
of a ID token that returns both the aggregated claims and distributed claims.
}
}
5.8 Summary
• A client application can use claims to identify the user, communicate with the user
(for example use user’s email address to send a newsletter), build a personalized user
experience, as well as do application specific authorization checks.
• A client application can request claims from an OpenID provider using the scopes or
with the claims parameter in the OpenID Connect authentication request.
• An OpenID provider can return claims to the client application either with the ID token
or via userinfo endpoint.
• The OpenID Connect specification defines 20 standard claims and 4 standard scopes;
19 out of these 20 standard claims are mapped to the 4 standard scopes.
• OpenID Connect specification defines three types of claims based on how they are
issued by an OpenID provider: normal claims, aggregated claims, and distributed
claims.
• The OpenID provider that the client application trusts directly issues a normal claim.
Or in other words, the OpenID provider the client application trusts asserts those
claims.
• Aggregated claims are issued by a claims provider the client application trusts, and
returned to the client application via the ID token issued by the OpenID provider the
client application trusts.
• The OpenID provider does not embed the distributed claims into the ID token. Rather,
the ID token carries corresponding claims provider’s endpoint information, so the
client application can directly talk and retrieve the claims.
• How to secure access to a server-side web application using an OpenID Connect agent
• How to secure access to a server-side web application using an OpenID Connect proxy
• How to store tokens securely
In chapter 3 we discussed how to login to a single-page application (SPA) using OpenID
Connect, following the implicit and authorization code flows. Even though SPAs are the most
popular application type at the time of this writing, still there are many server-side web
applications out there we need to worry about.
Technically speaking, a SPA is also a web application, so when we say a server-side web
application we mean a web application that has some logic running in the backend. It can be
a web application, for example that you develop using Java Server Pages (JSP) / Servlet, C#
or any other framework that needs some runtime at the backend to execute the logic.
As we discussed in chapter 3, ideally the opposite of a SPA is a multi-page application
(MPA). But not all MPAs are server-side web applications. There can be MPAs, which do not
need to run any backend logic in a web server. So, that’s why we want to use the term,
server-side, to identify the type of web applications we are going to discuss in this chapter.
However, in this chapter we may use server-side web application, web application,
application interchangeably to refer to the same.
To demonstrate how to integrate OpenID Connect with a server-side web application, in
this chapter we use Apache Tomcat web server to host the web application. However, based
on the language/framework you would like to use to build your web application, you can pick
the application server to host your application. If you use PHP to build your application, for
example, you can probably host it in the Apache Web Server.
Figure 6.1 An agent that runs in the same process as of the server-side web application intercepts all the
requests coming to web application and for unauthenticated requests initiates an OpenID Connect login flow.
1. The agent intercepts all the requests coming to the application to check whether each
request is already authenticated or not (step 1 in figure 6.1). Ideally this is done by
looking for an already known cookie in the request, which maps into some session
data, the application stores in the backend (server-side). In a Java EE application
running on a Tomcat server, for example, this can be tracked by using the
JSESSIONID cookie. Typically, the application server (e.g.: Tomcat), takes care of
managing the user session, so the agent only needs to check whether an
authenticated session exists or not.
2. For any unauthenticated request the agent will initiate an OpenID Connect login flow
and redirects the user to an OpenID provider (step 2 in figure 6.1).
3. The agent validates the OpenID Connect response it receives from an OpenID
provider, and injects the information related to the user it finds in the ID token, to the
memory, so the web application can simply read that information from memory, with
zero knowledge on OpenID Connect (step 9 in figure 6.1). Once the agent successfully
validates the OpenID Connect response from the OpenID provider, it creates a login
session for the corresponding user.
It looks like the tasks the OpenID Connect agent carries out need some heavy working! If
you are to implement an OpenID Connect agent, in addition to a deeper understanding about
the OpenID Connect specification, you also need to know how the corresponding application
server handles user sessions. Good news is, as an application developer, you do not need to
worry about writing these agents yourself. You can find an agent that fits your needs;
specially that works with your application server. In this chapter, we are using Asgardeo
Tomcat OIDC Agent (https://github.com/asgardeo/asgardeo-tomcat-oidc-agent), which is an
open-source OpenID Connect agent implementation for Java EE applications running on
Apache Tomcat.
Figure 6.2 The proxy that runs outside the process of the server-side web application intercepts all the
requests coming to web application and for unauthenticated requests initiates an OpenID Connect login flow.
Typically, a proxy is responsible for the following tasks; and you’ll find this list is very much
similar to what we discussed in section 6.1.1 with respect to agent-based single sign on:
1. The proxy intercepts all the requests coming to the application to check whether each
request is already authenticated or not. Ideally this is done by looking for an already
known cookie in the request (step-1 in figure 6.2).
2. For any unauthenticated request, the proxy will initiate an OpenID Connect login flow
and redirects the user to an OpenID provider (step-2 in figure 6.2). For example, an
Apache server, an Nginx server can act as a proxy. The mod_auth_openidc module
(https://github.com/zmartzone/mod_auth_openidc), for example, running on an
Apache server knows how to initiate an OpenID Connect login flow with an OpenID
provider it trusts. Similarly, the lua-resty-openidc module
(https://github.com/zmartzone/lua-resty-openidc) running on an Nginx server knows
how to initiate an OpenID Connect login flow. In section 6.4 you’ll learn how to secure
access to your server-side web application with lua-resty-openidc module running on
an Nginx server.
3. The proxy validates the OpenID Connect response it receives from an OpenID
provider, and injects the information related to the user it finds in the ID token, to the
HTTP request from the proxy to the web application, so the web application can simply
read that information from the HTTP request, with zero knowledge on OpenID
Connect. Once the proxy successfully validated the OpenID Connect response from the
OpenID provider, it will also create a login session for the corresponding user.
Proxy-based single sign on is the preferred option over agent-based single sign-on in the
following cases:
• When you have to work with legacy web applications that are not designed to support
OpenID Connect.
• When you are only allowed to do minimal changes for your web applications.
• When you have a large number of web applications developed using different
technology stacks.
Sometime back, I worked with a large financial institute in USA to implement proxy-based
single sign on using mod_auth_openidc module running on an Apache server over 200+
server-side web applications. These web applications were written in different programming
languages and had their own built-in authentication mechanisms. Since these applications
were evolved over a couple of decades and still working fine, no one wanted to do any
serious changes on them; and that was the key motivation for them to go for the proxy-
based single sign on. Also, since they had 200+ applications written in multiple programming
languages, if we had used agent-based single sign on, they would need to find an OpenID
Connect agent written in each programming language and maintain those.
1. Download the sample web application as a WAR (web archive) file, which is written in
Java.
2. Update the web application configuration to include the Tomcat agent for OpenID
Connect integration.
3. Register the web application at an OpenID Provider and get a client ID and client
secret for the web application.
4. Configure the client ID and client secret you got from the OpenID Provider in the web
application.
5. Build the updated web application and deploy it in Tomcat; and start the Tomcat
server.
The Asgardeo OpenID Connect agent is a servlet filter. If you are familiar with Java,
you most probably know what a servlet filter is. For all the others, think about a servlet filter
as a component that you can configure to intercept all the requests coming to a specific
path of your web application. Any Java EE application server supports servlet filters. So,
you can deploy the same servlet filter that you deployed in Tomcat server, in JBOSS
application server as well.
Let’s have a look at the code that integrates the OpenID Connect agent with the web
application. In the web.xml file, inside oidc-sample-app/WEB-INF directory, you’ll find the
following code, which defines the full qualified name of the Java class, that implements the
OpenID Connect agent logic as a servlet filter.
Listing 6.1 Defining the Asgardeo OpenID Connect servlet filter in WEB-INF/web.xml file
<filter>
<filter-name>OIDCAgentFilter</filter-name> #A
<filter-class>io.asgardeo.tomcat.oidc.agent.OIDCAgentFilter</filter-class> #B
</filter>
#A The name of the servlet filter. This can be any name and will be referred from other places in the web.xml file.
#B The fully-qualified class name of the servlet filter implementation. This is in fact the OpenID Connect agent
implementation.
The following code listing shows, how to setup the servlet filter defined in listing 6.1 against
certain paths of the web application. The OpenID Connect agent (or the servlet filter) will
protect access to these paths.
Listing 6.2 Configuring the servlet filter against different paths in WEB-INF/web.xml file
<filter-mapping> #A
<filter-name>OIDCAgentFilter</filter-name> #B
<url-pattern>/logout</url-pattern> #C
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>/oauth2client</url-pattern> #D
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.jsp</url-pattern> #E
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.html</url-pattern> #F
</filter-mapping>
#A A filter mapping defines a mapping between a filter name and a path for the application server will make sure any
request coming to that path will go through the corresponding servlet filter first
#B The name of the servlet filter, as defined in listing 6.1
#C This servlet filter will intercept all the requests coming to the /logout path of this web application
#D This servlet filter will intercept all the requests coming to the /oauth2client path of this web application
#E This servlet filter will intercept all the request coming to any file that has the .jsp extension
#F This servlet filter will intercept all the request coming to any file that has the .html extension
In addition to the servlet filter implementation, the Asgardeo OpenID Connect agent also
comes with an event listener implementation (SSOAgentContextEventListener). This
listener is responsible for loading certain values from WEB-INF/classes/oidc-sample-
app.properties. The oidc-sample-app.properties file carries a set of properties related
to the OpenID provider the web application trusts for authentication, as well as some
properties related to the web application itself, as shown in the following listing.
#A The client ID corresponding to this web application, which you get from the OpenID provider
#B The client secret corresponding to this web application, which you get from the OpenID provider
#C Defines pages or the URIs the OpenID Connect agent should not worry about
#D Where to take the user, in case of an error
#E This is the same callback URL (or the redirect URI) you configure at the OpenID provider at the time you registered
your web application.
#F The value of the scope parameter
#G The authorize endpoint of the OpenID provider. This OpenID Connect agent will redirect any unauthenticated
requests to this endpoint.
#H The logout endpoint of the OpenID provider. When the user initiates logout, the web application will redirect the
user to this endpoint. You’ll learn about logout in chapter 7.
#I The token endpoint of the OpenID provider. The OpenID connect agent uses this endpoint to exchange the
authorization code it got from the authorize endpoint to an access token and an ID token.
#J The issuer of the ID token. The OpenID provider defines the value of the issuer and is included in the ID token. The
OpenID Connect agent only accepts ID tokens from an issuer it knows (trusts).
#K This is an endpoint defined by the OpenID provider, which provides the public key associated with the signature of
the ID token.
#L The endpoint belongs to the web application, where the OpenID provider will redirect the user after logout. You’ll
learn about logout in chapter 7.
The following code listing shows how the web application defines two listeners in the WEB-
INF/web.xml file. In this section we discussed the SSOAgentContextEventListener. In
addition to that, the OpenID Connect agent defines another listener called JKSLoader. The
JKSLoader listener is responsible for loading properties from WEB-INF/classes/jks.properties
file. The jks.properties file defines a set of properties corresponding to the certificates that
are required to make secure connection with the OpenID provider.
<listener>
<listener-class>io.asgardeo.tomcat.oidc.agent.JKSLoader</listener-class>
</listener>
Figure 6.3 The server-side web application uses authorization code authentication flow to communicate with
the OpenID provider to authenticate the user.
The figure 6.3 is exactly the same figure you find in figure 3.11 in chapter 3; however, in
step-6, the request to the token endpoint from the web application to the token endpoint of
the authorization server is slightly different. If you recall from chapter 3, even in the case of
a single-page application, the request to the token endpoint would be very much similar to
what you see in listing 6.5. However, the token request from the single-page application
(listing 3.4) didn’t carry an HTTP Authorization header, while as in listing 6.5, the token
request from the server-side web application carries an HTTP Authorization header. The HTTP
Authorization header is used to authenticate the client application to the token endpoint of
the authorization server.
Listing 6.5 Request to the token endpoint of the OpenID provider (authorization code flow)
POST /token
HTTP/1.1
Host: oauth2.googleapis.com
Content-Type: application/x-www-form-urlencoded
Authorization: Basic XXXXXXX
code=YDed2u73hXcr783d&
client_id=your_client_id&
redirect_uri=https%3A//oauth2.example.com/code&
grant_type=authorization_code
As you already learnt in chapter 2, a single-page application is called a public client under
the OAuth 2.0 specification and a public client does not have the capability protect any
secrets. Since a SPA runs in the browser, it can’t hide any secrets from the end user.
Anything that you hide on the browser is visible to the end user. So, no point of having any
credentials for an SPA. So, that’s the reason it does not carry the HTTP Authorization header.
The server-side web application is called a confidential client under the OAuth 2.0
specification, because it can securely store secrets at the server side. The server-side web
application (either the agent or the proxy) generates the HTTP Authorization header in listing
6.5 by base64-encoding the client id and client secret it got from the OpenID provider. The
following code snippet shows how the base64-encoding takes place.
Figure 6.4 The application stores the user data extracted from an ID token in a data structure that can be
looked up by the identifier comes in the cookie from the browser.
The figure 6.4 illustrates how the session identifier from the cookie maps to the data
structure that carries the data from the token, stored in the web application’s memory. As
you might have rightly guessed already, this approach of storing ID token data in memory
has a limitation. When you have multiple servers hosting the same web application behind a
load balancer (figure 6.5), you need to find a way to replicate the data structure you keep in
memory across all the servers; otherwise, when the load balancer routes a request with the
cookie that carries the session identifier to a server that does not have the corresponding
data structure in memory, will result in an error. The other approach to fix this is to
configure the load balancer to send all the requests within a given session to the same web
server. This is called the session-aware load balancing.
Figure 6.5 , when the load balancer routes a request with the cookie that carries the session identifier to a
server that does not have the corresponding data structure in memory, will result in an error.
Just like an ID token, in most of the cases the access tokens are also treated as session
tokens; unless your web application has a specific requirement to use an access token to
perform certain tasks, on behalf of the user, even after the user has logged out from the
browser session. If you login to the meetup.com, for example, using your Google account via
OpenID Connect, the meetup.com can use the access token provided to it by Google to
publish events to your Google calendar. However, if the meetup.com didn’t persist the access
token, it won’t be able to update the event it published to your Google calendar (while you
are logged into the meetup.com). It’s a common case that meetups getting rescheduled
sometime, and if meetup.com didn’t persist your access token, it won’t be able to update
your Google calendar with the latest changes. If you are to persist an access token, you need
to treat it as confidential data and store it in an encrypted format.
Even if you don’t persist the access token, and if you still need to perform certain tasks
on behalf of the user, while the user is not logged into the web application; you can still do it
with the refresh token. You need to persist the refresh token in an encrypted format and
whenever you need an access token, you can talk to the token endpoint of the authorization
server, authenticate with the refresh token and get a new access token. In section 6.5 we
discuss refresh tokens in detail.
having to prompt the user of the application to log in again. To use the refresh token grant
type, the application should receive an access token and a refresh token in the token
response from the OpenID provider. Figure 6.6 illustrates the refresh token grant flow.
Figure 6.6 The refresh token grant type allows a token to be renewed when it expires.
The following listing shows a cURL request to refresh an access token. This looks very much
similar to the listing 2.5 in chapter 2, except in listing 6.6 we pass openid as a value under
scope parameter.
Listing 6.6 A sample cURL request with refresh token grant type
\> curl \
-u application_id:application_secret \
-H "Content-Type: application/x-www-form-urlencoded" \
-d "grant_type=refresh_token&
refresh_token=heasdcu8-as3t-hdf67-vadt5-asdgahr7j3ty3 \
scope=openid" \
https://localhost:8085/oauth/token
The response from the OpenID provider to cURL request in listing 6.6 may or may not
include the ID token. The OpenID Connect specification does not mandate to include the ID
token in the token refresh response, and you would need to check the documentation of your
OpenID provider to clarify it. In case the OpenID provider includes the ID token in the
response; then the ID token must adhere to following rules the OpenID Connect specification
defines. These rules are related to the certain claims (or attributes) the ID token carries, and
in chapter 3, in section 3.8.4 we explained the use of all of those attributes.
• The iss claim value of the ID token must be the same as in the ID Token issued when
the original authentication occurred.
• The sub claim value of the ID token must be the same as in the ID Token issued
when the original authentication occurred.
• The iat claim of the ID token must represent the time that the new ID Token is
issued.
• The aud claim value of the ID token be the same as in the ID Token issued when the
original authentication occurred.
• If the ID Token contains an auth_time claim, its value must represent the time of
the original authentication, not the time that the new ID token is issued.
• The azp claim value of the ID token must be the same as in the ID Token issued
when the original authentication occurred and if no azp claim was present in the
original ID Token, one must not be present in the new ID token.
Then again, what’s the point of refreshing an ID token, along with the access token? In
section 6.4 we explained that the web application will validate the ID token and store it’s
values in memory for future reference. The ID token has its own expiry time too and it does
not necessarily be the same as of the access token. If an ID token is not expired you do not
need to refresh it, and in practice most of the web application implementations do not
refresh the ID token even it is expired; rather the web application relies on the claims it
received in the original ID token throughout the same web session of the user. This approach
to not to refresh an expired ID token has one major drawback. If you use the claims the ID
token carries to make access control decisions, then you are making such decisions based on
stale data.
Whether to refresh an ID token or not is a decision you need to make consciously. For
example, in the case of a single-page application, I will really not bother about refreshing an
ID token. A single-page application is mostly driven by APIs, which are secured with OAuth
2.0; and ID token effectively plays a very minor role. You’ll only use an ID token to identify
who the user is only for the display purposes; yet all the APIs will rely on the access token to
find who the user is and what actions the user can perform on APIs. In case of a server-side
web application, whether to refresh the ID token or not relies on how much you rely on the
claims the ID token carries.
• Download and deploy the sample web application behind the OpenResty server, so the
OpenResty server can forward the requests to the sample web application.
• Start the OpenResty server and also the Tomcat server that runs the sample web
application
• Test the login flow by visiting the web application. All the requests to the web
application will be proxied through the OpenResty server.
#A The OpenID Connect metadata endpoint. The lua-resty-openidc module talks to this endpoint to discover the
authorization and token endpoints of the OpenID provider.
#B The client id from the OpenID provider. You need to replace this value with your own value.
#C The client secret from the OpenID provider. You need to replace this value with your own value.
#D Carries scope values and must include at least openid.
#E Defines what to in case of an error condition
#F Extract the sub attribute from the ID token and set its value to the X-USER HTTP header. The upstream web
application can find the name of user by reading X-USER header from the HTTP request.
6.7 Summary
• The agent-based single sign on and proxy-based single sign on are the most common
ways of integrating OpenID Connect into a web application.
• With agent-based single sign on an agent running along with the web application
intercepts all the requests coming to the web application to check whether each
request is already authenticated or not; and if not will initiate an OpenID Connect
login flow with the OpenID provider it trusts.
• Unlike agent-based single sign on, with proxy-based single sign on, the proxy runs
outside of your server-side web application in its own process, and intercepts all the
requests coming to the web application.
• If an ID token is not expired, you do not need to refresh it, and in practice most of
the web application implementations do not refresh the ID token even it is expired;
rather the web application relies on the claims it received in the original ID token
throughout same web session of the user.
Logging out
However, this is not a common expectation. Sometime back Gmail provided an option to
logout from all active sessions (across all the devices) with one click, but at the time of this
writing it is not available. Still, you can visit https://myaccount.google.com/device-activity
and selectively logout from any of the devices you have currently logged in with your Google
account.
In this chapter you will learn how to implement single logout with OpenID Connect to
address the most common expectation of single logout. That is, you logout from one
application and you’ll be logged out from all the applications running under the current
browser session, connected to the same identity provider and also from the identity provider.
In sections 7.3, 7.4 and 7.5, you’ll learn the merits of each of these three approaches.
One is loaded from the OpenID provider domain while the other is from the application’s
domain itself. The figure 7.1 elaborates the message flow happens between an application
and an OpenID provider which supports the session management specification during the
login flow. This will also help you recall what’s happening during the OpenID Connect login
flow, which we discussed in detail in chapter 3.
Figure 7.1 The client application uses authorization code authentication flow to communicate with an OpenID
provider that supports session management to authenticate the user.
The figure 7.1 shows the OpenID Connect authorization code login flow. In chapter 3 we
discussed this flow in detail, so you should be already familiar with the messages passed
between the OpenID provider and the application. However, when the application interacts
with an OpenID provider that supports session management, and if the session management
feature is enabled at the OpenID provider end, in the response from the OpenID provider to
the application (step-5 in figure 7.1), you’ll find one more parameter called session_state
along with the code parameter. The value of the session_state is a random string the
OpenID provider generates to track the login state of the user. The following code snippet
shows a sample response from the OpenID provider, which carries the session_state
parameter.
https://app.example.com/redirect_uri?code=YDed2u73hXcr783d&state=Xd2u73hgj59435&session_sta
te=xfhet67jhj3gj
7.3.2 The role of iframes loaded from the client application’s domain and the
OpenID provider’s domain
In this section you’ll learn what happens at the client application once the OpenID Connect
login flow is completed. As shown in figure 7.2, once the OpenID Connect login flow is
completed, the client application loads two iframes into the browser. One iframe is loaded
from the OpenID provider’s domain and the other iframe is loaded from the client
application’s domain. The iframe loaded from the client application’s domain will have access
to the session_state parameter, while the iframe loaded from the OpenID provider’s
domain has access to the opbs cookie.
Figure 7.2 Once the login is completed, the client application loads two iframes into the browser. One is from
its own domain and the other iframe is loaded from the OpenID Provider’s domain.
Once the iframes are loaded into the browser, they start talking to each other. The client
application’s iframe is developed by the corresponding application developer, while the
iframe loaded from the OpenID provider’s domain is developed and managed by the OpenID
provider, and is available to all the client applications. The following code listing shows a
sample JavaScript code for the client application’s iframe.
Listing 7.1 The client application’s iframe that communicates with the OpenID provider’s
iframe
var stat = "unchanged";
var mes = client_id + " " + session_state;
var targetOrigin = "https://op.example.com"; // Validates origin
var opFrameId = "op";
var timerID;
function check_session() {
var win = window.parent.frames[opFrameId].contentWindow
win.postMessage(mes, targetOrigin);
}
function setTimer() {
check_session();
timerID = setInterval(check_session, 5 * 1000);
}
function receiveMessage(e) {
if (e.origin !== targetOrigin) {
return;
}
stat = e.data;
As shown in listing 7.1, the role of the client application’s iframe is to periodically
communicate with the OpenID provider’s iframe to check whether the value of
session_state corresponding to the logged-in user has changed (see figure 7.3). The client
application’s iframe uses the window.postMessage() API (https://developer.mozilla.org/en-
US/docs/Web/API/Window/postMessage) to pass a message, which it constructs by
concatenating the client_id of the application with the session_state, to the OpenID
provider’s iframe. You may recall from section 7.3.1, at the end of the login flow the OpenID
provider passes the session_state parameter to the client application. In case the value of
session_state has changed, that means the user has logged out from the OpenID provider,
so the client application also can logout the user.
Figure 7.3 The Client application’s iframe periodically talks to the OpenID provider’s iframe to check the
status of the user’s login session.
The following code listing shows the sample code of the OpenID provider iframe.
Listing 7.2 The OpenID provider’s iframe that communicates with the client application’s
iframe
window.addEventListener("message", receiveMessage, false);
e.source.postMessage(stat, e.origin);
};
The OpenID provider iframe upon receiving a message from the client application’s iframe,
first extracts out the client_id and session_state from it. In the next section you’ll learn
how the session_state parameter is constructed.
know the client_id, the origin, the salt and the value written to the opbs cookie. The client
application’s iframe passes the client_id and the origin to the OpenID provider’s iframe,
and also, it can find the value of the salt by decoding the session_state passed to it by the
client application’s iframe, as shown below. The salt is a generated value that brings in
randomness to the session_state.
With that, the only missing value to calculate the session_state is the value written to the
opbs cookie. Since the OpenID provider’s iframe is loaded from the OpenID provider’s
domain itself, it can access the opbs cookie. Now, the OpenID provider’s iframe can calculate
the session_state in the following way.
The only reason the value of the session_state calculated by the OpenID provider’s iframe
can differ from the value of the session_state passed to it by the client application’s iframe
is due to any changes that could happen to the opbs cookie.
Only the OpenID provider can override the value of the opbs cookie and it will do so,
when it finds that the corresponding user has logged out from the OpenID provider. So, if
the user initiates logout from an application that runs on the same browser, which also runs
your application and also connected to the same OpenID provider as yours, then the OpenID
provider will indirectly notify your application about the logout event by updating the value of
the cookie. When the value of the opbs cookie changes, the value of session_state passed
to the OpenID provider’s iframe by the client application’s iframe won’t match with the value
of session_state the OpenID provider’s iframe derives. That’s an indication to the client
application, that the user has logged out from the OpenID provider, so, the client application
too can initiate it’s own logout routines, which you will learn in the next section.
Once the application clears its local state, and logs out the user from the application, it will
construct the following request and redirect the user to the logout endpoint of the OpenID
provider. Since this is initiated by the client application itself, in the OpenID Connect domain,
it’s widely known as OpenID Connect Relying Party (RP) Initiated Logout and is defined in the
specification: https://openid.net/specs/openid-connect-rpinitiated-1_0.html. The following
code snippet shows a sample HTTP GET request from the client application to the OpenID
provider’s logout endpoint.
https://op.example.com/logout?id_token_hint=<ID_TOKEN>&state=XD2uedhgj59458&post_logout_red
irect_uri= https://app.example.com/post_logout
The value of the id_token_hint parameter carries ID token the OpenID provider issued to
the client application during the login flow. This is a recommended parameter, however if the
request also carries the post_logout_redirect_uri parameter, the id_token_hint is a
mandatory parameter.
The post_logout_redirect_uri parameter carries a URL that the OpenID provider
should redirect the user back, after it completes the logout flow. The value of the
post_logout_redirect_uri parameter must be already known to the OpenID provider, so,
the client application should share it with the post_logout_redirect_uri OpenID provider
at the time of the application registration. In most of the cases, in the absence of the
post_logout_redirect_uri at application registration, the OpenID provider uses the
already registered redirect_uri as the post_logout_redirect_uri. The
post_logout_redirect_uri is an optional parameter and if it is not present in the request,
the OpenID provider uses the value of post_logout_redirect_uri parameter which is
already registered with it.
The above explanation of the post_logout_redirect_uri parameter probably might
have raised a question in your mind! If the value of the post_logout_redirect_uri
parameter the application sends in the logout request must match with the value already
registered with the OpenID provider, why in the first place the application has to send it in
the logout request? Why can’t the OpenID provider always use the value of the
post_logout_redirect_uri parameter, which is registered with it, to redirect the user back
after completing the logout flow? The reason is, you can register multiple
post_logout_redirect_uris with the OpenID provider and use one of them in the logout
request to the OpenID provider.
The state parameter in the logout request is an optional parameter. If the state
parameter is present in the logout request, the OpenID provider must return back the same
value in the state parameter to the post_logout_redirect_uri. The state parameter
helps the client application correlate the front-channel state before and after redirecting the
user to the OpenID provider logout endpoint. The value of the state parameter can be
anything the client application generates.
1. Download the sample web application as a WAR (web archive) file, which is written in
Java.
2. Update the web application configuration to include the Tomcat agent for OpenID
Connect integration.
3. Register the web application at an OpenID Provider and get a client id and client
secret for the web application.
4. Configure the client id and client secret you got from the OpenID Provider in the web
application.
5. Build the updated web application and deploy it in Tomcat; and start the Tomcat
server.
6. Test the login flow by visiting the web application.
7. Once the login flow is completed you can experience the logout flow by clicking the
logout button on the application.
of not to duplicate steps to set up the samples in this chapter, and keep them in the GitHub
repository; so, as the software versions used in the sample changes, we have the freedom of
updating the instructions in the README.md file as well as the code of the samples.
The Asgardeo OpenID Connect agent is developed as a servlet filter. If you are familiar with
Java, you most probably know what a servlet filter is. For all the others, think about servlet
filter as a component, that can be configured to intercept all the requests coming to a
specific path of your web application. Any Java EE application server supports servlet filters.
So, you can deploy the same servlet filter that you deployed in Tomcat server, in JBOSS
application server as well.
Let’s have a look at the code that integrates the OpenID Connect agent with the web
application. In the web.xml file, inside oidc-sample-app/WEB-INF directory, you’ll find the
following code, which defines the full qualified name of the Java class, that implements the
OpenID Connect agent logic as a servlet filter.
Listing 7.3 Defining the Asgardeo OpenID Connect servlet filter in WEB-INF/web.xml file
<filter>
<filter-name>OIDCAgentFilter</filter-name> #A
<filter-class>io.asgardio.tomcat.oidc.agent.OIDCAgentFilter</filter-class> #B
</filter>
#A The name of the servlet filter. This can be any name and will be referred from other places in the web.xml file.
#B The fully-qualified class name of the servlet filter implementation. This is in fact the OpenID Connect agent.
The following code listing shows, how to setup the servlet filter defined in listing 7.3 against
certain paths of the web application. The OpenID Connect agent (or the servlet filter) will
protect the access to these paths.
Listing 7.4 Configuring the servlet filter against different paths in WEB-INF/web.xml file
<filter-mapping> #A
<filter-name>OIDCAgentFilter</filter-name> #B
<url-pattern>/logout</url-pattern> #C
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>/oauth2client</url-pattern> #D
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.jsp</url-pattern> #E
</filter-mapping>
<filter-mapping>
<filter-name>OIDCAgentFilter</filter-name>
<url-pattern>*.html</url-pattern> #F
</filter-mapping>
#A A filter mapping defines a mapping between a filter name and a path for the application server will make sure any
request coming to that path will go through the corresponding servlet filter
#B The name of the servlet filter, as defined in listing 7.3
#C This servlet filter will intercept all the requests coming to the /logout path of this web application
#D This servlet filter will intercept all the requests coming to the /oauth2client path of this web application
#E This servlet filter will intercept all the request coming to any file that has the .jsp extension
#F This servlet filter will intercept all the request coming to any file that has the .html extension
In addition to the servlet filter implementation, the Asgardeo OpenID Connect agent also
comes with an event listener implementation (SSOAgentContextEventListener), that is
responsible for loading certain values from WEB-INF/classes/oidc-sample-app.properties. The
oidc-sample-app.properties file carries a set of properties related to the OpenID provider the
web application trusts for authentication, as well as some properties related to the web
application itself, as shown in the following listing.
#A The client id corresponding to this web application, which you get from the OpenID provider
#B The client secret corresponding to this web application, which you get from the OpenID provider
#C Defines pages or the URIs the OpenID Connect agent should not worry about
#D Where to take the user, in case of an error
#E This is the same callback URL (or the redirect URI) you configure at the OpenID provider at the time you registered
your web application.
#F The authorize endpoint of the OpenID provider. This OpenID Connect agent will redirect any unauthenticated
requests to this endpoint.
#G The logout endpoint of the OpenID provider. When the user initiates logout, the web application will redirect the
user to this endpoint. You’ll learn about logout in chapter 7.
#H The token endpoint of the OpenID provider. The OpenID connect agent uses this endpoint to exchange the
authorization code it got from the authorize endpoint to an access token and an ID token.
#I The issuer of the ID token. The OpenID provider defines the value of the issuer and is included in the ID token. The
OpenID Connect agent only accepts ID tokens from an issuer it knows (trusts).
#J This is an endpoint defined by the OpenID provider, which provides the public key associated with the signature of
the ID token.
#K The endpoint belongs to the web application, where the OpenID provider will redirect the user after logout.
The following code listing shows how the web application defines two listeners in the WEB-
INF/web.xml file. In this section already we discussed about the
SSOAgentContextEventListener; and in addition to that the OpenID Connect agent defines
another listener called JKSLoader. The JKSLoader listener is responsible for loading
properties from WEB-INF/classes/jks.properties file. The jks.properties file defines a set of
properties corresponding to the certificates that are required to make secure connection with
the OpenID provider.
<listener>
<listener-class>io.asgardio.tomcat.oidc.agent.JKSLoader</listener-class>
</listener>
https://op.example.com/logout?id_token_hint=<ID_TOKEN>&state=XD2uedhgj59458&post_logout_red
irect_uri= https://app.example.com/post_logout
Please check the section 7.3.4 for a detailed explanation of each of the request parameters in
the above code snippet.
7.4.2 The OpenID provider responding to the client application’s logout request
In this section you’ll learn how the OpenID provider responds to client application’s logout
requests, following the front-channel logout specification. The figure 7.4 explains the logout
flow from end to end. After the client application initiated logout from by talking to the
OpenID provider’s logout endpoint and then the OpenID provider responds by loading a set
of iframes under OpenID provider’s domain, which will send logout requests to other client
applications which do have active login sessions on the same browser.
Figure 7.4 The client application 1 initiates logout from by talking to the OpenID provider’s logout endpoint
and then the OpenID provider responds loading a set of iframes under OpenID provider’s domain, which will
send logout requests to other client applications which do have active login sessions on the same browser.
The OpenID provider can figure out the active sessions the user has on the same browser
and the corresponding client applications via the cookies the logout request brings in. Then,
the OpenID provider will load an iframe for each application to the browser under the OpenID
provider’s domain name. Each iframe then in parallel will do an HTTP GET to the
corresponding application’s logout URL. If an application supports front-channel logout, it has
to register a logout URL with the corresponding OpenID provider. Following code snippet
shows a logout request the OpenID provider initiates via it’s iframe.
https://app.example.com/logout?iss=<ISSUER>&sid=<SESSION_ID>
The value of the iss parameter carries the issuer identifier of the OpenID provider. This is
the same identifier the OpenID provider embeds into the ID tokens it issues during login flow
under the iss parameter. The value of the sid parameter carries a session identifier. An
OpenID provider, which supports logout adds a unique ID it generates for a login session to
the ID token it issues to the client application during the login flow.
When each client application receives the logout request from the OpenID provider along
with the iss and sid parameters, it also gets all the cookies attached to the corresponding
client application’s domain via the browser. With that the client application can uniquely
identify the user session and then initiate the logout routine.
To build the React application, run the following command from sample01 directory on the
command console. This command will create directory called build and copy all the files
that you want to deploy into your production web server.
In this example we use a node server as our web server. You can start using the following
command run from the sample02 directory.
The above command starts the node server on localhost port 3000 by default; if you visit
http://localhost:3000 link on the web browser, you will see a welcome message. This is the
simplest React application you can have; in the next two sections, you’ll learn how to secure
this application with OpenID Connect.
provider, you need to have following parameters to secure access to the React application
with OpenID Connect.
Listing 7.7 Parameters with sample values required to communicate with the OpenID provider
client_id: D4ZoMSpsxqgvUuiC6j5ROnEYea0a
redirect_uri: https://localhost:3000
Authorization endpoint: https://localhost:9443/oauth2/authz
Token endpoint: https://localhost:9443/oauth2/token
Logout endpoint: https://localhost:9443/oauth2/logout
Issuer: https://localhost:9443
ReactDOM.render(
<OIDCProvider
domain="localhost:9443"
tokenEp="https://localhost:9443/oauth2/token"
authzEp="https://localhost:9443/oauth2/authorize"
clientId="D4ZoMSpsxqgvUuiC6j5ROnEYea0a"
issuer="https://localhost:9443/oauth2/token"
redirectUri={window.location.origin}>
<App />
</OIDCProvider>,
document.getElementById('book-club-app')
);
Now you can replace the code in sample02/src/components/App.js with the following, that
adds a login button to the welcome page.
Listing 7.9 Updated App.js code that renders the login button to initiate the login flow
import React from 'react';
import { useAuth0 } from '@facilelogin/oidc-react';
function App() {
const {isLoading,isAuthenticated,error,user,loginWithRedirect,logout,} = useAuth0();
if (isLoading) {
return <div>Loading...</div>;
}
if (error) {
return <div>Oops... {error.message}</div>;
}
if (isAuthenticated) {
console.log(user.id);
return (
<div>
Hello {user.sub}{' '}
<button onClick={() => logout({ returnTo: window.location.origin })}>
Log out
</button>
</div>
);
} else {
return <button onClick={loginWithRedirect}>Log in</button>;
}
}
Now you can build the updated React application and start the node server using the
following two npm commands.
Once the node server successfully started up, you can visit http://localhost:3000 and click on
the login button to initiate the login flow; and you will get redirected to the OpenID
provider’s login page. Once the login flow is completed, you can click on the logout button on
the client application to initiate the logout flow.
7.5 Summary
• With single logout, once you logout from one application, you’ll be logged out from all
the applications running under the current browser session, connected to the same
identity provider.
• The OpenID Connect core specification does not talk about logout. It is defined under
three other specifications developed by the OpenID working group:
• The session management specification (https://openid.net/specs/openid-connect-
session-1_0.html) defines how to implement logout functionality between an OpenID
application and a provider using two iframes. This is the first logout specification the
OpenID working group introduced.
• The front-channel logout specification (https://openid.net/specs/openid-connect-
frontchannel-1_0.html) defines how to implement logout functionality between an
OpenID application and a provider completely using the browser (front-channel),
without demanding that the OpenID application load an iframe from the OpenID
provider.
• The back-channel logout specification (https://openid.net/specs/openid-connect-
backchannel-1_0.html) defines how to implement logout functionality between an
OpenID application and a provider using front-channel (via browser) as well as direct
(back-channel) communication.
called role-based access control. When you access a Facebook page, for example, as a
member, you can view the posts by other members, but won’t be able to delete any of
those. However, if you are the admin of that Facebook page, you can delete any posts by a
member. Here, Facebook decides what you can do on a Facebook page by the roles assigned
to you.
As discussed in chapter 5, in most of the cases in OpenID Connect, an application gets
the user claims via the ID token returned to it by the OpenID provider. However, to make an
access control decision, the claims are not just enough. They are the inputs, but we also
need policies! In our previous example, enforcing who can delete a post on a Facebook Page
is defined by a policy.
When we build an application, we need to have a way to represent policies, as well as a
policy engine to evaluate policies based on certain inputs. In this chapter we use Open Policy
Agent (OPA) as our policy engine. You’ll learn in this chapter how to set up Open Policy
Engine, and use its REST API to make access control decisions based on the claims an ID
token carries.
Figure 8.1 Components of a typical access-control system. The PAP defines access-control policies and then
stores those in the policy store. At runtime, the PEP intercepts all the requests, builds an authorization
request, and talks to the PDP. The PDP loads the policies from the policy store and any other missing
information from the PIP, evaluates the policies, and passes the decision back to the PEP. Based on the
response from the PDP, the PEP decides whether the request should be dispatched to the corresponding
resource or not.
Most of the time, PAP implementations come with their own user interface or expose the
functionality via an API. Some access-control systems don’t have a specific PAP; rather, they
read policies directly from the filesystem, so you need to use third-party tools to author
these policies. Once you define the policies via a PAP, the PAP writes the policies to a policy
store. The policy store can be a database, a filesystem, or even a service that’s exposed via
HTTP.
The PEP is the component that enforces access control policies before a request hits a
protected resource. The protected resource can be a server-side web application, a single-
page application, a mobile application, an API or even a microservice. How you design and
implement the PEP differs based on the type of the resource it wants to protect. If the
resource is a server-side web application written in Java, for example, then you can
implement the PEP as a servlet-filter. The servlet-filter intercepts all the requests directed to
a given web application, so it’s a good place to centrally enforce access control policies.
However, this approach is language specific. In other words, if you want to protect a server-
side application written in C#, then you would need to implement a similar kind of
interceptor using C#.
The best way to implement a PEP for a server-side web application in a language agnostic
way is to front the web application with a proxy, and run the PEP at the proxy. This proxy
can be an Apache server or a Nginx server for example. If it’s an Apache server then you
would need to build the PEP as an Apache module, and if it’s a Nginx server, then you would
need to build the PEP as a Nginx module.
When the PEP intercepts a request, it extracts certain parameters from the request—for
example, certain claims from the ID token—and creates an authorization request. Then it
talks to the PDP to check whether the request is authorized or to find out what actions the
corresponding user is eligible to perform on the protected resource.
When the PEP talks to the PDP to check authorization, the PDP loads all the
corresponding policies from the policy store. And while evaluating an authorization request
against the applicable policies, if there is any required but missing information, the PDP will
talk to a PIP. For example, let’s say we have an access-control policy that says a user can
buy a beer only if the logged in user’s age is greater than 21, but the authorization request
carries only the logged in user’s name in the authorization request. The age is the missing
information here, and the PDP will talk to a PIP to find the corresponding user’s age. We can
connect multiple PIPs to a PDP, and each PIP can connect to different data sources.
Figure 8.2 An application or a PEP can integrate with the OPA policy engine via its HTTP REST API or via the Go
API.
When you run the OPA server as a standalone deployment, it exposes a set of REST APIs
that PEPs can connect to and check authorization. In figure 8.2, the OPA engine acts as the
PDP.
The open source distribution of the OPA server doesn’t come with a policy authoring tool
or a user interface to create and publish policies to the OPA server. But you can use a tool
like Visual Studio (VS) Code to create OPA policies, and OPA has a plugin for VS Code
(https://marketplace.visualstudio.com/items?itemName=tsandall.opa). If you decide to
embed the OPA server (instead of using it as a hosted server) as a library in your application,
you can use the Go API (provided by OPA) to interact with it.
Once you have the policies, you can use the OPA API to publish them to the OPA server.
When you publish those policies via the API, the OPA engine keeps them in memory only.
You’ll need to build a mechanism to publish policies every time the server boots up. The
other option is to copy the policy files in the filesystem, and the OPA server will pick them up
when it boots up. If any policy changes occur, you’ll need to restart the OPA server.
However, there is an option to ask the OPA server to load policies dynamically from the
filesystem, but that’s not recommended in a production deployment. Also, you can push
policies to the OPA server by using a bundle server; we discuss that in detail in section 8.2.5.
OPA has a PIP design to bring in external data to the PDP or to the OPA engine. This
model is quite similar to the model we discussed in the previous paragraph with respect to
policies. In section 8.2.5, we detail how OPA brings in external data.
To start the OPA server, run the following command from the chapter08/sample01 directory.
This loads the OPA policies from the chapter08/sample01/policies directory (in section 8.2.5,
we discuss OPA policies in detail):
\> sh run_opa.sh
{
"addrs":[
":8181"
],
"insecure_addr":"",
"level":"info",
"msg":"Initializing server.",
"time":"2019-11-05T07:19:34Z"
}
You can run the following command from the chapter08/sample01 directory to test the OPA
server. The chapter08/sample01/policy_1_input_1.json file carries the input data for the
authorization request in JSON format (in section 8.2.4, we discuss authorization requests in
detail):
{"result":{"allow":true}}
recommends defense in depth and ensuring that communication between it and its clients is
secured via mTLS.
In this section, we discuss how to protect the OPA server with mTLS. This will ensure all
the communications that happen among the OPA server and other client applications are
encrypted. Also, only legitimate clients with proper keys can talk to the OPA server. To
protect the OPA server with mTLS, we need to accomplish the following tasks:
• Generate a public/private key pair for the OPA server
• Generate a public/private key pair for the OPA client
• Generate a public/private key pair for the Certificate Authority (CA) client
• Sign the public key of the OPA server with the CA’s private key to generate the OPA
server’s public certificate
• Sign the public key of the OPA client with the CA’s private key to generate the OPA
client’s public certificate
To perform all these tasks, we can use the chapter08/sample01/keys/gen-key.sh script with
OpenSSL. Let’s run the following Docker command from the chapter08/sample01/keys
directory to spin up an OpenSSL Docker container. You’ll see that we mount the current
location (which is chapter08/sample01/keys) from the host filesystem to the /export
directory on the container filesystem:
Once the container boots up successfully, you’ll find a command prompt where you can type
OpenSSL commands. Let’s run the following command to execute the gen-key.sh file that
runs a set of OpenSSL commands, to generate keys that are required to secure the OPA
server with mTLS:
# sh /export/gen-key.sh
Once this command executes successfully, you’ll find the keys corresponding to the CA in the
chapter08/sample01/keys/ca directory, the keys corresponding to the OPA server in the
chapter08/sample01/keys/opa directory, and the keys corresponding to the OPA client in the
chapter08/sample01/keys/client directory.
In case you’re already running the OPA server, stop it by pressing Ctrl-C on the
corresponding command console. To start the OPA server with TLS support, use the following
command from the chapter08/sample01 directory:
\> sh run_opa_tls.sh
{
"addrs":[
":8181"
],
"insecure_addr":"",
"level":"info",
"msg":"Initializing server.",
"time":"2019-11-05T19:03:11Z"
}
You can run the following command from the chapter08/sample01 directory to test the OPA
server. The chapter08/sample01/policy_1_input_1.json file carries the input data for the
authorization request in JSON format. Here we use HTTPS to talk to the OPA server:
{"result":{"allow":true}}
Let’s check what’s in the run_opa_tls.sh script, shown in the following listing. The code
annotations in the listing explain what each argument means.
#A Instructs the OPA server to load policies from policies directory, which is mounted to the OPA container
#B The OPA server finds key/certificate for the TLS communication from the keys directory, which is mounted to the
OPA container
#C Port mapping, maps the container port to the host port
#D Name of the OPA Docker image
#E Runs the OPA server by loading policies and data from the policies directory, which is mounted to the OPA
container
#F Certificate used for the TLS communication
#G Private key used for the TLS communication
#H Starts the OPA engine under the server mode
Now the communication between the OPA server and the OPA client (curl) is protected with
TLS. But still, anyone having access to the OPA server’s IP address can access it over TLS.
There are two ways to protect the OPA endpoint for authentication: token authentication and
mTLS.
With token-based authentication, the client has to pass an OAuth 2.0 token in the HTTP
Authorization header as a bearer token, and you also need to write an authorization policy. 1
In this section, we focus on securing the OPA endpoint with mTLS.
If you’re already running the OPA server, stop it by pressing Ctrl-C on the corresponding
command console. To start the OPA server enabling mTLS, run the following command from
the chapter08/sample01 directory:
\> sh run_opa_mtls.sh
Let’s check what’s in the run_opa_mtls.sh script, shown in the following listing. The code
annotations explain what each argument means.
#A The public certificate of the CA. All the OPA clients must carry a certificate signed by this CA
#B Enables mTLS authentication
You can use the following command from the chapter08/sample01 directory to test the OPA
server, which is now secured with mTLS:
Here, we use HTTPS to talk to the OPA server, along with the certificate and the key
generated for the OPA client at the start of this section. The key and the certificate of the
OPA client are available in the chapter08/sample01/keys/client directory. Since we have
secured the OPA server with mTLS, only the trusted client applications can connect to it now.
allow { #C
input.method = "POST" #D
input.path = "orders"
input.role = "manager"
}
allow {
input.method = "POST"
input.path = ["orders",dept_id] #E
input.deptid = dept_id
input.role = "dept_manager"
}
#A The package name of the policy. Packages let you organize your policies into modules, just as with programming
languages.
#B By default, all requests are disallowed. If this isn’t set and no allowed rules are matched, OPA returns an
undefined decision.
#C Declares the conditions to allow access to the resource
#D The Input document is an arbitrary JSON object handed to OPA and includes use-case-specific information. In this
example, the Input document includes a method, path, role, and deptid. This condition requires that the method
parameter in the input document must be POST.
#E The value of the path parameter in the input document must match this value, where the value of the dept_id is
the deptid parameter from the input document.
The policy defined in listing 8.3, which you’ll find in the policy_1.rego file, has two allow
rules. For an allow rule to return true, every statement within the allow block must return
true. The first allow rule returns true only if a user with the manager role is the one doing
an HTTP POST on the orders resource. The second allow rule returns true if a user with the
dept_manager role is the one doing an HTTP POST on the orders resource under their own
department.
Let’s evaluate this policy with two different input documents. The first is the input
document in listing 8.4, which you’ll find in the policy_1_input_1.json file. Run the following
curl command from the chapter08/sample01 directory and it returns true, because the
inputs in the request match with the first allow rule in the policy (listing 8.3):
{"result":{"allow":true}}
Let’s try with another input document, as shown in listing 8.5, which you’ll find in the
policy_1_input_2.json file. Run the following curl command from the chapter08/sample01
directory and it returns true, because the inputs in the request match with the second allow
rule in the policy (listing 8.3). You can see how the response from OPA server changes by
changing the values of the inputs:
{"result":{"allow":true}}
Now let’s have a look at a slightly improved version of the policy in listing 8.3. You can find
this new policy in listing 8.6, and it’s already deployed to the OPA server you’re running.
Here, our expectation is that if a user has the manager role, they will be able to do HTTP
PUTs, POSTs, or DELETEs on any orders resource, and if a user has the dept_manager role,
they will be able to do HTTP PUTs, POSTs, or DELETEs only on the orders resource in their
own department. Also any user, regardless of the role, should be able to do HTTP GETs to
any orders resource under their own account. The annotations in the following listing explain
how the policy is constructed.
allow {
allowed_methods_for_manager[input.method] #A
input.path = "orders"
input.role = "manager"
}
allow {
allowed_methods_for_dept_manager[input.method] #B
input.deptid = dept_id
input.path = ["orders",dept_id]
input.role = "dept_manager"
}
allow { #C
input.method = "GET"
input.empid = emp_id
input.path = ["orders",emp_id]
}
allowed_methods_for_manager = {"POST","PUT","DELETE"} #D
allowed_methods_for_dept_manager = {"POST","PUT","DELETE"} #E
#A Checks whether the value of the method parameter from the input document is in the
allowed_methods_for_manager set
#B Checks whether the value of the method parameter from the input document is in the
allowed_methods_for_dept_manager set
#C Allows anyone to access the orders resource under their own employee ID
#D The definition of the allowed_methods_for_manager set
#E The definition of the allowed_methods_for_dept_manager set
Let’s evaluate this policy with the input document in listing 8.7, which you’ll find in the
policy_2_input_1.json file. Run the following curl command from the chapter08/sample01
directory and it returns true, because the inputs in the request match with the first allow
rule in the policy (listing 8.6):
{
"result":{
"allow":true,
"allowed_methods_for_dept_manager":["POST","PUT","DELETE"],
"allowed_methods_for_manager":["POST","PUT","DELETE"]
}
}
You can also try out the same curl command as shown here with two other input
documents: policy_2_input_2.json and policy_2_input_3.json. You can find these files inside
the chapter08/sample01 directory.
allow { #D
policy = policies[_] #E
policy.method = input.method #F
policy.path = input.path
policy.scopes[_] = input.scopes[_]
}
This policy consumes all the external data from the JSON file
chapter08/sample01/order_policy_data.json (listing 8.9), which we need to push to the OPA
server using the OPA data API. Assuming your OPA server is running on port 8181, you can
run the following curl command from the chapter08/sample01 directory to publish the data
to the OPA server. Keep in mind that here we’re pushing only external data, not the policy.
The policy that consumes the data is already on the OPA server, which you can find in the
chapter08/sample01/policies/policy_3.rego file:
Now you can run the following curl command from the chapter08/sample01 directory with
the input message, which you’ll find in the JSON file
chapter08/sample01/policy_3_input_1.json (in listing 8.10) to check if the request is
authorized:
{"result":{"allow":true}}
With the push data approach, you control when you want to push the data to the OPA server.
For example, when the external data gets updated, you can push the updated data to the
OPA server. This approach, however, has its own limitations. When you use the data API to
push external data into the OPA server, the OPA server keeps the data in cache (in memory),
and when you restart the server, you need to push the data again
#A A Docker bind mount, which mounts the policies directory under the current path of the host machine to the
policies directory of the container filesystem
#B Runs the OPA server by loading policies and data from the policies directory
The OPA server you already have running has the policy and the data we’re going to discuss
in this section. Let’s first check the external data file (order_policy_data_from_file.json),
which is available in the chapter08/sample01/policies directory. This is the same file you saw
in listing 8.6 except for a slight change to the file’s structure. You can find the updated data
file in the following listing.
You can see in the JSON payload that we have a root element called
order_policy_data_from_file. The OPA server derives the package name corresponding to
this data set as data.order_policy_data_from_file, which is used in the policy in the
following listing. This policy is exactly the same as in listing 8.8 except the package name
has changed.
allow {
policy = policies[_]
policy.method = input.method
policy.path = input.path
policy.scopes[_] = input.scopes[_]
}
Now you can run the following curl command from the chapter08/sample01 directory with
the input message (chapter08/sample01/policy_4_input_1.json) from listing 8.10 to check
whether the request is authorized:
{"result":{"allow":true}}
One issue with loading data from the filesystem is that when there’s any update, you need to
restart the OPA server. There is, however, a configuration option (see
chapter08/sample01/run_opa_mtls_watch.sh) to ask the OPA server to load policies
dynamically (without a restart), but that option isn’t recommended for production
deployments. In practice, if you deploy an OPA server in a Kubernetes environment, you can
keep all your policies and data in a Git repository and use an init container along with the
OPA server in the same pod to pull all the policies and data from Git when you boot up the
corresponding pod. And when there’s an update to the policies or data, we need to restart
the pods.
OVERLOAD
The overload approach to bringing in external data to the OPA server uses the input
document itself. When the PEP builds the authorization request, it can embed external data
into the request. Say, for example, the orders API knows, for anyone wanting to do an HTTP
POST to it, they need to have the create_order scope. Rather than pre-provisioning all the
scope data into the OPA server, the PEP can send it along with the authorization request.
Let’s have a look at a slightly modified version of the policy in listing 8.8. You can find the
updated policy in the following listing.
Listing 8.14 OPA policy using external data that comes with the request
package authz.orders.policy5
allow {
policy.method = input.method
policy.path = input.path
policy.scopes[_] = input.scopes[_]
}
You can see that we used the input.external package name to load the external data from
the input document. Let’s look at the input document in the following listing, which carries
the external data with it.
Now you can run the following curl command from the chapter08/sample01 directory with
the input message from listing 8.15 (chapter08/sample01/policy_5_input_1.json) to check
whether the request is authorized:
{"result":{"allow":true}}
Reading external data from the input document doesn’t work all the time. For example, there
should be a trust relationship between the OPA client (or the policy enforcement point) and
the OPA server. Next we discuss an alternative for sending data in the input document that
requires less trust and is applicable especially for end-user external data.
BUNDLE API
To bring in external data to an OPA server under the bundle API approach, first you need to
have a bundle server. A bundle server is an endpoint that hosts a bundle. For example, the
bundle server can be an AWS S3 bucket or a GitHub repository. A bundle is a gzipped tarball,
which carries OPA policies and data files under a well-defined directory structure. 4
Once the bundle endpoint is available, you need to update the OPA configuration file with
the bundle endpoint, the credentials to access the bundle endpoint (if it’s secured), the
polling interval, and so on, and then pass the configuration file as a parameter when you spin
up the OPA server. 5 Once the OPA server is up, it continuously polls the bundle API to get the
latest bundle after each predefined time interval.
If your data changes frequently, you’ll find some drawbacks in using the bundle API. The
OPA server polls the bundle API after a predefined time interval, so if you frequently update
the policies or data, you could make authorization decisions based on stale data. To fix that,
you can reduce the polling time interval, but then again, that will increase the load on the
bundle API.
JWT and then read data from it. This is the approach we’ll be using to load data into OPA
with OpenID Connect, which we discuss in section 8.3.
Once you have the ID token, you can build the input document as in listing 8.16. There we
use the value of the ID token as the value of the token parameter. The listing shows only a
part of the ID token, but you can find the complete input document in the
chapter08/sample01/policy_6_input_1.json file.
The following listing shows the policy corresponding to the input document in listing 8.16.
The code annotations here explain all key instructions.
Listing 8.17 OPA policy using external data that comes with the request in an ID token
package authz.orders.policy6
allow {
input.method = "GET"
input.empid = emp_id
input.path = ["orders",emp_id]
token.payload.authorities[_] = "ROLE_USER"
}
#A The PEM-encoded certificate of the OpenID provider to validate the ID token, which corresponds to the private key
that signs the ID token
#B Verifies the signature of the ID token following the RSA SHA256 algorithm
#C Decodes the ID token
#D Checks whether the ID token is expired
#E Finds the current time in seconds; now_ns() returns time in nanoseconds.
Now you can run the following curl command from the chapter08/sample01 directory with
the input message from listing 8.16 (chapter08/sample01/policy_6_input_1.json) to check
whether the request is authorized:
{"result":{"allow":true}}
In listing 8.17, to do the ID token validation, we first needed to validate the signature and
then check the expiration. OPA has a built-in function, called
io.jwt.decode_verify(string, constraints)that validates all in one go. 7 For example,
you can use this function to validate the signature, expiration (exp), not before use (nbf),
audience, issuer, and so on.
7 You can find all the OPA functions to verify JWT at http://mng.bz/aRv9.
8.5 Summary
• Claim-based access control, also known as attribute-based access control is about
controlling access to an application based on different attributes of the logged in user.
• In a typical access control system, we find five key components (figure 8.1): the
policy administration point (PAP), policy enforcement point (PEP), policy decision point
(PDP), policy information point (PIP), and policy store.
• OPA is an open source, lightweight, general-purpose policy engine.
• OPA was designed to run on the same server as the PEP that needs authorization
decisions. As such, the first layer of defense for PEP-to-OPA communication is the fact
that the communication is limited to localhost.
• To define access-control policies, OPA introduces a new declarative language called
Rego.
• OPA provides a way to pass an ID token (a JWT) in the input document. The OPA
server can verify the ID token and then read data from it.
\> node -v
v12.18.1
To build applications with Expo, you need to have the Expo command line interface (CLI) and
Expo Go, which is a mobile client app. To install the Expo CLI, please follow the instructions
listed here: https://docs.expo.dev/get-started/installation/#expo-cli. The Expo Go client app
is available on both the Apple App Store and the Android Play Store; and you can install it on
your mobile device. The Expo Go app lets you access the mobile application you develop and
then expose via the Expo CLI from your mobile device. In section 9.1.3 you’ll learn more
about how to use the Expo CLI and the Expo Go app together.
The above command lists all the supported device types, runtimes, devices, and device pairs.
To start the emulator with the device you prefer, run the following command.
Figure 9.1 The iOS phone emulator which is installed as part of the Xcode installation on a Mac computer.
If you are not using a Mac, you can find an iOS emulator that matches your platform from
here: https://www.lifewire.com/best-iphone-emulators-4580594. If you are looking for an
Android emulator, you can pick one from here: https://www.androidauthority.com/best-
android-emulators-for-pc-655308/. Before you proceed to section 9.1.3, please make sure
the mobile emulator you picked, is working fine! In the examples in this chapter, we are
using an iOS emulator running on a Mac.
Once the command successfully completes, you’ll find the application it created under the
chapter09 directory. If you try to run the same command (above) again, you’ll get an error,
and should be using a new directory name.
From the chapter09 directory, you can either run npm run android or npm run ios to
start your application with the respective phone emulator. Make sure you have a phone
emulator installed in your computer before running the command (section 9.1.2). Here we
are going to start the sample application with the iOS emulator with the following
commands:
\> cd chapter09
\> npm run ios
Now you should see the iOS emulator, which runs the sample application as shown in figure
9.2.
Figure 9.2 The iOS phone emulator is running a React Native sample application developed with Expo Go. The
first screen requests permission to open the application. Once given the permission, the emulator loads the
app to the phone emulator.
Once you click on the Reload button (figure 9.2), you should see your application (figure
9.3). The text appears on the application is picked from the chapter09/App.js file. You
can open up the project with your favorite IDE (for example Visual Studio Code) and play
around it.
Figure 9.3 The iOS phone emulator is running a React Native sample application developed with Expo Go.
You can also access the mobile app you just developed via your iPhone or the Android phone,
using the Expo Go app we installed in section 9.1.1. In the terminal where you started the
application with the npm run ios command, you’ll also see a QR code (figure 9.4). You need
to scan it using your phone. It’ll open the Expo Go app in your mobile device, and then load
your application.
Figure 9.4 To load your mobile application to the Expo Go app, scan this QR code using your mobile device.
Finally, we can run the following command to prebuild native modules for your application.
This step is usually done to improve the performance of the application by compiling native
code ahead of time instead of at runtime. This command should be run in your project
directory (chapter09), where the package.json file of your project is located. The
prebuilding process may take a few minutes to complete, and the result of the prebuilding
will be saved in the node_modules directory, so you don't have to run this command again
unless you add or update any native modules in your project.
During the above process, you’ll be generating the bundle identifiers for your React Native
application. A bundle identifier is a string that uniquely identifies your application. It typically
takes the form of a reverse-DNS style string; and you need to choose a bundle identifier that
is unique to your application and will not be used by any others.
Once you get the following prompt, you can type a name for your Android package or for
your iOS bundle identifier. For example, we are using com.manning.chapter09 as our bundle
identifier. We use these package names and bundle identifiers to construct the callback URL
for our application when registering your application with Auth0 (section 9.2).
What would you like your Android package name to be? com.manning.chapter09
What would you like your iOS bundle identifier to be? com.manning.chapter09
Now, if you look at the chapter09/app.json file, you should see it updated with the bundle
identifier you created.
{
"expo": {
...
"ios": {
"supportsTablet": true,
"bundleIdentifier": "com.manning.chapter09"
},
}
}
After pre-building the application, you should use the following command to start the
application with iOS emulator. For the Android emulator, you can simply replace ios with
android. The npm run ios command, which we used before, won’t work on a pre-built
application.
In section 9.2, you’ll learn how to extend this sample application to support logging in with
OpenID Connect.
In this section we are using the React Native SDK provided by Auth0. It is just our
preference; we do not recommend one over the other. All these SDKs provide good
documentation to get started. Since we used Expo Go in section 9.1, using Auth0 SDK would
be a better option for us, as it added the Expo support from the version 2.16.0, released in
December 2022.
In the same way you built the single-page application in React (section 3.11), here too
we first need to create an application at the OpenID provider you picked. Since we are using
the Auth0 React SDK in this example, you can register with Auth0 and create an application
there. Here (check https://github.com/openidconnect-in-
action/samples/blob/master/IDPs.md), we explain how to set up a React Native application
in Auth0. Since these details could change over time, we wanted to keep them outside the
book, so we can update the instructions as and when they change.
When you register your application with an OpenID provider, you need a URL where the
OpenID provider can redirect the user after authentication. This is also called the callback
URL. You may recall that we used it in chapter 3 when we were building the React single-
page application, and also in chapter 6, when we were building the server-side web
application. In both the cases, the callback URL was an HTTP endpoint. However, when you
are registering the callback URL with Auth0 for your React Native application, it must be in
the following format.
• For iOS:
{IOS_BUNDLE_IDENTIFIER}://YOUR_DOMAIN/ios/{IOS_BUNDLE_IDENTIFIER}/callba
ck
• For Android:
{ANDROID_PACKAGE}://YOUR_DOMAIN/android/{ANDROID_PACKAGE}/callback
You may recall that in the section 9.1.3, we generated an iOS bundle identifier (or the
Android package), while we were pre-building the application; and there we used
com.manning.chapter09 as our iOS bundle identifier. You can find your domain identifier
after logging in to Auth0 under the Settings section. It is the value of the Tenant Name; for
example, dev-iyql2ytv. Now, we have all the ingredients to construct the callback URL.
Now we are ready to start integrating OpenID Connect for login, for the sample application
we developed in section 9.1. First, we need to install the React Native dependencies in our
computer. The following command uses npm to install the Auth0 React Native SDK:
Listing 9.1 does not have a section called plugins. You can add the following plugins section
directly under the expo element, as shown in the following listing:
#A The Auth0 domain name. You can find your domain identifier after login to Auth0 under the Settings section. It
is the value of the Tenant Name; for example, dev-iyql2ytv.
Now we have updated our app configuration with the Auth0 React Native plugin; and next we
need to prebuild the application again to generate the native source code. Run the following
command from the chapter09 directory to generate the native source code.
Now, we are ready to enable OpenID Connect to our React Native application. Open up
chapter09/App.js and replace the content of it with the following. You can also find the
same content in the https://github.com/openidconnect-in-
action/samples/blob/master/chapter09/App.js file.
return (
<View style={styles.container}>
{loggedIn && <Text>You are logged in as {user.name}</Text>}
{!loggedIn && <Text>You are not logged in</Text>}
<Button
onPress={loggedIn ? onLogout : onLogin}
title={loggedIn ? 'Log Out' : 'Log In'}
/>
</View>
);
};
flex: 1,
justifyContent: 'center',
alignItems: 'center',
backgroundColor: '#F5FCFF',
}
});
#A This function will redirect the user to the OpenID provider. In this case to Auth0.
#B Logout is not implemented in this example.
#C Replace YOUR_DOMAIN with your Auth0 domain name. You can find your domain identifier after login to Auth0
under the Settings section. It is the value of the Tenant Name; for example, dev-iyql2ytv. Also, replace CLIENT_ID
with the client ID corresponding to your application registered with Auth0.
Now, we are ready to start the application. Run the following command from the
chapter09 directory:
\> npx expo start –ios
The above command will launch your application on the iOS emulator as shown in the figure
9.5. When you click on the Log In button, you’ll be redirected to the login page of the
OpenID provider, in this case the Auth0.
Figure 9.5 The iOS emulator shows the React Native application, now secured with OpenID Connect. When you
tap the Log In button, you’ll be redirected to the OpenID provider’s login page.
Figure 9.6 The native mobile app spins up the system browser and redirects the user to the OpenID provider,
following the OpenID Connect protocol.
As per figure 9.6, when a user taps the login button on a mobile application; the mobile
applications talks to the corresponding mobile operating system, via the native API, and
spins up the system browser with the required set of parameters. And then the system
browser redirects the user to the OpenID provider, following the OpenID Connect protocol.
To launch the system browser using React Native, you could probably use something
similar to the following code snippet; the OpenID Connect React Native SDK uses this. You
can call this function with the desired URL to launch the system browser and redirect the
user to the specified URL; for example, in our case, to the authorize endpoint of the OpenID
provider with the corresponding query parameters.
Listing 9.4 A sample React Native code snippet that opens up the system browser
import { Linking } from 'react-native'; #A
Launching the system browser to initiate the OpenID Connect login flow is one approach
(figure 9.6); and the other approach is to use an embedded web view within the mobile
application itself; and redirect the user to the OpenID provider on the web view.
When integrating OpenID Connect with a native mobile application, using a web view or the
system browser both have pros and cons. Here are some of the main pros in using a web
view.
• Control: By using a web view, you have full control over the user experience and the
appearance of the login page.
• Isolation: By using a web view, you can isolate the login flow from the rest of the app,
which can reduce the risk of security vulnerabilities.
• Speed: Web views can be faster and more responsive than system browsers, which
can improve the overall user experience.
In conclusion, whether to use a web view or the system browser for integrating OpenID
Connect with a native mobile app depends on the specific requirements and constraints of
your application. Both approaches have their advantages and disadvantages, and it's
important to weigh them carefully before making a decision. However, from the security
point of it is recommended to use the system browser over the web view.
Let’s revisit figure 9.6. For your convenient we have duplicated it as figure 9.7.
Something important to note here is the code_challenge parameter, which is passed along
with the login request to the authorize end point of the OpenID provider. The
code_challenge, along with code_verifier, which is part of the request to the token
endpoint of the OpenID provider, were initially introduced by the Proof Key for Code
Exchange (PKCE) by OAuth Public Clients RFC
(https://datatracker.ietf.org/doc/html/rfc7636) by the IETF OAuth working group, and now
it's part of the OAuth 2.1 draft specification. PKCE is a best practice to prevent the code
interception attack, which we discuss in detail in the section 10.5. PKCE was initially
introduced targeting native mobile applications; however now its recommended to use with
all the types of applications that use OpenID Connect for login.
Figure 9.7 The native mobile app spins up the system browser and redirects the user to the OpenID provider,
following the OpenID Connect protocol (a copy of figure 9.6).
As per figure 9.7, after the OpenID provider authenticates the user successfully, it will
redirect the user back to the registered callback URL. You may recall from section 9.2 the
callback URLs are of the following formats (unlike the HTTP callback URL we had for single-
page apps):
• For iOS:
{IOS_BUNDLE_IDENTIFIER}://YOUR_DOMAIN/ios/{IOS_BUNDLE_IDENTIFIER}/callba
ck
• For Android:
{ANDROID_PACKAGE}://YOUR_DOMAIN/android/{ANDROID_PACKAGE}/callback
When the OpenID provider sends an HTTP 302 response to the system browser, it adds the
authorization code it generates as a query parameter to the callback URL, and then adds the
callback URL to the Location HTTP header in the response. Then it’s up to the browser to
interpret the value of the Location header and then act upon. If the Location header
carries an HTTP endpoint, the browser will do an HTTP GET to it. If the Location header
carries a value in the above format, then the browser will pass the control to the mobile
operating system to locate native mobile application with the corresponding
IOS_BUNDLE_IDENTIFIER or the ANDROID_PACKAGE. Once the mobile operating system finds
the corresponding mobile application, it will pass the control over to it; and the mobile
application will have access to the authorization code.
The following code snippet shows how to retrieve secret stored in the keychain.
9.5 Summary
• Expo helps you test React Native applications on Android and iOS platforms, without
building anything locally. It comes with a universal runtime and libraries to help us
build native mobile applications using React Native.
• A mobile emulator helps you to emulate a mobile device, for example an iOS or an
Android device, on your computer.
• A bundle identifier is a string that uniquely identifies your application. It typically
takes the form of a reverse-DNS style string; and you need to choose a bundle
identifier that is unique to your application and will not be used by any others.
• OAuth 2.0 defines two types of clients, based on their ability to manage secrets that a
client application uses to authenticate to an authorization server. The confidential
clients are the applications that can manage their secrets securely, while all the other
applications that are incapable of managing their own secrets securely, fall under the
public client type category.
• Both the single-page application and the native mobile application are considered
public clients under the OAuth 2.0 terminology.
• The Keychain on iOS and the KeyStore on Android provide a secure environment for
storing secrets, as well as access controls to ensure that only the authorized
components of the mobile application can access the secrets.
• In December 2014 a security researcher was able to use the Open Redirector
vulnerability in Facebook and Microsoft Live OAuth 2.0 implementations to get access
to an access token issued by the Microsoft Live server:
https://www.yassineaboukir.com/blog/how-I-discovered-a-1000$-open-redirect-in-
facebook/.
• In February 2013 a security researcher was able to get hold of Facebook user access
tokens by exploiting a vulnerability in Facebook’s OAuth 2.0 implementation and
Chrome: http://homakov.blogspot.com/2013/02/hacking-facebook-with-oauth2-and-
chrome.html.
In this chapter you’ll learn possible attacks against different OpenID Connect authentication
flows with respect to different application types. Also, you’ll learn security best practices you
need to follow in your implementations to mitigate such attacks.
Figure 10.1 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.
For an attacker to get hold of the authorization code, they have to either intercept step-5
or step-6; and to get hold of the access token, refresh token or the ID token, they have to
intercept step-7. In the following sections you’ll learn different techniques an attacker can
follow to intercept these communication paths.
The Firefox add-on, Modify Headers let you add, modify and filter the HTTP request headers
sent from the browser to a web server. Another Firefox add-on, SSO Tracer, lets you track all
the message flows between an identity provider and a client application via the browser. None
of these are harmful – but, then again if an attacker can fool a user to install a malware as a
browser plugin, it could easily bypass all your browser level security protections, even the TLS,
to get a hold of the tokens passed from the OpenID provider to the client application. In that
way, an attacker can get hold of the authorization code, access token, refresh token and ID
token sent to the client application from the OpenID provider.
It’s not just about an attacker installing a plugin into the user’s browser, but also when
there are many extensions installed in your browser, each one of them expands the attack
surface. Attackers need not to write new plugins, rather can exploit security vulnerabilities in
an already installed plugin to get hold of the tokens that flow through the browser.
How do you prevent such an attack? This attack happens at the client side, so the only way
to prevent such an attack is to educate users to not to install any random browser plugins.
However, in a corporate environment you can enforce policies to make sure only the trusted
browser plugins can be installed and they are updated regularly.
Figure 10.2 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.
As per figure 10.2, when the user clicks on Login on the native mobile app, it initiates the
OpenID Connect authorization code authentication flow, by spinning up the system browser.
In Apple iPhone for example, when you click on Login button on the Facebook app, it spins up
the Safari web browser to initiate the OpenID Connect login flow. At the end of the OpenID
Connect handshake, the OpenID provider redirects the user to the provided redirect_uri
with the authorization code. This is no different from what you have learnt so far with respect
to login with OpenID Connect to an SPA or a server-side web application. However, the value
of the redirect_uri here is a special one provided by the mobile application, which initiated
the login flow. At the time you install the mobile app on the device, it registers this
redirect_uri with the mobile operating system. This is called a custom URL scheme.
Once an app has a registered custom URL scheme with the corresponding mobile operating
system, whenever the OpenID provider redirects the user on a system browser, to a
redirect_uri that matches with the registered custom URL scheme, the mobile operating
system passes the entire request to the corresponding mobile app. This also includes the
authorization code; and then the mobile app can exchange the authorization code to an access
token and an ID token.
The caveat here is, the mobile operating system allows you to register multiple apps against
the same custom URL scheme. So, if an attacker is able to make you install their app in your
device with the same custom URL scheme as of another app that you use, then when you try
to login to the legitimate app, the mobile operating system will hand over the authorization
code to the attacker’s app; because both apps have registered under the same custom URL
scheme.
How do you prevent such an attack? The IETF OAuth working group introduced a new RFC
called, Proof Key for Code Exchange (PKCE) by OAuth Public Clients
(https://datatracker.ietf.org/doc/html/rfc7636) to mitigate such attacks. Initially this was
introduced for public clients such as a mobile app or an SPA; however today it is recommended
for any kind of an OpenID Connect client application. Also, emphasizing the value of this RFC,
the OAuth working group decided to include the PKCE recommendations into the upcoming
OAuth 2.1 core RFC as well. In the next section we discuss PKCE in detail.
10.5 Using proof key for code exchange (PKCE) to prevent code
interception
Proof key for code exchange (PKCE) is a best practice defined by the IETF OAuth working group
in the RFC 7363 to prevent the code interception attack, which we discussed in section 9.4. In
this section you will learn how PKCE works and how it prevents the code interception attack.
The client application prior to initiate login, generates a random string and calculates the
hash of it. The code_challenge parameter in the authentication request carries that hashed
value, and the code_challenge_method parameter carries an identifier corresponding to the
hashing algorithm. When the value of the code_challenge_method parameter is set to S256,
that means the hashing algorithm is SHA-256.
In addition to the code_challenge and code_challenge_method, PKCE also introduces
another parameter that goes with the access token request, called code_verifier. This
parameter carries the random string the client application generated before. In fact, the value
of code_challenge parameter is the hashed value of the code_verifier. Following you can
see an example of a token request that carries the code_verifier parameter:
code=YDed2u73hXcr783d&
client_id=your_client_id&
redirect_uri=https://app.example.com/redirect_uri&
grant_type=authorization_code
code_verifier=WW.geSsl9AzL8QWMl6En5dF2DhlH27mZfQ7C6T82ELN
The OpenID provider after receiving the authentication request (listing 10.1) from the client
application, it has to store the value of the code_challenge and the code_challenge_method
against the authorization code it issues. Then, when it receives the token request (listing 10.2)
along with the authorization code and the code_verifier parameters, the OpenID provider
has to load the corresponding code_challenge and the code_challenge_method to the
memory and check whether the hashed code_verifier (following the hashing method as
stated in the code_challenge_method) matches with the code_challenge, which it knows
already. If it’s a successful match then the OpenID provider will accept the provided
authorization code, and if not will reject the request.
How would this prevent the code interception attack? The figure 10.3 shows the flow of
events in a typical OpenID Connect login flow. This is in fact the same diagram you saw in
figure 10.1. Let’s assume the attacker was able to get hold of the authorization code in step-
5. However, when we have enabled PKCE at the OpenID provider, to exchange the stolen
authorization code to an access token and ID token (in step-6), the attacker also must know
the code_verifier. The client application generates the code_verifier before it initiates
step-1 and keeps it in memory; and the attacker who intercepts the step-5 won’t have access
to it.
Figure 10.3 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.
PKCE provides protection for the code interception attack that happens via intercepting the
step-5. The code interception technique we discussed in the section 10.4 with respect to a
mobile app won’t be successful when PKCE in use. However, PKCE does not help to fully prevent
the other two code interception techniques we discussed under the section 10.2 and 10.3.
For example, if the OpenID provider is using a vulnerable TLS implementation, then the
attacker does not need to worry about intercepting step-5, rather it can intercept step-7
directly and get the access token and ID token. In the same manner, if the attacker can make
the user install a browser plugin, then they need not to worry about intercepting step-5 to get
the authorization code, rather can intercept step-7 and get the access token and the ID token.
In addition to help preventing the code interception attack in a mobile environment, PKCE
also helps you prevent other types of attacks, which we discuss later in this chapter.
Figure 10.4 An attacker can successfully execute a CSRF vulnerability in a web application by sending a
carefully crafted link to the victim and making them click on it.
Every time the banking website invokes an API, the browser will automatically attach the
cookies to the API request. An attacker can exploit this behavior to make the user to submit a
malicious request unintentionally. For example, the attacker can send an email with a link that
carries a carefully crafted message (as shown below) and make the user clicks on it.
https://api.mybank.com/transfer?from=101090909090&to=121391939890&amount=5000
Once the user clicks on the link, the browser generates an HTTP request to the
corresponding banking API; and if the user already had a valid login session with the website,
the browser will automatically attach the corresponding cookies to the HTTP request, and the
API will successfully execute the malicious request. This kind of a CSRF attack has some
conditions to satisfy, which you might have already figured out. Following lists out those
limitations.
• The user (or the victim) must have a valid login session with corresponding backend
server.
• The corresponding backend server must use cookies to protect its APIs.
• The attacker must somehow make the user clicks on the link.
The API corresponding to the operation the attacker wants to perform, in the above
example, accepts inputs as query parameters. Can we prevent a CSRF attack by not accepting
query params as inputs to perform such operations, rather expect the HTTP body to carry the
inputs? If the API does not accept query params as inputs, then the attacker won’t be able to
send a link as below to the victim and make them click on it.
https://api.mybank.com/transfer?from=101090909090&to=121391939890&amount=5000
But, would that make the attacker’s job any harder? No. The attacker can host its own web
page, and make the user clicks on link pointing to that web page, as shown below.
https://app.evil.com/index.html
Then, under the above index.html, you can add HTML code as shown below, which can do
an HTTP POST to the corresponding API, which carries the required input parameters in the
HTTP body and will execute the API request successfully.
<script type="text/javascript">
window.onload=function(){
function submitform(){
document.forms["bank"].submit();
}
}
</script>
If you protect your business APIs with OAuth 2.0 tokens instead of cookies, then you can
prevent a CSRF attack against your business APIs. With OAuth 2.0, you pass the access token
in the HTTP Authorization header, and that prevents an attacker from executing a CSRF attack
(figure 10.5).
How about the attacker hosts a website, which will perform prompt=none call to the
authorize endpoint of the OpenID provider; and then gets the authorization code, and then
exchanges the authorization code to an access token and an ID token? If the attacker can do
it, then it can use the access token it got from the OpenID provider to talk to the business
APIs. The figure 10.5 shows the flow of events related to this scenario.
Figure 10.5 The request will not carry the OAuth token in the HTTP Authorization header by default. So, it will
be rejected by the web server..
As per the figure 10.3, in step-5 and step-6, when the browser does a direct call to the
authorize and token endpoints of the OpenID provider, the browser will only permit those calls,
if both the attacker’s website domain and the OpenID provider’s domain are the same or, you
have enabled cross-origin resource sharing (CORS) policy at the OpenID provider to let the
attacker’s website domain to do direct HTTP requests to the OpenID provider. So, having a
restrictive CORS policy at the OpenID provider, helps you prevent this attack.
Figure 10.6 The client application uses authorization code authentication flow to communicate with the
OpenID provider to authenticate the user.
With OpenID Connect, an attacker can try to execute a session fixation attack, with the
help of a CSRF vulnerability in the OpenID Connect implementation (see figure 10.7).
Figure 10.7 The attacker first logs into the OpenID provider, but will block redirect to the web server and
extract the authorization code.
In the step-1 of the figure 10.7, the attack first logs into the OpenID provider with their
own account. This is a perquisite for any kind of a session fixation attack; where the attacker
and the victim should have accounts with the same OpenID provider. In step-2, after the
OpenID provider verifies the attacker’s credentials, it will issue an authorization code. This
authorization code goes to the corresponding web application via the browser. In step-3, the
attacker prevents the authorization code from getting to the website, copies the redirect URI
with all the parameters (which also includes the authorization code); and in step-4 executes a
CSRF attack by sending the copied URI to the victim.
In step-4, the victim clicks on the link, and then the website exchanges the attacker’s
authorization code to an access token and an ID token, and now the victim has logged into the
website as the attacker. In other words, the victim’s session is now fixed.
10.6.3 Preventing the session fixation attack with the state parameter
How do we prevent this session fixation attack? To prevent the session fixation attack, we need
to fix the CSRF vulnerability in our OpenID Connect implementation. There are two ways to do
that. One is to use the state parameter and the other is to use PKCE (which we discussed in
section 10.5.2). The figure 10.8 shows, how to use the state parameter to avoid a possible
CSRF vulnerability. You learnt about the state parameter in chapter 3, and its recommended,
but yet an optional parameter in the OpenID Connect authorization code flow.
Figure 10.8 The client application includes a random, nonguessable string as the value of the state parameter
and also stores the same value on the browser.
In the step-1 in figure 10.8, the client application before it starts the login flow generates
a random value as the state parameter. Also, it’ll add the value of the state parameter to
the browser session. For a server-side web application, this can be added as cookie, and for a
single-application you can store the value of the state parameter in the session storage. In
step-5, when the client application gets back the authorization code, it also gets back the same
state parameter it added to the request in step-1, in the response. Now, the client application
must make sure, the value it stored in the browser session exactly matches with the value of
the state parameter it got form the OpenID provider. If those do match, then client application
can proceed to the code exchange, otherwise it should return an error.
How does the above approach with the state parameter fix the session fixation attack?
The URL the attacker copies from their own login flow, would be something similar to the
following. Here the value of the code is corresponding to the attacker’s account and also the
value of the state parameter is corresponding to the state value stored in the attacker’s
browser.
https://app.example.com/redirect_uri?code=YDed2u73hXcr783d&state=Xd2u73hgj59435
When the attacker passes the above URL to the victim and when the victim clicks on it, the
client application will check the value of the state parameter in the request matches with
value of the state parameter in the victim’s browser. That check will fail because, either the
victim’s browser session does not have a state value, or it won’t match with the state value
of the attacker. So, the session fixation effort by the attacker fails.
Figure 10.9 Using PKCE the client application can prevent a session fixation attack. The client generates a
code_challenge and a code_verifier before sending the login request to the OpenID provider, and later users
the code_verifier, when exchanging the authorization code to an access token.
In step-1 in figure 10.9, the client application on the attacker browser generates the
code_verifier parameter; and however, when the victim clicks on the link the attacker
provided, and the client application tries to exchange the authorization code provided the
attacker, that request will fail, because, the client application does know the code_verifier
corresponding to the attacker’s authorization code. So, the session fixation effort by the
attacker fails.
Figure 10.10 The attacker intercepts the communication between the browser and the web server, and update
the OpenID provider the user picked to a one that is under the attacker’s control.
As per figure 10.10 the client application provides login options with multiple OpenID
providers; let’s say with op.foo.com and op.bar.com. Both these OpenID providers are not
under the control of the client application. In step-1 of figure 9.10, the victim picks op.foo.com
from the browser and the attacker intercepts the request and changes the selection to and
op.bar.com. Here we assume the communication between the browser and the client
application is not protected with TLS. The OpenID Connect specification (or the OAuth 2.0 RFC)
does not talk about how to protect the communication between the browser and the client
application. It is outside the scope of OpenID Connect and it’s purely up to the web application
developers to decide what to do. Since there is no confidential data passed in this flow,
sometimes the web application developers may not be worried about using TLS. At the same
time, there were few vulnerabilities discovered over the past on TLS implementations. So, an
attacker could possibly use such vulnerabilities to intercept the communication between the
browser and the client application even if TLS is used.
The client application only gets the modified request from the attacker, who intercepted
communication. So, in step-2 the client application thinks the victim picks op.bar.com and
redirects the user to op.bar.com. In step-3 the attacker again intercepts the redirection from
the client application and modifies the redirection to go to the op.foo.com. The way redirection
works is, the web server (in this case the client application) sends back a response to the
browser with a 302 status code — and with a Location HTTP header. If the communication
between the browser and the client application is not on TLS, then this response is not
protected, even if the Location HTTP header contains an https URL. Since we assumed already,
the communication between the browser and the client application can be intercepted by the
attacker, then the attacker can modify the Location HTTP header in the response to go to the
idp.foo.com, which is the original selection by the victim.
In step-4 the client application gets the authorization code and now will talk to the
idp.bar.com to validate it. Since the client application supports multiple OpenID providers, how
does it know given an authorization code, the corresponding OpenID provider is? Just looking
at the authorization code, the client application cannot decide to which OpenID provider the
code belongs to. So, here we assume the client application tracks the OpenID provider by
some session variable (or using the session storage).
In step-5, the idp.bar.com gets hold of the victim’s authorization code from the
idp.foo.com. So, in the case of a single-page application, which is a public client, the
idp.bar.com can exchange the authorization code to an access token and use that to access
business APIs on behalf the victim.
There are no records that the this attack being carried out in practice, but at the same time
we cannot totally rule it out. One way to prevent such an attack is to have separate redirect
URIs by each OpenID provider. With this the client application knows to which OpenID provider
the corresponding authorization code belongs to and helps preventing the OpenID provider
mix-up.
10.8 Summary
• An attacker can exploit a vulnerability in the TLS communication and then read the
value of a token. Padding Oracle On Downgraded Legacy Encryption (POODLE), Browser
Exploit Against SSL/TLS (BEAST), Compression Ratio Info-leak Made Easy (CRIME),
Browser Reconnaissance and Exfiltration via Adaptive Compression of Hypertext
(BREACH), Heartbleed are some of the popular vulnerabilities found in TLS
implementations in the past.
• An attacker can fool a user to install a malware as a browser plugin, and it could easily
bypass all your browser level security protections, even the TLS, to get a hold of the
tokens passed from the OpenID provider to the client application.
• The IETF OAuth working group introduced a new RFC called, Proof Key for Code
Exchange (PKCE) by OAuth Public Clients
(https://datatracker.ietf.org/doc/html/rfc7636) to mitigate authorization code
interception attacks.
• The goal of a cross-site request forgery (CSRF) attack is to trick the user (or the victim)
to submit a malicious request unintentionally.
React fundamentals
React is an open source JavaScript library developed by Facebook for developing user
interfaces for web based applications. Over the time it has become one of the most popular
JavaScript libraries, used by many major companies such as Facebook, Netflix, Instagram,
AirBnB, Medium, Twitter, and many more. In this appendix we discuss React fundamentals,
and everything you need to know to follow the samples in Chapter 2 of the book. React
heavily uses features introduced in modern JavaScript, which is also called Next-Gen JS or
ES6+. So in this appendix you will also learn some of the new features introduced in modern
JavaScript.
If you’re interested in understanding JavaScript in detail, we recommend The Definitive
Guide: Master the World's Most-Used Programming Language (O'Reilly Media, 2020) by
David Flanagan or Eloquent JavaScript: A Modern Introduction to Programming (No Starch
Press, 2018) by Marijn Haverbeke. If you’d like to learn React in detail, we recommend the
book Learning React: Modern Patterns for Developing React Apps (O'Reilly Media, 2020) by
Alex Banks and Eve Porcello. Also, the book React Hooks in Action (Manning, 2020) by John
Larsen is another good book to learn React in detail.
1 JavaScript engines are different from browser engines. For example, Google Chrome browser uses the Blink browser engine with V8 JavaScript engine,
and Safari uses WebKit browser engine with JavaScriptCore JavaScript engine.
function testVar(override) {
var num = 10;
if (override) {
var num = 15;
}
console.log("value of num ",num);
}
testVar(true); // prints 15
testVar(false); // prints 10
Here you can see we have defined the variable num in two places: at the level of the testVar
function and within the if-block. If you look carefully at the output of the program, you will
realize that the value of the variable num defined within the if-block, overrides the value of
the num variable defined at the function level. In other words, the scope of any variable
defined with the var keyword in JavaScript is not just limited to the code block that contains
it. It’s value is visible within the function that contains the variable declaration. To be more
precise a variable declared with the var keyword within a function directly or inside another
code block, is visible to anyother code block within the same function. So, any code block
within the function can override the value of any variable declared with var at the function
level or within a clode block inside the function.
If a variable is defined outside of a function using the var keyword (also known as a
global variable), then still within a function you can refer that variable. However if you re-
declare a variable within a function with the same name as of a global variable, JavaScript
treats it as a new variable. So, within the function if you want to refer the global variable,
which carries the same name as of the local variable, then you need to you need to access
that variable via the global object, for example as globalThis.num. So, if you re-declare a
new variable with the same name within a function, any changes you do to your local
variable won’t affect the global variable. Following code block demonstrates the above
explanation.
var num = 25
function testVar() {
var num = 10;
console.log("value of num ", globalThis.num); // prints 25
console.log("value of num ", num); // prints 10
}
testVar();
console.log("value of num ",num); // prints 25
In the following code block, let’s rewrite the testVar function by using the let keyword.
When you use let to define a variable, that makes the scope of the corresponding variable
only visible within the block that defines the variable. In the following code block, for
example, the value of the num variable defined at the function level remains unchanged even
after we execute the if-block.
function testVar(override) {
var num = 10;
if (override) {
let num = 15;
}
console.log("value of num ",num);
}
testVar(true); // prints 10
testVar(false); // prints 10
The behavior of const keyword is quite similar to the behavior of let keyword, except, you
cannot override the value of any variable defined as const. Also, unlike let, when you
declare a variable using the const keyword, you also must specify a value to it at the time of
the declaration. The following code block shows the behavior of the const keyword. Here we
define the function level num variable as a constant. However, even if we redeclare another
variable under the same name (num) using the let keyword within the if-block, it will have
no impact on the function level constant.
function testVar(override) {
const num = 10;
if (override) {
let num = 15;
}
console.log("value of num ",num);
}
testVar(true); // prints 10
testVar(false); // prints 10
Finally, there is one more important difference between the variables declared with the var
keyword, and the variables declared with let and const keywords. When you declare a
variable using the var keyword, you can refer to that variable from any location within the
function; it can be before or after the declaration. This is possible due to a feature in
JavaScript called hoisting. All the variable declarations with the var keyword are hoisted to
the top of the corresponding function. But, when you use let and const keywords, those
variable declarations are not hoisted, so you can refer those variables only after the
corresponding variable declaration.
NESTED FUNCTIONS
A nested function is a function that is defined within another function. JavaScript supports
nested functions. The following code block shows an example of a nested function, which is
also called an inner function. The variables defined within the outer function are visible to the
inner function.
function foo() {
let num = 100;
function bar() { // this is a nested function
console.log(num); // prints 100
}
bar(); // calls the nested function
}
foo();
FUNCTION EXPRESSIONS
You can also define a function in JavaScript as an expression, with or without a name. The
following code block shows an example of a function expression. Here we assign the function
expression to a constant called greet, so the definition of the function cannot be overriden
later. Also, in this example, we don’t specify a name to the function.
Let’s have a look at another example of a function expression. The function expression
shown in the following code block carries a name. Even though this function has a name,
you’ll be able to use it only within the same expression itself. You cannot use the name of
the function to invoke it outside of the function expression. In most of the cases having a
name for a function is useful when you want to invoke same function recurssively within the
function itself.
There is a major difference in a function you declare using the function keyword and a
function expression. As discussed at the beginning of this section, when you use the keyword
function to define a function, those function declarations are hoisted to the top of
JavaScript program behind the scene, but when you define a function as an expression, you
can invoke that function only after the corresponding expression got executed. JavaScript
does not hoist function expressions.
HIGHER-ORDER FUNCTIONS
In JavaScript a function can accept another function as an input parameter, as well as return
another function. A function that accepts or returns another function is called a higher-order
function. In the following code block, let’s have a look at an example. Here the sendMessage
is a higher-order function that accepts two string arguments and a function. The one who
invokes the sendMessage function decides how the communication should happen with the
recipient and based on that passes the corresponding function as the messenger.
As you might have already guessed, we don’t use the function keyword to define an arrow
function. Also, there no need to have a name for the function. Prior to the arrow in the arrow
function we defined the input parameters to the function within parentheses. If we have only
input parameter, then we don’t need to have parentheses. After the arrow, is the body of the
function. If there are multiple statements in the body of the function then we need to have
the function body with a pair of curly braces, but if the function has only one return
statement, then we don’t need to curly braces, and also can skip the return keyword.
Following code block shows an example of an arrow function that accepts a number as an
input parameter and has one statement that returns the square of the provided number.
console.log(sqrt(4)); // prints 16
If an arrow function does not accept any input parameters, then it must have an empty
parentheses as shown in the following code block.
console.log(sqrt()); // prints 4
console.log(sqrt(4)); // prints 16
The default value of an optional parameter of a function can also be an expression. Let’s
revisit the code example we had section A.2.3. In the following code block we make the
messenger an optional argument and sets its default value to email, which points to another
function.
You can also define a default value to the messenger argument in the following way. Here
we define a function for the messenger argument as an arrow function inline. This is a good
example to use an arrow function because of its compact nature.
You can also use template literals along with a function name as shown in the following code
block. Any template literal applearing right next to a function name (for example right after
the function name helloName), is passed to the corresponding function as input arguments.
This type of template literals is also known as tagged template literals.
// prints 5
console.log(sum(2,3,4));
ES6 introduced the rest operator to handle these kind of cases in a better way. You define a
rest operator with three dots (…) and it must be the last argument in the function declaration
if the function wants accept more arguments than the rest operator. Following code block
shows some valid and invalid function declarations with the rest operator.
const sum = (num1, num2, ...nums) => {//function body;} // this is a valid
The following code block shows a complete example of how to use the rest operator. Here we
treat the values you get to the rest parameter (...nums) as an array and execute forEach
function on it.
const sum = (...nums) => {let tot=0; nums.forEach(num => tot+=num); return tot;}
console.log(sum (2,3,4)); // prints 9
You can also use the spread operator to break an array into indivudal elements and pass
those to a function. The following code block demonstrates that use case.
// prints 3
console.log(sum(...arr));
const book = {name: "React in Action", author: "Mark Tielens Thomas", publisher:
"Manning"};
Let’s go through another example. In the following code snippet we do object destructuring
for the input parameters of an arrow function.
const book = {name: "React in Action", author: "Mark Tielens Thomas", publisher:
"Manning"};
A.2.9 Modules
ES6 introduced the first class support for modularity in JavaScript with import and export
statements. In this section we discuss how ES6 modules work in a browser environment. We
do not intend to discuss modules in detail here, but we’ll teach you just enough information
to get started with React. In section A.7 we discuss modules again with respect to React.
Prior to ES6, JavaScript followed the principle shared-everything. All the variables and
functions you define under different script tags of an HTML page are shared among each
other. You can’t restrict access to a given JavaScript function only for the other functions
defined within that particular script tag. The following code listing (listing A.1) illustrates
this with an example.
If you open up the code in the following listing in a web browser you’ll find that number
100 is printed on the console (You can also access the same via
https://prabath.s3.amazonaws.com/a1.html). That confirms that JavaScript code defined
under one script tag has access to the JavaScript code defined under another script tag.
This would still be the case even if you load JavaScript under each script tag from a URL
using src attribute (rather than defining it inline as in listing A.1).
<body>
<script>
let x = 10;
function getSqrt(num) {
return num*num;
}
</script>
<script>
console.log(getSqrt(x)); // prints 100
</script>
</body>
With the modularity concept introduced in ES6, JavaScript code defined under each script
tag is treated as an independent module, when we set the type attribute of the script tag
to module. If you open up the following code snippet (listing A.2) as an HTML file on your
browser, you find the error Uncaught ReferenceError: getSqrt is not defined on the
<body>
<script type="module">
let x = 10;
function getSqrt(num) {
return num*num;
}
</script>
<script type="module">
console.log(getSqrt(x)); // Uncaught ReferenceError: getSqrt is not defined
</script>
</body>
If you want to refer a function defined within one script tag (or a module) from another script
tag (or a module), first you need to export it from the module that defines the function and
import the same from the module that wants to access it. You can import a function, only if
the module that defines that function exports it. Let’s go through an example in the following
code snippet. Here we have externalized the module that defines the getSqrt function in to
an independent JavaScript file called math_mod.js. You cannot import a function from an
inline module, and that’s why we have to define the getSqrt function in math_mod.js file.
The following code snippet shows the content from the math_mod.js file.
The following code snippet shows the HTML page content that loads a module from the
math_mod.js file.
<body>
<script type="module" src="./math_mod.js"></script>
<script type="module">
import {x, getSqrt} from "./math_mod.js";
console.log(getSqrt(x)); // prints 100
</script>
</body>
If you try to save the above code snippet into an HTML file and try to open it up in a
browser, you will notice an error message on the console. Even though normal script tags
support loading a JavaScript source from the local filesystem, when you have module as the
type of the script tag, the browser makes sure Cross-Origin Resource Sharing (CORS)
rules are enforced. We discuss CORS in chapter 3, and cross origin requests are not
supported for the files loaded from the local file system. So, if you’d need to test this use
case, you would need to host both the math_mod.js file and HTML file in a web server.
We’ve uploaded both these files into Amazon S3, which you can access from
https://prabath.s3.amazonaws.com/a4.html. Once you open this file, you will see 100
printed on the console.
<body>
<div id="book-club-app" />
<script type="text/babel">
const App = () => (
<h1>Welcome to the Book Club!</h1>
);
ReactDOM.render(<App />,
document.getElementById("book-club-app"));
</script>
</body>
</html>
The React program in listing A.5 produces the following output in figure A.1, when you open
the index.html file using a web browser.
Congratulations! You have successfully run your first React program. Now let’s break down
the code in listing A.5 to multiple sections.
First let’s focus on the code inside the script tag within the HTML body tag. You can find
the same in the following code listing. As you might have already guessed rightly, here we
have defined an arrow function (see section A.2.3), which takes no arguments, and assigned
it to a constant with the name App. The objective of this arrow function is to construct and
return a React element, and we call this function a React component or to be precise a React
function component. A valid React function component accepts zero or one input parameters.
In the following listing it accepts no parameters, however in section A.5 we have an example
that accepts one input parameter.
The code inside the body of the arrow function (listing A.6) looks very similar to HTML, but it
is not. It’s written in JSX, which is an extension to JavaScript. JSX allows us to define React
elements in a way that, they look very similar to HTML. So, you don’t need to learn much
about JSX syntax, if you are already familiar with HTML. However, browsers do not
understand JSX. So, we use the Bable (https://babeljs.io) JavaScript library to compile the
JSX code into JavaScript that the browsers understand. To use Babel on the browser we need
to do two things.
• Load the Bable JavaScript library
<script src="https://unpkg.com/babel-standalone@6/babel.min.js"></script>
• Embed the JSX code with a script tag with the type text/babel
<script type="text/babel"></script>
Once the App function constructed the React element, we need to talk to the ReactDOM API
to add the following code line, which you also can find in the listing A.5, to add the React
element to the browser’s DOM. Here we first find the book-club-app element in our HTML
code (which is an HTML div tag) with the JavaScript function document.getElementById
and then add the App element to it.
ReactDOM.render(<App />,document.getElementById("book-club-app"));
To use the ReactDOM API from the browser, we need to load the ReactDOM JavaScript
library using the following script tag.
<script src="https://unpkg.com/react-dom@16/umd/react-dom.development.js"></script>
We’ve talked about all the key elements in listing A.5 except the following script tag.
<script src="https://unpkg.com/react@16/umd/react.development.js"></script>
This loads the React JavaScript library that helps the browser to understand React element.
However, in the code listing A.5 we have not defined any React elements. But yet, we need
to load this React JavaScript library because, when Bable compiles JSX code to JavaScript,
that will result in React element.
In the rest of this appendix we’ll discuss how to improve this simple React program in
steps to build a production-grade React application. In doing that we’ll introduce you to some
important concepts in React.
You can find the complete HTML page in appendix-a/sample02/public/index.html. To see the
output of the program, you can open the index.html file using your favorite web browser.
Listing A.7 The App React element loads the with Books element
<script type="text/babel">
Listing A.8 The App React element accepts a set of properties from the caller
<script type="text/babel">
ReactDOM.render(<App name="Peter"/>,document.getElementById("book-club-app"));
</script>
The following code line that adds the <App /> element into the browser DOM, defines under
the <App /> element, which attributes it needs to pass to the App function component. The
props input parameter the App function component accepts in fact a JavaScript object, and
it is populated from the attributes defined under the corresponding element. In this case the
props will have a property called name that carries the value Peter.
ReactDOM.render(<App name="Peter"/>,document.getElementById("book-club-app"));
Let’s have a look at how the App function component reads the values from the props input
parameter. As shown in the following code snippet, to refer to any JavaScript from JSX code,
you need to have that within curly braces. Here we read the value of name attribute defined
in th <App /> element as props.name.
There is another, better way of reading values from the props input parameter. To do that
we need to change the function definition of the App component in the following way. Here
we destructure (section A.2.8) the props object and specify the exact names of the
attributes we expect within curly braces, and in the JSX code directly refers to an attribute
using the corresponding name.
ReactDOM.render(<App name="Peter"/>,document.getElementById("book-club-app"));
The lifetime of a React component begins at the time you load it to the browser, with the
ReactDOM.render method, and ends when you reload the application, probably by refreshing
the browser or calling the ReactDOM.render method again. You may recall from our previous
examples, we don’t call ReactDOM.render method to explicilty render all the React
components. We explicitly call ReactDOM.render method to load our core React component
(for example <App />) and other React components are loaded by the core component
implicitly.
Let’s say we want to build a React component that keeps asking the user to type a
number and shows the sum of the numbers typed by the user, one after the other. A user
types 2, for example, and the React component adds 2 to 0 and displays 2. Then the user
types 4 and the React component adds 4 to 2 (the previous sum) and displays 6. This is a
stateful React component and it has to maintain the state of the sum between the user
interactions.
To manage state in a React component we use hooks. React introduced hooks from React
version 16.8 onward. As the name implies a hook in React helps you hook up additional
functionality into your React function components to make your life easier. In other words, a
React hook is a function that wraps some functionality, which you can use from your function
component. You can use the useState React hook, for example, to manage state of your
React component. There are many other React hooks and we’ll discuss some of them later in
this appendix.
In the following code listing we demonstrate how to use useState React hook within a
function component. The following listing only shows React code that will be compiled by
Babel. You can find the complete HTML page in appendix-a/sample04/public/index.html.
Listing A.9 Using useState React hook to manage state of a React component
<script type="text/babel">
const App = () => {
const [sum,setSum] = React.useState(0);
return (
<div>
<h1>Sum of Numbers: {sum}</h1>
<input type="text"
onKeyPress={event=>event.key=="Enter" ?
setSum(parseInt(sum)+parseInt(event.target.value)):undefined} />
</div>
);
}
The most import code line in the listing A.9 is the following code snippet that calls the
useState function. The response from the useState function is destructured into two
constants. The setSum constant is a function we can use to update the value of the constant
sum. Since sum is a constant, we cannot directly update its value, and must use the setSum
function all the time. When invoking the useState function, we can pass the initial value of
the sum constant, and here we pass 0.
Let’s have a look at the following code snippet that invokes the useState function when we
type a number in the input box and press the Enter key. Here we define an arrow function
inline, for the onKeyPress event of the input box. Since this event being triggered for all the
key, we need to first check whether key pressed is the Enter key, and then we read the
value of the sum constant from the state, parse it as an integer, add the number type in the
input box and call setSum function to update the state of the sum. That’s all we need to do
and React will automatcially update the value of sum in all the places within that React
component that refer the sum constant.
<input type="text"
onKeyPress={event=>event.key=="Enter" ?
setSum(parseInt(sum)+parseInt(event.target.value)):undefined} />
code into React elements (or the JavaScript the browser understands) while loading the
HTML page into the web browser. This is not an optimal way of developing a React
application and also not the way how you develop a React application in a production
enviornment. However, we followed that approach in all the examples so far, because, that
helps you understand how React works. In this section we’ll teach you how to organize your
React application in a modular way.
Let’s first be clear about our objectives. Ultimately we want to produce a single-page (an
HTML page), with all what we want to render our application on the browser. This is exactly
what we did in all our examples so far in this appendix. So, we need to do the same but in a
different way. Ideally we need to find a way to maintain the HTML code and the React
components independent from each other in our code repository (GitHub) and use some kind
of a tool to aggregate all the HTML code, React components and all the other JavaScript files
to build a single HTML file. The rest of this section takes you through how you can structure
a React project in a better way.
JavaScript ultimately being added into the index.html file (via bundle.js file), the JavaScript
in the index.js (listing A.11) file can find the div element that we already have in the
index.html file and add the <App /> React element to it.
There are three important statements at the top of the code listing A.11. In the first import
statement we import React component from the react module. In our previous examples
(see listing A.5) we used the following script tag to do the same.
<script src="https://unpkg.com/react@16/umd/react.development.js"></script>
As you learnt at the beginning of this appendix, React is a JavaScript library. A JavaScript
library can be distributed as a JavaScript file or as a node package (there are other ways
too). In our previous examples we used a file to load the React JavaScript library from
https://unpkg.com/react@16/umd/react.development.js. We also used multiple other files to
load ReactDOM (https://unpkg.com/react-dom@16/umd/react-dom.development.js) and the
Babel (https://unpkg.com/babel-standalone@6/babel.min.js) JavaScript libraries.
However, the objective of this section is to have single script file, which is the bundle.js
file that aggregates all the JavaScript we use in our application; so we don’t need to ask the
browser to download multiple scripts from different locations. So, the tool we will introduce
later in this section to build our React application will load the react node package (a node
package is a distribution unit that has multiple JavaScript modules) from the central npm
(node package manager) registry and will add the corresponding JavaScript (from the react
node package) to the bundle.js file.
In section A.2.9 we discussed JavaScript modules, and you may recall how we imported
one module from another module. There we had to refer the module we need to load by
pointing to the corresponding JavaScript file that defines that module, as shown in the
following code.
<script type="module">
import {x, getSqrt} from "./math_mod.js";
console.log(getSqrt(x)); // prints 100
</script>
In listing A.7 we used two methods to load modules. Just as in section A.2.9, we load the
App.js module by pointing to the JavaScript file that defines it, as shown in the following
code. This import statement imports the App component from the file components/App.js.
You can find App.js file under sample05/src/components directory, and App.js defines a
JavaScript module and exports the App component.
We also use another type of import in listing A.7, as shown in the following code. Here we
load two modules: react and react-dom, from the npm central registry by default. To be
pecise the JavaScript engine that runs this script should understand how to find these two
modules and load them. By default if you use node as our JavaScrip engine, it tries to load
these modules from the npm central registry available at https://registry.npmjs.org. Still, if
you’d like you can instruct node to use a different registry.
You surely have a question now! If you use import statement like the above in our JavaScript
modules, and when they run on the browser, how does the browser knows where to find the
correponding JavaScript associated with those imported modules? At the time of this writing
the browsers do not support above type of import statements, you need to always specify
where to load the JavaScript associated with the module you want to import, in the following
way.
Now we have another question. If the browser do not support importing module just by the
name, then why did we use them in the code listing A.7. We use Babel to compile our
module code in away that is understood by the browsers, and all the code will be included in
the bundle.js file that we are going to generate shortly.
Let’s have a look at the components/App.js file (listing A.12). This file defines the App
function component and exports it. Following listing shows the content from the App.js file.
By now you should know how everything in this file works.
INSTALLING NODE.JS
Node.js is a JavaScript runtime that runs on V8 JavaScript engine. You can download and
install Node.js, following the instructions available here: https://nodejs.org/. If the
installation is successful, the following command should return the verion of the Node.js you
installed.
\> node -v
v12.18.1
We intend to run our React application on the browser; so, why do we need another
JavaScript engine with Node.js? There are two reasons as explained below why we need to
install Node.js.
1. We are using a tool called npm to download JavaScript modules that our React
application uses, from the npm central registry. Once we download all the modules our
React application depends on, later we can aggregate all those modules together to
generate the bundle.js file. The npm tool comes with Node.js.
2. We use Babel to compile Next Gen JavaScript and JSX code into the JavaScript that
most of the browsers do understand. You may recall that in the examples we had
before, we embedded Babel as a JavaScript to our HTML code and Babel did the
compilation on the fly while loading the HTML page. That’s not an optimal way of
doing things; so, here we want Babel to do the compilation before we generate
bundle.js file. The browser should understand everything in the bundle.js file. To run
Bable we need a JavaScript engine, and we use Node.js for that.
INITIALIZE THE PROJECT WITH NPM
The node package manager or npm is a tool that comes with Node.js. If you have installed
Node.js, then you also have npm in your system. To initialize your React project, run the
following command from appendix-a/sample05 directory. Here we pass –y as an argument,
so npm will use the default settings.
Now if you look inside the sample05 directory, you will find a file with the name
package.json. Following listing shows the auto-generated content of the package.json file.
Since we passed –y as an argument to the above npm command, the content of
package.json file is generated using the default settings. The package.json file carries all the
node packages our React application depends on. We don’t see them now in the generated
file, but in the next section when we install the packages we need, using the npm tool, it will
automatically update this file.
After you run the above command, npm tool updates package.json file. The following listing
shows the updated file, with the newly added dependencies section, which carries
references to the react and react-dom packages.
In addition to the dependecies our React application needs at the runtime, to run on the
browser, we also need few more node packages to help the build process of the project. We
need the node packages bable-loader, @bable/core, @babel/preset-env and @babel/preset-
react to translate the Next Gen Javascript and JSX code to the JavaScript that is understood
by the browsers. And we also need the node packages webpack and webpack-cli to
aggregate all the JavaScript we need for our React application and generate the bundle.js
file. Let’s run the following npm command from the sample05 directory to install those
packages. Here we pass –save-dev argument to instruct npm that we only need these
dependencies during the development time.
The following listing shows the updated package.json file, with the newly added
devDependencies section.
Apart from generating the package.json file, the npm install command also creates a new
directory called node_modules under the sample05 directory. This directory includes all the
JavaScript files corresponding to the npm packages we installed. We do not need to do any
changes to those files.
module.exports = {
entry: "./src/index.js",
output: {
path: path.join(__dirname, "public"),
filename: "bundle.js"
},
module: {
rules: [{ test: /\.js$/, exclude: /node_modules/, loader: "babel-loader" }]
}
};
To run the project build with webpack, we also need to update the package.json file under
the sample05 directory, with the following content. Here we add a new element called build
under scripts element of the package.json file. So, when we execute npm run build, the
webpack command we have in the following code will get executed.
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1",
"build": "webpack --mode production"
},
CONFIGURE BABEL
To configure Babel, we need to create a file with the name .bablerc under the sample05
directory. This file instructs Babel on how to carry out the compilation process. In this file we
can define a set of Babel presents. A Babel preset brings in a set of plugins that gets
executed during the compilation process. The @babel/preset-env preset brings in all the
features you need to compile ES6 or later JavaScript. The @babel/preset-react in all Babel
plugin you need for a React application. The following listing shows the content of the
.bablerc file.
{
"presets": ["@babel/preset-env", "@babel/preset-react"]
}
Once the build is successful you can find a new file called bundle.js created under
sample05/public directory. To see how the application works, open up the
sample05/public/index.html file usinng your favorite web browser.
you automate the process and with a single command you can create a complete React
application.
Let’s install the Create React App tool with the following command. This is only a one-
time installation and you can run the following command from anywhere in your local
machine. Here we pass -g to indicate npm to install create-react-app package globally, so
it’s available to any React project you create.
To reate a React application using the tool, run the following command from the appendix-a
directory. Here we use another tool that comes with the Node.js installation called npx,
which is also called a package runner. Here we pass hello-react as an agument to the
command, which is the name of our React application.
Once the command runs successfully, it creates a directory with the name of the React
application, hello-react. Under the hello-react directory you’ll find a directory structure and a
set of files similar to what we created in the section A.7. One thing to notice here is, in the
package.json file you won’t find all the packages we had before in the section A.7. The trick
is with the react-scripts package (which we didn’t have in the section A.7). The react-scripts
is a special package that embeds some packages related to Babel, webpack and many more.
The template React application created from the above command is designed to run on
web browser. If you just try to open the index.html file using your web browser you will find
encounter some errors due to the location of some file. The best way to test the app is to run
the app in a web server using the following command, run from the hello-react directory. It
will spin up a web server and host the React application on localhost port 3000 by default.
Local: http://localhost:3000
On Your Network: http://10.0.0.129:3000