0% found this document useful (0 votes)
6 views30 pages

Wa04 WebArchitecture

Corso di Web Application II del prof G.Malnati - Politecnico di Torino, 2021

Uploaded by

giagio.vit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views30 pages

Wa04 WebArchitecture

Corso di Web Application II del prof G.Malnati - Politecnico di Torino, 2021

Uploaded by

giagio.vit
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 30

When we create a web application, we are creating a distributed system; what the user will see is the result

of the interaction of his own browser and a remote server that is somewhere else and possibly a database
that can be even further way. So, it’s the result of several computations which exchange messages and
cooperatively creates the final response, what we see on the screen.

So, we need a network, a client that connects to the server, the server that connects o the database or on
other systems. The communication is based on messages.

We create distributed systems for sure for a problem of scalability, so we can face problems that are larger
that those mapped on one single computer. The result on our computations can be splittee on several
different computers and then putting all together for the results. It has to do not with the dimensions, but
with the number of users we need to satisfy at the same time, we need to manage many concurrent users.

High availability: distributed systems are at the same time more fragile and more robust. They consist of
several parts, so they can break independently (fragility). At the same time those parts can be redundant,
so they can be created in several versions, if one collapse, the other is alive (more resistant). If we have a
stand-alone system, it either works or doesn’t at all; if we have many degrees, a partially broken system is
much better than a totally broken system.

Resource sharing: sometimes we need to have some resources like something with small capacity and we
want to make it available for a lot of people.

However, distributed systems come with a lot of drawbacks, so we should know about them. They have a
much larger complexity because each single part can fail independently on the others. It’s more complex to
text, to deploy and to operate. Since all those systems are inherently concurring (they are separated) we
don’t have a commonly idea or a common time. A pc can run faster or slower than another and they can
both be true or false.

04: Web Architecture Pag. 1 a 30


We need to cope with partial failures. If we have one single process it works completely, or it’s broken.
Here instead we can have something that works but slow, or something break partially or the system
perfectly working, and the network in between can be broken.

We don’t have a global clock; we coordinate them with messages.

Security: is much easier to tamper the system by injecting code here and there or modifying messages.

We need to choose a suitable architecture. The architecture is just a picture of how all the parts should be
together. We have to be sure that the system we adopt can support some properties that can be robust
and reliable on their own. However, we should take care of common mistakes.

We typically use the software architecture to define the structure of the system, how the elements are
made, which relationship exist among them, and which relationship exists between those elements. By
seeing such a picture, we can better understand how the system behave and make choices respect to
potential failures, risks or opportunities that the system has.

04: Web Architecture Pag. 2 a 30


There are some major keywords we should know. One is idempotence. When we exchange messages,
sometimes those messages fail. The communication cannot deliver the message. So, if I get no answers for
my message, I can repeat it but if I repeat there is a big risk. For example, if I send again the message, the
system now has two identically messages in the queue and this can be a problem. If I gave the same
message twice the result is different from getting it once. The communication is not completely reliable.

Immutability: we should never delete or alter anything; we should only append. It means having
information grow monotonically; changing things is dangerous because there is a delay between the
command and the change, and we cannot understand what is happening.

Location independence: what an application does should not depend on where it is deployed. This is
extremely important because sometimes we have a distributed system where we have a several
webservices along the world.

Versioning: We should understand what the other is asking you, what alphabet, what syntaxes, giving a
version. We may change it later on. If we don’t provide any indication of version, changes become
impossible.

People when dealing with networks, tend to make the following assumptions on the right which are wrong.
Properly considering these items creates unusable systems.

We must design our system knowing that something can go wrong. When we have a monolithic system
(stan-alone application) either it works, or it breaks. If the power goes down, it is unusable at all. If we have
a distributed system, we can have a part offline while the other working. Systems should be designed in a
way that all problems can be detected, we need to understand what is going on.

In stand-alone application we don’t really need a log, in distributed applications we need a log file where all
systems convey their messages, and we can understand. If we don’t write what is happening, we are not
able to reconstruct what is wrong and what need to be adjusted in some way. If we have a failing system, it
should be automatically restarted, but that means that we need a way to automatically detect that the
system is failing. The client, when realize that a given component doesn’t answer anymore, should stop
bombing it with request otherwise the situation becomes worst.

Once problem been corrected, we should start again requesting, because the operation need to be
completed. This means that the client should be resilient. We have many clients; in a simple system we can
have one client (browser) where user is connecting and a server somewhere else where information is

04: Web Architecture Pag. 3 a 30


managed and maybe a database behind where information is persistent. The server itself is a client for the
database and the database too can fail for some reasons.

Whoever place the role of the client and try to connect to a server need to remember what is doing and
later on resume request and complete the operation that were not completed or be able to track in a
reliable way that the operation is not finished and undo what is already done.

Dealing with these manually by saying “I have a system admin who overview the network and check that
everything is running ok” works only for toys systems. If we have a real system with several connected
machines, we cannot have it because a single person will be unable to track the complexity of all the
system. We need an automatic way to detect failures and restarting machines.

Of course, using an orchestrator is not something that impact the way in which we write the software, we
can write the software without knowing we use an orchestrator even if using orchestrator sometimes need
that the software have some features like a live chat that the orchestrator can invoke to understand if the
system is still responding. We should have also some types of readiness check.

Sometimes instead we need to adopt suitable architectural patterns. We use patterns in software, we use
the composite pattern where we need a particular structure, so we need to organize classes in a given way
and so on. In large systems, we need some major block (API gateways, circuit breaker, caches and so on).
Circuit breaker is a piece of code that stands between somebody and check that the communication goes
smoothly and provide hints when destination is alive again if it does not work.

One of the failures is related to the fact that the network is intrinsically unreliable. Even if we have
extremely powerful and organized kinds of connections they can break.

An API call between a client and a server is not like a function call. It looks lime at the surface but is not.
When we have a function call, it is the same processor that jumping to a different address and execute
something. Either the processor have a whole block and we don’t go any further, it’s local.

In a network connection instead, the client can still be alive, the server can still be alive, but the path in
between can break. The client sends a message to the server (request) and if everything goes fine in a given
time it came back a response that can be a success (request accepted) or a failure. This is already a first
change because in code we make calls that return a value or throw exception, so take a different path. In
API calls we don’t have exception by themselves, we always have a response.

There is another problem here, when we send a request is possible that the wire is cut, and the message
doesn’t reach the server, or the message did reach the server and the wire was cut few second after the
server send back the answer. From our point of view, we cannot see the difference, we send the request
and got nothing back. If the server did receive the request, it probably operate on it and perform any side
action that was requested (for example, send an email to somebody) but we cannot know because we
04: Web Architecture Pag. 4 a 30
cannot receive the answer. Or the server didn’t receive it, so we really don’t know if our request may
already caused some effects or not or maybe we don’t get any answer because the server itself is broken.

From the client point of view, send a request and don’t get anything back, the only way to don’t be blocked
forever is defined a timeout. In function call we don’t have a timeout. How long the timeout should be? If
too short, maybe we are triggering a false failure, we say that there is a failure when the server was slow; if
we make it too long, the system become totally unresponsive, because we are still waiting the answer and
the client knows nothing about that.

In any case timeout can give us the idea when something goes wrong but we cannot know what.

Having idempotent API helps making this problem less dramatic. Function or request are idempotent if
receiving them one or more time have the same effect of receiving them once. if we ask to our server to
send a mail to someone, we were able to say if we send it already or not. Some operations are idempotent,
like reading some information without modifying it. If we set a value, saying “turn the light on” and the light
is already on, nothing happens. Messages that contain a total description of the final state, what I’m
expecting to find are reasonably fine for this point of view.

We have typically one situation which happens very often when we have web system that be approached
via a REST API.

Rest is a paradigm to create API (representational state transfer) which say that most of the time client and
web browser application need to operate through a server on a database and on the database, they need
sometimes to read some data, to update some existing data, delete some data or create some new data.
these four operations typically map on the operations that the database performs (CRUD operations). But
http has 4 methods which maps very well on CRUD; http has get that define an operation that doesn’t alter
the state of the server, read-only operation. Get is good for transferring request of reading something, get
is safe, does not change anything and it is idempotent. http has also delete, useful to remove something. It
is not safe, change the server but it is idempotent, because if I say delete something, if something doesn’t
not exist, I cannot delete it anymore.

Also, we have put, it is idempotent, and it is useful for updating. Modifying the content of a piece of data
with a precise value, receiving it twice is not a problem. Finally, we have post for those operations that are
neither safe (change the state of the server) nor idempotent.

In the REST paradigm, we have URLs that describe well things in my database. We can change the content
of a table with a precise url. We can have /customer or /customer/id to read all the table or a precise id.
We can also delete them. Delete something works fine, we can change the information containing in the
customer with put. If we have a new customer, we cannot have a name, it cannot exist yet in database. If
we perform a read operation, we read them all, we cannot search for a not existent number and trying to
create an invented name is dangerous, two different clients can do the same operation at the same time.
For these situations typically in the REST architecture, we send a post to the customer URL without any
more details. We send the whole record we want to create apart the ID, rely on the fact that the server

04: Web Architecture Pag. 5 a 30


takes the data, store it in the db and the db is equipped with some annotation like auto-increment and do it
atomically.

This kind of schema breaks if communication between client and server has problems. The client sends a
post request to the server asking to create a new user. For some reasons, the client does not receive any
answer; it may happen that the server received the request, went to db and insert it with a new ID but the
answer back cannot be propagated. The client doesn’t know and send another request. If it tries again the
server receive a new request and create another record with another and same data.

To deal with it, we can say that primary keys need to be unique, so we have numbers and the database
assign them progressively. Since the db has its internal coherency tools, it is safe internally and if receive
several requests can assign unique number and consequently.

A second possibility is that the client can invent itself the ID. If we invent it in small interval, probability of
making disaster is huge. If I take 256bit and make them random, we have a lower probability that two
clients generate the same ID. We can also take 512bit. Larger is the domain, smaller is the probability to
generate identical keys (like the one in db or the one produced by another client).

One possibility is to have the client invent its own ID. Typically id created in this way are called UUID
(universally unique identifier). In that way the db becomes really expensive, it needs to manage the table in
a reliable way. It needs a quick way to fetch the right information and return it back. If db are written
alphabetically, in order, it could be really fast.

If record is not written in order, if we insert, delete and modify them they are not in order and if ID are
random the only way of fetching quickly the information is to create an index, an auxiliary structure
managed by the db; a secondary table that we don’t see that has the sorted identifier to find them quickly.

Indexes made of autoincrement numbers are compact. Indexes made of UUID are sparse; they tend to use
a huge amount of disk and as a consequence a huge amount of ram in DBMS. Having ID generated by client,
fixes partially the problem of idempotency but not so much.

There are other possibilities, the client could add to the request an extra field, a UUID but we let the server
creates its own ID. The UUID is not the ID of the record that is going to be created, but it is the record that
describe the current operation that the DBMS can store somewhere temporarily. In that way, if it receives
yet a request, it forgets the new request. This is consuming space but gives me some kind of resiliency.

It is not necessary everywhere, only in those situations where the data we create are really important.

The idempotency comes out in several other situations, like the problem in topology changes. Some
applications need to work all the time. Being always on is impossible because things sometimes break. In
some situations, we need to turn off something and turn on something else. Systems need to be highly

04: Web Architecture Pag. 6 a 30


available. We can create for example two DBMS; one is called the master and the other the slave. The salve
just mimics the operations of the master, but if something goes wrong and the master shut down for some
reasons, the secondary will step in.

Now, the secondary will start responding and create new entries. Then the master will resume, but it is not
yet aligned with the secondary, because meanwhile the secondary go through. If the master starts again
answering it may create again content that already exist on the slave.

This is potentially problematic, so we use UUID, and we don’t rely to the DBMS ability to create unique
identifier. Also, one way to generate auto ID is to make a hash of the record. If the record is the same, the
hash will be the same (content-based storage). We can assign a hash at the beginning but when the record
change, the hash does not reflect anymore the content.

We have the problem of inhomogeneity of network and applications. If we create a successful software of
some kind, like SAP (administrative tool database and server system adopted by so many companies that
use it to track invoices, employees and so on) sometimes bugs affect it, so we need to improve it. The
problem is that how many versions of SAP exist in the world? A lot, some companies update it and other
not.

The problem is that if we need to fix things, we need to know what is available where and track it. The
software is typically versioned.

When we use git or other versioned code, we can manage such a fact replacing the old version. Just having
a git repository where we have different version of the code is not enough, we have different pipeline to
test, integrate it.

Versioning data is more complex. When we update from version 17 to 18, maybe we change some of the
way in which the table are created. We do that for some major problem or because the new version is out.
We need to create a new table or add a column to an existing table and changing the schema. That’s a
problem because we cannot simply replace the version of the software, we need to migrate data. migrating
data one version can be easy, but if we need to migrate from 17 to 21, we cannot do it in one step. If we go
forward, maybe we cannot go backward. If we implement a new version that is bugged, and we eliminate
the db, we cannot turn backward.

There are tools that helps us making those transition, automate most of the script we need to generate to
apply changes, check everything is fine and so on. The most challenging thing is versioning the APIs. The
easiest way is when client is Javascript code and the server talk to the js code, and the API is only the point
of contact. Changing the API sometimes is necessary because client have more needs. The API links 2 code-
based (client and server).

04: Web Architecture Pag. 7 a 30


Sometimes we just add a new method that previously don’t exist, we still have a client API old versioned
that query the new server that still perform same operations as before plus something more.

Sometimes, we need to drop one method because is dangerous. So, we have to remove it immediately. We
need to do change that are not more backward compatible.

What we said so far, is valid for any kind of distributed applications.

A web application is a special kind of distributed system where at least one segment is based on http.
Typically, the first segment because we want to use ubiquitous client (browser) that every user has on its
pc, so they don’t have to download anything.

The idea of using http is that because network configuration has many restrictions, but almost everywhere
in the world, http is open. We could create any other kind of protocol, but we would incur on the problem
that our protocol is broken because in between there are firewall that prevent communication.

We need to stick on what http offers us.

A web application is an application. We create applications because people need to perform some kind of
jobs and they benefit of the fact that computers assist them. Since the web is distributed, differently from
what happen for stand-alone applications where we have one client at a time, here we can have many
clients at the same time, so applications are necessarily concurrent. Should be capable of safely deal with
different diverging requests coming at the same time and guarantee coherence.

When we pass from sequential code to concurrent code, complexity explode. The server is in charge of
guarantee that the application behaves accordingly to the business logic, the purpose for what the
application is built. The data provided by the user needs to be stored in a server and computed and routed
to either back the user that issuing the request or to other subsystems.

04: Web Architecture Pag. 8 a 30


We have a user that use the frontend software, typically a browser with some javascript, html and css code
and start issuing a flow of sequence of request to all the server that applies some business logic, potentially
operate on some storage (db or the file system or other), create responses, send them back to the client
and interpret it.

When we design a web application, we do a lot of things. Designing a web application is a broad process,
that span across different domains and dimensions. The first step in designing a system is defining the
purpose of that system (why I design it), we need to make them clear.

Then we need to define how should it behave, how purpose is reached. Designing a system means defining
a purpose and understand the way in which we need to reach the purpose.

The behavior is based on an over-all structure (architecture) and then a lot of details about how the
structure is really created, (implementation). We usually design systems in a progressive details approach
(we start in a way, and then we enter in detail to define better what each part do  divide et impera;
possible for system that are complicated but not complex). If we have a complex system (huge amount of
theoretical study behind) we cannot apply the principle of replacement diving it in parts.

Then we have a module level design where we enter into each component, define how they communicate
between each other and many other constraints to properly feed in the overall panorama.

We call software architecture a picture representing the metric components of the system with their
relationship, how they interact with each other.

An architecture is a mental model adopted in order to conceive first, create later and operate later more
that system. if we lack such a mental model, it will be impossible. By defining and choosing a software
architecture that follows a known pattern means to not repeat mistakes and have discovered advantages.

Designing an architecture, is a task of compromise because we have forced that force use to operate on a
landscape like costs, time to market, performance (how fast will be the system), maintainability (how much

04: Web Architecture Pag. 9 a 30


the system will cost in the years to come to keep it running and capable of following every changing require
from customers), quality (how good will be the system), security (what are the risk of misuse of the system)
and others.

This says that we cannot just do what we want when we design the architecture.

We typically come up with the tiered approach, splitting the system into a set of layers, each one operating
on top of others. We typically have the final users (community of users, not only one) operate on front-end
works via http and web application. The backend works with web tier that is in charge of setting and
processing http request; also, we need some place where information are maintained and stored called
data tier.

If we start looking inside, we see something. The client tier is typically hosted in a web browser. It is
convenient because people already have their browser. It is also a problem, because different browsers
existed, and they are not all equivalent especially if we want to use some particular features.

The browsers offer generic sandboxes execution environments that provide access to remote systems via
http. The http is accepted as protocol mostly because for the browsers is the only protocol which they can
speak.

We must say that modern browsers offer really high performances and incredibly software architecture and
capability and very good behavior on network communication.

There are situations where instead of creating web application based on bowser, we create stand alone
applications (mobile world). Sometimes happens on some embedded devices which have small resources

04: Web Architecture Pag. 10 a 30


and prefer to have a dedicated application. Sometimes we need it because browser have sandboxes,
limitations across what software can do.

Server tier is built using a layered approach. In the server we don’t’ have one huge program running, but
the program is created with a set of layers, each one aims a specific feature. Each layer only interacts with
another one limiting dependencies and specifying contact points on the interface.

In this way we have modularity and separation of concerns. The code can have few or a lot of lines. We split
it to many different teams of people, each one works on different aspects of it and so it is necessary that
communication between modules where defined. Those layers can be part of a single monolithic process or
can be implemented as separated microservices.

A typical web tier consists in an inner block which receives a request and response from the client and do
that on data tier or to some remote services. In the web layer, we may distinguish a front facing part which
is labeled as public, so the one exposed to the client consisting of a set of entry points, typically URL where
we can get and put information and some private part which operate internally in order to store the
information in the data tier or to manage the information that is forwarded to subsystems.

Typically, the presentation layer will be totally public, the exposed part. We create controllers for it.
Conversely, the data access layer is totally private, the one who knows how information are stored and
maintained inside the system.

In the middle, we have the place where the business logic stays. The data layer just know how to store and
retrieve piece of information in tables. The service layer is the art of our system. It is the place that knows
the rule, what operations are legal from what kind of users, when and so on.

The value of the application stands in the service layer, the other parts are pieces of software necessary for
adapting incoming data or adapting request coming from another component. They contain no intelligence

04: Web Architecture Pag. 11 a 30


by themselves apart from optimization of code. The domain lives in the service layer. The service layer
knows the real purpose of what we are doing.

The service layer can have a front facing interface as a backend as well. Data which flows outward of the
service is called DTO (data transfer object) and it is a representation of the information we want to send to
the client, public data. Data which is managed in the database are called domain models or entity.

We may choose in our private part to have two tables, but we don’t need to reveal this information to our
client, so we can put this information together in one larger object and send it to the client. In that way we
can evolve the database independently of the user interface and vice versa.

Data can be immediately stores in the db or file system or can be send to a remote service which will
operate on top of it. Maybe if we are making shop, some data are stored inside the system and some other
like payments are outside. For example, the bank returns a validation of the payment, and the application
works accordingly, we have a chain of responsibility.

Web tier is organized in several layers.

The public part contains the presentation layers. We create the presentation of our services in terms of
controller (object in charge of responding to http request coming from the end user and producing an http
response). The controller typically only considers what regards taking in and out of the http envelop the
information which is sent by values. No processing from the point of view of the domain happens here.
Information is encoded in DTOs. Conversely, in the private part we have the data access layer that only
consider how information are persisted in the db or file system. this part contains entities. In between we
have the service layer that is the place where the business logic happens.

The service layer is typically implemented by object that conforms to the service stereotype. Typically, we
may have several objects in the service layer, each is in a specific group of functionalities. Methods in the
service layers implement the business logic.

The object we consider often is not known for its specific class but known for interface it implements.

All requests appear to be independent of each other, http protocol does not remember what happen
before and don’t what to know everything happens after. Each requested is treated as its own. Probably,
instead of instantiating the service layer once on a single pc, it instantiated it several times on many
different pc because each request is considered independent on each other. So, we have a load balancer
that distributed the incoming request to the different services connected to the balance. This allows us to
scale horizontally.

04: Web Architecture Pag. 12 a 30


A typical configuration could be something like this. The client knows only a well-known address that is the
one of the load balancers, a thin process that know a set of servers which really have the implementation
and distributes the incoming requests. All those servers share the same data tier, because the http protocol
is stateless, but application is not stateless. Once we log into the system, it knows we log in. Now we have
an identity, so we can do some specific action that before we cannot do.

The web tier is stateless, but the application is not. The state is kept in the data tier. What happens is that
when we get or post or put some information into the web server, it reacts to the request by storing or
updating some information in the data tier. If more requests coming in referring to the same user or same
entity, the data tier will consistently provide the updated information.

This means that we may have many web servers and one single powerful database behind capable of
supporting all the transaction that the various implementations of web tier requests.

At the top of the web layer there is the presentation that is totally public. It is in charge of processing all
incoming requests and produce a suitable response. Hose requests will be compliant with the http protocol,
so contain a method, a url and possibly a query string. In the request will be headers inspected to provide
suitable response. Methods are specified by the http protocol (get, post, put, delete). Get should be both
safe and idempotent. Safe meaning no change happens on the backend because of this request. The
information remains the same. Idempotent means that if two or more of the same requests will be
affected, only one of the requests will received.

04: Web Architecture Pag. 13 a 30


Post is neither safe nor idempotent. Is used to model those situations where we need to update a set of
information that the backend manages. Receiving two post produce different results of receiving only one
of them.

Put is idempotent but not safe requests. Potentially a change occurs in backend. Delete is not safe but
idempotent. Not safe because change a piece of information.

http also considers the patch that is a variation of put, useful when we need a partially update of a record.
This is problematic because may or may not be idempotent and it is not safe. Also, we have the option
method, relevant when we work with Javascript. Servers typically can denied respond to request
generating to browser if they are not getting unless a specific policy for CORS has been enabled. We can
check it with a preflight request: first we query the URL we want to contact with a special option method
that just ask to the server the operation is supports and the restriction applied; option is automatically
implemented by framework, but we can customize the behavior. There is also the trace option useful for
debugging.

Then we have URLs, a sequence of segments where a segment is defined as a set of letters, numbers,
underscores and few other characters separated by a slash. The structure of URL may resemble the
structure of path name on file system. we divide it in sentences using slashes. URLs basically are piece of
data that we send to server. They can be used to identify a source on a server.

One of the difficulties that web implementors sometimes have is figuring how to name the URLs. We know
that we have to name the endpoint and we have to understand how call them. Inventing URLs is important,
to convey parameters for the method we want to invoke. For example, if we have a web application that
manages a set of purchase item, we can have a URL with something that begin with /item and then a
number that represent the ID of a single item. Or we may have a more sophisticated one
/item/id/manufacturer/id; in this way we can have the id of item and the id of manufacturer.

URLs may contain a query string. Typically, we segment URLs using slashes, but URL ends where a question
mark is and the part after is called query string. It is used to sending information from the client without
using a body. Head, get and delete don’t have any body in the request. Instead post and put may have.

Since sometimes we need to perform a query using get and pass some parameters, they get encoded in the
URL via query string. What we have after question mark is usually segmented using the &. It split it into two
blocks that typically are interpreted as key=value where key and value are URL encoded, so they are string
with letters, numbers and a + that stands for a space and all the other characters are replaced by the
hexadecimal encoding. This means that by encoding things in this way, there are perfectly readable and
easy to guess. It let you pass some information from client to server. Server may use them in a way
consistent with the verbs which are there.

We have response only header that appear only in response, request only header, response only body that
is for example content-type that tells us how to interpret the body.

A body can be anything, an array of byte. The way in which we interpret it is due to the presence of some
headers.

The presentation layer is characterized not only by the capability of decoding and interpreting incoming
requests but is able to deal with several request at the same time and it is free to choose its own way to
implement it. It can choose a policy one at a time that is not so interesting or must be able to deal with
concurrent that is thread safe implementation.

Spring internally manages the concurrency using two different approaches: one which is the most common
and oldest is based on having a very large thread pool by default which concurrently wait for incoming

04: Web Architecture Pag. 14 a 30


requests. Once any of the thread finishes the part, they go back to the pool. It works fine, the only
limitation is that we need to pre-allocate a very large number of resources. Sometimes we want to increase
the pool.

The second approach is called asynchronous that says that we do not need many threads, but those
threads never be blocked. A thread could block because is trying for example to invoking an operation
which happens to be served by another process. Making operations asynchronous we ask to call back the
thread when the values it requested is ready, so at the same time we can process another request. The way
in which code is written becomes more complicated.

The fact that we need to manage several requests coming from different clients also have the potential
limitation that we need to figure out what those request means. The http protocol is stateless, but the
application is not. It must recognize if the request coming from the same client that just few milliseconds
before asking me for something else. In order to deal with this, sessions are used.

A session is just a way of assigning a unique random id to each client. Session is based on a cookie, piece of
information which a server can handle in a response to a client. Typically, a web server, if manages
sessions, can have a concurrent hash map, so it can be safely access by many threads at the same time,
using the session id as a key and having a secondary map as value where we store whatever information
can be useful.

This piece of information never leaves the server, only the session id moving forward and backward to the
client. Of course, it is perfect if we have only a web server. If we have two instances of a load server that
share same application, information kept in one implementation should be shared with the other part, so
we need to use some kind of libraries which cause the two maps remained aligned.

The presentation layer consists of object which in spring framework are annotated as controller or
restController. We call controller classes responsible of generating or handling the presentation layers.
Incoming methods of the controller are mapped in URL. Parameters of those methods will be maybe the
URL itself or part of it, maybe headers, maybe the query string or part of the query string, maybe the
session or some extra information which the framework can handle.

The returned values for those methods will be either the content return to the client or an indication of
how to generate that content. Those methods will generate responses are consisted of full pages or just
data structure which will be typically returned back to the client in terms of JSON object (typical approach
 SPA).

The service layer is the core of our application, it’s where the business logic is. If the application is about
purchasing tickets, the service layer performs the purchase, all rules are here. But the service layer has
known side effect by itself. side effects are all delegated back to its own service provider which is the data

04: Web Architecture Pag. 15 a 30


layer. Typically, elementary operations are maps one to one with the methods they provide (if we buy a
ticket, the method is about buy a ticket). This is helpful because when we start reading a specification, we
start imaging the project and for each action we start imaging have a method. This helps to understand the
logic behind.

If requirements are very sophisticated, they can split in different areas with service object that having all
the methods consistently for the kind of operations happened there. We read the specification, the
requirements and design the service layer.

The operation that the service layer implement will typically need to be atomic. Atomic in the terms of the
transaction, that is we either see all the operations or none of the operations, no intermediate steps.
Typically, we consider those methods part of a transaction and the operation on the database should
reflect it.

Typically, the server layer is stateless, we design objects, but they have only methods, no attributes.
Attributes are not in the memory; we need them in database.

The service layer receives the parameter of its methods, DTOs and return DTOs to the above layer but
internally transform DTO to the representation we need to the service, in domain models or entities. We do
that because we want to be sure that we can evolve the database representation independently in the way
in which we evolve the communication with the client.

As we learn more for our application and so on, we may decide that the way in which we model the data in
the db is not affected and we can change it without the client realize it. This also has some very good side
effects; it makes our layers testable and mockable. Testable so we can write unit test against it; mockable,
so we can replace the real object with mock (faker) objects that contains necessary elements to have the
succeed test.

04: Web Architecture Pag. 16 a 30


Domain models are the way in which we represent the information that our application need to manage in
a way suitable for the specific DBMS we have chosen. We can start choosing a traditional SQL database
where we have tables and columns that contains simple values. Later on, we can realize that probably due
to the nature of the application and the way in which people tend to use it, is better have a graph database
or a document database and we can change everything mapping it with the same DTOs as before.

The implementation of the domain model is private and relevant in order to improve performance of the
system.

Entities or domain objects are handled by the data access layer, that persist and retrieves the information
we ask for. Typically, the data access layer only executes CRUD operations. Of course, those elementary
operations are under the control of the service layer that knows the rules.

The data access layer is so trivial that usually is automatically generated, no need to write anything. It is
generated in consisting and reliable way. Typically, we use framework for this, ORM (object relational
models) or ODM (object to document model). In any case, whatever the kind of data we use, the general
assumption is that any individual piece of information has a primary key that distinguish it on every other
information.

The data access layer will only contact the DBMS, does not store anything on its own. It interacts with the
data tier. The data tier is based on a single DBMS or sometimes on a bunch of different DBMSs which can
manipulate the various kind of information. We can have more than one because sometimes the
applications need to represent very heterogeneous kind of information and some other fit much better
other kind of noSQL database.

We can also have files in data tier. It can be a problem because it prevents the fact that server can be
instantiated on different machines. Files are typically local to a file system.

04: Web Architecture Pag. 17 a 30


Here we have a more realistic idea of what
really a web system might be in the word.
Everything starts with the end user that
uses its own browser start typing
something, like an URL in its address bar.
The journey of that URL starts early of we
might expect. The name of the host is
resolved in an IP address.

The browser when we type the URL first


goes to the DNS server and ask for the site
and gives back an IP address and then the
browser start querying the IP address with
a TCP connection and try to send the
request itself. the IP address is usually the
one of a load balancer that is a machine that spread request somewhere.

Load balancer can be totally dumb, so choose a random machine connected to its backend or can perform
some kind of matching on incoming request seeing if user is asking some resources available in another
place.

Usually, the load balancer chooses one machine at random and handle the TCP connection to the machine.
One server will react to it and start processing (fetch the headers, understand the URL, the body etc and
the application logic will be triggered). The application logic may in response to the incoming data can
operate on a DBMS and fetch data or store data or maybe can sometimes contact a caching service to avoid
repeating heavy operation and having ready-made answer to most requests’ information.

Possibly, the retrieved information will be enough to generate a response, sometimes it’s not. For example,
is possible that the request contains the information necessary to triggering a process inserting in a job
queue, so that the server later on process it.

Sometimes other kind of services which are necessary to the web request can happen (if we do a purchase,
amazon contact the bank to take the money, the amazon web application contact the bank we service and
wait for the bank to validate the purchase and going on doing further things).

Tracking all what is happening and being able to get an overall idea of how the business is going, requires
not only to look to the single item, but collect all the requests. For that reason, we need a data firehose,
used to protect the data. Sometimes instead the information can be feed to a cloud storage where they are
available via a CDN (content delivery network) by being near to the end user (dumb server to pass
information to the user; instead of searching information all the times, we have some dumb server that
redirect information to the user).

04: Web Architecture Pag. 18 a 30


Typically, in order to manage such a large ecosystem, we don’t want to write all the software by hand. We
need to leverage on existing components. Some of those components are very high level, are a huge
system. DBMS or message broker (send pieces of information from somewhere to somewhere else) are
very huge and complicated.

The component may exist as stand-alone process or is provided by a library to be embedded to another
component. This means that a web application is not a single process, is a family of processes that can grow
quite fast; we might have hundreds of processes that we have to keep alive and rebalanced so that they act
as they need. For this reason, we need orchestrator.

We can try to understand some major models. The easiest one is the monolithic system where we have one
application server and one single database. It already has its own complication, but conceptually is pretty
easy. Have everything with only one element is very fragile. If the server stops the application is offline. For
this reason, is the less reliable model.

Something which looks more reliable is based on the idea we can have several (2/3) application servers and
one single database. Those applications server is preceded by an application balancer. This one is better
and scale much more, we can increase the number of servers even dynamically. Of course, this requires a
stateless design of the application, we need more instances together that cooperates.

Of course, we can have the solution based on multiple application servers and multiple databases. It can be
several smaller application servers each with its own database which gets coordinated together by an API
gateway or more complex situation. This is the most reliable and complex model.

04: Web Architecture Pag. 19 a 30


CAP theorem: when we have two copies of the same data, is difficult to guarantee the ACID property that
the database has, so we move from a fully transactional isolated way to considering to have BASE property.

A typical large system probably be based on an approach like this. While the client tier consists of several
different technologies (react, angular etc), those clients tiers will contact a common API gateway. It is just a
front facing component which offer a set of URLs implementing the operation that our application need.
The API gateway by itself just implement the cross-cutting concern, that is the security check (is the request
allowed to proceed or not, who is the user who want it and what kind of permission it has), routing (this
request to what of the specific subsystem in the backend need to be redirected), logging (we need to track
what incoming, so if anything wrong happens we are be able to reconstruct the sequence of operation),
rate limiting (we need to be responsive, but I don’t want one single client to consume all the power, I will
inspect incoming connection and prevent one single connection from taking more bit per second than I
want to allow), aggregation (when we need to produce a larger set of information which comes from
different sources).

The API gateway simply redirect the incoming request to one of the subsystems where the information is
stored. Here we have three different subsystems: one manages the set of users enrolled in a course, so we
have a student microservice where users are managed (profile, scores), then we have the course
microservices, the platform that deliver the courses (video lecture, logic for test evaluation) and also a
payment microservice in charge of billing the credit card of the users.

04: Web Architecture Pag. 20 a 30


We go from one monolithic application to a group of processes, each one having some piece of
information. We need to face the problem of distributing the state.

A state is the condition or stage of the physical being of something. User can be logged into a system or
logged out of it. That a piece of state. We can have some items on shopping cart and remove or add
another and it has a state; at same point we can pay, and the shopping cart is empty, and money are taken
from wallet.

Virtually everything have a state and we use them to perform action and decisions. As long as we have one
single process, state is easy. We store state in variables. Whenever we need to make a decision, we have an
instruction if and whether the result is true we perform some actions otherwise other actions.

But when the application and the set of information we know are not stored in a single process, so we
cannot put in a complex expression many statements at the same time we have a problem. The problem is
that state is something which is evolving, and it evolves as a consequence of events.

We consequence events of an abstraction of something happens in no time, before events and after events
(something change). So, we imagine that magically events turn a state into a different value atomically.

In a distributed system the state is a problem, because the information is split among different parts. The
information may be replicated, propagation takes time. State evolution is no more atomic, is possible to
observe something which is already changed somewhere but not somewhere else, because it takes time.
So, when we need to make a decision, we look to the local copy and it can be not true, but already
invalidated but we don’t know unless we wait for infinite time. So, we either need to cope with
inconsistency (we accept we can make the wrong choice and we need to correct it later on undoing
something) or we need a way to grab a global idea of what is happening (locking can be a solution, but it
limits scalability because if whatever decision we take we need everybody else wait we can have problems).
Larger is the distribution, less locking is possible, we look for lock-less algorithms or we want to find a way
to have a state which suffer not so much from the problem of ambiguity which may rise.

So, as our system goes from being stand-alone to multiple instances, either those instances are equal or
different from each other, we start facing these three problems that characterized everything.

The first one is the problem of consistency; can we guarantee strong consistency? Do we exactly know the
same things at every time? Never.

If a system were strongly consistent, we could propagate the state information among all the notes, so that
independent on which server we go, we always have the same answer. We need to deal with weak
consistency.

Staleness/freshness: when I query a single node for a piece of information, I would like to know whether
information I’m getting is the fresher possible, so no other nodes have recent information.

04: Web Architecture Pag. 21 a 30


The final problem is the distributed transactions, how implement them. If I need to guarantee a typical
operation like atomicity, durability, isolation and consistency by making different choices, how can I do? On
the black Friday on amazon there are few pieces for products, so only few of them take them. Two things
need to happen for the purchase to be successful: we need to have money, so we need to fetch from the
warehouse where things are kept; only the banks know how much money you have. The other things are if
the product remains or if it is finished. The two information are in different warehouse.

If you want to buy you need to perform a transaction locally to grab the item and a purchase with the bank
to get the money. But they are independent of each other. One may fail and the other succeed. Typically,
most people that buy probably have money, so the risk is that the item finished. Let’s suppose that I first try
to get the item, then go to the bank and try to perform the operation and the credit card is expired.
Purchased is not possible, so what should I do with the item I took? The transaction is already committed. I
need to undo the transaction, not roll back, because it finished. I need to create a secondary transaction to
undo the first one.

Meanwhile some other has been rejected because there is no more item, so someone will try again and
find it. I find a way to guarantee that everything become consistent, even if something breaks during the
operation.

It has to do with the correctness of our application. We need to create an application that are performant
and correct.

When we create a typical monolithic application, to make optimization to our program we typically choose
to use indexes or changes the algorithm in order to have it faster or improve in some way. A typical
heuristic which applies when writing stand-alone application do not work at all when used in a distributed
way.

That because the time scale which is involved in a distributed application is so different that everything
changes. The problem is that for us as human is very difficult to appreciate that.

What is fast or slow for a CPU depend on its clock.

When we have computer to computer communication, everything is slower.

04: Web Architecture Pag. 22 a 30


Typically, we tend to design our system based on synchronous pattern. The most common operation we
perform is function invocation, here happens that we call something, and we remain block until the
function returns. This is good when the function I return is actively computing something, like an operation.
But when I contact a DBMS, I send the sql statement and then the thread remains blocked waiting for the
answer, but the computing does not happen on my process, my CPU but probably on another machine.

Having blocking function is convenient, because make the code easy to read and write. When we write a
file, we invoke the write system call that perform the actual writing. So, we fill the buffer, and it does not
change until the controller fetches it and properly stores it. Here we remain block and do nothing. But this
is convenient, because we know that when return the operation is succeed or failed.

Conversely, if we perform the operation asynchronously, we just send a message and keep do other things.
When we do our thing, the operation may have not even started, so the things we perform is subject to a
hypothesis. To know that the operation is really finished we need a callback, and this is invoked by another
thread. This makes programming much harder, but potentially much more effective, we waste many less
resources.

There are several operations which block. Whenever we perform I/O (read or write) we block waiting for
the data to be ready. When we perform networking, synchronization (grab a mutex or waiting for a
condition variable) we block. Those things are CUP cycles that we can use in some more convenient way.

Sometimes we delay, for example when we perform sleep.

However, keeping a thread blocked makes the call easier to be written and maintain (imperative style of
coding). Typically, we make the illusion working in synchronous world and be concurrent at the same time
with a lot of threads. Those threads start blocked; they wait some jobs is performed. They allocate a lot of
memory, all the stacks. Whenever a request comes in, a thread will be unblocked, fetch and process it. We
don’t care which thread do it. Then the thread block because for example we need to take something from
db. All of this works perfectly just wasting resources, when we do not need scaling so much.

If the machine alone is not enough for the real performances, at that point it’s better to change the way in
which we program.

04: Web Architecture Pag. 23 a 30


Typically, the problem is for larger system. we buy machines in integer units (1, 2, 3). As long as we need to
perform anything, we need at least one machine. If the load increase, as long it can accommodate the
traffic everything works. But there are situations like black Friday when we need to face a huge incoming
request, so we cannot design the system to be able to cope with all that traffic when we have normally
much lesser (we pay a lot). We need a way to make it plastic, so quickly adapting and as soon as the big
wave has come, go down again.

There are two important metrics to evaluate how well behaves a system. One is the latency and the other
the throughput.

Latency is the average delay which is experienced from when a requested is made and the corresponding
response is generated. Throughput instead is the rate in which we made process new request, that is we
measure latency in seconds or ms and throughput in transaction per second.

Note that the two are different. If we go to the post office to take a letter and we find a big queue, it’s the
most convenient thing because all employees inside are fully, so it means that when one employee is free
the queue go on. Employees are used as full of their potential rate, but people are coming in a faster way
than the employees can serve them. This is a good condition for employees but bad situation for people in
queue.

Conversely, if people arrive at the post office distributed along the day, maybe they arrived so distributed
that on the average is free employees, so there is 0 queue. But from the point of view of the front office
this is not good because there are idle employees that doing nothing.

If we need to characterized how a web system behave, we are interested on both of them.

04: Web Architecture Pag. 24 a 30


When we look our distributed system, we can see a list of time (from top of the slide to the bottom) and
several histories of different clients. Client is represented by a vertical line where we can see there are
blocks representing the portion of time when we do something and a thin line when hey are blocked
waiting for something.

So, a client performing a request that reaches one of the server threads and it starts processing. For
processioning it, probably we need to invoke DBMS, making a query and once DMBS answered, the answer
is returned and evaluated, and a second query need to perform and so on.

These operations overlap, different clients do requests. Of course, as the number of clients increase then
the number of threads allocated on the server are fully. If the number of clients become larger than the
number of threads, clients need to wait in queue for being served. The reason for waiting stands for
partially because we need to grab a new connection from the single socket the server is listening and
partially to the fact we need to contact the DBMS that is managed with different set of threads on its own
and usually it is much smaller that then umber of threads the server have.

We typically want from a single client point of view to wait as less as possible. The information about how
much time the operation takes is useless, because how much really takes from the server to respond
depends on the request, on the load overall the server may experience etc, so we need to use a statistical
approach.

Let’s start having one client continuously ask something and keep asking 1000 times. We measure the
overall period from the request to the answer and we divide it by the number of requests, and it is the
average serving time.

04: Web Architecture Pag. 25 a 30


The number of requests divided by the overall time gives me on the average the number of requests per
second that the server is producing for one single client. This is the throughput.

If for making 1000 requests it takes 1 second, we can save that on the average the server serve 1000
requests per second.

Since I’m one single client that continuously asking information to the server, the fact that the server is
made of one single thread, or many threads is irrelevant. If it’s only me, I can say that all resources are for
me.

At most, since while I’m making one request, I cannot make another one, I can say that one thread of the
server is fully dedicated for me. Now, if there are two clients if the server has two threads, can maintain the
same throughput because another thread of the server can keep on continuously serve the other client.

If the system is perfectly scalable even if I have 3 clients, they have the same throughput, so the overall
throughput is N times the individual one.

If I have 2000 clients, it does not work anymore because the server is not fully scalable. So, if we have many
clients at the same time the overall throughput the server can produced is not the multiplication of the
single client throughput times the number of clients, but it is a more complex function.

We call such a function C(N), a function which changes when the number of clients increase. If the server is
perfectly scalable, C(N) is exactly N that is that the system would go linearly.

But we have a problem, our server has 150 threads, so if we have less than 150 clients, all clients can be
served together; but as soon as we have more than 150 clients, somebody has to wait. That is the first
reason called resource contention.

If the threads need to connect to a database, they need to grab a database connection, but we can have
few databases connection and they have to wait. This means that instead of growing linearly, we start
going up.

There is also a second order problem, the system must remain coherent; the operation that the client
performs somewhere impact on the amount of information, which is there, that we need to keep
consistent. This is a physical law. Going to the infinite, the throughput will be 0.

When we say that some operations require locks, it means that we prevent others to enter in the area. If all
want locks, we get stuck.

04: Web Architecture Pag. 26 a 30


The universal scalability law is modeled on the theory of queue and said that the overall throughput that
exist is going to produce can be approximated by the formula in the slide.

We say that the overall throughput grows linearly with a γ parameter but decreases according to a
denominator that has two other parameters α and β . α represents the resource contention and accounts
for a linear factor on the denominator that causes the increase to block at a given point to reduce the slope
and reach almost a horizontal line.

γ
If β were 0, the system can go to infinite to a constant that is the maximum number of operations we
1+ α
can achieve. β represents the coherency cost, that comes to the fact that sometimes we need some delay
coming from some locks for example or to the fact that I spread my information, I send copies around and I
need to collect the information to make my decisions.

In any case, β is not going to be 0. If the system if perfectly linear, so α , β are 0, the expected throughput
grows linearly.

If α >0∧β=0 the system is bend and grow slower than the natural linear version. Actually, when α
increases the system operates in two regions: the green one shows the system almost growing linearly that
is good. When the load increases α becomes linear and it will be a larger bending and the system tend to 0.

If the number of clients is too large with respect to the coherence, we enter into the red zone and go down.

γ does not changethe shape of the curve. We can calculate α ∧β only empirically. The larger I reduce β ,
the bigger will be N. also improving α has effects, but is much more important to make β small.

04: Web Architecture Pag. 27 a 30


So, to measure α , β∧γ there are some tools. One of them is Apache Bench and we invoke it saying I’m
starting to make a number of requests to a certain URL using some concurrent clients.

We have a server that accepts a post connection, and we can add a simple end point pm get and then call
the tool. Once we collect the measurements, we can use a library like codahale which is a useful library
where we give the measurements, and it prints out interesting values (at least we need 6 measurements).

We can write a Kotlin function where we have a list that contains pairs: the number of concurring clients
and the number of requests per second of the system. We create a model and the program ask us how we
want to interpret the model. For the level of concurrency, what kind of throughput we expect. Once we
have that, we can simply invoke the model and we get back the estimated level of throughput.

This will give us an idea of how well the system is behaving.

04: Web Architecture Pag. 28 a 30


We can see now a situation where we have to create a web system to manage user, so it means registering
users and login them. We want to put into evidence something that may be already evident or maybe not.

Typically, a system that needs to manage accounts ask for a set of requirements like the one in the slide.

From the first requirements, we can see that suitable is not enough, is ambiguous. If we want unique name,
we need to check before deciding if it is ok or not, so we increase β . Suitable also can means to have a
strong enough password. we need to force user to use complex things even if they do not care. Then we
need to have valid email address.

04: Web Architecture Pag. 29 a 30


04: Web Architecture Pag. 30 a 30

You might also like