10 - Reliable, Maintainable and Scalable
10 - Reliable, Maintainable and Scalable
10 - Reliable, Maintainable and Scalable
Last week we learned the main characteristics of a graph database, we have graded
and query this type of database.
This week we will understand the different architectures provided in order to design
reliable, maintainable and scalable data intensive applications.
So let's start with some important concepts.
Data-intensive applications store data so that they or another application can find it
again later in a database.
The applications put the result of an expensive operation on cache memory to speed
up rates.
They allow users to search data by keyword or filter it in various ways.
They can send a message to another process to be handled asynchronously by stream
processing and periodically crunch a large amount of accumulated data through batch
processing.
An example of a data intensive application can be an online gaming experience.
This application is assigned to manage several thousand concurrent users, and can
scale out at several points as needed.
Traditional database management systems have been utilized for data-intensive
applications.
However, as the system requirements, volume and availability of data are increasing
the task is not that simple.
Furthermore, there are various approaches to caching, several ways of building search
indexes and so on.
When building an application,
we still need to figure out which tools and which approaches are the most appropriate
for the task at hand and it can be hard to combine tools when you need to do
something that a single tool cannot do alone.
Therefore, it's important to keep in mind some questions that should be asked an
answered when designing data intensive systems.
For example, how do you ensure that the data remains correct and complete, even
when things go wrong internally?
How do you provide consistently good performance to clients, even when parts of your
system are degraded?
How do you scale to handle an increase in load?
What does a good API for the service look like?
There are many factors that may influence the design of a data system such as, skills
and experience of the people involved, legacy systems dependencies, the timescale for
delivery, your organization's tolerance of different kinds of risk, regulatory constraints
et cetera.
Those factors depend very much on the situation.
A data intensive application should be reliable, scalable, and maintainable.
Even in the case of an increment of work load.
Reliability is an important characteristic of data intensive applications.
Reliability is when the system should continue to work correctly even in the face of
adversity.
The application performs functions expected by the user.
It can tolerate user mistakes or the execution of software in unexpected ways.
It's performance is good enough for the required use-case under expected loads and
data volume.
The system prevents any unauthorized access.
In the case of scalability as a system grows in data volume, traffic volume or
complexity, there should reasonable ways of dealing with that growth.
Scalability is a system's ability to cope with increasing load.
Discussing scalability means considering questions like, if the system grows in a
particular way, what are our options for coping with the growth?
How can we add computing resources to handle the additional load?
An intensive data application should perform well in the case of increasing workload.
The workload depends on the architecture of the system.
For instance, the number of requests per seconds to our web server, the ratio of reads
to writes in a database, the number of simultaneously active users in a chat room or
the hit rate on a cache or something else.
A data intensive application should be maintainable.
I mean over time, many different people will work on the system, engineering and
operations, both maintain current behavior and adapting the system to new use cases,
and they should all be able to work on it productively.
There are three design principles of software systems that help them to be
maintainable.
Operability, is when an operation team makes easier to keep the system running
smoothly.
Simplicity, makes it easy for new engineers to understand the system by removing as
much complexity as possible from the system.
In the case of operability, a good operations team should be responsible for the
following and more.
Evolvability, make it easy for engineers to make changes to the system in the future,
adapting it for unanticipated use cases as requirements change.
Monitoring the health of the system and quickly restoring service if it goes into a bad
state.
Tracking down the cause of problems such as system failures or degraded
performance.
Keeping software and platforms up to date including security patches.
Keeping tabs on how different systems affect each other, so that a problematic change
can be avoided before it causes damage.
Anticipating future problems and solving them before they occur.
Software applications should be as simple as possible.
Small software projects can have delightfully simple and expressive code but as
projects get larger, they often become very complex and difficult to understand.
This complexity slows down everyone who needs to work on the system.
Increasing the cost of maintenance.
An application must be prepared to evolve because system requirements change all
the time.
Such as, new facts, previously unanticipated use cases emerge, business priorities
change, users request new features, new platforms replace old platforms, legal or
regulatory requirements change, growth of the system forces architectural changes, et
cetera.
Well, we have learned some characteristics that a data intensive application should
have.
However, there are different types of information systems.
We will review how to implement them during the next session. I hope you enjoyed.