01-Migrating Enterprise Databases To The Cloud
01-Migrating Enterprise Databases To The Cloud
Migration
Migrating Enterprise Databases
to the Cloud
In every course, it's a good idea to start with some theory about what the course is
about and why it is important. So let's talk briefly about why customers want to move
to the cloud, discuss some real-world problems you may encounter.
Learning objectives
● Get a high-level solution overview of use
cases, customers, and competitors.
In this module, you get a high-level overview of why customers want to move their
database workloads to Google Cloud, what it entails, and what our competitors offer
in this space.
You learn how to assess different database architectures and how they affect your
projects.
Lastly, you learn to architect database solutions for high availability, scalability, and
durability.
Agenda
Solution Overview
Let's get started with an overview of database migration problems and some solutions
provided, not only by Google Cloud, but other providers as well.
Discussion: List real-world challenges when
migrating databases to the cloud
Take a minute to consider some real-world challenges you may encounter when you
migrate databases to the cloud.
Data provides the foundation of most applications
● Many dependent Getting the data into the An opportunity for the
applications cloud enables easier customer to save on
● Security concerns adoption of cloud services hardware, licenses, and
administration
● Potential downtime
<yellow box>
Databases can be the hardest part of an application to move. Often, many dependent
applications rely on the database for their data. And those applications may be
constantly adding new data to the database. Moving the database can break those
connections.
There are also security concerns. When hackers are attacking your systems, they are
usually after your data. You need to design and architect your system in a way that
protects the database, but allows dependent applications to have access.
For some applications, defining a maintenance window and bringing the database
down for that window of time is acceptable. For mission-critical applications though,
downtime needs to be minimized and sometimes avoided completely. This
complicates your migration project significantly. Automation, database replication, and
extensive testing will be required to ensure that when you “flip the switch” to move
from the old database to the new one, everything will continue to work seamlessly.
<blue box>
Because the database is so important to the operation of its dependent applications,
moving it to the cloud is often a critical first step when migrating applications. After the
database is moved and running, moving other applications is comparatively easy.
Customers often want to take advantage of the many services Google provides as
part of a digital transformation. When you’re running in the cloud, you can more easily
take advantage of Google's advanced machine learning and big data processing
services, for example.
<green box>
Moving the database can also be a big cost-saving opportunity for customers.
Between hardware, maintenance, licensing, and administration, the database tier is
likely the most expensive part of an application. Moving to Google Cloud can help
reduce all of those costs. When their database is in the cloud, customers can work to
optimize their databases and move to even cheaper, completely managed data
services like BigQuery, Firestore, and Spanner.
Customers choose to move databases to the cloud
for many reasons
Cost
Security
Scalability and
high availability
Easier to
Manage
There are many great reasons customers want to move their databases and
applications to Google Cloud. Google has data centers in regions all over the world,
and each region is divided into multiple zones. You can deploy your database to
multiple zones or even multiple regions for greater scalability and fault tolerance.
Google will manage all the hardware for you, which simplifies the management of
your databases. If you use a managed database service like Cloud SQL or Spanner,
Google does practically all the maintenance for you.
Sometimes people think moving to the cloud is less secure. On the contrary, Google's
security is unmatched. If you know what you are doing, moving to Google Cloud can
in fact enhance the security of your applications and data.
And of course, decreasing the total cost of ownership of running your applications is
often a primary driver for moving to Google Cloud.
Customers choose Google Cloud to take advantage
of its advanced capabilities
Machine learning
Hyperscale data centers
Artificial intelligence
Global networking
Kubernetes
Opportunities
Automation
Security
Cost savings
Compliance
Open source
Google Cloud offers advanced capabilities that only the largest organizations could
replicate in their own data centers.
<green>
There is practically unlimited compute power and storage provided by Google's many
hyperscale data centers located all around the globe. The data centers are connected
by Google's fast and reliable global network.
<yellow>
All resources in Google Cloud can be automated for easier management and cost
savings. Google also embraces open-source technologies and is a big contributor to
the open-source community. Google uses a custom Linux distribution for its own
servers. Some very popular open-source technologies like TensorFlow and
Kubernetes originated at Google.
<blue>
Google is a leader in advanced technologies, which is often why customers choose
Google over other cloud providers. Some customers want to add machine learning
capabilities to their applications using TensorFlow or use one of Google's artificial
intelligence APIs. Many customers want to simplify their data center management
using Kubernetes.
<red>
Enhanced security is also an important factor when customers move to the cloud.
Because Google Cloud is already certified by many government and industry
compliance standards, running on Google Cloud can make compliance easier for
many customers.
Competitors also offer compelling services
Other cloud providers, competing with Google, also offer compelling platforms and
services.
<AWS>
Amazon Web Services is the largest cloud provider in the world. They offer a
managed database solution called RDS. RDS supports SQL Server, Oracle, MySQL,
and other databases. Amazon also has a customized version of MySQL called Aurora
that is optimized to run in their cloud. RDS has many advanced features like
automated backups and administration and automated replication for high availability.
<Azure>
For customers who rely heavily on Windows, Microsoft Azure, as you would expect,
provides very strong integration with Microsoft products and tools. Azure SQL
Database provides SQL Server as a service. There is also a cloud-based Azure
Active Directory service. Azure also provides support for open-source technologies.
You can run Linux or Windows virtual machines and many other databases like
MySQL and MongoDBs. Azure also supports Kubernetes through Azure Kubernetes
Service (AKS).
<Oracle>
Oracle also provides their own cloud for those customers who rely on Oracle
databases. Oracle Cloud automates the provisioning of Oracle databases. It differs
from the managed service provided by AWS RDS in that it supports all Oracle
features, which RDS does not, and it supports all Oracle versions.
Leverage Google Cloud services on a path
to digital transformation
In this section, you learn about the different database architectures you will encounter.
In one sense, this is a history lesson on how database architectures have evolved.
However, in large organizations, you might find a mix of database architectures,
depending on who created the databases and what they were designed for.
Client-server databases have been a standard since
the 1980s
● Highly normalized.
Client-server databases are fast and secure and are the preferred architecture for
many DBAs.
Three-tier architectures pulled the business logic
from the database into the application tier
● Developers advocate that business
logic is the domain of the code.
Client Application Server Database Server
● Database become just the storage
platform:
○ Fewer stored procedures
○ Performance suffers
The database becomes more of a storage tier; the middle tier handles the complexity.
In a 3-tier architecture, there are fewer, if any, stored procedures. While this makes
programmers happy, database performance overall can suffer, especially when
accessing multiple tables within a transaction.
In this architecture, clients don't connect directly to the database; instead, they
connect through an intermediary application server.
Service-oriented architectures expose the
application logic over the internet
● Database details are encapsulated
by a service.
As the internet has grown, and more and more different types of clients have
appeared, there has been a trend toward service-oriented architectures. Databases
sit behind a firewall, and their functionality is encapsulated, or hidden, behind a
service that is made available over the network.
Clients don’t need to know anything about the details of the database. Clients only
connect to the service via HTTP and pass the data back and forth using a text-based
format like XML or JSON.
Of course in the real world, you won't find a company that uses one architecture or
the other. You'll find a mix of all of these architectures depending on who designed
and programmed the systems, and when.
In older applications, you might find a mix of architectures within a single database,
depending on when new features and data were added over time.
This can lead to confusion when you are analyzing applications and databases you
want to migrate.
For example, in a 3-tier architecture, the application server is a dependency for the
client. At the same time the application server is a dependent of the database.
Understanding the target database architecture is
important when planning a database migration
Most difficult Least difficult
Understanding the architecture of the database you are trying to move is important
when planning a migration project.
Generally, client-server databases are harder to move because there is more logic
provided by the stored procedures that needs to be verified and tested. Any error
would probably break the clients, and more dependent applications tend to be
connected directly to the the database.
3-tier or N-tier applications have most of the business logic in the application code, so
there is less testing required in the database tier. Also, there tend to be fewer
dependents, because the clients connect to the application server, which in turn
connects to the database.
Service-oriented applications can be the easiest to move to the cloud because the
database and all its details are hidden behind the service layer. The service can be
used to synchronize the source and target databases during the migration.
Eventually, the new database takes over and the the old one can be turned off. There
is no need to worry about the clients because they are simply passing text between
themselves and the service.
Agenda
Solution Overview
After you move your databases and applications, there is an opportunity for optimizing
them to run in the cloud. Let's briefly talk about how you could do that.
Microservice architectures are the trend when
developing cloud-native applications
● Large applications are divided into
multiple smaller independent ones.
Microservice architectures have become popular as more and more applications have
been migrated to the cloud.
With microservices, large applications are divided into smaller, simpler, independent
services. Each microservice should be responsible for its own data. This is a big
change. You want to design your services to minimize the dependencies between
them. Microservices should be loosely coupled. That means one service can use
another without understanding its details. Data passed between services is simply
text in JSON or XML format.
Auth
Database
Phase 2
Because each microservice should be responsible for its own data, when applications
are optimized for the cloud, databases also have to be split into multiple, smaller
pieces.
Some developers make the mistake of having one main database service provide the
data to all the microservices in an application. This complicates your deployment
without providing all the benefits of a microservice architecture.
There are many benefits of microservices. One benefit is the ability to choose which
type of database is most appropriate for each individual service. For some services, a
relational database is required. But for others, using a NoSQL database or a data
warehouse can save money and make programming easier.
Choose the right storage type for each microservice
With monolithic apps, the database you choose has to be one that works for the entire
application. That means you probably need a relational database. Relational
databases are great, and they work for nearly any application. However, relational
databases tend to be more expensive and require more administration than other,
simpler types of databases. With microservices, you can keep the relational database
when the service requires it, but you can use NoSQL databases or data warehousing
solutions when they are a better fit for the microservice use case.
<relational>
Use relational databases for online transaction processing uses cases when strong
schemas and strong consistency are required.
<NoSQL>
NoSQL databases can be simpler and less expensive than relational systems. Use a
NoSQL database when you prefer or can tolerate a schemaless database or when a
higher degree of flexibility is needed.
<Data Warehouse>
If a microservice is for big data, analytics, or object storage, consider a data
warehousing solution like BigQuery or Cloud Storage. These solutions are extremely
inexpensive and support unlimited amounts of data.
Database-as-a-service products reduce
administrative overhead and cost
Cloud SQL Cloud Spanner Firestore Memorystore Cloud Bigtable BigQuery Cloud Storage
Google Cloud provides a complete collection of database and data storage solutions.
There are two managed database services: Cloud SQL and Spanner. Cloud SQL
allows you to run MySQL, PostgreSQL, and SQL Server databases. Spanner is a
Google-created relational database service that is massively scalable and can easily
be deployed across multiple regions for extremely high availability and low latency
around the world.
For data warehousing and analytics using SQL, use BigQuery. Use Cloud Storage for
cheap object storage.
Agenda
Solution Overview
A system that is scalable is one that continues to work even as the number of users
and the amount of data grow. Highly available systems are fault-tolerant and continue
to work even when there is a failure of some of the nodes. Let's talk about how you
can architect your databases to achieve scalability and high availability.
High availability is achieved by deploying
infrastructure across more than one zone in a region
● In the cloud, a zone is considered a fault
Project
boundary:
○ Applications deployed to multiple zones Region
continue to work even if a zone fails.
● Route traffic to application instances Zone A Zone B Zone C
through a load balancer:
○ Load balancers are regional
(or sometimes global) resources. Server Server Server
High availability is achieved by deploying infrastructure across more than one zone in
a region. In Google Cloud, each region is divided into three zones. Think of a zone as
a fault boundary. No single thing that can fail would cause resources deployed in
different zones to be unavailable.
For your applications to be highly available, your databases must be too. Failover
replicas are used to ensure database reliability. Deploy two databases in different
zones. One is the master and one is the failover. Requests go to the master. All data
is synchronized with the failover. If the master ever goes down, the failover takes
requests until the master is back up and running.
Scalable databases continue to work as the number
of users and amount of data grows very large
Scalable databases continue to work as the number of users and amount of data
grow very large. To handle a large volume of writes, you split the database into pieces
called shards. Then use multiple nodes, or servers, to process different shards.
You can process high volumes of reads by creating multiple copies of the database
called replicas. The read replicas handle analytics and reporting uses cases, while the
master is left to handle the writes. The master ensures that the data is synchronized
with the read replicas when changes are made.
For global applications, create read replicas in
multiple regions Project
Region 1
● One or more read replicas can be added for very
Zone A Zone B
large applications:
○ The master is responsible for the data.
○ It synchronizes the data with the replicas. Master
ServerDB Failover
ServerDB
Customers with users all over the world can create read replicas in multiple regions.
Requests from users are routed to the region geographically closest to them. This can
be automated using Google's global load balancers.
A master will still handle the writes and synchronize those changes to the replicas.
When synchronizing across regions, you just have to be aware of the increased
latency around the world. If you are using asynchronous replication across regions,
the replicas can have stale data for a short period of time. This is known as eventual
consistency, as opposed to immediate consistency.
Distributed databases use clusters of nodes and
multiple shards to process high volume data
Project
● When writes happen so quickly that a
single server can’t keep up, split the data Region
into pieces (shards):
○ Clusters of servers consist of nodes. Zone A
○ Each node is responsible for some of
the shards.
Node Node Node
● For high availability or low latency,
replicate clusters across multiple zones
or regions.
B. Make sure you have a dependable backup and restore process in place.
C. Use unit and integration testing for all stored procedures and data migration code.
What are some things you should do when migrating an enterprise database to the
cloud?
Quiz
What are some things you should do when migrating an enterprise database to the cloud?
B. Make sure you have a dependable backup and restore process in place.
C. Use unit and integration testing for all stored procedures and data migration code.
The answer is D: All of the above. Inventorying client applications, ensuring that you
have a dependable backup and restore process, and implementing thorough,
automated testing are all good ideas to ensure a smooth and successful database
migration.
Quiz
You have a microservice responsible for managing customer accounts and orders.
Strong consistency, ACID transactions, and strong schemas are all important. What
type of database would you use?
A. Relational
B. NoSQL
C. Document
D. Object
You have a microservice responsible for managing customer accounts and orders.
Strong consistency, ACID transactions, and strong schemas are all important. What
type of database would you use?
Quiz
You have a microservice responsible for managing customer accounts and orders.
Strong consistency, ACID transactions, and strong schemas are all important. What
type of database would you use?
A. Relational
B. NoSQL
C. Document
D. Object
You’re deploying an application for high availability and have created web servers in
multiple zones in a region. What should you do for the database to also be highly
available?
Quiz
You’re deploying an application for high availability and have created web servers in
multiple zones in a region. What should you do for the database to also be highly
available?
The answer is B: Create a failover replica in another zone. In the cloud, a zone is
considered a fault boundary. Resources in different zones share no common points of
failure. So, having a redundant server in another zone would be adequate for high
availability.