PostgreSQL for data architects discover how to design develop and maintain your database application effectively with PostgreSQL Maymala 2024 scribd download
PostgreSQL for data architects discover how to design develop and maintain your database application effectively with PostgreSQL Maymala 2024 scribd download
com
https://textbookfull.com/product/postgresql-for-data-
architects-discover-how-to-design-develop-and-maintain-your-
database-application-effectively-with-postgresql-maymala/
OR CLICK BUTTON
DOWNLOAD NOW
https://textbookfull.com/product/learning-postgresql-10-a-beginner-s-
guide-to-building-high-performance-postgresql-database-solutions-juba/
textboxfull.com
https://textbookfull.com/product/postgresql-up-and-running-a-
practical-guide-to-the-advanced-open-source-database-regina-o-obe/
textboxfull.com
https://textbookfull.com/product/learn-postgresql-second-edition/
textboxfull.com
https://textbookfull.com/product/postgresql-up-and-running-a-
practical-guide-to-the-advanced-open-source-database-3rd-edition-
regina-o-obe/
textboxfull.com
https://textbookfull.com/product/beginning-postgresql-on-the-cloud-
simplifying-database-as-a-service-on-cloud-platforms-baji-shaik/
textboxfull.com
https://textbookfull.com/product/idea-searching-for-design-how-to-
research-and-develop-design-concepts-david-bramston/
textboxfull.com
PostgreSQL for Data Architects
Jayadevan Maymala
BIRMINGHAM - MUMBAI
PostgreSQL for Data Architects
All rights reserved. No part of this book may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, without the prior written
permission of the publisher, except in the case of brief quotations embedded in
critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy
of the information presented. However, the information contained in this book is
sold without warranty, either express or implied. Neither the author, nor Packt
Publishing, and its dealers and distributors will be held liable for any damages
caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the
companies and products mentioned in this book by the appropriate use of capitals.
However, Packt Publishing cannot guarantee the accuracy of this information.
ISBN 978-1-78328-860-1
www.packtpub.com
Reviewers Proofreaders
Pascal Charest Maria Gould
Bahman Movaqar Clyde Jenkins
Ângelo Marcos Rigo Chris Smith
Hans-Jürgen Schönig Jonathan Todd
Stéphane Wirtel
Indexer
Commissioning Editor Hemangini Bari
Anthony Albuquerque
Graphics
Acquisition Editor Sheetal Aute
Sonali Vernekar Abhinash Sahu
Copy Editor
Relin Hedly
About the Author
When he is not working on open source technologies, he spends time reading and
updating himself on economic and political issues.
I'd like to thank my lovely wife, Nahid, who has taught me how to
be strong.
Ângelo Marcos Rigo has a strong background in web development since
1998, focusing on content management systems. For the past 7 years, he has been
managing, customizing, and developing extensions for Moodle LMS. He can be
reached at his website http://www.u4w.com.br/novosite/index.php for CMS or
Moodle LMS consulting. He has reviewed Moodle Security, Packt Publishing.
Did you know that Packt offers eBook versions of every book published, with PDF
and ePub files available? You can upgrade to the eBook version at www.PacktPub.
com and as a print book customer, you are entitled to a discount on the eBook copy.
Get in touch with us at [email protected] for more details.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital
book library. Here, you can search, access, and read Packt's entire library of books.
Why subscribe?
• Fully searchable across every book published by Packt
• Copy and paste, print, and bookmark content
• On demand and accessible via a web browser
[i]
Table of Contents
[ ii ]
Table of Contents
[ iii ]
Table of Contents
[ iv ]
Table of Contents
[v]
Preface
PostgreSQL is an incredibly flexible and dependable open source relational database.
Harnessing its power will make your applications more reliable and extensible
without increasing costs. Using PostgreSQL's advanced features will save you work
and increase performance, once you've discovered how to set it up.
PostgreSQL for Data Architects will teach you everything you need to learn in order to
get a scalable and optimized PostgreSQL server up and running.
The book starts with basic concepts (such as installing PostgreSQL from source) and
covers theoretical aspects (such as concurrency and transaction management). After
this, you'll learn how to set up replication, use load balancing to scale horizontally,
and troubleshoot errors.
As you continue through this book, you will see the significant impact of
configuration parameters on performance, scalability, and transaction management.
Finally, you will get acquainted with useful tools available in the PostgreSQL
ecosystem used to analyze PostgreSQL logs, set up load balancing, and recovery.
Chapter 2, Server Architecture, covers the important processes started when we start a
PostgreSQL cluster and how they work along with the memory structures to provide
the functionality expected from a database management system.
[ vii ]
Preface
Chapter 3, PostgreSQL – Object Hierarchy and Roles, explains various object types and
objects provided by PostgreSQL. Important concepts such as databases, clusters,
tablespaces, and schemas are covered in this chapter.
Chapter 5, Data Modeling with SQL Power Architect, talks about how we can model
tables and relationships with SQL Power Architect. Some of the aspects that should
be considered when we choose a design tool are also covered in this chapter.
Chapter 6, Client Tools, covers two clients tools (pgAdmin: a UI tool and psql: a
command-line tool). Browsing database objects, generating queries, and generating
the execution plan for queries using pgAdmin are covered. Setting up the
environment variables for connecting from psql, viewing history of SQL commands
executed, and meta-commands are also covered in this chapter.
Chapter 7, SQL Tuning, explains query optimization techniques. To set the context,
some patterns about database use and theory on how the PostgreSQL optimizer
works are covered.
Chapter 8, Server Tuning, covers PostgreSQL server settings that have significant
impact on query performance. These include memory settings, cost settings,
and so on. Two object types: partitions and materialized views are also
covered in this chapter.
Chapter 9, Tools to Move Data in and out of PostgreSQL, covers common tools/utilities,
such as pg_dump, pg_bulkload, and copy used to move data in and out
of PostgreSQL.
Chapter 10, Scaling, Replication, and Backup and Recovery, covers methods that
are usually used for achievability. A step-by-step method to achieve horizontal
scalability using PostgreSQL's streaming replication and pgpool-II is also presented.
Point-in-time recovery for PostgreSQL is also covered in this chapter.
Chapter 12, PostgreSQL – Extras, covers quite a few topics. Some interesting data types
that every data architect should be aware of, a couple of really useful extensions, and
a tool to analyze PostgreSQL log files are covered. It also covers a few interesting
features available in PostgreSQL 9.4.
[ viii ]
Preface
Conventions
In this book, you will find a number of text styles that distinguish among different
kinds of information. Here are some examples of these styles and an explanation of
their meaning.
Code words in text, database table names, folder names, filenames, file extensions,
pathnames, dummy URLs, user input, and Twitter handles are shown as follows:
"We will use the following wget command to download the source."
A block of code/SQL at psql prompt as well as the output from the server at psql is
set as follows:
CREATE TABLE emp(id serial, first_name varchar(50));
[ ix ]
Preface
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about
this book—what you liked or disliked. Reader feedback is important for us as it helps
us develop titles that you will really get the most out of.
If there is a topic that you have expertise in and you are interested in either writing
or contributing to a book, see our author guide at www.packtpub.com/authors.
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to
help you to get the most from your purchase.
[x]
Preface
Errata
Although we have taken every care to ensure the accuracy of our content, mistakes
do happen. If you find a mistake in one of our books—maybe a mistake in the text or
the code—we would be grateful if you could report this to us. By doing so, you can
save other readers from frustration and help us improve subsequent versions of this
book. If you find any errata, please report them by visiting http://www.packtpub.
com/submit-errata, selecting your book, clicking on the Errata Submission Form
link, and entering the details of your errata. Once your errata are verified, your
submission will be accepted and the errata will be uploaded to our website or added
to any list of existing errata under the Errata section of that title.
Piracy
Piracy of copyrighted material on the Internet is an ongoing problem across all
media. At Packt, we take the protection of our copyright and licenses very seriously.
If you come across any illegal copies of our works in any form on the Internet, please
provide us with the location address or website name immediately so that we can
pursue a remedy.
We appreciate your help in protecting our authors and our ability to bring you
valuable content.
Questions
If you have a problem with any aspect of this book, you can contact us at
[email protected], and we will do our best to address the problem.
[ xi ]
Installing PostgreSQL
This chapter gives you an overview of the process to install PostgreSQL from the
source. The system used for installation and providing examples in the following
sections is a 64-bit CentOS (6.4) machine. Other Unix/Linux systems typically have
similar commands. For those using Windows systems, there is a set of utilities
available at http://sourceforge.net/projects/unxutils/, which makes it
possible to execute most of the Unix commands (find, grep, cut, and so on) in the
Windows environment. The steps to be followed to install PostgreSQL on Windows
are very different compared to those for Unix/Linux systems and are not covered in
this chapter.
Installation options
There are many possible ways to install PostgreSQL on a system. For Windows,
downloading the Graphical Installer and using this is the easy way. For Linux systems
such as Red Hat Enterprise Linux or CentOS, we could either use Yellow dog Updater
Modified (yum) or Red Hat Package Manager or RPM Package Manager (rpm)
commands to install PostgreSQL. For Ubuntu, PostgreSQL can be installed using the
apt-get command, which in turn works with Ubuntu's Advanced Packaging Tool
(APT). While these options work, we do not get to see what is happening when we
execute these commands, except, of course, that the database gets installed.
Then there are situations where we might want to build from the source.
Assume that all we have is one production server and one development or staging
server. We are on version 9.3. Version 9.4 is about to be released and there are
quite a few interesting features in 9.4 that we want to try out. If we want to install 9.4
in the test server and use it alongside 9.3, without the installations stepping on each
other's toes, compiling from the source with the --prefix= option and specifying
different installation directories is the right approach. We could also set different
default ports. It's also possible that the new version (source) is ready, but the
package for our Linux distribution is not ready yet.
[1]
Installing PostgreSQL
We might use a flavor of Linux for which an installation package is not available at
all. Installation from source is the way forward in these situations. One advantage
with installing from the source is that we don't have to worry too much about
which package to download, the version of operating system (CentOS 6.3 or 6.4?),
architecture (32 bit or 64 bit), and so on. These are more or less irrelevant. Of course,
we should be using an operating system/architecture that is supported by the
database, but that's about it! We also need to download and install all the tools and
utilities necessary to compile and make the software, in this case, PostgreSQL.
We can see a number of versions all the way down to version 1 when it was called
Postgres95 and up to the latest production and beta versions. If you belong to the
group who believe that one shouldn't try software that is not at least a few months
old, so that its teething issues are resolved, you should opt for the last-but-one
version. It's a good idea to opt for the latest stable version. The latest versions have
added quite a few very useful features, such as materialized views and an improved
set of JSON functions and operators.
Executing this command will give us a window that looks like this:
[2]
Chapter 1
As we can see, the tarred and gzipped source code comes to about 21 MB. As an
aside, the installation files of Oracle—the big RDBMS out here—weighs over 2.2 GB.
The tar command is used to create or extract TapeARchive files. In the preceding
command, the x option is used to extract, v for verbose is used so that we can see
the list of files and folders getting extracted, and the f option is for, well, passing
the name of the file, which will undergo the extraction process. We might need to
provide the z option, so the command will be tar -xzvf if the preceding code in the
tar command does not work. Some versions of tar are intelligent enough to figure
out whether it is a gzipped file or not and will unzip it automatically. The untarred
unzipped files come to around 115 MB.
The find command searches for files meeting specific criteria. Here, we instructed
find to limit itself to scanning just one level of subdirectories using maxdepth 1. We
used the type option along with d to tell find that we need files of type directory, as
shown in the following screenshot:
• src: This directory has most of the core code, namely, code for the backend
processes, optimizer, storage, client utilities (such as psql) and code to
take care of replication, and so on. It also contains the makefiles for various
distributions. For example, we have the files Makefile.hpux, Makefile.
linux, Makefile.openbsd, and Makefile.sco under src/makefile.
[3]
Installing PostgreSQL
• doc: This directory has the source for documentation written in DocBook,
DocBook being an application of Standard Generalized Markup Language
(SGML). It's possible to generate documentation in an HTML format, PDF
format, and a few other formats.
• contrib: This directory is where many extensions are available. These
are add-on modules that do not form part of the core installation, but can
be installed as needed. For example, those who have to connect to other
PostgreSQL databases can install the Foreign Data Wrapper extension:
postgres_fdw. For those who want to access the contents of a file on the
server from a table, there is the file_fdw extension.
• config: This directory contains a few macros that help you configure and
compile the package.
Now let's move on to the dependencies, configuration options, and the actual
installation itself.
A compiler is also necessary. GNU Compiler Collection (GCC) is one such toolset
that is included in almost all the Unix systems. The gcc -v command will provide
you with the version of gcc as well as options with which it was configured on the
system, as shown in the following screenshot:
[4]
Chapter 1
The process of building a package from source involves preprocessing the source
(including the header files, expanding macros, and so on), compiling, assembly, and
linking (linking the libraries). The make utility automates the process of building the
executable from source code. The make command uses a makefile, which contains
rules on how to build the executables.
Other than GNU Make and a compiler, there is nothing else that is really necessary
to continue. However, it is better to have at least the following two components:
• readline: The GNU readline library is very useful once we start using
psql, the PostgreSQL command-line client, which is covered later. Having
readline helps us work in a very "bash-like" environment, using Tab
to autocomplete/suggest table names and up and down keys to browse
command history, and so on and so forth. It also helps to have zlib in
place before we proceed with the installation.
• zlib: This compression library can be handy when we are taking backups
(a process definitely to be followed when we have a database).
Adding SQL/XML support will also be useful as sooner or later we will want to
extract data from tables in an XML format or load data from the XML files to tables.
Still, this might not be as useful as the other two, namely, readline and zlib.
We can vi /tmp/config.txt and verify that there are over 80 options that can be
used. These options can be broadly grouped into the following categories:
[5]
Installing PostgreSQL
When we run ./configure, it's likely that we get an output like this:
The output tells us that readline is not available. However, if we list installation
packages, it is very much there. The reason is that readline-devel is missing. It
contains files needed by programs (such as psql) that use the readline library. This
can be installed using the following command:
yum install readline-devel.x86_64
[6]
Chapter 1
It also installs ncurses-devel. You will have to execute the command using sudo
or root. You might also get a similar error for zlib, although zlib itself is already
installed. Again, the corrective action is to install devel, in this case, zlib-devel.
Once this is done, we can run configure again and it should go through without
any issues, as shown in the following screenshot:
The two files are now created in the current directory in addition to a few more files
in subdirectories. One is config.status and the other (config.log. config.
status) is a bash script that can be executed to recreate the configuration. The
config.log file can be reviewed to understand the various options used, variables,
and errors, if any. It's possible that the config.log file has a few errors that are
marked fatal, and the compilation process can still be completed without any issue.
[7]
Installing PostgreSQL
The process takes a few minutes to complete, and in the end says PostgreSQL,
contrib, and documentation successfully made. Ready to install, as
shown in the following screenshot:
[8]
Another Random Scribd Document
with Unrelated Content
asked to pass. In no other way, can we acquire our own knowledge
that the Supreme Court has yet to hear and consider the real
challenge to the supposed new Article in which governments attempt
to exercise ungranted power and to grant new power to interfere with
the individual freedom of the American citizen. As we well know, that
one real challenge is that the new Article was not made by those
who alone can make it, that it was not made as it can be
constitutionally made, by the makers of that kind of Article
named in the Fifth Article, the “conventions” of the Seventh and the
Fifth Articles, the “We, the people” of the Preamble and “the people”
of the Tenth Amendment.
CHAPTER XXIII
THE CHALLENGES THAT FAILED
Our website is not just a platform for buying books, but a bridge
connecting readers to the timeless values of culture and wisdom. With
an elegant, user-friendly interface and an intelligent search system,
we are committed to providing a quick and convenient shopping
experience. Additionally, our special promotions and home delivery
services ensure that you save time and fully enjoy the joy of reading.
textbookfull.com