Skip to content

Commit 66f75e3

Browse files
committed
Merge branch 'fix_s3_access' into 'master'
custom postgres confings for clones support + fix for dumps on S3 See merge request postgres-ai/terraform-postgres-ai-database-lab!30
2 parents 48bb929 + 05bd036 commit 66f75e3

File tree

6 files changed

+54
-194
lines changed

6 files changed

+54
-194
lines changed

README.md

Lines changed: 4 additions & 179 deletions
Original file line numberDiff line numberDiff line change
@@ -1,192 +1,17 @@
11
[[_TOC_]]
22

3-
# How to setup Database Lab using Terraform in AWS
3+
# Database Lab Terraform Module
44

55
This [Terraform Module](https://www.terraform.io/docs/language/modules/index.html) is responsible for deploying the [Database Lab Engine](https://gitlab.com/postgres-ai/database-lab) to cloud hosting providers.
66

77
Your source PostgreSQL database can be located anywhere, but DLE with other components will be created on an EC2 instance under your AWS account. Currently, only "logical" mode of data retrieval (dump/restore) is supported – the only available method for managed PostgreSQL cloud services such as RDS Postgres, RDS Aurora Postgres, Azure Postgres, or Heroku. "Physical" mode is not yet supported, but it will be in the future. More about various data retrieval options for DLE: https://postgres.ai/docs/how-to-guides/administration/data.
88

9-
## Supported Cloud Platforms:
9+
## Supported Cloud Platforms
1010
- AWS
1111

12-
## Prerequisites
13-
- [AWS Account](https://aws.amazon.com)
14-
- [Terraform Installed](https://learn.hashicorp.com/tutorials/terraform/install-cli) (minimal version: 1.0.0)
15-
- AWS [Route 53](https://aws.amazon.com/route53/) Hosted Zone (For setting up TLS) for a domain or sub-domain you control
16-
- You must have AWS Access Keys and a default region in your Terraform environment (See section on required IAM Permissions)
17-
- The DLE runs on an EC2 instance which can be accessed using a selected set of SSH keys uploaded to EC2. Use the Terraform parameter `aws_keypair` to specify which EC2 Keypair to use
18-
- Required IAM Permissions: to successfully run this Terraform module, the IAM User/Role must have the following permissions:
19-
* Read/Write permissions on EC2
20-
* Read/Write permissions on Route53
21-
* Read/Write permissions on Cloudwatch
12+
## Installation
2213

23-
## Configuration overview
24-
- :construction: Currently, it is supposed that you run `terraform` commands on a Linux machine. MacOS and Windows support is not yet implemented (but planned).
25-
- It is recommended to clone this Git repository and adjust for your needs. Below we provide the detailed step-by-step instructions for quick start (see "Quick start") for a PoC setup
26-
- To configure parameters used by Terraform (and the Database Lab Engine itself), you will need to modify `terraform.tfvars` and create a file with secrets (`secret.tfvars`)
27-
- This Terraform module can be run independently or combined with any other standard Terraform module. You can learn more about using Terraform and the Terraform CLI [here](https://www.terraform.io/docs/cli/commands/index.html)
28-
- The variables can be set in multiple ways with the following precedence order (lowest to highest):
29-
- default values in `variables.tf`
30-
- values defined in `terraform.tfvars`
31-
- values passed on the command line
32-
- All variables starting with `postgres_` represent the source database connection information for the data (from that database) to be fetched by the DLE. That database must be accessible from the instance hosting the DLE (that one created by Terraform)
33-
34-
## How-to guide: using this Terraform module to set up DLE and its components
35-
The following steps were tested on Ubuntu 20.04 but supposed to be valid for other Linux distributions without significant modification.
36-
37-
1. SSH to any machine with internet access, it will be used as deployment machine
38-
1. Install Terraform https://learn.hashicorp.com/tutorials/terraform/install-cli. Example for Ubuntu:
39-
```shell
40-
sudo apt-get update && sudo apt-get install -y gnupg software-properties-common curl
41-
curl -fsSL https://apt.releases.hashicorp.com/gpg | sudo apt-key add -
42-
sudo apt-add-repository "deb [arch=amd64] https://apt.releases.hashicorp.com $(lsb_release -cs) main" # Adjust if you have ARM platform.
43-
sudo apt-get update && sudo apt-get install terraform
44-
# Verify installation.
45-
terraform -help
46-
```
47-
1. Get TF code for Database Lab:
48-
```shell
49-
git clone https://gitlab.com/postgres-ai/database-lab-infrastructure.git
50-
cd database-lab-infrastructure/
51-
```
52-
1. Edit `terraform.tfvars` file. In our example, we will use Heroku demo database as a source:
53-
```config
54-
dle_version_full = "2.4.1"
55-
56-
aws_ami_name = "DBLABserver*"
57-
aws_keypair = "YOUR_AWS_KEYPAIR"
58-
59-
aws_deploy_region = "us-east-1"
60-
aws_deploy_ebs_availability_zone = "us-east-1a"
61-
aws_deploy_ec2_instance_type = "t2.large"
62-
aws_deploy_ec2_instance_tag_name = "DBLABserver-ec2instance"
63-
aws_deploy_ebs_size = "40"
64-
aws_deploy_ebs_type = "gp2"
65-
aws_deploy_allow_ssh_from_cidrs = ["0.0.0.0/0"]
66-
aws_deploy_dns_api_subdomain = "tf-test" # subdomain in aws.postgres.ai, fqdn will be ${dns_api_subdomain}-engine.aws.postgres
67-
68-
# Source – two options. Choose one of two:
69-
# - direct connection to source DB
70-
# - dump stored on AWS S3
71-
72-
# option 1 – direct PG connection
73-
source_type = "postgres" # source is working dome postgres database
74-
source_postgres_version = "13"
75-
source_postgres_host = "ec2-3-215-57-87.compute-1.amazonaws.com" # an example DB at Heroku
76-
source_postgres_port = "5432"
77-
source_postgres_dbname = "d3dljqkrnopdvg" # an example DB at Heroku
78-
source_postgres_username = "bfxuriuhcfpftt" # an example DB at Heroku
79-
80-
# option 2 – dump on S3. Important: your AWS user has to be able to create IAM roles to work with S3 buckets in your AWS account
81-
# source_type = 's3' # source is dump stored on demo s3 bucket
82-
# source_pgdump_s3_bucket = "tf-demo-dump" # This is an example public bucket
83-
# source_pgdump_path_on_s3_bucket = "heroku.dmp" # This is an example dump from demo database
84-
85-
dle_debug_mode = "true"
86-
dle_retrieval_refresh_timetable = "0 0 * * 0"
87-
postgres_config_shared_preload_libraries = "pg_stat_statements,logerrors" # DB Migration Checker requires logerrors extension
88-
89-
platform_project_name = "aws_test_tf"
90-
```
91-
1. Create `secret.tfvars` containing `source_postgres_password`, `platform_access_token`, and `vcs_github_secret_token`. An example:
92-
```config
93-
source_postgres_password = "dfe01cbd809a71efbaecafec5311a36b439460ace161627e5973e278dfe960b7" # an example DB at Heroku
94-
platform_access_token = "YOUR_ACCESS_TOKEN" # to generate, open https://console.postgres.ai/, choose your organization,
95-
# then "Access tokens" in the left menu
96-
vcs_github_secret_token = "vcs_secret_token" # generate a personal access token with scope of "repo"
97-
```
98-
To generate a personal GitHub access token with the scope of "repo", open the [guide on GitHub Docs](https://docs.github.com/en/github/authenticating-to-github/keeping-your-account-and-data-secure/creating-a-personal-access-token) and follow the instructions.
99-
100-
Note that the "repo" scope essentially gives full access to all user-controlled repositories. Should you have any concerns about which repositories the DLE can have access to, consider using a separate GitHub account that has access to the reduced number of repositories.
101-
1. Initialize
102-
```shell
103-
terraform init
104-
```
105-
1. Set environment variables with AWS credentials:
106-
```shell
107-
export AWS_ACCESS_KEY_ID = "keyid" # todo: how to get it?
108-
export AWS_SECRET_ACCESS_KEY = "accesskey"
109-
```
110-
1. Deploy:
111-
```shell
112-
terraform apply -var-file="secret.tfvars" -auto-approve
113-
```
114-
1. If everything goes well, you should get an output like this:
115-
```config
116-
vcs_db_migration_checker_verification_token = "gsio7KmgaxECfJ80kUx2tUeIf4kEXZex"
117-
dle_verification_token = "zXPodd13LyQaKgVXGmSCeB8TUtnGNnIa"
118-
ec2_public_dns = "ec2-11-111-111-11.us-east-2.compute.amazonaws.com"
119-
ec2instance = "i-0000000000000"
120-
ip = "11.111.111.11"
121-
platform_joe_signing_secret = "lG23qZbUh2kq0ULIBfW6TRwKzqGZu1aP"
122-
public_dns_name = "demo-api-engine.aws.postgres.ai" # todo: this should be URL, not hostname – further we'll need URL, with protocol – `https://`
123-
```
124-
125-
1. To verify result and check the progress, you might want to connect to the just-created EC2 machine using IP address or hostname from the Terraform output. In our example, it can be done using this one-liner (you can find more about DLE logs and configuration on this page: https://postgres.ai/docs/how-to-guides/administration/engine-manage):
126-
```shell
127-
echo "sudo docker logs dblab_server -f" | ssh [email protected] -i postgres_ext_test.pem
128-
```
129-
130-
Once you see the message like:
131-
```
132-
2021/07/02 10:28:51 [INFO] Server started listening on :2345.
133-
```
134-
– it means that the DLE server started successfully and is waiting for you commands
135-
136-
1. Sign in to the [Postgres.ai Platform](https://console.postgres.ai/) and register your new DLE server:
137-
1. Go to `Database Lab > Instances` in the left menu
138-
1. Press the "Add instance" button
139-
1. `Project` – specify any name (this is how your DLE server will be named in the platform)
140-
1. `Verification token` – use the token generated above (`verification_token` value); do NOT press the "Generate" button here
141-
1. `URL` – use the value generated above // todo: not convenient, we need URL but reported was only hostname
142-
1. Press the "Verify URL" button to check the connectivity. Then press "Add". If everything is right, you should see the DLE page with green "OK" status:
143-
<img src="/uploads/8371e7f79de199aa017ff2df82b8f704/image.png" width="400" />
144-
1. Add Joe chatbot for efficient SQL optimization workflow:
145-
1. Go to the "SQL Optimization > Ask Joe" page using the left menu, click the "Add instance" button, specify the same project as you defined in the previous step
146-
1. `Signing secret` – use `platform_joe_signing_secret` from the Terraform output
147-
1. `URL` – use `public_dns_name` values from the Terraform output with port `444`; in our example, it's `https://demo-api-engine.aws.postgres.ai:444`
148-
1. Press "Verify URL" to check connectivity and then press "Add". You should see:
149-
<img src="/uploads/252e5f74cd324fc4df301bbf7c2bdd25/image.png" width="400" />
150-
151-
Now you can start using Joe chatbot for SQL execution plans troubleshooting and verification of optimization ideas. As a quick test, go to `SQL Optimization > Ask Joe` in the left menu, and enter `\dt+` command (a psql command to show the list of tables with sizes). You should see how Joe created a thin clone behind the scenes and immediately ran this psql command, presenting the result to you:
152-
<img src="/uploads/d9e9e1fdafb0ded3504691cec9018868/image.png" width="400" />
153-
154-
1. Set up [DB migration checker](https://postgres.ai/docs/db-migration-checker). Prepare a repository with your DB migrations(Flyway, Sqitch, Liquibase, etc.):
155-
1. Add secrets:
156-
- `DLMC_CI_ENDPOINT` - an endpoint of your Database Lab Migration Checker service – use `vcs_db_migration_checker_registration_url` from the Terraform output
157-
- `DLMC_VERIFICATION_TOKEN` - verification token for the Database Lab Migration Checker API – use `vcs_db_migration_checker_verification_token` from the Terraform output
158-
1. Configure a new workflow in the created repository (see an example of configuration: https://github.com/postgres-ai/green-zone/blob/master/.github/workflows/main.yml)
159-
- add a custom action: https://github.com/marketplace/actions/database-lab-realistic-db-testing-in-ci
160-
- provide input params for the action (the full list of available input params)
161-
- provide environment variables:
162-
- `DLMC_CI_ENDPOINT` - use a CI Checker endpoint from the repository secrets
163-
- `DLMC_VERIFICATION_TOKEN` - use a verification token from the repository secrets
164-
165-
1. Install and try the client CLI (`dblab`)
166-
1. Follow the [guide](https://postgres.ai/docs/how-to-guides/cli/cli-install-init) to install Database Lab CLI
167-
1. Initialize CLI:
168-
```shell
169-
dblab init --environment-id=<ANY NAME FOR ENVIRONMENT> --url=https://<public_dns_name> --token=<your_personal_token_from_postgres_ai_platform>
170-
```
171-
1. Try it:
172-
```shell
173-
dblab instance status
174-
```
175-
It should return the OK status:
176-
```json
177-
{
178-
"status": {
179-
"code": "OK",
180-
"message": "Instance is ready"
181-
},
182-
...
183-
}
184-
```
185-
186-
## Important Note
187-
When the DLE creates new database clones, it makes them available on incremental ports in the 6000 range (e.g. 6000, 6001, ...). The DLE CLI will also report that the clone is available on a port in the 6000 range. However, please note that these are the ports when accessing the DLE from `localhost`. This Terraform module deploys [Envoy](https://www.envoyproxy.io/) to handle SSL termination and port forwarding to connect to DLE generated clones.
188-
189-
Bottom Line: When connecting to clones, add `3000` to the port number reported by the DLE CLI to connect to the clone. for example, if the CLI reports that a new clone is available at port `6001` connect that clone at port `9001`.
14+
Follow the [how-to guide](https://postgres.ai/docs/how-to-guides/administration/install-database-lab-with-terraform) to install Database Lab with Terraform on AWS
19015

19116
## Known Issues
19217
### Certificate Authority Authorization (CAA) for your Hosted Zone

dle-logical-init.sh.tpl

Lines changed: 26 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,10 +4,10 @@ set -x
44

55
sleep 20
66
#run certbot and copy files to envoy
7-
# to avoid restrinctions from letsencrypt like "There were too many requests of a given type ::
7+
# to avoid restrictions from letsencrypt like "There were too many requests of a given type ::
88
# Error creating new order :: too many certificates (5) already issued for this exact set of domains
99
# in the last 168 hours: demo-api-engine.aws.postgres.ai: see https://letsencrypt.org/docs/rate-limits/"
10-
# follwing three lines were commented out and mocked up. In real implementation inline certs have to be
10+
# following three lines were commented out and mocked up. In real implementation inline certs have to be
1111
# removed and letsencrypt generated certs should be used
1212

1313

@@ -100,8 +100,14 @@ for i in $${!disks[@]}; do
100100
done
101101

102102
# Adjust DLE config
103-
mkdir ~/.dblab
103+
mkdir -p ~/.dblab/postgres_conf/
104+
104105
curl https://gitlab.com/postgres-ai/database-lab/-/raw/${dle_version_full}/configs/config.example.logical_generic.yml --output ~/.dblab/server.yml
106+
curl https://gitlab.com/postgres-ai/database-lab/-/raw/${dle_version_full}/configs/postgres/pg_hba.conf \
107+
--output ~/.dblab/postgres_conf/pg_hba.conf
108+
curl https://gitlab.com/postgres-ai/database-lab/-/raw/${dle_version_full}/configs/postgres/postgresql.conf --output ~/.dblab/postgres_conf/postgresql.conf
109+
cat /tmp/postgresql_clones_custom.conf >> ~/.dblab/postgres_conf/postgresql.conf
110+
105111
sed -ri "s/^(\s*)(debug:.*$)/\1debug: ${dle_debug_mode}/" ~/.dblab/server.yml
106112
sed -ri "s/^(\s*)(verificationToken:.*$)/\1verificationToken: ${dle_verification_token}/" ~/.dblab/server.yml
107113
sed -ri "s/^(\s*)(timetable:.*$)/\1timetable: \"${dle_retrieval_refresh_timetable}\"/" ~/.dblab/server.yml
@@ -117,11 +123,14 @@ sed -ri "s/:13/:${source_postgres_version}/g" ~/.dblab/server.yml
117123
case "${source_type}" in
118124

119125
postgres)
126+
# Mount directory to store dump files.
127+
extra_mount="--volume /var/lib/dblab/dblab_pool_00/dump:/var/lib/dblab/dblab_pool/dump"
128+
120129
sed -ri "s/^(\s*)(host: 34.56.78.90$)/\1host: ${source_postgres_host}/" ~/.dblab/server.yml
121130
sed -ri "s/^(\s*)(port: 5432$)/\1port: ${source_postgres_port}/" ~/.dblab/server.yml
122131
sed -ri "s/^(\s*)( username: postgres$)/\1 username: ${source_postgres_username}/" ~/.dblab/server.yml
123132
sed -ri "s/^(\s*)(password:.*$)/\1password: ${source_postgres_password}/" ~/.dblab/server.yml
124-
#restore pg_dump via pipe - without saving it on the disk
133+
# restore pg_dump via pipe - without saving it on the disk
125134
sed -ri "s/^(\s*)(parallelJobs:.*$)/\1parallelJobs: 1/" ~/.dblab/server.yml
126135
sed -ri "s/^(\s*)(# immediateRestore:.*$)/\1immediateRestore: /" ~/.dblab/server.yml
127136
sed -ri "s/^(\s*)(# forceInit: false.*$)/\1 forceInit: true /" ~/.dblab/server.yml
@@ -134,10 +143,14 @@ case "${source_type}" in
134143
s3)
135144
# Mount S3 bucket if it's defined in Terraform variables
136145
mkdir -p "${source_pgdump_s3_mount_point}"
137-
s3fs ${source_pgdump_s3_bucket} ${source_pgdump_s3_mount_point} -o iam_role -o use_cache=/tmp -o allow_other
146+
s3fs ${source_pgdump_s3_bucket} ${source_pgdump_s3_mount_point} -o iam_role -o allow_other
138147

148+
extra_mount="--volume ${source_pgdump_s3_mount_point}:${source_pgdump_s3_mount_point}"
149+
139150
sed -ri "s/^(\s*)(- logicalDump.*$)/\1#- logicalDump /" ~/.dblab/server.yml
140151
sed -ri "s|^(\s*)( dumpLocation:.*$)|\1 dumpLocation: ${source_pgdump_s3_mount_point}/${source_pgdump_path_on_s3_bucket}|" ~/.dblab/server.yml
152+
sed -ri '/is always single-threaded./{n;s/.*/ parallelJobs: '${postgres_dump_parallel_jobs}'/}' ~/.dblab/server.yml
153+
sed -ri '/jobs to restore faster./{n;s/.*/ parallelJobs: '$(getconf _NPROCESSORS_ONLN)'/}' ~/.dblab/server.yml
141154
;;
142155

143156
esac
@@ -148,9 +161,10 @@ sudo docker run \
148161
--privileged \
149162
--publish 2345:2345 \
150163
--volume /var/run/docker.sock:/var/run/docker.sock \
151-
--volume /var/lib/dblab/dblab_pool_00/dump:/var/lib/dblab/dblab_pool/dump \
152164
--volume /var/lib/dblab:/var/lib/dblab/:rshared \
153165
--volume ~/.dblab/server.yml:/home/dblab/configs/config.yml \
166+
--volume /root/.dblab/postgres_conf:/home/dblab/configs/postgres \
167+
$extra_mount \
154168
--env DOCKER_API_VERSION=1.39 \
155169
--detach \
156170
--restart on-failure \
@@ -162,13 +176,15 @@ for i in {1..30000}; do
162176
sleep 10
163177
done
164178

179+
curl https://gitlab.com/postgres-ai/database-lab/-/raw/${dle_version_full}/scripts/cli_install.sh | bash
180+
sudo mv ~/.dblab/dblab /usr/local/bin/dblab
165181
dblab init \
166182
--environment-id=tutorial \
167183
--url=http://localhost:2345 \
168184
--token=${dle_verification_token} \
169185
--insecure
170186

171-
#configure and run Joe Bot container
187+
# Configure and run Joe Bot container.
172188
cp /home/ubuntu/joe.yml ~/.dblab/joe.yml
173189
sed -ri "s/^(\s*)(debug:.*$)/\1debug: ${dle_debug_mode}/" ~/.dblab/joe.yml
174190
sed -ri "s/^(\s*)( token:.*$)/\1 token: ${platform_access_token}/" ~/.dblab/joe.yml
@@ -186,8 +202,8 @@ sudo docker run \
186202
--detach \
187203
postgresai/joe:latest
188204

189-
#configure and run DB Migration Checker
190-
curl https://gitlab.com/postgres-ai/database-lab/-/raw/master/configs/config.example.run_ci.yaml --output ~/.dblab/run_ci.yaml
205+
# Configure and run DB Migration Checker.
206+
curl https://gitlab.com/postgres-ai/database-lab/-/raw/${dle_version_full}/configs/config.example.run_ci.yaml --output ~/.dblab/run_ci.yaml
191207
sed -ri "s/^(\s*)(debug:.*$)/\1debug: ${dle_debug_mode}/" ~/.dblab/run_ci.yaml
192208
sed -ri "s/^(\s*)( verificationToken: \"secret_token\".*$)/\1 verificationToken: ${vcs_db_migration_checker_verification_token}/" ~/.dblab/run_ci.yaml
193209
sed -ri "s/^(\s*)( url: \"https\\:\\/\\/dblab.domain.com\"$)/\1 url: \"http\\:\\/\\/dblab_server\\:2345\"/" ~/.dblab/run_ci.yaml
@@ -200,4 +216,4 @@ sudo docker run --name dblab_ci_checker -it --detach \
200216
--volume /var/run/docker.sock:/var/run/docker.sock \
201217
--volume /tmp/ci_checker:/tmp/ci_checker \
202218
--volume ~/.dblab/run_ci.yaml:/home/dblab/configs/run_ci.yaml \
203-
postgresai/dblab-ci-checker:2.4.1
219+
postgresai/dblab-ci-checker:${dle_version_full}

0 commit comments

Comments
 (0)