Skip to content

Commit e274e86

Browse files
committed
docs: Adds workload example for single index analysis
1 parent 9e3280a commit e274e86

File tree

1 file changed

+171
-0
lines changed

1 file changed

+171
-0
lines changed
Lines changed: 171 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,171 @@
1+
## workload example for single index analysis dashboard
2+
3+
This example prepares and runs a repeatable workload designed for the postgres_ai monitoring “Single index analysis” dashboard. It also shows how to deploy `pg_index_pilot`, generate controlled index bloat, and let `pg_index_pilot` automatically rebuild indexes when bloat exceeds the configured threshold during periodic runs.
4+
5+
### prerequisites
6+
7+
- Postgres instance
8+
- `pg_cron` extension available for scheduling periodic execution
9+
- `pgbench` installed for workload generation
10+
11+
## prepare the dataset in the target database
12+
13+
Create a table with several indexes and populate 10 million rows in the target database (e.g., `workloaddb`). This schema uses `test_pilot` schema and `items` table.
14+
15+
```bash
16+
psql -U postgres -d workloaddb <<'SQL'
17+
drop table if exists test_pilot.items cascade;
18+
drop sequence if exists test_pilot.items_id_seq;
19+
20+
create sequence test_pilot.items_id_seq as bigint;
21+
22+
create table test_pilot.items (
23+
id bigint primary key default nextval('test_pilot.items_id_seq'::regclass),
24+
email text not null,
25+
status text not null,
26+
data jsonb,
27+
created_at timestamptz not null default now(),
28+
amount numeric(12,2) not null default 0,
29+
category integer not null default 0,
30+
updated_at timestamptz
31+
);
32+
33+
alter sequence test_pilot.items_id_seq owned by test_pilot.items.id;
34+
35+
create index items_category_idx on test_pilot.items(category);
36+
create index items_status_idx on test_pilot.items(status);
37+
create index items_created_at_idx on test_pilot.items(created_at);
38+
create index items_email_idx on test_pilot.items(email);
39+
create index idx_items_data_gin on test_pilot.items using gin (data);
40+
41+
insert into test_pilot.items (email, status, data, created_at, amount, category, updated_at)
42+
select
43+
'user'||g||'@ex',
44+
(g % 10)::text,
45+
jsonb_build_object('k', g),
46+
now() - (g % 1000) * interval '1 sec',
47+
(g % 1000) / 10.0,
48+
(g % 10),
49+
now()
50+
from generate_series(1, 10000000) g;
51+
52+
select setval('test_pilot.items_id_seq', (select coalesce(max(id),0) from test_pilot.items));
53+
SQL
54+
```
55+
56+
### deploy pg_index_pilot
57+
58+
```bash
59+
# Clone the repository
60+
git clone https://gitlab.com/postgres-ai/pg_index_pilot
61+
cd pg_index_pilot
62+
63+
# 1) Create the control database
64+
psql -U postgres -c "create database index_pilot_control;"
65+
66+
# 2) Install required extensions in the control database
67+
psql -U postgres -d index_pilot_control -c "create extension if not exists postgres_fdw;"
68+
psql -U postgres -d index_pilot_control -c "create extension if not exists dblink;"
69+
70+
# 3) Install schema and functions in the control database
71+
psql -U postgres -d index_pilot_control -f index_pilot_tables.sql
72+
psql -U postgres -d index_pilot_control -f index_pilot_functions.sql
73+
psql -U postgres -d index_pilot_control -f index_pilot_fdw.sql
74+
```
75+
76+
### register the target database via FDW
77+
78+
Replace placeholders with actual connection details for your target database (the database where workload and indexes live; in examples below it is `workloaddb`).
79+
80+
```sql
81+
psql -U postgres -d index_pilot_control <<'SQL'
82+
create server if not exists target_workloaddb foreign data wrapper postgres_fdw
83+
options (host '127.0.0.1', port '5432', dbname 'workloaddb');
84+
85+
create user mapping if not exists for current_user server target_workloaddb
86+
options (user 'postgres', password 'your_password');
87+
88+
insert into index_pilot.target_databases(database_name, host, port, fdw_server_name, enabled)
89+
values ('workloaddb', '127.0.0.1', 5432, 'target_workloaddb', true)
90+
on conflict (database_name) do update
91+
set host = excluded.host,
92+
port = excluded.port,
93+
fdw_server_name = excluded.fdw_server_name,
94+
enabled = true;
95+
SQL
96+
```
97+
98+
Verify environment readiness:
99+
100+
```bash
101+
psql -U postgres -d index_pilot_control -c "select * from index_pilot.check_fdw_security_status();"
102+
psql -U postgres -d index_pilot_control -c "select * from index_pilot.check_environment();"
103+
```
104+
105+
### schedule periodic runs with pg_cron (run from the primary database)
106+
107+
Install `pg_cron` in the primary database (e.g., `postgres`) and schedule execution of `index_pilot.periodic` in the control database using `cron.schedule_in_database`:
108+
109+
```sql
110+
select cron.schedule_in_database(
111+
'pg_index_pilot_daily',
112+
'0 2 * * *',
113+
'call index_pilot.periodic(real_run := true);',
114+
'index_pilot_control' -- run in control database
115+
);
116+
SQL
117+
```
118+
119+
Behavior: when `index_pilot.periodic(true)` runs, it evaluates index bloat in the registered target database(s). If bloat for an index exceeds the configured `index_rebuild_scale_factor` at the time of a run, an index rebuild is initiated.
120+
121+
#
122+
123+
### run the workload with pgbench
124+
125+
Use two concurrent pgbench jobs: one generates updates that touch ranges of `id` and another performs point-lookups by `id`. This mix creates index bloat over time; when bloat exceeds the configured threshold during a periodic run, `pg_index_pilot` triggers a rebuild.
126+
127+
1) Create workload scripts on the machine where `pgbench` runs:
128+
129+
```bash
130+
cat >/root/workload/update.sql <<'SQL'
131+
\set id random(1,10000000)
132+
update test_pilot.items
133+
set updated_at = clock_timestamp()
134+
where id between :id and (:id + 100);
135+
SQL
136+
137+
cat >/root/workload/longselect.sql <<'SQL'
138+
\set id random(1,10000000)
139+
140+
select 1 from test_pilot.items where id = :id;
141+
142+
\sleep 300s
143+
SQL
144+
```
145+
146+
2) Start pgbench sessions against the target database (example: `workloaddb`):
147+
148+
```bash
149+
# Updates at limited rate (-R 50), 4 clients, 4 threads
150+
pgbench -n -h 127.0.0.1 -U postgres -d workloaddb -c 4 -j 4 -R 50 -P 10 -T 1000000000 -f /root/workload/update.sql
151+
152+
# Long selects, 2 clients, 2 threads
153+
pgbench -n -h 127.0.0.1 -U postgres -d workloaddb -c 2 -j 2 -P 10 -T 1000000000 -f /root/workload/longselect.sql
154+
```
155+
156+
Tip: run pgbench in tmux so workloads continue running after disconnects. Example:
157+
158+
```bash
159+
tmux new -d -s pgbench_updates 'env PGPASSWORD=<password> pgbench -n -h 127.0.0.1 -U postgres -d workloaddb -c 4 -j 4 -R 50 -P 10 -T 1000000000 -f /root/workload/update.sql'
160+
# Optional: run long selects in another tmux session
161+
tmux new -d -s pgbench_selects 'env PGPASSWORD=<password> pgbench -n -h 127.0.0.1 -U postgres -d workloaddb -c 2 -j 2 -P 10 -T 1000000000 -f /root/workload/longselect.sql'
162+
```
163+
164+
Let these processes run continuously. The updates will steadily create index bloat; every 20 minutes, `index_pilot.periodic(true)` evaluates bloat and, if thresholds are exceeded, initiates index rebuilds.
165+
166+
### monitor results
167+
168+
- In the postgres_ai monitoring included with this repository, use:
169+
- `Single index analysis` for targeted inspection
170+
171+

0 commit comments

Comments
 (0)