summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMarko Kreen2007-07-24 22:19:52 +0000
committerMarko Kreen2007-07-24 22:19:52 +0000
commitaa311a2891db4088f7054eb7eb3f7d7187497f41 (patch)
tree140c57717153bf62b40f3cc4a067b5d1cbb1a405
parent72b755e34829c98c3ac5da23797a17539fae7253 (diff)
doc update
-rw-r--r--doc/Makefile3
-rw-r--r--doc/TODO.txt77
-rw-r--r--doc/londiste.ref.txt261
-rw-r--r--doc/overview.txt23
-rw-r--r--doc/pgq-sql.txt1
5 files changed, 324 insertions, 41 deletions
diff --git a/doc/Makefile b/doc/Makefile
index 069781d4..1bf04d29 100644
--- a/doc/Makefile
+++ b/doc/Makefile
@@ -10,7 +10,8 @@ all:
upload:
devupload.sh overview.txt $(wiki)
- devupload.sh londiste.txt $(wiki)/LondisteUsage
+ #devupload.sh londiste.txt $(wiki)/LondisteUsage
+ devupload.sh londiste.ref.txt $(wiki)/LondisteReference
devupload.sh pgq-sql.txt $(wiki)/PgQdocs
devupload.sh pgq-nodupes.txt $(wiki)/PgqNoDupes
devupload.sh walmgr.txt $(wiki)/WalMgr
diff --git a/doc/TODO.txt b/doc/TODO.txt
index a1600946..8628ee91 100644
--- a/doc/TODO.txt
+++ b/doc/TODO.txt
@@ -1,58 +1,65 @@
-web:
- - walmgr
- - pgqadm
- - todo
-
-
-londiste link <qname>
-londiste unlink <qname>
-londiste add --skip-truncate (needs new field in table_state)
-
-pgqadm reg-copy que cons1 cons2
-pgqadm reg-move que cons1 cons2
+High-prority
+============
-queue-rename
-show-batch-events
-del-event
+* docs: londiste, pgq/python, pgq/sql, skytools, walmgr
-Immidiate
-=========
+Larger things
+-------------
+* chained replication, switchover
+* Write PgQ triggers in C
-* londiste swithcover support / deny triggers
-* deb: /etc/skylog.ini should be conffile
-* RemoteConsumer/SerialConsumer/pgq_ext sanity, too much duplication
+Smaller things
+--------------
* londiste * remove tbl should work also if table is already dropped
-
+* RemoteConsumer/SerialConsumer/pgq_ext sanity, too much duplication
* backend modules need to be ported to 8.3
+* pgqadm: separate priod for retry queue processing
+* londiste: create tables on subscriber
+* pgqadm: Utility commands:
+ reg-copy que cons1 cons2
+ reg-move que cons1 cons2
+ queue-rename
+ show-batch-events
+ del-event
-Near future
+Low-priority
============
-* pgqadm: separate priod for retry queue processing
-* londiste: create tables on subscriber
+Larger things
+-------------
+* denytriggers on subscriber
+* Quote SQL identifiers, keep combined name, rule will be "Split schema as first dot"
+* londiste: good fkey support:
+ store them in subscriber db and apply when both tables are in sync.
+* skylog/logdb: publish sample logdb schema, with some tools
+* londiste: allow table redirection on subscriber side
+
+Smaller things
+--------------
* skytools: switch for silence for cron scripts
-* docs: londiste, pgq/python, pgq/sql, skytools
-* txid: decide on renaming functions
-* logtriga: way to switch off logging for some connection
-* pgq: separately installable fkeys for all tables for testing
+* pgq: drop_fkeys.sql for live envs
* logdb: hostname
-* pgq_ext: solve event tracking
* contrib/*.sql loading from python - need to check db version
+* DBScript: failure to write pidfile should be logged (crontscripts)
* ideas from SlonyI:
- force timestamps to ISO
- when buffering queries, check their size
-* DBScript: failure to write pidfile should be logged (crontscripts)
-* logtriga/textbuf.c should be converted to StringInfo to get rid of
- buffer management code.
Just ideas
===========
-* logtriga: use pgq.insert_event_directly?
-* pgq/sql: rewrite pgq.insert_event in C or logtriga in plpython?
* skytools: config-less operation?
* skytools: config from database?
* skytools: partial sql parser for log processing
-* pgqadm ticker logic into db, to make easier other implementations?
+
+Dropped
+-------
+* txid: decide on renaming functions
+
+walmgr
+======
+
+- copy master config to slave
+- slave needs to decide which config to use
diff --git a/doc/londiste.ref.txt b/doc/londiste.ref.txt
new file mode 100644
index 00000000..9e4ee368
--- /dev/null
+++ b/doc/londiste.ref.txt
@@ -0,0 +1,261 @@
+
+[[TableOfContents]]
+
+= Notes =
+
+== PgQ daemon ==
+
+Londiste runs as a consumer on PgQ. Thus `pgqadm.py ticker` must be running
+on provider database.
+
+== Table Names ==
+
+Londiste internally uses table names always fully schema-qualified.
+If table name without schema is given on command line, it just
+puts "public." in front of it, without looking at search_path.
+
+== PgQ events ==
+
+''' Table change event '''
+
+Those events will be inserted by triggers on tables.
+
+ * ev_type = 'I' / 'U' / 'D'
+ * ev_data = partial SQL statement - the part between `[]` is removed:
+ * `[ INSERT INTO table ] (column1, column2) values (value1, value2)`
+ * `[ UPDATE table SET ] column2=value2 WHERE pkeycolumn1 = value1`
+ * `[ DELETE FROM table WHERE ] pkeycolumn1 = value1`
+ * ev_extra1 = table name with schema
+
+Such partial SQL format is used for 2 reasons - to conserve space
+and to make possible to redirect events to another table.
+
+''' Registration change event '''
+
+Those events will be inserted by `provider add` and `provider remove`
+commands. Then full registered tables list will be sent to the queue
+so subscribers can update their own registrations.
+
+ * ev_type = 'T'
+ * ev_data = comma-separated list of table names.
+
+Currently subscribers only remove tables that were removed from provider.
+In the future it's possible to make subscribers also automatically add
+tables that were added on provider.
+
+== log file ==
+
+Londiste normal log consist just of statistics log-lines, key-value
+pairs between `{}`. Their meaning:
+
+ * count: how many event was in batch.
+ * ignored: how many of them was ignores - table not registered on subscriber or not yet in sync.
+ * duration: how long the batch processing took.
+
+= Commands for managing provider database =
+
+== provider install ==
+
+{{{
+londiste.py <config.ini> provider install
+}}}
+
+Installs code into provider and subscriber database and creates queue.
+Equivalent to doing following by hand:
+
+{{{
+CREATE LANGUAGE plpgsql;
+CREATE LANGUAGE plpython;
+\i .../contrib/txid.sql
+\i .../contrib/pgq.sql
+\i .../contrib/londiste.sql
+select pgq.create_queue(queue name);
+}}}
+
+Notes:
+
+ * If the PostgreSQL modules are not installed on same machine
+ the Python scripts are, the commands need to be done by hand.
+
+ * The schema/tables are installed under user Londiste is configured to run.
+ If you prefer to run Londiste under non-admin user, they should also
+ be installed by hand.
+
+== provider add ==
+
+{{{
+londiste.py <config.ini> provider add <table name> ...
+}}}
+
+Registers table on provider database and adds trigger to the table
+that will send events to the queue.
+
+== provider remove ==
+
+{{{
+londiste.py <config.ini> provider remove <table name> ...
+}}}
+
+Unregisters table on provider side and removes triggers on table.
+The event about table removal is also sent to the queue, so
+all subscriber unregister table from their end also.
+
+== provider tables ==
+
+{{{
+londiste.py <config.ini> provider tables
+}}}
+
+Shows registered tables on provider side.
+
+== provider seqs ==
+
+{{{
+londiste.py <config.ini> provider seqs
+}}}
+
+Shows registered sequences on provider side.
+
+= Commands for managing subscriber database =
+
+== subscriber install ==
+
+{{{
+londiste.py <config.ini> subscriber install
+}}}
+
+Installs code into subscriber database.
+Equivalent to doing following by hand:
+
+{{{
+CREATE LANGUAGE plpgsql;
+\i .../contrib/londiste.sql
+}}}
+
+This will be done under Londiste user, if the tables should be
+owned by someone else, it needs to be done by hand.
+
+== subscriber add ==
+
+{{{
+londiste.py <config.ini> subscriber add <table name> ... [--excect-sync | --skip-truncate | --force]
+}}}
+
+Registers table on subscriber side.
+
+Switches
+
+ * --excect-sync: Table is tagged as in-sync so initial COPY is skipped.
+ * --skip-truncate: When doing initial COPY, don't remove old data.
+ * --force: Ignore table structure differences.
+
+== subscriber remove ==
+
+{{{
+londiste.py <config.ini> subscriber remove <table name> ...
+}}}
+
+Unregisters the table from subscriber. No events will be applied
+to the table anymore. Actual table will not be touched.
+
+== subscriber resync ==
+
+{{{
+londiste.py <config.ini> subscriber resync <table name> ...
+}}}
+
+Tags tables are "not synced." Later replay process will notice this
+and launch `copy` process to sync the table again.
+
+= Replication commands =
+
+== replay ==
+
+The actual replication process. Should be run as daemon with `-d` switch,
+because it needs to be always running.
+
+It main task is to get a batches from PgQ and apply them in one transaction.
+
+Basic logic:
+ * Get batch from PgQ queue on provider. See if it is already applied to
+ subsciber, skip the batch in that case.
+ * Management actions, can do transactions on subscriber:
+ * Load table state from subscriber, to be up-to-date on registrations
+ and `copy` processes running in parallel.
+ * If a `copy` process wants to give table over to main process,
+ wait until `copy` process catches-up.
+ * If there is a table that is not synced and no `copy` process
+ is already running, launch new `copy` process.
+ * If there are sequences registered on subscriber, look latest state
+ of them on provider and apply it to subscriber.
+ * Event replay, all in one transaction on subscriber:
+ * Apply events from the batch, only for tables that are registered
+ on subscriber and are in sync.
+ * Store tick_id on subscriber.
+
+== copy ==
+
+Internal command for initial SYNC. Launched by `replay` if it notices
+that some tables are not in sync. The reason to do table copying in
+separate process is to avoid locking down main replay process for
+lond time.
+
+Basic logic:
+ * Register on the same queue in parallel with different name.
+ * One transaction on :
+ * Drop constraints and indexes.
+ * Truncate table.
+ * COPY data in.
+ * Restore constraints and indexes.
+ * Tag the table as `catching-up`.
+ * When catching-up, the `copy` process acts as regular
+ `replay` process but just for one table.
+ * When it reaches queue end, when no more batches are immidiately
+ available, it hands the table over to main `replay` process.
+
+= Utility commands =
+
+== repair ==
+
+it tries to achieve a state where tables should be in sync and then compares
+them and writes out SQL statements that would fix differences.
+
+Syncing happens by locking provider tables against updates and then waiting
+unitl `replay` has applied all pending changes to subscriber database. As this
+is dangerous operation, it has hardwired limit of 10 seconds for locking. If
+`replay process does not catch up in that time, locks are releases and operation
+is canceled.
+
+Comparing happens by dumping out table from both sides, sorting them and
+then comparing line-by-line. As this is CPU and memory-hungry operation,
+good practice is to run the `repair` command on third machine, to avoid
+consuming resources on neither provider nor subscriber.
+
+== compare ==
+
+it syncs tables like repair, but just runs SELECT count(*) on both sides,
+to get a little bit cheaper but also less precise way of checking
+if tables are in sync.
+
+= Config file =
+
+{{{
+[londiste]
+job_name = test_to_subcriber
+
+# source database, where the queue resides
+provider_db = dbname=provider port=6000 host=127.0.0.1
+
+# destination database
+subscriber_db = dbname=subscriber port=6000 host=127.0.0.1
+
+# the queue where to listen on
+pgq_queue_name = londiste.replika
+
+# where to log
+logfile = ~/log/%(job_name)s.log
+
+# pidfile is used for avoiding duplicate processes
+pidfile = ~/pid/%(job_name)s.pid
+
+}}}
diff --git a/doc/overview.txt b/doc/overview.txt
index 3dee45db..447ab383 100644
--- a/doc/overview.txt
+++ b/doc/overview.txt
@@ -30,30 +30,45 @@ Replication engine written in Python. It uses PgQ as transport mechanism.
Its main goals are robustness and easy usage. Thus its not as complete
and featureful as Slony-I.
-Docs: ./LondisteUsage
+[http://pgsql.tapoueh.org/londiste.html Tutorial] written by Dimitri Fontaine.
+
+Reference Docs: ./LondisteReference
''' Features '''
* Tables can be added one-by-one into set.
* Initial COPY for one table does not block event replay for other tables.
* Can compare tables on both sides.
+ * Supports sequences.
* Easy installation.
''' Missing features '''
- * No support for sequences. Thus its not possible to use it for keeping
- failover server up-to-date. We use WalMgr for that.
-
* Does not understand cascaded replication, when one subscriber acts
as provider to another one and it dies, the last one loses sync with the first one.
In other words - it understands only pair of servers.
''' Sample usage '''
{{{
+## install pgq on provider:
+$ pgqadm.py provider_ticker.ini install
+
+## run ticker on provider:
+$ pgqadm.py provider_ticker.ini ticker -d
+
+## install Londiste in provider
$ londiste.py replic.ini provider install
+
+## install Londiste in subscriber
$ londiste.py replic.ini subscriber install
+
+## start replication daemon
$ londiste.py replic.ini replay -d
+
+## activate tables on provider
$ londiste.py replic.ini provider add users orders
+
+## add tables to subscriber
$ londiste.py replic.ini subscriber add users
}}}
diff --git a/doc/pgq-sql.txt b/doc/pgq-sql.txt
index 9414594e..4e662f9d 100644
--- a/doc/pgq-sql.txt
+++ b/doc/pgq-sql.txt
@@ -188,4 +188,3 @@ When all done, notify core about it:
{{{
select pgq.finish_batch(batch_id)
}}}
-