diff options
author | Marko Kreen | 2007-07-24 22:19:52 +0000 |
---|---|---|
committer | Marko Kreen | 2007-07-24 22:19:52 +0000 |
commit | aa311a2891db4088f7054eb7eb3f7d7187497f41 (patch) | |
tree | 140c57717153bf62b40f3cc4a067b5d1cbb1a405 | |
parent | 72b755e34829c98c3ac5da23797a17539fae7253 (diff) |
doc update
-rw-r--r-- | doc/Makefile | 3 | ||||
-rw-r--r-- | doc/TODO.txt | 77 | ||||
-rw-r--r-- | doc/londiste.ref.txt | 261 | ||||
-rw-r--r-- | doc/overview.txt | 23 | ||||
-rw-r--r-- | doc/pgq-sql.txt | 1 |
5 files changed, 324 insertions, 41 deletions
diff --git a/doc/Makefile b/doc/Makefile index 069781d4..1bf04d29 100644 --- a/doc/Makefile +++ b/doc/Makefile @@ -10,7 +10,8 @@ all: upload: devupload.sh overview.txt $(wiki) - devupload.sh londiste.txt $(wiki)/LondisteUsage + #devupload.sh londiste.txt $(wiki)/LondisteUsage + devupload.sh londiste.ref.txt $(wiki)/LondisteReference devupload.sh pgq-sql.txt $(wiki)/PgQdocs devupload.sh pgq-nodupes.txt $(wiki)/PgqNoDupes devupload.sh walmgr.txt $(wiki)/WalMgr diff --git a/doc/TODO.txt b/doc/TODO.txt index a1600946..8628ee91 100644 --- a/doc/TODO.txt +++ b/doc/TODO.txt @@ -1,58 +1,65 @@ -web: - - walmgr - - pgqadm - - todo - - -londiste link <qname> -londiste unlink <qname> -londiste add --skip-truncate (needs new field in table_state) - -pgqadm reg-copy que cons1 cons2 -pgqadm reg-move que cons1 cons2 +High-prority +============ -queue-rename -show-batch-events -del-event +* docs: londiste, pgq/python, pgq/sql, skytools, walmgr -Immidiate -========= +Larger things +------------- +* chained replication, switchover +* Write PgQ triggers in C -* londiste swithcover support / deny triggers -* deb: /etc/skylog.ini should be conffile -* RemoteConsumer/SerialConsumer/pgq_ext sanity, too much duplication +Smaller things +-------------- * londiste * remove tbl should work also if table is already dropped - +* RemoteConsumer/SerialConsumer/pgq_ext sanity, too much duplication * backend modules need to be ported to 8.3 +* pgqadm: separate priod for retry queue processing +* londiste: create tables on subscriber +* pgqadm: Utility commands: + reg-copy que cons1 cons2 + reg-move que cons1 cons2 + queue-rename + show-batch-events + del-event -Near future +Low-priority ============ -* pgqadm: separate priod for retry queue processing -* londiste: create tables on subscriber +Larger things +------------- +* denytriggers on subscriber +* Quote SQL identifiers, keep combined name, rule will be "Split schema as first dot" +* londiste: good fkey support: + store them in subscriber db and apply when both tables are in sync. +* skylog/logdb: publish sample logdb schema, with some tools +* londiste: allow table redirection on subscriber side + +Smaller things +-------------- * skytools: switch for silence for cron scripts -* docs: londiste, pgq/python, pgq/sql, skytools -* txid: decide on renaming functions -* logtriga: way to switch off logging for some connection -* pgq: separately installable fkeys for all tables for testing +* pgq: drop_fkeys.sql for live envs * logdb: hostname -* pgq_ext: solve event tracking * contrib/*.sql loading from python - need to check db version +* DBScript: failure to write pidfile should be logged (crontscripts) * ideas from SlonyI: - force timestamps to ISO - when buffering queries, check their size -* DBScript: failure to write pidfile should be logged (crontscripts) -* logtriga/textbuf.c should be converted to StringInfo to get rid of - buffer management code. Just ideas =========== -* logtriga: use pgq.insert_event_directly? -* pgq/sql: rewrite pgq.insert_event in C or logtriga in plpython? * skytools: config-less operation? * skytools: config from database? * skytools: partial sql parser for log processing -* pgqadm ticker logic into db, to make easier other implementations? + +Dropped +------- +* txid: decide on renaming functions + +walmgr +====== + +- copy master config to slave +- slave needs to decide which config to use diff --git a/doc/londiste.ref.txt b/doc/londiste.ref.txt new file mode 100644 index 00000000..9e4ee368 --- /dev/null +++ b/doc/londiste.ref.txt @@ -0,0 +1,261 @@ + +[[TableOfContents]] + += Notes = + +== PgQ daemon == + +Londiste runs as a consumer on PgQ. Thus `pgqadm.py ticker` must be running +on provider database. + +== Table Names == + +Londiste internally uses table names always fully schema-qualified. +If table name without schema is given on command line, it just +puts "public." in front of it, without looking at search_path. + +== PgQ events == + +''' Table change event ''' + +Those events will be inserted by triggers on tables. + + * ev_type = 'I' / 'U' / 'D' + * ev_data = partial SQL statement - the part between `[]` is removed: + * `[ INSERT INTO table ] (column1, column2) values (value1, value2)` + * `[ UPDATE table SET ] column2=value2 WHERE pkeycolumn1 = value1` + * `[ DELETE FROM table WHERE ] pkeycolumn1 = value1` + * ev_extra1 = table name with schema + +Such partial SQL format is used for 2 reasons - to conserve space +and to make possible to redirect events to another table. + +''' Registration change event ''' + +Those events will be inserted by `provider add` and `provider remove` +commands. Then full registered tables list will be sent to the queue +so subscribers can update their own registrations. + + * ev_type = 'T' + * ev_data = comma-separated list of table names. + +Currently subscribers only remove tables that were removed from provider. +In the future it's possible to make subscribers also automatically add +tables that were added on provider. + +== log file == + +Londiste normal log consist just of statistics log-lines, key-value +pairs between `{}`. Their meaning: + + * count: how many event was in batch. + * ignored: how many of them was ignores - table not registered on subscriber or not yet in sync. + * duration: how long the batch processing took. + += Commands for managing provider database = + +== provider install == + +{{{ +londiste.py <config.ini> provider install +}}} + +Installs code into provider and subscriber database and creates queue. +Equivalent to doing following by hand: + +{{{ +CREATE LANGUAGE plpgsql; +CREATE LANGUAGE plpython; +\i .../contrib/txid.sql +\i .../contrib/pgq.sql +\i .../contrib/londiste.sql +select pgq.create_queue(queue name); +}}} + +Notes: + + * If the PostgreSQL modules are not installed on same machine + the Python scripts are, the commands need to be done by hand. + + * The schema/tables are installed under user Londiste is configured to run. + If you prefer to run Londiste under non-admin user, they should also + be installed by hand. + +== provider add == + +{{{ +londiste.py <config.ini> provider add <table name> ... +}}} + +Registers table on provider database and adds trigger to the table +that will send events to the queue. + +== provider remove == + +{{{ +londiste.py <config.ini> provider remove <table name> ... +}}} + +Unregisters table on provider side and removes triggers on table. +The event about table removal is also sent to the queue, so +all subscriber unregister table from their end also. + +== provider tables == + +{{{ +londiste.py <config.ini> provider tables +}}} + +Shows registered tables on provider side. + +== provider seqs == + +{{{ +londiste.py <config.ini> provider seqs +}}} + +Shows registered sequences on provider side. + += Commands for managing subscriber database = + +== subscriber install == + +{{{ +londiste.py <config.ini> subscriber install +}}} + +Installs code into subscriber database. +Equivalent to doing following by hand: + +{{{ +CREATE LANGUAGE plpgsql; +\i .../contrib/londiste.sql +}}} + +This will be done under Londiste user, if the tables should be +owned by someone else, it needs to be done by hand. + +== subscriber add == + +{{{ +londiste.py <config.ini> subscriber add <table name> ... [--excect-sync | --skip-truncate | --force] +}}} + +Registers table on subscriber side. + +Switches + + * --excect-sync: Table is tagged as in-sync so initial COPY is skipped. + * --skip-truncate: When doing initial COPY, don't remove old data. + * --force: Ignore table structure differences. + +== subscriber remove == + +{{{ +londiste.py <config.ini> subscriber remove <table name> ... +}}} + +Unregisters the table from subscriber. No events will be applied +to the table anymore. Actual table will not be touched. + +== subscriber resync == + +{{{ +londiste.py <config.ini> subscriber resync <table name> ... +}}} + +Tags tables are "not synced." Later replay process will notice this +and launch `copy` process to sync the table again. + += Replication commands = + +== replay == + +The actual replication process. Should be run as daemon with `-d` switch, +because it needs to be always running. + +It main task is to get a batches from PgQ and apply them in one transaction. + +Basic logic: + * Get batch from PgQ queue on provider. See if it is already applied to + subsciber, skip the batch in that case. + * Management actions, can do transactions on subscriber: + * Load table state from subscriber, to be up-to-date on registrations + and `copy` processes running in parallel. + * If a `copy` process wants to give table over to main process, + wait until `copy` process catches-up. + * If there is a table that is not synced and no `copy` process + is already running, launch new `copy` process. + * If there are sequences registered on subscriber, look latest state + of them on provider and apply it to subscriber. + * Event replay, all in one transaction on subscriber: + * Apply events from the batch, only for tables that are registered + on subscriber and are in sync. + * Store tick_id on subscriber. + +== copy == + +Internal command for initial SYNC. Launched by `replay` if it notices +that some tables are not in sync. The reason to do table copying in +separate process is to avoid locking down main replay process for +lond time. + +Basic logic: + * Register on the same queue in parallel with different name. + * One transaction on : + * Drop constraints and indexes. + * Truncate table. + * COPY data in. + * Restore constraints and indexes. + * Tag the table as `catching-up`. + * When catching-up, the `copy` process acts as regular + `replay` process but just for one table. + * When it reaches queue end, when no more batches are immidiately + available, it hands the table over to main `replay` process. + += Utility commands = + +== repair == + +it tries to achieve a state where tables should be in sync and then compares +them and writes out SQL statements that would fix differences. + +Syncing happens by locking provider tables against updates and then waiting +unitl `replay` has applied all pending changes to subscriber database. As this +is dangerous operation, it has hardwired limit of 10 seconds for locking. If +`replay process does not catch up in that time, locks are releases and operation +is canceled. + +Comparing happens by dumping out table from both sides, sorting them and +then comparing line-by-line. As this is CPU and memory-hungry operation, +good practice is to run the `repair` command on third machine, to avoid +consuming resources on neither provider nor subscriber. + +== compare == + +it syncs tables like repair, but just runs SELECT count(*) on both sides, +to get a little bit cheaper but also less precise way of checking +if tables are in sync. + += Config file = + +{{{ +[londiste] +job_name = test_to_subcriber + +# source database, where the queue resides +provider_db = dbname=provider port=6000 host=127.0.0.1 + +# destination database +subscriber_db = dbname=subscriber port=6000 host=127.0.0.1 + +# the queue where to listen on +pgq_queue_name = londiste.replika + +# where to log +logfile = ~/log/%(job_name)s.log + +# pidfile is used for avoiding duplicate processes +pidfile = ~/pid/%(job_name)s.pid + +}}} diff --git a/doc/overview.txt b/doc/overview.txt index 3dee45db..447ab383 100644 --- a/doc/overview.txt +++ b/doc/overview.txt @@ -30,30 +30,45 @@ Replication engine written in Python. It uses PgQ as transport mechanism. Its main goals are robustness and easy usage. Thus its not as complete and featureful as Slony-I. -Docs: ./LondisteUsage +[http://pgsql.tapoueh.org/londiste.html Tutorial] written by Dimitri Fontaine. + +Reference Docs: ./LondisteReference ''' Features ''' * Tables can be added one-by-one into set. * Initial COPY for one table does not block event replay for other tables. * Can compare tables on both sides. + * Supports sequences. * Easy installation. ''' Missing features ''' - * No support for sequences. Thus its not possible to use it for keeping - failover server up-to-date. We use WalMgr for that. - * Does not understand cascaded replication, when one subscriber acts as provider to another one and it dies, the last one loses sync with the first one. In other words - it understands only pair of servers. ''' Sample usage ''' {{{ +## install pgq on provider: +$ pgqadm.py provider_ticker.ini install + +## run ticker on provider: +$ pgqadm.py provider_ticker.ini ticker -d + +## install Londiste in provider $ londiste.py replic.ini provider install + +## install Londiste in subscriber $ londiste.py replic.ini subscriber install + +## start replication daemon $ londiste.py replic.ini replay -d + +## activate tables on provider $ londiste.py replic.ini provider add users orders + +## add tables to subscriber $ londiste.py replic.ini subscriber add users }}} diff --git a/doc/pgq-sql.txt b/doc/pgq-sql.txt index 9414594e..4e662f9d 100644 --- a/doc/pgq-sql.txt +++ b/doc/pgq-sql.txt @@ -188,4 +188,3 @@ When all done, notify core about it: {{{ select pgq.finish_batch(batch_id) }}} - |