TWiki
>
Main Web
>
DeveloperInformation
>
DatabaseReplication
(2007-03-18,
SteveMarple
)
(raw view)
E
dit
A
ttach
%TOC% ---+ Database Replication %ICON{bubble}% I propose GAIA implements its own method of _lazy replication_ as described below as the options described in DatabaseReplicationOptions do not seem appropriate for GAIA. -- Main.SteveMarple - 16 Mar 2007 ---++ Assumptions * One master database. * Many read-only slave databases. * Slave databases are connected to the master database with a wide area network (WAN), so assume network is slow and unreliable. ---++ Requirements * Must be able to update at any time. * Must scale to many hosts (at least 50). * Must tolerate network outages. * Update must not interrupt any connections already using the database. * Must be able to change the master database (expect only infrequent changes). ---++ Implementation Use a form of _lazy replication_, making use of =rsync= to transfer database dumpfiles to the slave database servers. * Master database periodically creates a dump file. Create new files each time and leave some old files present so that any existing =rsync= operations are not broken. Remove files dumpfiles older than a certain age. * The name of the master host should be defined outside of the database (eg by a DNS hostname) otherwise if the master database goes down we have no automatic way to propagate the details of change of database master.<br> %ICON{bubble}% If the master goes down, how do we decide who gets to be the master? -- Main.MikkoSyrjaesuo - 16 Mar 2007<br>%ICON{bubble}% Someone with appropriate authority should change the DNS entry so that =master.gaia-vxo.org= points to a suitable host. It shouldn't matter which host is chosen, (as long as it has a reliable network connection) and isn't affected by any of the [[#Restrictions][restrictions]] on which hosts can be the master.-- Main.SteveMarple - 17 Mar 2007 * The slave database server uses a web service to obtain the name of the latest dumpfile and initiates =rsync= to update the last dumpfile it obtained. * The slave database server creates a new database (eg gaia_YYYYMMDDhhmmss) and initialises it with the latest dumpfile. Any local database configuration scripts are then run (eg to install stored procedures which may not contained in the dumpfile). The slave database server then updates its record of which database the GAIA PHP code should access. * The slave database can then delete at its leisure any old GAIA databases. =DROP= on databases which are in use will fail, that is not a problem and will not break any existing connections; all old databases should get dropped on either the first or second attempt.<br> %ICONURL{todo}% Investigate the effect PHP's persistent connections will have. As long as persistent connections timeout after a given period they should not cause any problems. ---+++ Notes * The first thing the master script(s) should do is to test if the host holds the master database. If it *does not* then quit. * The first thing the slave script(s) should do is to test if the host holds the master database. If it *does* then quit. These actions should allow any host to swap between master and slave modes. ---++ Restrictions * =rsyncd= must be available on the master node so that the slave nodes can download the database dump file. This poses a restriction on which hosts should be selected if the master node is taken offline.
E
dit
|
A
ttach
|
P
rint version
|
H
istory
: r4
<
r3
<
r2
<
r1
|
B
acklinks
|
V
iew topic
|
Ra
w
edit
|
M
ore topic actions
Topic revision: r4 - 2007-03-18 - 04:57:07 -
SteveMarple
Main
Log In
or
Register
Main Web
Create New Topic
Index
Search
Changes
Notifications
RSS Feed
Statistics
Preferences
Webs
Main
Sandbox
TWiki
Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki?
Send feedback