summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
authorJoerg Jaspert <joerg@debian.org>2008-11-05 23:49:58 +0100
committerJoerg Jaspert <joerg@debian.org>2008-11-05 23:49:58 +0100
commitfb9943e7fbf1e8ffc83f0ea135ac9f75f44c2b97 (patch)
tree1bc4819e8468f49cad680acde4671b74f766b121 /README
parent2fb0a787b8d2f3ff2dc7bc5335ab937f9b88ad3b (diff)
readme
add pretty large readme Signed-off-by: Joerg Jaspert <joerg@debian.org>
Diffstat (limited to 'README')
-rw-r--r--README216
1 files changed, 202 insertions, 14 deletions
diff --git a/README b/README
index 0adab69..9687b85 100644
--- a/README
+++ b/README
@@ -1,23 +1,211 @@
Archvsync
=========
-This is the centralized archvsync configuration. It is a simple git
-repository that is currently hosted on ftp-master.debian.org AKA ries.
+This is the central repository for the Debian mirror scripts. The scripts
+in this repository are written for the purposes of maintaining a Debian
+archive mirror (and shortly, a Debian bug mirror), but they should be
+easily generalizable.
-Use the following conventions in here:
+Currently the following scripts are available:
- - As much as possible should be in the master branch (and as such
- affect every machine).
+ * ftpsync - Used to sync an archive using rsync
+ * runmirrors - Used to notify leaf nodes of available updates
+ * dircombine - Internal script to manage the mirror user's $HOME
+ on debian.org machines
+ * typicalsync - Generates a typical Debian mirror
+ * udh - We are lazy, just a shorthand to avoid typing the
+ commands, ignore... :)
- - Use machine specific branches for those things that *have* to be
- machine specific. Examples for such cases are the
- .ssh/authorized_keys files as well as the actual list of hosts to
- push to / mirror from.
+Usage
+=====
+For impatient people, short usage instruction:
+ - Create a dedicated user for the whole mirror.
+ - Create a seperate directory for the mirror, writeable by the new user.
+ - Place the ftpsync script in the mirror user's $HOME/ (or $HOME/bin)
+ - Place the ftpsync.conf.sample into $HOME/etc as ftpsync.conf and edit
+ it to suit your system. You should at the very least change the TO=
+ line.
+ - Setup .ssh/authorized_keys for the mirror user and place the public key of
+ your upstream mirror into it. Preface it with
+no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty,command="~/bin/ftpsync",from="IPADDRESS"
+ and replace $IPADDRESS with that of your upstream mirror.
+ - You are finished
-!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
-!!! NOTE: Do NOT store ssh private keys or any kind of passwords in !!!
-!!! here. The existance of etc/secrets is just for the existance of the !!!
-!!! directory. Nothing more. !!!
-!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
+In order to receive different pushes or syncs from different archives,
+name the config file ftpsync-$ARCHIVE.conf and call the ftpsync script
+with the commandline "sync:archive:$ARCHIVE". Replace $ARCHIVE with a
+sensible value.
+
+
+
+Debian mirror script minimum requirements
+=========================================
+As always, you may use whatever scripts you want for your Debian mirror.
+However, if you want to be listed as a Primary mirror it must support
+the following minimal functionality:
+
+ - Must perform a 2-stage sync
+ The archive mirroring must be done in 2 stages. The first rsync run
+ must ignore the index files. The correct exclude options for the
+ first rsync run are:
+ --exclude Packages* --exclude Sources* --exclude Release* --exclude ls-lR*
+ The first stage must not delete any files.
+
+ The second stage should then transfer the above excluded files and
+ delete files that no longer belong on the mirror.
+
+ Rationale: If archive mirroring is done in a single stage, there will be
+ periods of time during which the index files will reference files not
+ yet mirrored.
+
+ - Must not ignore pushes whil(e|st) running.
+ If a push is received during a run of the mirror sync, it MUST NOT
+ be ignored. The whole synchronization process must be rerun.
+
+ Rationale: Most implementations of Debian mirror scripts will leave the
+ mirror in an inconsistent state in the event of a second push being
+ received while the first sync is still running. It is likely that in
+ the near future, the frequency of pushes will increase.
+
+ - Should understand multi-stage pushes.
+ The script should parse the arguments it gets via ssh, and if they
+ contain a hint to only sync stage1 or stage2, then ONLY those steps
+ SHOULD be performed.
+
+ Rationale: This enables us to coordinate the timing of the first
+ and second stage pushes and minimize the time during which the
+ archive is desynchronized. This is especially important for mirrors
+ that are involved in a round robin or GeoDNS setup.
+
+ The minimum arguments the script has to understand are:
+ sync:stage1 Only sync stage1
+ sync:stage2 Only sync stage2
+ sync:all Do everything. Default if none of stage1/2 are
+ present.
+ There are more possible arguments, for a complete list see the
+ ftpsync script in our git repository.
+
+
+
+ftpsync
+=======
+
+This script is based on the old anonftpsync script. It has been rewritten
+to add flexibilty and fix a number of outstanding issues.
+
+Some of the advantages of the new version are:
+ - Nearly every aspect is configurable
+ - Correct support for multiple pushes
+ - Support for multi-stage archive synchronisations
+ - Support for hook scripts at various points
+ - Support for multiple archives, even if they are pushed using one ssh key
+
+ Correct support for multiple pushes
+ -----------------------------------
+ When the script receives a second push while it is running and syncing
+ the archive it won't ignore it. Instead it will rerun the
+ synchronisation step to ensure the archive is correctly synchronised.
+
+ Scripts that fail to do that risk ending up with an inconsistent archive.
+
+
+ Can do multi-stage archive synchronisations
+ -------------------------------------------
+ The script can be told to only perform the first or second stage of the
+ archive synchronisation.
+
+ This enables us to send all the binary packages and sources to a
+ number of mirrors, and then tell all of them to sync the
+ Packages/Release files at once. This will keep the timeframe in which
+ the mirrors are out of sync very small and will greatly help things like
+ DNS RR entries or even the planned GeoDNS setup.
+
+
+ Can run hook scripts
+ --------------------
+ ftpsync currently allows 5 hook scripts to run at various points of the
+ mirror sync run.
+
+ Hook1: After lock is acquired, before first rsync
+ Hook2: After first rsync, if successful
+ Hook3: After second rsync, if successful
+ Hook4: Right before leaf mirror triggering
+ Hook5: After leaf mirror trigger (only if we have slave mirrors; HUB=true)
+
+ Note that Hook3 and Hook4 are likely to be called directly after each other.
+ The difference is that Hook3 is called *every* time the second rsync
+ succeeds even if the mirroring needs to re-run due to a second push.
+ Hook4 is only executed if mirroring is completed.
+
+
+ Support for multiple archives, even if they are pushed using one ssh key
+ ------------------------------------------------------------------------
+ If you get multiple archives from your upstream mirror (say Debian,
+ Debian-Backports and Volatile), previously you had to use 3 different ssh
+ keys to be able to automagically synchronize them. This script can do it
+ all with just one key, if your upstream mirror tells you which archive.
+ See "Commandline/SSH options" below for further details.
+
+
+For details of all available options, please see the extensive documentation
+in the sample configuration file.
+
+
+Commandline/SSH options
+=======================
+Script options may be set either on the local command line, or passed by
+specifying an ssh "command". Local commandline options always have
+precedence over the SSH_ORIGINAL_COMMAND ones.
+
+Currently this script understands the options listed below. To make them
+take effect they MUST be prepended by "sync:".
+
+Option Behaviour
+stage1 Only do stage1 sync
+stage2 Only do stage2 sync
+all Do a complete sync (default)
+archive:foo Sync archive foo (if the file $HOME/etc/ftpsync-foo.conf
+ exists and is configured)
+callback Call back when done (needs proper ssh setup for this to
+ work). It will always use the "command" callback:$HOSTNAME
+ where $HOSTNAME is the one defined in config and
+ will happen before slave mirrors are triggered.
+
+So, to get the script to sync all of the archive behind bpo and call back when
+it is complete, use an upstream trigger of
+ssh $USER@$HOST sync:all sync:archive:bpo sync:callback
+
+
+runmirrors
+==========
+This script is used to tell leaf mirrors that it is time to synchronize
+their copy of the archive. This is done by parsing a mirror list and
+using ssh to "push" the leaf nodes. You can read much more about the
+principle behind the push at [1], essentially it tells the receiving
+end to run a pre-defined script. As the whole setup is extremely limited
+and the ssh key is not usable for anything else than the pre-defined
+script this is the most secure method for such an action.
+
+This script supports two types of pushes: The normal single stage push,
+as well as the newer multi-stage push.
+
+The normal push, as described above, will simply push the leaf node and
+then go on with the other nodes.
+
+The multi-staged push first pushes a mirror and tells it to only do a
+stage1 sync run. Then it waits for the mirror (and all others being pushed
+in the same run) to finish that run, before it tells all of the staged
+mirrors to do the stage2 sync.
+
+This way you can do a nearly-simultaneous update of multiple hosts.
+This is useful in situations where periods of desynchronization should
+be kept as small as possible. Examples of scenarios where this might be
+useful include multiple hosts in a DNS Round Robin entry.
+
+For details on the mirror list please see the documented
+runmirrors.mirror.sample file.
+
+
+[1] http://blog.ganneff.de/blog/2007/12/29/ssh-triggers.html