Description | Download | Usage | Configuration | FAQ | To do | Feedback | Links
Current version: 0.8 (2004.02.15)
sync2cd is an incremental archiving tool. It allows backing up complete filesystem hierarchies to multiple backup media (e.g. CD-R). Files are archived incrementally, i.e. only new or changed files are stored during an archive operation.
All entity types are supported: directories, files, symlinks, named pipes, sockets, block and character devices.
sync2cd is released under the GNU General Public License. The current version can be downloaded here.
sync2cd requires at least Python 2.2, and
provides an installer based on distutils
. This means that installation
to the default location (/usr
) is as simple as:
python setup.py installTo install sync2cd to a specific location, e.g.
/usr/local
, enter:
python setup.py install --prefix=/usr/local
sync2cd has been tested with Python 2.2.2 and 2.3.3 on Linux. It might work with other version and platform combinations, YMMV. If you were able to make it work with another configuration, please drop me a line (and, if necessary, send me a patch ;-)
Basic usage information is output with sync2cd.py --help
:
[joe@hobbes sync2cd]$ ./sync2cd.py --help sync2cd.py 0.8 Synchronization to CD-R Copyright 2003-2004 Remy Blank Usage: sync2cd.py [commands] [options] ConfigFile Commands: -c, --create Create a new archive descriptor -g, --graft-list Output a graft list for an archive -h, --help Show this text -p, --print Print archive information -s, --status Print current synchronization status Options: -a N, --archive N Operate on archive number N -m N, --medium-size N Set archive medium size to N -v, --verbose Be more verbose
Commands define what will be done and what will be output to stdout. Several commands
can be specified at the same time, and will be executed in a sensible order (e.g.
--create
before --graft-list
).
Options allow passing arguments to the selected commands. If the same option is specified on the command line and in the configuration file, the command line takes precedence.
N
.
Note that this option will have no effect with --create
and will be overridden by the newly created archive.mkisofs
with the -graft-points
option.N
. Corresponds to the function
MediumSize()
in the configuration file.
--verbose
is also specified, the list of files contained in the archive is also output.
--verbose
is also specified, the list
of files that need to be archived is also output.Here are a few basic examples:
Create a new archive and burn a CD-R (insert a blank disk):
sync2cd.py -c -g pictures | mkisofs -J -r -graft-points -path-list - -quiet | cdrecord -v -waiti -data -
Check how much data will be stored on the next archive:
sync2cd.py -s pictures
Print the contents of archive number 2
:
sync2cd.py -p -a 2 -v pictures
The configuration file is actually Python code calling functions defined in sync2cd.py and passing configuration information. The functions available are described below.
Set the current working directory to Dir
before starting. All paths
specified with Input()
are relative to this directory. This option
corresponds to the -C
or --directory
option
of tar
.
Default: | . (current directory) |
Example: | BaseDir("/home") |
Exclude files matching the shell pattern Pattern
from the archive.
Several exclude patterns can be specified. The pattern matching is done against
the path relative to BaseDir()
. If a directory matches an exclude
pattern, it is not recursed into.
As usual with shell patterns, a *
wildcard matches zero or more
characters except path separators (e.g. "/" on *nix). A new wildcard,
**
, matches zero or more characters, including path separators.
Example: | ExcludeGlob("Music/Country/**.mp3") This excludes all mp3 files in Music/Country and in all
subdirectories. |
Exclude files matching the regular expression Pattern
from the
archive. Several exclude patterns can be specified. The pattern matching is
done against the path relative to BaseDir()
. If a directory
matches an exclude pattern, it is not recursed into.
For more information about regular expression syntax in Python, see this page.
Example: | ExcludeRegexp("Music/Country/.*\\.mp3") This excludes all mp3 files in Music/Country and in all subdirectories
(note escaping of "\"). |
Specify the hash function to be used to check files for content modification.
Currently supported: md5
(128 bits), sha1
(160 bits).
Default: | md5 |
Example: | HashFunction("sha1") |
Add a file or directory to be archived. Several inputs can be specified.
The use of a directory name always implies that the subdirectories below should be
included in the archive. Path
must be a relative path specification,
and is interpreted relative to BaseDir()
.
Example: | Input("Music") |
Set the maximum size of an archive to Value
. This is typically used to
span a backup over multiple media.
Value
is an integer giving the size in bytes, or a string containing a
number optionally followed by the suffix k
, M
, G
,
T
, P
, E
.
Default: | 0 (no limit) |
Example: | MediumSize("690M") |
Here are a few examples of configuration files.
Archive an mp3 collection:
MediumSize("690M") # Fit archives on CD-R HashFunction("sha1") # Use a good hash BaseDir("/home") # cd to this directory Input("Music") # Archive this tree Exclude("Music/atrontc.vtc") # Exclude generated files Exclude("Music/Playlists/*.m3u")
Archive a digital photo and video collection. Note how backslashes are escaped in Python strings. For more information about regular expressions and Python strings, see this page.
MediumSize("690M") # Fit archives on CD-R HashFunction("sha1") # Use a good hash BaseDir("/home/mirror/hobbes") # cd to this directory Input("pictures") # Archive photos Input("videos") # and videos # Exclude thumbnails and small versions that are generated Exclude("pictures**/thumb/t_*.jpg") ExcludeRegexp("pictures/([^/]+/)*small/s_[^/]+\\.jpg")
(Not that so many people actually asked...)
I lost a backup medium. How will my incremental backup remain consistent?
Just remove the descriptor corresponding to the lost medium from the descriptor directory, and make a new backup. The files that were stored on the lost medium and still exist will be put on the new medium.
Only a subset of the files that need to be archived fit on a medium. How does sync2cd select which files are to be stored on the next archive?
The oldest files are stored on the next archive (based on the file modification time).
Backup's good. Now, where's restore?
Well, the whole point of a backup is never to use it, right? ;-)
Seriously, restore is in the works. In the meantime, all files are stored in the
original hierarchy on the CD/DVD, so that a simple cp
can be used to
restore single files or complete hierarchies. Note however that this will not
restore file and directory attributes, and symbolic links will not be restored
either.
The following features will be added to sync2cd as time permits.
If you are using or trying to use sync2cd, I would be happy to hear from you! I'm especially interested in the following:
In any case, just drop me an e-mail.