Here's a possible solution to creating
a backup of a directory tree (your home directory would be a good
start). First it takes advantage of the
fact that the /tmp directory is writable by
everyone so that no special permissions are needed to create files
and directories there. By using /tmp you can manage all of this
without being root.
Next it uses the find command to
collect all the file names and the full path for each file and spit
them out to tar. The tar command (Tape Archiver) will create a file,
and optionally, compress it and store it in the /tmp directory. The -X option to tar
will instruct tar to not include any files or directories listed in
the exclude file. This prevents archiving a bunch of useless stuff
like the web browser's cache files, X-desktop settings and other
stuff that either changes often or is specific to this machine. If
you want to archive everything then leave off the -X option.
Notice that the script begins its
search in the current directory (.). It assumes that you will first
cd into this directory. The reason is so that tar won't include the full
path. This way you can un-tar it in any directory and it will
duplicate the the structure of the current directory
only. If you use a full path name like, /home/acme, then the tar file
will preserve that path and when you un-tar the archive it will put
everything back into the same location. In short, it will overwrite
the existing directories on whatever machine it is un-tarred on. Best
bet, cd into the directory you want to
archive first.
find . -depth | tar -X exclude -zcvf /tmp/save.tar.gz .
This next part will create a temporary
.netrc file (assuming that the script is
run in the home directory). This .netrc file is used to automatically
login into an ftp server someplace and, optionally, execute some
commands. Here the entire .netrc file is echoed into place by this
script.
Following the login is a command to
execute a "macro definition" (macdef)
with the name, init. The init macdef name
is a special name that ftp will understand to mean, “Do this
right after a successful login.” So, it'll login you in and
immediately execute all the commands up to
the first blank line.
First I tell it to turn off prompting,
then I tell it print a hash (#) for every 1Kb of data transferred.
Next it will cd into the www directory.
Finally it will upload (mput) the files I
archived above.
echo "machine acme.com login acme password whizbang
macdef
init
prompt
hash
cd
www
mput /tmp/saved.tar.gz
Finally I issue
the command to quit ftp and make sure to follow that with a blank
line so ftp will know that this ends the
macdef definition. Since all of this was being echoed, re-direct the
output to the .netrc file (I'm in my home directory of course).
quit
">.netrc
Now that there's a .netrc file I can go ahead and do the ftp ...
ftp acme@acme.com
Let's remove the archive since it's not
needed any more.
rm /tmp/save.tar.gz
To finish up I put another version of the .netrc file for other ftp sites ...
echo "
machine freon.net login will password pollywop
machine faraway.net login edna password pressing
machine ftpsite.com login beme password pulpie
macdef init
hash
prompt
cd www
ls
">.netrc
The script is an example of how you can automate files transferred to
another machine either for backup or because you want the files
available for access elsewhere.
NOTE: Everything above that's indented is part of a single shell script (or shell function) so you can just copy and paste it as is. Change the directory names, file names, host names and user names to match your setup.
Another example adds some stuff to give you more control over what gets archived and where it goes:
if [ -d /tmp/saved ]
then
rm -r /tmp/saved
mkdir /tmp/saved
fi
The first part above will check to see if a directory exists and, if it
does, removes it and then re-creates it. Doing it this way just prevents
the shell from complaining if the directory doesn't exist and I try to remove
it (there's other ways to test this of course).
Next I check to see if an earlier version of the archive exists and delete it
if it does:
if [ -f /root/save.tar.gz ]
then
rm -r /root/save.tar.gz
fi
Here again I use the find command to collect all the file
names along with the full path for each ...
find /root -depth |cpio -pdv /tmp/saved
This time I use cpio with -p option to copy the file names
passed by the find command to
the directory I created in the previous step above. The find
command just returns file names with their path information and cpio
will copy the named files some place. With this command line you can
copy entire directory trees to other locations. Next I cd
into another directory and repeat the command,
cd /home
find acme -depth |cpio -pdv /tmp/saved
By first changing to the directory of the parent of
the one I want to copy, the find command will return only the path
information for the files I'm copying. The advantage in this case is
that the resulting directory structure will only have the stuff I
care about. Since I'm copying to the /tmp/saved directory I want the
shortest paths possible:
/tmp/saved/acme
Without the cd first, the path would be:
/tmp/saved/home/acme
I don't need that extra directory level, hence the cd command. Next I
cd to the /tmp/saved directory where I just copied everything. I will
create the compressed archive here for the same reason as before; to
simplify the path information.
cd /tmp/saved
tar -zcvf /root/save.tar.gz . -X /root/exclude
Notice that the archive was actually created in the /root directory. Now that I
have the archive, I can cd to where the archive is
located and delete the /tmp/saved directory and recover the disk
space:
cd /root
rm -rf /tmp/saved
Now we can do something different with the archive file we've just
created. Earlier we used ftp and the .netrc file
to automatically upload the archive to another machine somewhere.
This time we'll try something else.
Let's suppose that we created this archive script on another machine and
now we need to execute the script on that
machine. Rather than logging in over there and running the script,
try this:
ssh other-host.work.net /root/cleanup
sleep 120
scp other-host.work.net:save.tar.gz /root
What this does is connect to the remote host and execute the archive
script (cleanup) there. This will do all the stuff there we talked
about above. This time the archive is created on another machine.
We'll wait two minutes for it to finish and then copy it from the
remote machine to this machine. There are other ways to do this but
the idea here is to present a couple of uses for ssh.
The ssh utilities let you connect to other machines
through an encrypted connection.
This means that no one can see what you're doing. While it's still
possible for someone to intercept the data you're transmitting and
receiving, they can't make any sense of it because it's encrypted and
will look like indecipherable garbage. Only a machine having your
authorization can decrypt the data. The ssh connection
is about all most networks use these days so telnet and even ftp
are rarely acceptable.
The third line above uses the ssh version of rcp
(remote copy), scp, to copy files to and from other
machines. There is also sftp
for ftp type connections. Yet another way to copy files around is:
rsync -az -e ssh acme@acme.work.nks.net:saved.tar.gz .
This uses the rsync command to
copy the archive from the remote host to this host and also invokes
ssh to create an encrypted connection. The rsync
command is one of the handier ways to shuffle stuff around and it has
a bunch of options and capabilities. One of
its neater features is keeping files on two or more machines
identical. Files will only be copied if they are different and even
then only the differences are actually copied. You don't have to copy
entire files every time so the transfer times are usually quite small
once the initial copy is made.