Distributed Remote Backups with Git and Etckeeper

Git is a powerful tool for lots of tasks, it makes a wonderful VCS that can drive easy utility and automation.  Recently I used git and remote repositories to deploy code & data fanned out to multiple locations.  Here are some examples along with using Etckeeper, a related wrapper tool for revision control around git (or other VCS) with some extra features.


Remote Repositories in Git
Sometimes you want to keep multiple copies of your data across one or more repositories in git, you work with a primary repository but want a remote clone elsewhere (for redundancy or read-only purposes).

This works easiest with newly created remote repos, add them all with the git remote command and sync your branch outbound.

git remote set-url --add origin git@gitlab.com:you/repoclone
git remote set-url --add origin git@hobo.house:you/repoclone
git push -u origin master

You should see your new remote branches added:

You can add as many remotes as you want, and any commits to the current repo will fan out to the remotes with a git push.  This might be useful with a script to auto-commit backups, and If you manage flat files only when changes incur will anything be committed.

Etckeeper the Roomba

Etckeeper provides a useful wrapper and extra recovery sub-tools around git.  The default behaviour is to manage the /etc/ directory and commit updates as content changes, including some basic RPM or Apt package reversion capabilities.  This could serve as an extra layer of configuration file revisioning and reporting.

Etckeeper will also provide the following features you won’t get with barebones git + cron or something similiar:

  • Keep a .gitignore for files that shouldn’t be kept in a VCS
  • Do a daily commit if there are uncommitted changes in /etc
  • Hook into your packaging system to do commits before and after package changes
  • Maintain critical file metadata like permissions important for proper system operation in configuration files.

You’ll require the EPEL repository on CentOS/RHEL systems.

yum install etckeeper -y

If you want to have Etckeeper create and manage your /etc/ files:

etckeeper init

Edit etckeeper.conf to PUSH_REMOTE=”origin” or tell sed to do it, you can add multiple repositories with spaces.

sed -i -e 's/PUSH_REMOTE=""/PUSH_REMOTE="origin"/g' /etc/etckeeper/etckeeper.conf

Add a remote branch like we did earlier to copy your data changes.

git remote add origin git@gitlab.com:you/etc-backup.git

Etckeeper Scheduling

By default a cron job is set to run it once a day.  That’s fine for most people so skip on.  To disable this and use systemd instead uncomment AVOID_DAILY_AUTOCOMMITS=1 in etckeeper.conf and enable the timer:

sed -i -e 's/#AVOID_DAILY_AUTOCOMMITS=1/AVOID_DAILY_AUTOCOMMITS=1/g' /etc/etckeeper/etckeeper.conf
systemctl enable etckeeper.timer
systemctl start etckeeper.timer

You should start to see automated commits when files in /etc/ change or your package database is modified.

When using your package manager you should also see the etckeeper plugin.

Etckeeper has built-in sub commands for easy reversion to previous git copies of your data, Vultr has a good guide that goes more in-depth with etckeeper git recovery sub-commands.

Simple Etckeeper Recovery

Etckeeper has some useful wrapper commands around git internals, for example if you want to check and revert a configuration file from the auto /etc/ git backup:

Check the git log where the change was incurred

etckeeper vcs log

Obtain the git commit hash of the last change that altered something you want to revert.

etckeeper vcs diff 2dc5dee44f3c0b55274e7f31bc255dcb22134a35

We see the commit hash ending in 4a35 introduces a change in /etc/someconfig that we want to revert.  Git commit ending in 8891 is the previous commit, but we only want to revert the contents of /etc/someconfig only.

etckeeper vcs checkout 9fc268314c93323dcf2462e3f0276200be008891 /etc/someconfig

Now only /etc/someconfig will be reverted from the last change.

Using Etckeeper outside /etc

For managing content besides /etc/ use the -d option.  This could useful for sub-directory level revision control.

etckeeper init -d /srv/data

Add any remote repositories to push to as needed:

git remote add origin git@gitlab.com:you/data.git

Force an initial etckeeper commit:

etckeeper commit -d /srv/data 'initial sync commit' && git push

If you want to automatically have Etckeeper push after ever new commit:

cat > /etc/etckeeper/commit.d/60-push <<EOF
git push
chmod +x /etc/etckeeper/commit.d/60-push

You can tie this into cron if you like, or alias the command.

*/59 * * * * etckeeper commit -d /srv/data 'hourly auto commit'

Extending Further
There is a lot you can do with git and automation.  I won’t even come close to covering a fraction of it here.  You can do a lot with just utilizing branches of git for different environments but still share a common codebase, or promoting infrastructure code (Ansible, Puppet, etc) to production by tying things into code review like Gerrit.

You can also use git post-hooks to combine this with multiple, remote repositories or triggered via Etckeeper to extend usage even further.

About Will Foster

hobo devop/sysadmin, all-around nice guy.
This entry was posted in open source, sysadmin and tagged , , , , , . Bookmark the permalink.

Have a Squat, Leave a Reply ..

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s