Backups

I've started doing more work in servers and recently decided to move my documentation to a Mediawiki running on my internal server. While setting this up, I started thinking about backups and data integrity. When I backup data, I always do 2 things: 

  1. Encryption - In the words of Tom Lawrence , "Data at rest is ALWAYS encrypted." If you don't actively need the data, why risk it being accessible to hackers? Encrypting your data means thieves can't use it if they get ahold of your backup drives.
  2. Checksums - I like to take the extra 5 seconds to run checksums on critical files. Of course, larger files will take more time to run the checksum, but the key is that they are CRITICAL files. Why risk a bad transfer/decryption causing a massive headache when an algorithm can tell you that your data came out the same way it went in?

In the case of my mediawiki, I'm almost always updating the contents of the pages and the uploaded files (mostly device configurations on projects) at the same time. I wanted to have a script which does all of my backup operations for me and allows me to FTP into the server and grab the files quickly. I also want to avoid manually entering passwords during the script run so that I can eventually set up a cron job and send the files to a NAS of file sync tool.

For reference, I'm running a LEMP stack (Linux Nginx, MySQL, PHP) on Ubuntu server 16.04.2 as an AWS EC2 Instance. My bash script is stored in the home directory of the user "filebackup" I also added the user which I use for FTP and SSH access to the "filebackup" groupso the output files, which allow reading and writing from the group, can be copied off the server.

#!/bin/bash

NOW=$(date +%d_%b_%Y-%H%M%z)
cd ~/
mysqldump --single-transaction -h localhost -u backup --default-character-set=binary wiki > sql_$NOW.sql
cd /var/web/services/mediawiki
tar -cvf ~/MWBackup_$NOW.tar LocalSettings.php images ~/sql_$NOW.sql
cd ~/
sha1sum MWBackup_$NOW.tar > checksum/MWBackup_$NOW.sha
cat password | gpg -c --batch --passphrase-fd 0 MWBackup_$NOW.tar
rm sql_$NOW.sql
rm MWBackup_$NOW.tar

#create file ~/.my.cnf
#[mysqldump]
#user=
#password=
#
#Grant Select on the mysql user
#
#create file ~/password
#entire contents is the gpg encryption password
#
#chmod 600 and chown to correct user both files: .my.cnf and password.
#chown these files to a user with no ssh/ftp access for increased security.
#Add FTP user to the primary group of the above user.

The code is pretty inclusive and simple. It:

  1. Sets the current date, time, and timezone as a variable to distinguish our files.
  2. Performs a mysqldump of the database "wiki" using the user "backup".
    • This does require a .my.cnf file in the users home directory. The file includes [mysqldump], the sql username and password.
  3. Archives our mediawiki's "images" folder, the LocalSettings.php file and the mysqldump.
  4. Creates a checksum file for the archive.
  5. Encrypts the archive.
  6. Removes the unencrypted tar and sql files.

Of course, not everyone will use this process, and I will likely move "LocalSettings.php" into my server config backups since it wont change much. This is just a simple way to get started backing up data. Over time, the process will be refined and I will eventually sit around wondering "Why did I do that?". As long as the backup system is solid and the data is stored properly, an unrefined backup system is significantly better than no backup system. Below is an example of my ".my.cnf" file with the username and password removed

[mysqldump] 
user=##USERNAME##
password=##PASSWORD##

Thanks to Tom Lawrence for his Youtube videos and responses to my comments.