We all know that backups must be scheduled periodically. A full
backup (a snapshot of the data at a point in time) can be done
in MySQL with several tools. For example, InnoDB Hot
Backup
provides online nonblocking physical backup of
the InnoDB
data files, and
mysqldump provides online logical backup.
This discussion uses mysqldump.
MySQL Enterprise. For expert advice on backups and replication, subscribe to the MySQL Enterprise Monitor. For more information, see http://www.mysql.com/products/enterprise/advisors.html.
Assume that we make a backup on Sunday at 1 p.m., when load is
low. The following command makes a full backup of all our
InnoDB
tables in all databases:
shell> mysqldump --single-transaction --all-databases > backup_sunday_1_PM.sql
This backup acquires a global read lock on all tables (using
FLUSH TABLES WITH READ
LOCK
) at the beginning of the dump. As soon as this
lock has been acquired, the binary log coordinates are read and
the lock is released. If long updating statements are running
when the FLUSH
statement is
issued, the MySQL server may get stalled until those statements
finish. After that, the dump becomes lock-free and does not
disturb reads and writes on the tables.
We assumed earlier that our tables are InnoDB
tables, so
--single-transaction
uses a
consistent read and guarantees that data seen by
mysqldump does not change. (Changes made by
other clients to InnoDB
tables are not seen
by the mysqldump process.) If we do also have
other types of tables, we must assume that they are not changed
during the backup. For example, for the
MyISAM
tables in the mysql
database, we must assume that no administrative changes are
being made to MySQL accounts during the backup.
The resulting .sql
file produced by
mysqldump contains a set of SQL
INSERT
statements that can be
used to reload the dumped tables at a later time.
Full backups are necessary, but they are not always convenient. They produce large backup files and take time to generate. They are not optimal in the sense that each successive full backup includes all data, even that part that has not changed since the previous full backup. After we have made the initial full backup, it is more efficient to make incremental backups. They are smaller and take less time to produce. The tradeoff is that, at recovery time, you cannot restore your data just by reloading the full backup. You must also process the incremental backups to recover the incremental changes.
To make incremental backups, we need to save the incremental
changes. The MySQL server should always be started with the
--log-bin
option so that it
stores these changes in a file while it updates data. This
option enables binary logging, so that the server writes each
SQL statement that updates data into a file called a MySQL
binary log. Looking at the data directory of a MySQL server that
was started with the --log-bin
option and that has been running for some days, we find these
MySQL binary log files:
-rw-rw---- 1 guilhem guilhem 1277324 Nov 10 23:59 gbichot2-bin.000001 -rw-rw---- 1 guilhem guilhem 4 Nov 10 23:59 gbichot2-bin.000002 -rw-rw---- 1 guilhem guilhem 79 Nov 11 11:06 gbichot2-bin.000003 -rw-rw---- 1 guilhem guilhem 508 Nov 11 11:08 gbichot2-bin.000004 -rw-rw---- 1 guilhem guilhem 220047446 Nov 12 16:47 gbichot2-bin.000005 -rw-rw---- 1 guilhem guilhem 998412 Nov 14 10:08 gbichot2-bin.000006 -rw-rw---- 1 guilhem guilhem 361 Nov 14 10:07 gbichot2-bin.index
Each time it restarts, the MySQL server creates a new binary log
file using the next number in the sequence. While the server is
running, you can also tell it to close the current binary log
file and begin a new one manually by issuing a
FLUSH LOGS
SQL
statement or with a mysqladmin flush-logs
command. mysqldump also has an option to
flush the logs. The .index
file in the data
directory contains the list of all MySQL binary logs in the
directory. This file is used for replication.
The MySQL binary logs are important for recovery because they form the set of incremental backups. If you make sure to flush the logs when you make your full backup, then any binary log files created afterward contain all the data changes made since the backup. Let's modify the previous mysqldump command a bit so that it flushes the MySQL binary logs at the moment of the full backup, and so that the dump file contains the name of the new current binary log:
shell>mysqldump --single-transaction --flush-logs --master-data=2 \
--all-databases > backup_sunday_1_PM.sql
After executing this command, the data directory contains a new
binary log file, gbichot2-bin.000007
. The
resulting .sql
file includes these lines:
-- Position to start replication or point-in-time recovery from -- CHANGE MASTER TO MASTER_LOG_FILE='gbichot2-bin.000007',MASTER_LOG_POS=4;
Because the mysqldump command made a full backup, those lines mean two things:
The .sql
file contains all changes made
before any changes written to the
gbichot2-bin.000007
binary log file or
newer.
All data changes logged after the backup are not present in
the .sql
, but are present in the
gbichot2-bin.000007
binary log file or
newer.
On Monday at 1 p.m., we can create an incremental backup by
flushing the logs to begin a new binary log file. For example,
executing a mysqladmin flush-logs command
creates gbichot2-bin.000008
. All changes
between the Sunday 1 p.m. full backup and Monday 1 p.m. will be
in the gbichot2-bin.000007
file. This
incremental backup is important, so it is a good idea to copy it
to a safe place. (For example, back it up on tape or DVD, or
copy it to another machine.) On Tuesday at 1 p.m., execute
another mysqladmin flush-logs command. All
changes between Monday 1 p.m. and Tuesday 1 p.m. will be in the
gbichot2-bin.000008
file (which also should
be copied somewhere safe).
The MySQL binary logs take up disk space. To free up space, purge them from time to time. One way to do this is by deleting the binary logs that are no longer needed, such as when we make a full backup:
shell>mysqldump --single-transaction --flush-logs --master-data=2 \
--all-databases --delete-master-logs > backup_sunday_1_PM.sql
Deleting the MySQL binary logs with mysqldump
--delete-master-logs can be dangerous if your server
is a replication master server, because slave servers might
not yet fully have processed the contents of the binary log.
The description for the PURGE BINARY
LOGS
statement explains what should be verified
before deleting the MySQL binary logs. See
Section 12.6.1.1, “PURGE BINARY LOGS
Syntax”.
User Comments
We had one problem where we had too many bin files. As our database was heavily used the file size was quite big there were too many of them.
Please be sure to have a policy where you remove your old transactional log files as mentioned in the last example.
I could not found the disk pace was mentioned in Binary LOG Manual page.
Here's a scrip i wrote to backup my db, using a user that's not "mysql".
There's an entry in the sudoers file to allow the user to read the log files
The backup (mysql) user has "super,reload,all" privileges + its password in .my.cnf
The backup path is on a NAS filesystem.
What's missing is a function to remove old full backups if needed.
------------------------------------------------------
#!/bin/bash
MODE=$1
MYSQLUSER=backup
BINLOGPATH=/var/log/mysql
BINLOGNAME=mysql-bin
ARCHIVEPATH=/usr/app/security/var/data/backups/mysql/
setRead() {
# change properties on binlog path
sudo chmod o+rx $BINLOGPATH
sudo chmod o+r $BINLOGPATH/*
}
resetRead() {
# restore properties on binlog path
sudo chmod o-r $BINLOGPATH/*
sudo chmod o-rx $BINLOGPATH
}
copyBinlogs() {
# copy binlogs to archive dir
echo "Copying binlogs"
setRead;
for FILE in `cat $BINLOGPATH/$BINLOGNAME.index`
do
SFILE=(${FILE/*\//})
if [ ! -f $ARCHIVEPATH/$SFILE.gz ]
then
echo "- binlog $SFILE"
cp $FILE $ARCHIVEPATH
gzip $ARCHIVEPATH/$SFILE
fi
done
resetRead;
}
saveBinlogs() {
# move latest binlogs to savedir
echo "Saving binlogs"
SAVEDIR=`date +%Y%m%d`
echo $SAVEDIR
mkdir -p $ARCHIVEPATH/$SAVEDIR
mv $ARCHIVEPATH/*.gz $ARCHIVEPATH/$SAVEDIR
}
# --- Main ---
case "$MODE" in
'full')
# Weekly backup
# - copy all bin logs to backup directory
# - clean path of backup directory
echo "Weekly backup"
DATE=`date +%Y%m%d`
mysqldump -u $MYSQLUSER --single-transaction --flush-logs --master-data=2 \
--all-databases --delete-master-logs | gzip > $ARCHIVEPATH/$SAVEDIR/full-$DATE.sql.gz
copyBinlogs;
saveBinlogs;
;;
'incremental')
# Daily backup
# - flush logs
# - copy all bin logs to backup directory if not already done
echo "Mysql daily backup"
mysqladmin -u $MYSQLUSER flush-logs
copyBinlogs;
;;
*)
echo "Usage: mysql_backup [full|incremental]"
;;
esac
------------------------------------------------------
You should never use --delete-master-logs as the warning specifies. The above script is quite dangerous if you have many replication slaves.
A much better alternative is to do
echo 'PURGE MASTER LOGS BEFORE DATE_SUB( NOW( ), INTERVAL 7 DAY);' | /usr/bin/mysql
Regarding copyBinlogs() - it may be wise to preserve the creation and modification times, with 'cp -p', so it is easy to order them correctly when recovering.
further to the comment on delete-master-logs - see
http:://bugs.mysql.com/bug.php?id=24733
In other words, delete-master-logs is bad news for versions before the fix for that bug went in.
Add your own comment.