Guilhem Bichot is a software developer with MySQL AB, and lives in France.
By Guilhem Bichot
Here we discuss how to recover data after a crash in case of:
The version of MySQL we will use is MySQL 4.1.8. We'll consider data is stored into the InnoDB storage engine of MySQL, which has support for transactions and automatic crash recovery. We'll always assume the MySQL server is under load at the time of crash. If it were not, no recovery would ever be needed.
In these cases we can assume the MySQL disk data is available after restart. At restart the InnoDB data files do not contain consistent data because of the crash, but InnoDB reads its InnoDB logs, finds there the list of pending (not flushed to data file) committed and non-committed transactions, automatically rolls back those which were not committed, and flushes to data file those which were committed. Information about this recovery process is conveyed to the user through the MySQL error log. Excerpt:
InnoDB: Database was not shut down normally. InnoDB: Starting recovery from log files... InnoDB: Starting log scan based on checkpoint at InnoDB: log sequence number 0 13674004 InnoDB: Doing recovery: scanned up to log sequence number 0 13739520 InnoDB: Doing recovery: scanned up to log sequence number 0 13805056 InnoDB: Doing recovery: scanned up to log sequence number 0 13870592 InnoDB: Doing recovery: scanned up to log sequence number 0 13936128 ... InnoDB: Doing recovery: scanned up to log sequence number 0 20555264 InnoDB: Doing recovery: scanned up to log sequence number 0 20620800 InnoDB: Doing recovery: scanned up to log sequence number 0 20664692 InnoDB: 1 uncommitted transaction(s) which must be rolled back InnoDB: Starting rollback of uncommitted transactions InnoDB: Rolling back trx no 16745 InnoDB: Rolling back of trx no 16745 completed InnoDB: Rollback of uncommitted transactions completed InnoDB: Starting an apply batch of log records to the database... InnoDB: Apply batch completed InnoDB: Started mysqld: ready for connections
In these cases we can assume the MySQL disk data is not available after restart; as some blocks of disk data are not readable anymore, MySQL will fail to start successfully. Ok, we re-format the disk, or install a fresh new one, and it's time to recover our MySQL data from backups... ah, so we needed to make backups before! Let's step back in time and design a backup policy.
We all know backups must be scheduled periodically. Full backups (snapshots of the data at a point in time) can be done in MySQL with several tools. InnoDB Hot Backup provides online (non-blocking) physical (data file copy) backup. mysqldump provides online logical backup, we'll consider it in the rest of the discussion. Example:
mysqldump --single-transaction --all-databases > backup_sunday_1_PM.sql
This makes a backup of all our InnoDB tables in all databases, without
disturbing the reads and writes on these tables. The content of the
.sql
file is a bunch of SQL INSERT
statements. (We
assume this is the full backup of Sunday 1 PM, where load is low.)
Full backups are needed but they are not always convenient. They produce big backup files and so, take time. They are not optimal in the sense that they backup again the part of data which didn't change since the last full backup. Incremental backups are more optimal (they trade backup time for restore time, though).
To do incremental backups we need to have the increments. To make the MySQL
server store the increments in a file while it updates data, this server should
always be started with the --log-bin
option. Then each SQL
statement which updates data will be written into a file (which we call a
"MySQL binary log"). Let's look at the data directory of a MySQL server which
has been running with --log-bin
for some days, we find these MySQL
binary logs:
-rw-rw---- 1 guilhem guilhem 1277324 Nov 10 23:59 gbichot2-bin.001 -rw-rw---- 1 guilhem guilhem 4 Nov 10 23:59 gbichot2-bin.002 -rw-rw---- 1 guilhem guilhem 79 Nov 11 11:06 gbichot2-bin.003 -rw-rw---- 1 guilhem guilhem 508 Nov 11 11:08 gbichot2-bin.004 -rw-rw---- 1 guilhem guilhem 220047446 Nov 12 16:47 gbichot2-bin.005 -rw-rw---- 1 guilhem guilhem 998412 Nov 14 10:08 gbichot2-bin.006 -rw-rw---- 1 guilhem guilhem 361 Nov 14 10:07 gbichot2-bin.index
Every time it restarts, the MySQL server stops writing to the current MySQL
binary log, creates a new one and from then on, the new one will become the
current one (the one which gets written). Such a switch can be forced manually
with the FLUSH LOGS
SQL command. The .index
file
contains the list of all MySQL binary logs in the directory (it's used for
replication).
Those MySQL binary logs are the increment. Let's change a bit our mysqldump run so that it forces a switch of MySQL binary logs at the moment of the full backup; let's additionally ask to see the name of the new current binary log:
mysqldump --single-transaction --flush-logs --master-data=2 --all-databases > backup_sunday_1_PM.sql
We now have in the data directory a file gbichot2-bin.007
. In
our .sql
file there are these lines:
-- Position to start replication or point-in-time recovery from -- CHANGE MASTER TO MASTER_LOG_FILE='gbichot2-bin.007',MASTER_LOG_POS=4;
This means that all data changes logged in MySQL binary logs older than
gbichot2-bin.007
are present in the .sql file, and all data
changes logged in gbichot2-bin.007
or newer are not present in the
.sql
file. On Monday 1 PM, to do an incremental backup we will
just issue mysqladmin --flush-logs
this will create
gbichot2-bin.008
. All changes between the "Sunday 1 PM" full
backup and Monday 1 PM are the file gbichot2-bin.007. We copy this precious
file to our backup safe place (tape, spare machine, DVD-RW etc).
On Tuesday 1 PM, let's do mysqladmin --flush-logs
again. All
changes between Monday 1 PM and Tuesday 1 PM are the file
gbichot2-bin.008
. We copy this precious file to our backup safe
place.
These MySQL binary logs are going to use disk space; we should make room from time to time. A good idea is deleting the binary logs which we won't need anymore, i.e. when we do the full backup:
mysqldump --single-transaction --flush-logs --delete-master-logs --master-data=2 --all-databases > backup_sunday_1_PM.sql
Now we have this crash, say on Wednesday 8 AM. We restore the last full backup we have, the one of Sunday 1 PM. As this script was just a set of SQL statements, restoring is very easy:
mysql < backup_sunday_1_PM.sql
We now have data how it was on Sunday 1 PM. To apply the incremental backups to it, just fetch these from the backup safe place and do:
mysqlbinlog gbichot2-bin.007 | mysql mysqlbinlog gbichot2-bin.008 | mysql
We now have data how it was on Tuesday 1 PM. We are still missing the
changes from that date to the date of the crash. To not miss them, we would
have needed to have the MySQL server store its MySQL binary logs into a safe
location (RAID disks, SAN...) different from the place where it stores its data
files, so that these logs were not in the destroyed disk. If we had done this
(we will do this from now on, promise!), we would have the
gbichot2-bin.009
, and we could apply it and get our data how it
was at the moment of the crash (no data loss). We could even be more flexible;
if instead of a crash the problem was a wrong DROP DATABASE
, we
could apply gbichot2-bin.009
up to just before that wrong
DROP DATABASE
, so that we get our data how it was just before that
wrong statement! For example, if you know the wrong statement happened around
7:30 AM, you could ask to get data at its state of 7 AM (you take a margin of
30 minutes):
mysqlbinlog --stop-datetime=2004-11-17\ 07:00:00 gbichot2-bin.009 | mysql
That's a bit fuzzy (you lose approximately half an hour of changes) but sufficient in some cases; in other cases you really can't afford losing anything, so you'll want to run mysqlbinlog up to just before the DROP DATABASE. There are several easy ways to do that, here's one:
mysqlbinlog gbichot2-bin.009 > a_temp_file.sql.
We edit a_temp_file.sql
in our favorite editor; we use the text
search function to search for "DROP DATABASE", the editor takes us there
immediately, we see:
# at 79 #041117 07:26:08 server id 1 log_pos 79 Query thread_id=1 exec_time=0 error_code=0 use test; DELETE FROM `test`.`hea` WHERE col=879865; # at 138 #041117 07:26:08 server id 1 log_pos 138 Query thread_id=1 exec_time=0 error_code=0 DROP DATABASE our_cherished_database; # at 198 #041117 07:26:08 server id 1 log_pos 198 Query thread_id=1 exec_time=0 error_code=0 DELETE FROM `test`.`hea3`;
We just delete all lines starting from the DROP DATABASE
, save
a_temp_file.sql
and run mysql <
a_temp_file.sql
.
And we are done!
For good sleep:
--log-bin
, or even
--log-bin=some_safe_media_different_from_data_disk
if you have
such safe media, this will anyway be good for disk load balancing (performance
improvement).mysqladmin
.Note 1: real-life examples would require passing the --user
and
--password
options (and specifying the right user/password) to the
mysqldump
and mysql
programs so that the MySQL server
allows them to connect to it.
Note 2: deleting the MySQL binary logs with mysqldump
--delete-master-logs
can be dangerous if your server is a
replication master server; the Replication section of our manual
explains what should be verified before deleting the MySQL binary logs.