Russell Dyer is the Editor of the MySQL Knowledge Base and the author of MySQL in a Nutshell (O'Reilly 2005), which has been translated into German, Japanese, and Polish. He has written on MySQL for several magazines including ONlamp.com, SysAdmin, and Unix Review. He's from New Orleans, but lives in Milan, Italy. See his personal web site for more of his articles on MySQL: http://russell.dyerhouse.com/
By Russell Dyer
A utility like mysqldump is very useful for backing up MySQL data and table schema. However, backups are usually made only once a day or at some other regular interval. If data is lost on your system several hours after the last backup is completed, you may not be able to recover it. There are a few methods by which you can continuously protect data. One method that you can use to be able to recover interim data is to use binary logs. A binary log will record all SQL transactions on a server as they are executed. Some administrators try to minimize any logging because of concerns about server performance. However, on average, binary logging will slow a server by only one percent. In return, though, it allows for up to the minute data recovery.
Binary logging will record in a binary file all SQL transactions executed and attempted on the server. By using the mysqlbinlog utility, the contents of the binary log file can be extracted so that the SQL statements may easily be rerun. To enable binary logging, add the following line to your server's options file (i.e., /etc/my.cnf or c:\my.ini, depending on your system) in the [mysqld] group:
log-bin = /var/log/mysql/bin.log
The exact path to use will depend on your filesystem and your preferences. You may have to create the directory where logs are to be stored and change the ownership and permissions for the directory on your filesystem accordingly. It's recommended that the path where you backup data and keep log files should be on a separate hard drive for added safety. The suffix log in the file name above will be replaced automatically with a six digit number. You can limit binary logging to specific databases or you can omit certain databases from the log. For more information on fine tuning this logging, see the documentation on binary logs. After making the entry above to your options file, you will need to restart the MySQL server for it to take effect.
In addition to making regular backups with something like mysqldump, you may want to synchronize the starting of binary logs with the backups. This can be done by flushing the logs when you run a backup. You can add something like the following line to cron or a similar scheduling utility:
mysqladmin -u root -pmypwd flush-logs
In this line, the mysqladmin utility is used to flush the server's logs. You could instead use the mysql client to execute the FLUSH LOGS statement.
With daily backups and the binary logs running, it's a simple, albeit sometimes tedious procedure to recover data on a MySQL server. As an example, suppose that your data is backed up each day at midnight using mysqldump. Suppose further that one day (say, at 10:00 a.m. on April 20, 2005) your data is lost somehow and you want to recover it. To begin with, you may want to stop the MySQL server and then restart it in such a way that you will be the only user to have access to it. This can be done like so:
mysqld --socket=/tmp/mysql_restore.sock --skip-networking
The --socket option will temporarily name a different socket file for Unix systems. On Windows systems, you would provide a named pipe (e.g., MySQL_restore) instead of a path and file name. You would enter something like this:
mysqld-nt --enable-named-pipe \ --socket=MySQL_restore --skip-networking
On Windows systems that are not NT based, named pipes are unavailable and as such you will have to use TCP/IP. In this situation you may want to start the server using the --port option with a port other than 3306. You can use any non-privileged port unused by other services.
The --skip-networking option prevents users from accessing the server via TCP/IP and the socket file or named pipes.. It will only permit connections from the localhost. This still allows access from a console physically on the server, or through a secure shell connection. This option with a different socket name will prevent outsiders and API scripts on the server from being able to access the database until you're able to restore the data.
Once you have the server under your exclusive control, you can then proceed without worrying about more problems occurring from users trying to access the server while you're recovering the data. The first step is to restore the dump file from the previous night's backup like so:
mysql -u root -pmypwd --socket=/tmp/mysql_restore.sock \ < /var/backup/20050420.sql
Again, the actual names of the paths and files will be different for your server. The line here will restore the data as of the previous night, or rather as of the start of the day. To restore the transactions since the dump file was created, we will use the mysqlbinlog utility. If you are flushing the logs each day when making your nightly backups, you would restore the entire binary log file from the command-line like so:
mysqlbinlog /var/log/mysql/bin.123456 \ | mysql -u root -pmypwd \ --socket=/tmp/mysql_restore.sock
In this example, the results of mysqlbinlog are piped (notice the vertical bar) to the mysql client for processing. When it's finished, restart the MySQL server without the temporary socket file and network restriction.
These couple of steps will restore the data as it stood at the time of the previous backup and then all SQL statements that were entered since the logs were flushed will be re-entered. To determine the name of the latest binary log file, you can get a listing of the directory which contains the logs.
The scenario in the previous section is based on the assumption that you want to restore all of the server's data. However, if the reason you need to perform the recovery is because some data was inadvertently deleted by an SQL statement, a simple full recovery using the binary logs will only repeat the mistake. For such a situation, a better solution would be to run the mysqlbinlog as shown above, but instead of piping the results to mysql, redirect them to a text file so that you can edit it before passing the file on to MySQL. To redirect the output from mysqlbinlog, you would enter something like this:
mysqlbinlog /var/log/mysql/bin.123456 \ > /tmp/mysql_restore.sql
This will create a simple text file in the /tmp directory (adjust for your server) that you can edit with a text editor like vi or notepad.exe. Don't use a word processor because it may add binary formatting codes to the file, which will cause problems when you hand the file over to mysql. After you've deleted the unwanted SQL statements and saved the restore file, you can then run it through the mysql client like so:
mysql -u root -pmypwd \ --socket=/tmp/mysql_restore.sock \ < /tmp/mysql_restore.sql
Notice that in this case, a less-than sign is used to redirect the standard input (i.e., STDIN).
As of version 4.1.4 of MySQL, a few options have been added to mysqlbinlog that make tweaking a restore with a binary log easier and may eliminate the need to manually modify a restoration. One pair of options added was the --start-date and the --stop-date options. For example, suppose exactly at 10:00 a.m. on April 20, 2005 you executed an SQL statement which deleted a large table. You could restore the previous night's backup, and then run the mysqlbinlog like so:
mysqlbinlog --stop-date="2005-04-20 9:59:59" /var/log/mysql/bin.123456 | mysql -u root -pmypwd \ --socket=/tmp/mysql_restore.sock
This will recover all of the data up until the date and time given as a datetime format for the --stop-date option. If you did not detect the erroneous SQL statement that you entered until hours later, you will probably want to recover the activity that occurred afterwards, as well. Based on this, you could run the mysqlbinlog again with a start date and time like so:
mysqlbinlog --start-date="2005-04-20 10:01:00" \ /var/log/mysql/bin.123456 \ | mysql -u root -pmypwd \ --socket=/tmp/mysql_restore.sock
In this line, the SQL statements logged from 10:01 a.m. will be run. The combination of executing of the previous night's dump file and the two lines of the mysqlbinlog will restore everything up until one second before 10:00 a.m. and everything from 10:01 a.m. on. Of course, you should examine the log to be sure of the exact times. The next section explains how you might do this.
Instead of using a specific time, you can use two other mysqlbinlog options: --start-position and --stop-position. They work the same as the start and stop date options, except that you will have to give position numbers found in the log. Using log positions may be more accurate recovery method, especially if you have many transactions that occurred around the same time as a damaging SQL statement. To determine the position numbers, you could run mysqlbinlog for a range of times around when the unwanted transaction was executed using a start and stop time, and redirect the results to a text file for you to examine. You could do this like so:
mysqlbinlog --start-date="2005-04-20 9:55:00" \ --stop-date="2005-04-20 10:05:00" \ /var/log/mysql/bin.123456 \ > /tmp/mysql_restore.sql
This will create a small text file in the /tmp directory which will show the SQL statements around the time you executed the deleterious SQL statement. You would then open this file with a text editor and look for the statement you don't want to repeat. Once you know the position where you want to stop the recovery and the position in which you'd like to resume a recovery, make note of the position numbers. Positions are labeled with log_pos followed by a number. After restoring the backup file, with the position numbers in hand, you would then enter something like the following from the command-line:
mysqlbinlog --stop-position="368312" \ /var/log/mysql/bin.123456 \ | mysql -u root -pmypwd \ --socket=/tmp/mysql_restore.sock mysqlbinlog --start-position="368315" \ /var/log/mysql/bin.123456 \ | mysql -u root -pmypwd \ --socket=/tmp/mysql_restore.sock
The first line above will recover all of the transactions up until the stop position given. The next line will recover all transactions from the starting position given until the end of the binary log. By the way, because the output of mysqlbinlog includes SET TIMESTAMP statements before each SQL statement recorded, the recovered data and related MySQL logs will reflect the original times that the transactions were executed.
With binary logging, complete recovery up to any particular moment can be achieved fairly easily and quickly with a standard installation of MySQL. The key is to perform backups and to flush the logs on a regular basis. The frequency will depend on your particular needs and policies. With these safeguards in place, it's just a matter of being familiar with the procedures for recovering your data so that you may act quickly and accurately when called upon. If you cannot afford the time it may take, however, to recover a database by the methods explained here--especially when you consider that you may have to lock users out of the server while restoring data--you may want to consider other methods of protecting data and ensuring consistency of service such as Replication or Clustering.