Robin Schumacher is MySQL's Director of Product Management and has over 13 years of database experience in DB2, MySQL, Oracle, SQL Server and other database engines. Before joining MySQL, Robin was Vice President of Product Management at Embarcadero Technologies.
I still remember the day. I was working for a large systems integrator on a very major account, and had responsibility for some DB2 databases, but also had the unfortunate duty of also taking care of some legacy IMS databases (hierarchical IBM databases used quite a lot ‘back in the day’), which I knew next to nothing about. One morning I was asked to make a small change to a very large and prominent IMS database – just remove one record – what could go wrong? The guy who knew IMS real well was gone so I put together a program to do the job (yes, you actually had to write a program to do it), and ran it after hours. I came in the next day and tried to access some records in the database. Nothing. All gone, except for – ta-da – the one record I wanted to remove! At the instant I realized my mistake, I was acutely aware of two things: (1) I was in serious trouble if I didn’t have a backup, and (2) I was in desperate need of a complete change in underwear.
Well, fortunately for me, I did have a backup that I restored (although it took a while to run), and I learned my lesson well that day: Always practice good disaster recovery techniques. DBA’s usually focus on things like performance tuning and the like, which to be sure, are important, but in the end a DBA is paid to do one thing very well: protect the data. Don’t do that, and you’ll get an all expense paid trip to the unemployment office. This being the case, as a DBA, I always made sure I knew my database’s backup/restore utilities inside and out and that I routinely practiced various restore scenarios on my test servers with the backups I took of my databases each day.
When it comes to MySQL, there are a variety of different methods you can use to backup your databases, with the most popular being the mysqldump
utility as shown in one of our polls that ran on the MySQL website a while back:
Although the above methods serve many backup/restore use cases well, there’s always room for improvement. And in MySQL 6.0 (now in alpha), a new backup and restore utility is being introduced that many will find beneficial. Let’s take a quick look at the new backup/restore tool and take it for a quick test drive through a number of common scenarios to see how well it performs.
MySQL 6.0’s backup/restore utility is a completely new tool in the server that you should find quite easy to use and work with. One feature of the new utility that I especially like is that it allows for backup and restore commands to be issued right from a mysql
client command prompt, which means no shelling out to an operating system prompt and/or scripting shell scripts to run a backup. In this aspect, the new backup tool mirrors what a DBA would find in Microsoft SQL Server’s backup/restore repertoire (which, of all the other database vendors, I found in my experience to be the best/easiest/most idiot-proof to use).
The syntax in the current alpha release looks like the following:
BACKUP {DATABASE | SCHEMA} { * | db_name [, db_name] ... } TO 'image_file_name' [WITH COMPRESSION [COMPRESSION_ALGORITHM [=] algorithm_name]]; RESTORE FROM 'image_file_name';
Keep in mind that as of this writing, MySQL 6.0 is in alpha and additional syntax changes will likely be made for upcoming beta releases. To see the most current syntax for MySQL 6.0’s backup and restore utility, see the 6.0 manual at: http://dev.mysql.com/doc/refman/6.0/en/backup-database-restore.html.
Beside having easy to use and remember syntax, one much anticipated feature of the new 6.0 backup is the ability to do online backups of the MyISAM storage engine, which, up until this time, had to backed up in an offline fashion. When I say the 6.0 backup for MyISAM is online, I mean it is non-blocking for DML activity – so any INSERT
, UPDATE
, and DELETE
activity can run unhindered while the backup is in progress. It should be noted that any DDL (ALTER TABLE
, etc.) will be suspended and blocked while the backup is process.
In addition to being online for DML activity, the MyISAM backup in 6.0 is also a “native” backup and is not “logical” in nature. A native backup differs from a logical backup in that the backup file is taken from the actual underlying O/S files and is not a readable SQL-based file like a logical backup is. For example, the backup files produced by the mysqldump
utility are logical-based backups. The benefit of a native backup is that it typically runs faster and take less storage space that a logically-based backup.
In addition to being able to do MyISAM online (non-blocking DML) backups, the 6.0 backup utility can also do the same for transactional databases such as InnoDB, Falcon, and other transactional-oriented engines that support consistent snapshot. For those of you using MySQL Cluster, note that the new 6.0 backup won’t support Cluster primarily because Cluster already has its own online backup.
The difference for MyISAM vs. transactional engines at this time is that while MyISAM has a native backup, transactional engine backups are logical in nature. In upcoming versions of the 6.0 backup, native drivers will be developed for MySQL’s key transactional engines.
In terms of how the new 6.0 backup utility differs from mysqldump
, the following table communicates the core differences:
With a brief overview out of the way, let’s now take a quick test drive of the new MySQL 6.0 backup. All the below tests were performed on a small (1GB RAM, single CPU, etc.) Fedora Core Linux box with MySQL 6.0.6, so keep in mind that the performance, etc., that you experience will vary depending on your box and other factors.
Let’s first try the new backup on a 1GB+ MyISAM-only database that has the following data objects:
The backup is performed as follows:
mysql> backup database gimm to '/mybackups/gimm.backup'; +-----------+ | backup_id | +-----------+ | 1 | +-----------+ 1 row in set (58.64 sec)
To test the online, non-blocking DML nature of the backup, I ran a few update
statements while the backup was running:
mysql> update broker set broker_last_name = 'MORTONS' WHERE broker_last_name = 'MORTON'; Query OK, 1 row affected (0.19 sec) Rows matched: 1 Changed: 1 Warnings: 0
mysql> update broker set broker_last_name = 'POTT' WHERE broker_last_name = 'POTTS'; Query OK, 1 row affected (0.19 sec) Rows matched: 1 Changed: 1 Warnings: 0
No problems were found at all – the DML statements ran without hesitation during the backup.
Compare this to a mysqldump
performed on the same database, which ran about 4 minutes…
mysqldump
-uroot -pthisisntmypassword --port=3309 --extended-insert --quick gimm > gimm.dmp
And where this update statement hung until the backup completed…
mysql> update broker set broker_last_name = 'POTTS' WHERE broker_last_name = 'POTT';
… and you can see the new MySQL backup ran nearly 75% faster for this MyISAM database and allowed DML work to continue uninterrupted, which is a welcome change from the current mysqldump
capabilities. Now I should say that in my early tests, the first MyISAM backup I executed ran longer (about 2 more minutes), with every other backup taken afterwards running in the time shown above. The MySQL backup team is exploring why this is the case so I’m sure we’ll see a fix down the road.
You can view backup and restore executions such as the one above via a new metadata table that’s currently stored in the mysql database. It gives you a fair amount of detail on your backup (e.g. size of the backup, number of objects backed up, etc.) and recovery executions for a particular server:
mysql> select * from online_backup\G *************************** 1. row *************************** backup_id: 1 process_id: 0 binlog_pos: 14860603 binlog_file: ./my5-bin.000001 backup_state: complete operation: backup error_num: 0 num_objects: 9 total_bytes: 1149371056 validity_point_time: 2008-08-19 14:14:04 start_time: 2008-08-19 14:13:05 stop_time: 2008-08-19 14:14:04 host_or_server_name: NULL username: root backup_file: /mybackups/gimm.backup user_comment: command: backup database gimm to '/mybackups/gimm.backup' drivers: MyISAM
You can see from the above output that the backup size was a little over 1.1GB, which nearly mirrors the size of the MyISAM database being backed up.
Another metadata table lets you view the operations in progress and has this layout:
mysql> desc online_backup_progress; +-------------+---------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------------+---------------------+------+-----+---------+-------+ | backup_id | bigint(20) unsigned | NO | | NULL | | | object | char(30) | NO | | NULL | | | start_time | datetime | YES | | NULL | | | stop_time | datetime | YES | | NULL | | | total_bytes | bigint(20) | YES | | NULL | | | progress | bigint(20) unsigned | YES | | NULL | | | error_num | int(11) | NO | | 0 | | | notes | char(100) | YES | | NULL | | +-------------+---------------------+------+-----+---------+-------+
Before 6.0 GA’s, expect these tables to become like the MySQL slow and general query logs in that you can specify tables or files for their data storage.
Restoring the database was quite easy using the new restore
command:
mysql> use mysql Database changed mysql> drop database gimm; Query OK, 9 rows affected (1.34 sec) mysql> restore from '/mybackups/gimm.backup'; +-----------+ | backup_id | +-----------+ | 2 | +-----------+ 1 row in set (56.48 sec) mysql> select count(*) from gimm.client_transaction_hist3; +----------+ | count(*) | +----------+ | 14678550 | +----------+ 1 row in set (0.00 sec)
Restoring the same database with a mysqldump
file (done in typical shell> mysql db_name < backup-file.sql
fashion) took 8 minutes 40 seconds, which means the new 6.0 restore function delivered an 89% reduction in response time as a typical mysqldump
restore – certainly a nice-to-have given that critical database restores in production environments usually need to be done as quickly as possible.
Let’s now try the new backup on a 1GB+ InnoDB-only database that mirrors the MyISAM database we just tested with:
mysql> backup database gim2 to '/mybackups/gim2.backup'; +-----------+ | backup_id | +-----------+ | 10 | +-----------+ 1 row in set (3 min 21.83 sec)
The following update and others were issued during the backup to ensure DML was not suspended:
mysql> update gim2.office_location set office_city='WASHINGTON' where office_city = 'WASHINGTONS'; Query OK, 1 row affected (0.00 sec) Rows matched: 1 Changed: 1 Warnings: 0
As you can see, the backup for the InnoDB database ran quite a bit longer than the MyISAM backup due to the fact that it’s a logical-based backup as opposed to a native backup. In the current alpha of MySQL 6.0.6, it also ran a little slower (10 seconds) than a mysqldump
of the same database. However, the new MySQL 6.0 did have an advantage in terms of overall backup size footprint; the mysqldump
backup file was about 500MB larger than the 6.0 backup file.
Restoring the InnoDB database was performed as follows:
mysql> restore from '/mybackups/gim2.backup'; +-----------+ | backup_id | +-----------+ | 15 | +-----------+ 1 row in set (4 min 44.11 sec)
Restoring the same InnoDB database with a mysqldump
file took 11 minutes 30 seconds, which means that, although there wasn’t much of a speed difference between mysqldump
and the 6.0 backup for InnoDB, the new 6.0 restore function delivered a 59% reduction in response time vs. a typical mysqldump
restore, which is, again, nice to have.
The new 6.0 backup works just fine with databases using multiple engines (except for MySQL Cluster). For example, in the test MyISAM database we’ve been using, let’s switch out one of the MyISAM tables for an InnoDB table and see what happens:
mysql> use gimm; Database changed mysql> show tables; +--------------------------+ | Tables_in_gimm | +--------------------------+ | broker | | client | | client_transaction | | client_transaction_hist | | client_transaction_hist2 | | client_transaction_hist3 | | investment | | investment_type | | office_location | +--------------------------+ 9 rows in set (0.02 sec) mysql> alter table client_transaction_hist2 engine=innodb; Query OK, 1120500 rows affected (23.33 sec) Records: 1120500 Duplicates: 0 Warnings: 0 mysql> backup database gimm to '/mybackups/gimmi.backup'; +-----------+ | backup_id | +-----------+ | 21 | +-----------+ 1 row in set (1 min 41.45 sec)
So we see a slight reduction in response time as the backup uses the native MyISAM driver for all but one table, and then does a logical backup of the one million row InnoDB table.
The MySQL 6.0 backup has a compression feature that allows you to compress the size of a backup file down using standard zlib compression. Invoking compression is very easy and is done with just a single keyword.
For example, let’s test compression with our 1GB MyISAM database:
mysql> backup database gimm to '/mybackups/gimm.backup' with compression; +-----------+ | backup_id | +-----------+ | 16 | +-----------+ 1 row in set (2 min 15.38 sec)
And then try compression with the 1GB+ InnoDB database:
mysql> backup database gim2 to '/mybackups/gim2.backup' with compression; +-----------+ | backup_id | +-----------+ | 17 | +-----------+ 1 row in set (3 min 38.69 sec)
Notice that in the current alpha (6.0.6), the MyISAM backup takes more than double the time to execute over a non-compressed backup, whereas the logical-based InnoDB backup doesn’t suffer much from using compression at all. MySQL’s backup team is currently looking into the response time hit on the MyISAM backup, so stay tuned for updates.
In both cases, however, an 80%+ reduction in the backup file footprint is realized, which is a nice benefit. For example, the MyISAM (1GB+ sized) and InnoDB compressed backup size is 214MB vs. 1.1GB for a non-compressed MyISAM backup and a 1.6GB InnoDB backup.
Doing a restore of the 1GB MyISAM database was accomplished with the normal restore command:
mysql> use mysql; Database changed mysql> drop database gimm; Query OK, 9 rows affected (1.39 sec)
mysql> restore from '/mybackups/gimm.backup'; +-----------+ | backup_id | +-----------+ | 18 | +-----------+ 1 row in set (44.39 sec)
mysql> select count(*) from gimm.client_transaction_hist3; +----------+ | count(*) | +----------+ | 14678550 | +----------+ 1 row in set (0.00 sec)
As you can see, the restore of the MyISAM database ran actually faster than the non-compressed restore. Restoring the 1GB+ InnoDB database from a compressed backup took around the same time as the non-compressed restore:
mysql> drop database gim2; Query OK, 9 rows affected (0.34 sec) mysql> restore from '/mybackups/gim2.backup'; +-----------+ | backup_id | +-----------+ | 20 | +-----------+ 1 row in set (4 min 30.38 sec)
mysql> use gim2; Database changed mysql> show tables; +--------------------------+ | Tables_in_gim2 | +--------------------------+ | broker | | client | | client_transaction | | client_transaction_hist | | client_transaction_hist2 | | client_transaction_hist3 | | investment | | investment_type | | office_location | +--------------------------+ 9 rows in set (0.00 sec)
In MySQL 6.0, the first cut of the new backup utility does have a number of restrictions that you should be aware of, the important of which includes the following:
mysqldump
of the mysql database that is done in keeping with the dynamics of your instance (i.e. the more changes in logons, etc., you do, the more frequent you should run mysql database backups). The first version of MySQL 6.0’s new backup offers some nice benefits, but naturally, can’t do everything in the initial cut. The current roadmap for the backup/restore utility looks like this at the current time:
There are other enhancements on the table right now, but the above represents the major items on user’s wish-lists. If there’s any feedback or roadmap items you’d like to include, please don’t hesitate to contact me (robin.schumacher@sun.com).
More than a few years has passed since I totally trashed a major customer’s database, but fortunately, I made everything right in the end by having a basic disaster recovery plan in place. As someone responsible for MySQL, you need to ensure you have a solid plan in place for protecting your data and being able to recover it in the event of a catastrophe, whether that involves a total server melt-down or a user with too powerful permissions deleting a few rows of critical corporate data.
The new backup and restore utility in MySQL 6.0 can play a strong role in your disaster recovery framework, so it would be wise for you to become familiar with it now. You can download MySQL 6.0 and kick the tires of the new backup very easily by visiting http://dev.mysql.com/downloads/mysql/6.0.html. And being that it’s currently in alpha, we certainly would appreciate you reporting any bugs via our online bug system at: http://bugs.mysql.com/. Lastly, don’t forget to read through the MySQL 6.0 manual to become familiar with the details of the new utility – you can find that at: http://dev.mysql.com/doc/refman/6.0/en/backup-database-restore.html.
We certainly appreciate all the help and feedback you can provide with the new backup and restore utility – and thanks as always for your support of MySQL and Sun.