Robin Schumacher is MySQL's Director of Product Management and has over 13 years of database experience in DB2, MySQL, Oracle, SQL Server and other database engines. Before joining MySQL, Robin was Vice President of Product Management at Embarcadero Technologies.
MySQL 5.0's Pluggable Storage Engine Architecture, Part 2: A Quick Test Drive
By Robin Schumacher, MySQL AB
October 26, 2005
Part 1 of MySQL 5.0's Pluggable Storage Engine Architecture gave you an overview Let's take a quick test drive to show how simple it is to use MySQL's pluggable storage engine architecture and see the impact different storage engines can have on performance. All tests below were conducted on a Dell Red Hat Fedora Core 4 box with a Pentium 4 3.00 GHz processor (hyperthreading enabled) and 1GB of RAM.
For example, let's say you have a very insert-intensive/logging application that requires the absolute fastest response times possible when it comes to handling incoming insert activity. Reads of loaded data for analysis purposes is also required.
You want to test out different scenarios and find the best possible solution, so you settle on a simple iterative test of inserting three million sample logging records into a table to test insert load speed.
First, because you have some legacy systems on another database, you decide to first give it a try and see how well it performs for your simple test. You create the necessary test table and procedure, and then execute your test:
SQL> desc test_insert; Name Null? Type ----------------------------------------- -------- ------------------- C1 NUMBER(38) C2 VARCHAR2(20) C3 DATE SQL> CREATE OR REPLACE PROCEDURE P_TEST_INSERT 2 AS 3 v_CTR NUMBER; 4 BEGIN 5 6 -- start loop for insert 7 8 FOR CTR IN 1..3000000 LOOP 9 10 INSERT INTO TEST_INSERT VALUES (1, 'sample audit string',sysdate); 11 12 END LOOP; 13 14 END; 15 / Procedure created. SQL> set timing on; SQL> exec p_test_insert; Elapsed: 00:03:03.01 SQL>
The legacy database doesn't do too bad and comes in at about one million records a minute. You now decide to try MySQL as you've heard MySQL is used on many high traffic web sites for its ability to handle heavy volumes of insert activity. You first decide to view all the available storage engines for MySQL 5.0 by issuing a SHOW ENGINES from the MySQL command line utility:
mysql> show engines; +------------+---------+----------------------------------------------------------------+ | Engine | Support | Comment | +------------+---------+----------------------------------------------------------------+ | MyISAM | DEFAULT | Default engine as of MySQL 3.23 with great performance | | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | | HEAP | YES | Alias for MEMORY | | MERGE | YES | Collection of identical MyISAM tables | | MRG_MYISAM | YES | Alias for MERGE | | ISAM | NO | Obsolete storage engine, now replaced by MyISAM | | MRG_ISAM | NO | Obsolete storage engine, now replaced by MERGE | | InnoDB | YES | Supports transactions, row-level locking, and foreign keys | | INNOBASE | YES | Alias for INNODB | | BDB | YES | Supports transactions and page-level locking | | BERKELEYDB | YES | Alias for BDB | | NDBCLUSTER | NO | Clustered, fault-tolerant, memory-based tables | | NDB | NO | Alias for NDBCLUSTER | | EXAMPLE | NO | Example storage engine | | ARCHIVE | YES | Archive storage engine | | CSV | NO | CSV storage engine | | FEDERATED | YES | Federated MySQL storage engine | | BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | +------------+---------+----------------------------------------------------------------+
You then duplicate your test and first target the InnoDB storage
engine as you've heard it is very close to legacy databases in
terms of functionality. First you validate your test object is using
the InnoDB storage engine through the SHOW CREATE TABLE command:
Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 5 to server version: 5.0.12-beta-max Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql> use test Database changed mysql> show create table insert_test\G *************************** 1. row *************************** Table: insert_test Create Table: CREATE TABLE `insert_test` ( `c1` int(11) default NULL, `c2` varchar(20) default NULL, `c3` date default NULL ) ENGINE=INNODB DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
Now you create a test stored procedure and run your test:
mysql> delimiter // mysql> create procedure test_insert() -> begin -> declare v_ctr mediumint; -> set v_ctr = 0; -> while v_ctr < 3000000 -> do -> insert into insert_test values (1,'sample audit string',now()); -> set v_ctr = v_ctr + 1; -> end while; -> end -> // Query OK, 0 rows affected (0.05 sec) mysql> delimiter ; mysql> call test_insert(); Query OK, 1 row affected (3 min 4.75 sec)
Performance of MySQL using the InnoDB storage engine is very close to the legacy database. You then decide to try the MyISAM storage engine, which is the default storage engine that comes with MySQL. Testing the MyISAM storage engine requires only one simple DDL statement to change the current InnoDB table to MyISAM (the table was also truncated beforehand so it would be empty for the test):
mysql> truncate table insert_test; Query OK, 0 rows affected (0.05 sec) mysql> alter table insert_test engine=myisam; Query OK, 0 rows affected, 0 warning (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> call test_insert(); Query OK, 1 row affected (1 min 49.88 sec)
MyISAM does pretty well beating the legacy database and InnoDB by over 73 seconds, which equates to about a 40% improvement in overall performance. While you're pleased, you've heard about a new storage engine in MySQL 5.0 called Archive that's supposed to be even more efficient for insert activity, plus it offers the added benefit of transparent data compression (with reported storage savings being in the neighborhood of 80% in some cases), row-level locking, and a consistent/snapshot read. As these features help support your application, you decide to give it try. Testing the Archive engine again requires only one simple DDL statement to change the current MyISAM table to Archive:
mysql> truncate table insert_test; Query OK, 0 rows affected (0.05 sec) mysql> alter table insert_test engine=archive; Query OK, 0 rows affected, 0 warning (0.03 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> call test_insert(); Query OK, 1 row affected (1 min 13.83 sec)
The MySQL 5.0 Archive engine beats the MyISAM insert test by 36 seconds, giving it a 33% improvement over MyISAM and a whopping 60% response time improvement over the legacy system and InnoDB.
Beyond the above insert speed test results, you can see how easy it is to switch from within MySQL to what amounts to a completely new database engine — one that can offer a dramatically positive impact for the properly targeted application.
The beautiful thing is, you don't have to worry about any dependencies that have been previously defined on the former storage engine object. Whether you have created procedures, triggers, views, functions, base indexes (keep in mind, different storage engines support different indexing strategies), etc., they all are retained after the ALTER TABLE statement and support the new storage engine object. Moreover, all the same SQL, SQL functions, and management functions such as replication transparently work without any changes or intervention necessary on part of the DBA or developer.
After your test, you want to check table scan times for your legacy system and MySQL. Once again, whether a table is loaded with data or not, switching between MySQL storage engines is a piece of cake. First, you perform a full table scan of the three million row legacy table with the following results:
SQL> select count(*) from test_insert where c1 = 1; COUNT(*) ---------- 3000000 Elapsed: 00:00:01.43 SQL>
Then you try scan times for the MySQL Archive, MyISAM, and InnoDB storage engines:
mysql> show create table insert_test\G *************************** 1. row *************************** Table: insert_test Create Table: CREATE TABLE `insert_test` ( `c1` int(11) default NULL, `c2` varchar(20) default NULL, `c3` date default NULL ) ENGINE=ARCHIVE DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> select count(*) from insert_test where c1 = 1; +----------+ | count(*) | +----------+ | 3000000 | +----------+ 1 row in set (1.98 sec) mysql> alter table insert_test engine=myisam; Query OK, 3000000 rows affected, 0 warning (5.86 sec) Records: 3000000 Duplicates: 0 Warnings: 0 mysql> select count(*) from insert_test where c1 = 1; +----------+ | count(*) | +----------+ | 3000000 | +----------+ 1 row in set (1.86 sec) mysql> alter table insert_test engine=innodb; Query OK, 3000000 rows affected (35.89 sec) Records: 3000000 Duplicates: 0 Warnings: 0 mysql> select count(*) from insert_test where c1 = 1; +----------+ | count(*) | +----------+ | 3000000 | +----------+ 1 row in set (4.05 sec)
In these tests, the legacy system has a slight performance advantage in the full table scan with MyISAM coming in at less than half a second difference, followed closely by the Archive storage engine ( with each scan being under two seconds), and then finally InnoDB. Notice how easy and quick it is to change MySQL storage engines, even when millions of rows of data exist in a table.
Because of MySQL's unique architecture, you can easily disable certain engines if you choose. This can actually have positive benefits as some engines carry various amounts of overhead when enabled even though they are not used. "Unplugging" them can ensure that only the necessary amount of resources are used by the MySQL server.
Of course, you don't want to disable engines that carry specific feature sets that pertain to your particular application, but once you determine your system's needs, you should then evaluate which MySQL storage engines you can "unplug" from the server.
Some have worried that certain large-scale features such as replication, stored procedures, etc., are embedded in the various storage engines, and disabling or not using a certain engine will have far reaching implications. As we've already seen in the above discussion on the various MySQL layers, this isn't true.
Let's say you determine you don't need the InnoDB transactional engine for your particular application. Disabling InnoDB in MySQL is quite easy and involves setting only one configuration parameter in the MySQL configuration file:
# The MySQL Server [mysqld] Skip-innodb . . .
We then stop and start the MySQL server and check if InnoDB is truly disabled.
mysql> show engines; +------------+---------+----------------------------------------------------------------+ | Engine | Support | Comment | +------------+---------+----------------------------------------------------------------+ | MyISAM | DEFAULT | Default engine as of MySQL 3.23 with great performance | | MEMORY | YES | Hash based, stored in memory, useful for temporary tables | | HEAP | YES | Alias for MEMORY | | MERGE | YES | Collection of identical MyISAM tables | | MRG_MYISAM | YES | Alias for MERGE | | ISAM | NO | Obsolete storage engine, now replaced by MyISAM | | MRG_ISAM | NO | Obsolete storage engine, now replaced by MERGE | | InnoDB | DISABLED| Supports transactions, row-level locking, and foreign keys | | INNOBASE | DISABLED| Alias for INNODB | | BDB | YES | Supports transactions and page-level locking | | BERKELEYDB | YES | Alias for BDB | | NDBCLUSTER | NO | Clustered, fault-tolerant, memory-based tables | | NDB | NO | Alias for NDBCLUSTER | | EXAMPLE | NO | Example storage engine | | ARCHIVE | YES | Archive storage engine | | CSV | NO | CSV storage engine | | FEDERATED | YES | Federated MySQL storage engine | | BLACKHOLE | YES | /dev/null storage engine (anything you write to it disappears) | +------------+---------+----------------------------------------------------------------+
Then we test a variety of MySQL support features to ensure all is well. Let's first test stored procedures:
mysql> show create table SH_Part\G *************************** 1. row *************************** Table: SH_Part Create Table: CREATE TABLE `SH_Part` ( `Model` int(11) NOT NULL, `ProductID` int(11) NOT NULL, `Serial_Number` char(12) NOT NULL, `Sub_Category` char(5) NOT NULL, `Version` int(11) NOT NULL, `Part_Name` char(24) NOT NULL, `Comment1` char(30) NOT NULL, `Price` double NOT NULL, `VendorID` int(11) NOT NULL ) ENGINE=MyISAM DEFAULT CHARSET=latin1 1 row in set (0.00 sec) mysql> delimiter // mysql> create procedure testing () -> begin -> select count(*) from SH_Part; -> end -> // Query OK, 0 rows affected (0.03 sec) mysql> delimiter ; mysql> call testing(); +----------+ | count(*) | +----------+ | 40000 | +----------+ 1 row in set (0.00 sec) Query OK, 0 rows affected (0.00 sec)
Now, how about a trigger that encrypts data on insert:
mysql> create table test (userid int, username varchar(20), userssn varchar(15)); Query OK, 0 rows affected (0.00 sec) mysql> delimiter // mysql> create trigger t_test_insert before insert on test -> for each row -> begin -> set NEW.userssn = aes_encrypt(NEW.userssn,'password'); -> end -> // Query OK, 0 rows affected (0.01 sec) mysql> delimiter ; mysql> insert into test values (1, 'robin','432123454'); Query OK, 1 row affected, 0 warning (0.01 sec) mysql> select * from test; +--------+----------+-----------------+ | userid | username | userssn | +--------+----------+-----------------+ | 1 | robin | O╗¿▓A"⌡╕Qi►ô╫█v | +--------+----------+-----------------+ 1 row in set (0.00 sec)
We could continue on with all other MySQL 5.0 features, but the above tests should suffice to prove beyond any doubt that 5.0 features such as stored procedures, triggers, and views are not dependent on InnoDB or any other particular storage engine.
The marriage of open source freedom and the MySQL pluggable storage engine architecture means that you can extend the MySQL database server to include custom storage engines that meet specialized application needs that aren't 100% addressed by bundled MySQL storage engines. Many MySQL customers have done this and are experiencing great success in both small and large enterprises. To assist customers, MySQL ships with an Example storage engine that can be used to jump-start any custom storage engine project.
The subject matter for creating a custom storage engine is beyond the scope of this paper; however, a nice short tutorial on how to create a custom storage engine exists on the MySQL developer zone web site at http://dev.mysql.com/tech-resources/articles/creating-new-storage-engine.html. For anyone wishing to build a MySQL custom storage engine, the above referenced article offers a very nice start. MySQL and the open source community continue to develop new storage engines regularly both for specialized and mainstream purposes.
The growing popularity of MySQL 5.0 can be attributed to many factors, but one thing that is particularly attractive is MySQL's pluggable storage engine architecture, which affords great flexibility and choice to any IT professional that is charged with designing high-powered database applications. The benefits of working with a transparent data management support and services layer that is joined to interchangeable database storage engines is indeed unique in the database industry and is a combination that is hard to beat.