Robin Schumacher is MySQL's Director of Product Management and has over 13 years of database experience in DB2, MySQL, Oracle, SQL Server and other database engines. Before joining MySQL, Robin was Vice President of Product Management at Embarcadero Technologies.
As any experienced MySQL user knows, the MySQL server uses different storage engines that live underneath an upper layer of services, utilities, and SQL interfaces, to manage the actual I/O and data storage tasks. Each storage engine has its own set of characteristics, some of which are shared with other storage engines, but some of which are unique to a particular engine. One area that a storage engine may differ from other engines is in the way it manages data on disk. Some engines (like MyISAM, Archive, and InnoDB using the file-per-table option) map each table and index to separate and distinct files that can be targeted to most any accessible file system/directory. Other engines (like InnoDB and SolidDB) use a quasi-concept of tablespaces, with different files being specified in the MySQL configuration file and used for all tables and indexes in a MySQL instance.
The Falcon transactional storage engine uses the concept of user-defined tablespaces that are more in line with what one would find in Oracle (tablespaces) or SQL Server (filegroups). Let’s take a quick look at how tablespaces are used within Falcon and what you can and can’t do with them.
Unlike some storage engines that use a fixed page size unless you do some work at the compiler level (MyISAM is 2K; InnoDB is 16K), Falcon lets you assign a custom page size that will be used by all subsequently created tablespaces in a MySQL instance. Currently, you can specify a page size anywhere from 2-32K. The only catch? You only get to specify it once in the MySQL configuration file during the initial MySQL instance startup/creation. Note that if you want to change the current Falcon page size, you can backup any falcon objects, drop all storage containers and log files, reinitialize the MySQL instance with a new page size, and restore all your objects back to your databases.
With respect to internals, Falcon creates an initial datafile for a tablespace at four times the configured page size and auto-increments it in that amount each time more space is needed. You can change the amount of the initial allocation for all new tablespaces by specifying a number (representing MB) for the falcon_initial_allocation config parameter
. The first four pages of each tablespace datafile are not used for user data, but for management purposes. The first page is the datafile header, which identifies the file as a Falcon tablespace and specifies useful things like the page size, the on-disk structure version, the creation time, and more. The second page is a free space page, which contains bits that indicate whether the pages that follow are free or in use. The third page is the top of a lookup structure used to locate the lookup structures for table data, sequences, etc., and the fourth page is the top of a lookup structure that locates the top page of indexes.
Just like Oracle and SQL Server, Falcon creates a system tablespace for its own use (falcon_master.fts), a temporary tablespace for workarea handling (falcon_temporary.fts), and a default tablespace for you to use for your own tables, indexes, and BLOB data (falcon_user.fts). These tablespaces are created in the default MySQL data directory. Note that a tablespace can be used for multiple databases within the MySQL instance, so Falcon tablespaces are more like Oracle tablespaces (with schemas) than SQL Server filegroups in this respect.
The syntax used to create a Falcon tablespace is very simple (below example done on the Windows platform):
mysql> create tablespace gimdata -> add datafile 'c:/dev/falcon/gimdata.fts' -> engine=falcon; Query OK, 0 rows affected (0.03 sec)
To create a tablespace, you provide a unique name for the tablespace, a single datafile name (and path if desired), and use the ENGINE=FALCON
clause (done because the NDB/Cluster storage engine also uses tablespaces). Notice that you do not provide a size for the initial size of the tablespace’s datafile. As previously stated, Falcon initially creates a datafile the size equal to 4X the Falcon page size, so for the default Falcon page size (4K), this would equate to a 16K initial datafile size. Also note that you only provide one datafile name. Currently, each Falcon tablespace only supports one datafile with the limit per tablespace being 110 Terabytes. Expect tablespaces to support multiple datafiles in future Falcon versions. The final thing to note about tablespace datafiles is that they extend automatically to accommodate incoming data and they have no DDL defined/preassigned limit with respect to how large they can grow.
One good thing about Falcon tablespaces is that they are dynamic in nature; you can create and drop as many of them as you would like without impacting the running of the MySQL instance. This differs from InnoDB, which requires that you manually add additional datafiles to the MySQL configuration file, stop and restart the server to add more storage space (unless you are using the InnoDB file-per-table option).
Creating tables and indexes in Falcon tablespaces is easily done via the CREATE DDL
syntax:
mysql> create table t1 (c1 int) engine=falcon tablespace gimdata; Query OK, 0 rows affected (0.09 sec)
Once created, you populate and use the table as you would any other table and really give no second thoughts to the tablespace storage container itself. If, however, you find that you want to relocate existing Falcon objects to a new/different tablespace, you simply use the ALTER TABLE command and specify the new target tablespace:
mysql> alter table t1 tablespace gimdata2; Query OK, 0 rows affected (0.14 sec)
One important thing to understand in Falcon is that you cannot separate Falcon tables and indexes into separate tablespaces. Most database systems separate indexes from data because their index access strategy requires reading the index and data alternately, which leads to a bouncing of the disk head between the data and index. The idea is that putting them in separate files, preferably on different devices, reduces contention. Falcon reads the index first, then retrieves data in storage order, which eliminates that problem so it’s better for Falcon to keep its tables and indexes stored together.
Tablespaces that house Falcon objects cannot be dropped; only empty tablespaces can be removed with the DROP TABLESPACE < tablespace name > ENGINE=FALCON
command. Note that the actual Falcon datafiles are also removed from the server when a DROP TABLESPACE command is issued, which is nice from a cleanup standpoint.
If you want to know what objects are assigned to what tablespaces, currently you simply query the INFORMATION_SCHEMA object FALCON_TABLES
:
mysql> select * from information_schema.falcon_tables; +-------------+--------------------+-------------+ | SCHEMA_NAME | TABLE_NAME | TABLESPACE | +-------------+--------------------+-------------+ | GIMF | BROKER | FALCON_USER | | GIMF | CLIENT | FALCON_USER | | GIMF | CLIENT_TRANSACTION | FALCON_USER | | GIMF | INVESTMENT | FALCON_USER | | GIMF | INVESTMENT_TYPE | FALCON_USER | | GIMF | OFFICE_LOCATION | FALCON_USER | | GIMF | FTEST | GIMFDATA | | GIM | T1 | GIMDATA2 | +-------------+--------------------+-------------+
In the current Falcon Alpha, there is no way via the INFORMATION_SCHEMA
to see storage diagnostic info like total, used, and free space for tablespaces. Before the GA release of Falcon, there will either be a tablespace object available for this, or the space for Falcon tables and indexes will be reported in the INFORMATION_SCHEMA.TABLES
object, which can then be joined to the FALCON_TABLES
object to produce such reports.
One worry that DBA’s have always had is space fragmentation and the inevitable resulting reorganizations that must be done to reclaim wasted space and more efficiently organize individual object storage within the tablespace. Falcon completely reuses any empty pages that have resulted from deleted data, so the need to perform tablespace reorganizations is very unlikely from a “plug the holes” standpoint. Performing a reorg because of object fragmentation in the tablespace is another matter and might be needed depending on the shape the object is in and the activity it receives. No diagnostics are available yet to determine such a thing, but they are on the roadmap.
MySQL DBA’s wishing more freedom in storage management will be pleased with what they find with Falcon tablespaces. Although not perfect, Falcon tablespaces provide an easy-to-use way for dynamically allocating storage and assigning objects where desired. More feature enhancements are on the horizon for Falcon tablespaces, so stay tuned for coming additions.
For more information on Falcon, you can download the updated Falcon whitepaper, check out other articles on Falcon that are currently on the dev zone, review the Falcon documentation, and visit the Falcon online forum.
Of course, you can also download the latest MySQL 6.0 Alpha (of which Falcon is only the first new feature) to kick the tires of Falcon tablespaces yourself. Give them a try today and shoot me what you both like and hate at rschumacher@mysql.com. The latest Alpha build can be found at: http://dev.mysql.com/downloads/mysql/6.0.html.
Thanks again for your support of MySQL!
Read and post comments on this article in the MySQL Forums. There are currently 2 comments.