The following terms are useful to an understanding of MySQL Cluster or have specialised meanings when used in relation to it.
Cluster: In its generic sense, a cluster is a set of computers functioning as a unit and working together to accomplish a single task. NDB Cluster is the storage engine used by MySQL to implement data storage, retrieval, and management distributed amongst several computers. MySQL Cluster refers to a group of computers working together using the NDB engine to support a distributed MySQL database in a shared-nothing architecture using in-memory storage.
Configuration files: Text files containing directives and information regarding the cluster, its hosts, and its nodes. These are read by the cluster's management nodes when the cluster is started. See Section 16.4.4, « Fichier de configuration » for details.
Backup: A complete copy of all cluster data, transactions and logs, saved to disk or other long-term storage.
Restore: Returning the cluster to a previous state as stored in a backup.
Checkpoint: Generally speaking, when data is saved to disk, it is said that a checkpoint has been reached. More specific to Cluster, it is a point in time where all committed transactions are stored on disk. With regard to the NDB storage engine, there are two sorts of checkpoints which work together to ensure that a consistent view of the cluster's data is maintained:
Local Checkpoint (LCP): This is a checkpoint that is specific to a single node; however, LCP's take place for all nodes in the cluster more or less concurrently. An LCP involves saving all of a node's data to disk, and so usually occurs every few minutes. The precise interval varies, and depends upon the amount of data stored by the node, the level of cluster activity, and other factors.
Global Checkpoint (GCP): A GCP occurs every few seconds, when transactions for all nodes are synchronised and the redo-log is flushed to disk.
Cluster host: A computer making up part of a MySQL Cluster. A cluster has both a physical structure and a logical structure. Physically, the cluster consists of a number of computers, known as cluster hosts (or more simply as hosts). See also Node, Node group.
Node: This refers to a logical or functional unit of MySQL Cluster, sometimes also referred to as a cluster node. In the context of MySQl Cluster, we use the term node to indicate a process rather than a physical component of the cluster. There are three node types required to implement a working MySQL Cluster. These are:
Management (MGM) nodes: Manages the other nodes within the MySQL Cluster. It provides configuration data to the other nodes; starts and stops nodes; handles network partitioning; creates backups and restores from them, and so forth.
SQL (MySQL server) nodes: Instances of MySQL Server which serve as front ends to data kept in the cluster's storage nodes. Clients desiring to store, retrieve, or update data can access an SQL node just as they would any other MySQL Server, employing the usual authentication methods and API's; the underlying distribution of data between node groups is transparent to users and applications. SQL nodes access the cluster's databases as a whole without regard to the data's distribution across different storage nodes or cluster hosts.
Data nodes (also referred to as storage nodes): These nodes store the actual data. Table data fragments are stored in a set of node groups; each node group stores a different subset of the table data. Each of the nodes making up a node group stores a replica of the fragment for which that node group is responsible. Currently a single cluster can support a total of up to 48 data nodes.
It is possible for more than one node to co-exist on a single machine. (In fact, it is even possible to set up a complete cluster on one machine, although one would almost certainly not want to do this in a production environment.) It may be helpful to remember that, when working with MySQL Cluster, host refers to a physical component of the cluster whereas a node is a logical or functional component (that is, a process).
Note Regarding Obsolete Terms: In older versions of the MySQL Cluster documentation, data nodes were sometimes referred to as "Database nodes" or "DB nodes". In addition, SQL nodes were sometimes known as "client nodes" or "API nodes". This older terminology has been deprecated in order to minimize confusion, and for these reasons should be avoided.
Node group: A set of data nodes. All data nodes in a node group contain the same data (fragments), and all nodes in a single group should reside on different hosts. It is possible to control which nodes belong to which node groups.
Node failure: MySQL Cluster is not solely dependent upon the functioning of any single node making up the cluster; the cluster can continue to run if one or more nodes fail. The precise number of node failures that the cluster can tolerate depends upon the number of nodes and the cluster's configuration.
Node restart: The process of restarting a failed cluster node.
Initial node restart: The process of starting a cluster node with its filesystem removed. This is sometimes used in the course of software upgrades and in other special circumstances.
System crash (or system failure): This can occur when so many cluster nodes have failed that the cluster's state can no longer be guaranteed.
System restart: The process of restarting the cluster and reinitialising its state from disk logs and checkpoints. This is required after either a planned or an unplanned shutdown of the cluster.
Fragment: A portion of a database table; in the NDB storage engine, a table is broken up into and stored as a number of fragments. A fragment is sometimes also called a partition; however, "fragment" is the preferred term. Tables are fragmented in MySQL Cluster in order to facilitate load balancing between machines and nodes.
Replica: Under the NDB storage engine, each table fragment has number of replicas stored on other storage nodes in order to provide redundancy. Currently there may be up 4 replicas per fragment.
Transporter: A protocol providing data transfer between nodes. MySQL Cluster currently supports 4 different types of transporter connections: TCP/IP (local), TCP/IP (remote), SCI, and SHM (experimental in MySQL 4.1).
TCP/IP is, of course, the familiar network protocol that underlies HTTP, FTP, etc., on the Internet.
SCI (Scalable Coherent Interface) is a high-speed protocol used in building multiprocessor systems and parallel-processing applications. Use of SCI with MySQL Cluster requires specialised hardware and is discussed in Section 16.7.1, « Configurer le cluster MySQL avec les sockets SCI ». For a basic introduction to SCI, see this essay at dolphinics.com.
SHM stands for Unix-style
shared
memory segments. Where
supported, SHM is used automatically to connect nodes
running on the same host. This is experimental in MySQL
4.1, but we intend to enable it fully in MySQL 5.0. The
Unix
man page for shmop(2)
is a good
place to begin obtaining additional information about this
topic.
Note: The cluster transporter is internal to the cluster. Applications using MySQL Cluster communicate with SQL nodes just as they do with any other version of MySQL Server (via TCP/IP or Windows named pipes/Unix sockets). Queries can be sent and results retrieved using the standard APIs.
NDB: Refers to the storage engine used to enable MySQL Cluster. The NDB storage engine supports all the usual MySQL column types and SQL statements, and is ACID-compliant. This engine also provides full support for transactions (commits and rollbacks). "NDB" stands for Network Database.
Share-nothing architecture: The ideal architecture for a MySQL Cluster. In a true share-nothing setup, each node runs on a separate host. The advantage such an arrangement is that there no single host or node can act as single point of failure or as a performance bottle neck for the system as a whole.
In-memory storage: All data stored in each data node is kept in memory on the node's host computer. For each data node in the cluster, you must have available an amount of RAM equal to the size of the database times the number of replicas, divided by the number of data nodes. Thus, if the database takes up 1 gigabyte of memory, and you wish to set up the cluster with 4 replicas and 8 data nodes, a minimum of 500 MB memory will be required per node. Note that this is in addition to any requirements for the operating system and any applications running on the host.
Table: As is usual in the context of a relational database, the term "table" denotes an ordered set of identically structured records. In MySQL Cluster, a database table is stored in a data node as a set of fragments, each of which is replicated on additional data nodes. The set of data nodes replicating the same fragment or set of fragments is referred to as a node group.
Cluster programs: These are command-line programs used in running, configuring and administering MySQL Cluster. They include both server daemons:
ndbd
: The data node daemon (runs a data
node process)
ndb_mgmd
: The management server daemon
(runs a management server process)
and client programs:
ndb_mgm
: The management client
(provides an interface for executing management commands)
ndb_waiter
: Used to verify status of
all nodes in a cluster
ndb_restore
: Restores cluster data from
backup
For more about these programs and their uses, see Section 16.5, « Serveur de gestion du cluster MySQL ».
Event log: MySQL Cluster logs events by category (startup, shutdown, errors, checkpoints, etc.), priority, and severity. A complete listing of all reportable events may be found in Section 16.6.2, « Rapport d'événements générés par le cluster MySQL ». Event logs are of two types:
Cluster log: Keeps a record of all desired reportable events for the cluster as a whole.
Node log: A separate log is also kept for each individual node.
Under normal circumstances, it is necessary and sufficient to keep and examine only the cluster log. The node logs need be comsulted only for application development and debugging purposes.
This is a translation of the MySQL Reference Manual that can be found at dev.mysql.com. The original Reference Manual is in English, and this translation is not necessarily as up to date as the English version.