What is MySQL partitioning ?

MySQL partitioning makes data distribution of individual tables (typically we recommend partition for large & complex I/O table for performance, scalability and manageability) across multiple files based on partition strategy / rules. In very simple terms, different portions of table are stored as separate tables in different location to distribute I/O optimally. The user defined division of data by some rule is known as partition function, In MySQL we partition data by RANGE of values / LIST of values / internal hashing function / linear hashing function. By restricting the query examination on the selected partitions by matching rows increases the query performance by multiple times compared to the same query on a non partitioned table, This methodology is also called partition pruning (trimming of unwanted partitions), Please find below example of partition pruning:

CREATE TABLE tab1 (
    col1 VARCHAR(30) NOT NULL,
    col2 VARCHAR(30) NOT NULL,
    col3 TINYINT UNSIGNED NOT NULL,
    col4 DATE NOT NULL
)
PARTITION BY RANGE( col3 ) (
    PARTITION p0 VALUES LESS THAN (100),
    PARTITION p1 VALUES LESS THAN (200),
    PARTITION p2 VALUES LESS THAN (300),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

Write a SELECT query benefitting partition pruning:

SELECT col1, col2, col3, col4
FROM tab1
WHERE col3 > 200 AND col3 < 250;

What is explicit partitioning in MySQL and how is it different from partition pruning ? 

In MySQL we can explicitly select partition and sub-partitions when executing a statement matching a given WHERE condition, This sounds very much similar to partition pruning, but there is a difference:

  • Partition to be checked are explicitly mentioned in the query statement, In partition pruning it is automatic.
  • In explicit partition, the explicit selection of partitions is supported for both queries and DML statements, partition pruning applies only to queries.
  • SQL statements supported in explicit partitioning – SELECT, INSERT, UPDATE, DELETE, LOAD DATA, LOAD XML and REPLACE

Explicit partition example:

CREATE TABLE customer  (
    cust_id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
    cust_fname VARCHAR(25) NOT NULL,
    cust_lname VARCHAR(25) NOT NULL,
    cust_phone INT NOT NULL,
    cust_fax INT NOT NULL
) 
    PARTITION BY RANGE(cust_id)  (
        PARTITION p0 VALUES LESS THAN (100),
        PARTITION p1 VALUES LESS THAN (200),
        PARTITION p2 VALUES LESS THAN (300),
        PARTITION p3 VALUES LESS THAN MAXVALUE
);

Query explicitly mentioning partition:

mysql> SELECT * FROM customer PARTITION (p1);

RANGE partitioning
In RANGE partitioning you can partition values within a given range, Ranges should be contiguous but not overlapping, usually defined by VALUES LESS THAN operator, The following examples explain how to create and use RANGE partitioning for MySQL performance:

CREATE TABLE customer_contract(
    cust_id INT NOT NULL,
    cust_fname VARCHAR(30),
    cust_lname VARCHAR(30),
    st_dt DATE NOT NULL DEFAULT '1970-01-01',
    end_dt DATE NOT NULL DEFAULT '9999-12-31',
    contract_code INT NOT NULL,
    contract_id INT NOT NULL
)
PARTITION BY RANGE (contract_id) (
    PARTITION p0 VALUES LESS THAN (50),
    PARTITION p1 VALUES LESS THAN (100),
    PARTITION p2 VALUES LESS THAN (150),
    PARTITION p3 VALUES LESS THAN (200)
);

For example, let us suppose that you wish to partition based on the year contract ended:

CREATE TABLE customer_contract(
    cust_id INT NOT NULL,
    cust_fname VARCHAR(30),
    cust_lname VARCHAR(30),
    st_dt DATE NOT NULL DEFAULT '1970-01-01',
    end_dt DATE NOT NULL DEFAULT '9999-12-31',
    contract_code INT NOT NULL,
    contract_id INT NOT NULL
)
PARTITION BY RANGE (year(end_dt)) (
    PARTITION p0 VALUES LESS THAN (2001),
    PARTITION p1 VALUES LESS THAN (2002),
    PARTITION p2 VALUES LESS THAN (2003),
    PARTITION p3 VALUES LESS THAN (2004)
);

It is also possible to partition a table by RANGE, based on the value of a TIMESTAMP column, using the UNIX_TIMESTAMP() function, as shown in this example:

CREATE TABLE sales_forecast (
    sales_forecast_id INT NOT NULL,
    sales_forecast_status VARCHAR(20) NOT NULL,
    sales_forecast_updated TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
)
PARTITION BY RANGE ( UNIX_TIMESTAMP(sales_forecast_updated) ) (
    PARTITION p0 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-01-01 00:00:00') ),
    PARTITION p1 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-04-01 00:00:00') ),
    PARTITION p2 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-07-01 00:00:00') ),
    PARTITION p3 VALUES LESS THAN ( UNIX_TIMESTAMP('2008-10-01 00:00:00') ),
    PARTITION p4 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-01-01 00:00:00') ),
    PARTITION p5 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-04-01 00:00:00') ),
    PARTITION p6 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-07-01 00:00:00') ),
    PARTITION p7 VALUES LESS THAN ( UNIX_TIMESTAMP('2009-10-01 00:00:00') ),
    PARTITION p8 VALUES LESS THAN ( UNIX_TIMESTAMP('2010-01-01 00:00:00') ),
    PARTITION p9 VALUES LESS THAN (MAXVALUE)
);

LIST partitioning
The difference between RANGE and LIST partitioning is: In LIST partitioning, each partition is grouped on the selected list of values of a specific column. You can do it by PARTITION BY LIST (EXPR) where EXPR is the selected column for list partition, We have explained LIST partitioning with example below:

CREATE TABLE students (
    student_id INT NOT NULL,
    student_fname VARCHAR(30),
    student_lname VARCHAR(30),
    student_joined DATE NOT NULL DEFAULT '1970-01-01',
    student_separated DATE NOT NULL DEFAULT '9999-12-31',
    student_house INT,
    student_grade_id INT
)
PARTITION BY LIST(student_grade_id) (
    PARTITION P1 VALUES IN (1,2,3,4),
    PARTITION P2 VALUES IN (5,6,7),
    PARTITION P3 VALUES IN (8,9,10),
    PARTITION P4 VALUES IN (11,12)
);

HASH partitioning
HASH partitioning makes an even distribution of data among predetermined number of partitions, In RANGE and LIST partitioning you must explicitly define the partitioning logic and which partition given column value or set of column values are stored. In HASH partitioning MySQL take care of this, The following example explains HASH partitioning better:

CREATE TABLE store (
    store_id INT NOT NULL,
    store_name VARCHAR(30),
    store_location VARCHAR(30),
    store_started DATE NOT NULL DEFAULT '1997-01-01',
    store_code INT
)
PARTITION BY HASH(store_id)
PARTITIONS 4;

P.S. : If you do not include a PARTITIONS clause, the number of partitions defaults to 1.

LINEAR HASH partitioning
The LINEAR HASH partitioning utilizes a linear powers-of-two algorithm, Where HASH partitioning employs the modulus of the hashing function’s value. Please find below LINEAR HASH partitioning example:

CREATE TABLE store (
    store_id INT NOT NULL,
    store_name VARCHAR(30),
    store_location VARCHAR(30),
    store_started DATE NOT NULL DEFAULT '1997-01-01',
    store_code INT
)
PARTITION BY LINEAR HASH( YEAR(store_started) )
PARTITIONS 4;

KEY partitioning
KEY partitioning is very much similar to HASH, the only difference is, the hashing function for the KEY partitioning is supplied by MySQL, In case of MySQL NDB Cluster, MD5() is used, For tables using other storage engines, the MySQL server uses the storage engine specific hashing function which is based on the same algorithm as PASSWORD().

CREATE TABLE contact(
    id INT NOT NULL,
    name VARCHAR(20),
    contact_number INT,
    email  VARCHAR(50),
    UNIQUE KEY (id)
)
PARTITION BY KEY()
PARTITIONS 5;

P.S. – if the unique key column were not defined as NOT NULL, then the previous statement would fail.

Subpartitioning
SUBPARTITIONING  is also known as composite partitioning, You can partition table combining RANGE and HASH for better results, The example below explains SUBPARTITIONING better:

CREATE TABLE purchase (id INT, item VARCHAR(30), purchase_date DATE)
    PARTITION BY RANGE( YEAR(purchase_date) )
    SUBPARTITION BY HASH( TO_DAYS(purchase_date) )
    SUBPARTITIONS 2 (
        PARTITION p0 VALUES LESS THAN (2000),
        PARTITION p1 VALUES LESS THAN (2010),
        PARTITION p2 VALUES LESS THAN MAXVALUE
    );

It is also possible to define subpartitions explicitly using SUBPARTITION clauses to specify options for individual subpartitions:

CREATE TABLE purchase (id INT, item VARCHAR(30), purchase_date DATE)
    PARTITION BY RANGE( YEAR(purchase_date) )
    SUBPARTITION BY HASH( TO_DAYS(purchase_date) ) (
        PARTITION p0 VALUES LESS THAN (2000) (
            SUBPARTITION s0,
            SUBPARTITION s1
        ),
        PARTITION p1 VALUES LESS THAN (2010) (
            SUBPARTITION s2,
            SUBPARTITION s3
        ),
        PARTITION p2 VALUES LESS THAN MAXVALUE (
            SUBPARTITION s4,
            SUBPARTITION s5
        )
    );

Things to remember:

  • Each partition must have the same number of subpartitions.
  • Each SUBPARTITION clause must include (at a minimum) a name for the subpartition. Otherwise, you may set any desired option for the subpartition or allow it to assume its default setting for that option.
  • Subpartition names must be unique across the entire table. For example, the following CREATE TABLE statement is valid in MySQL 5.7:
CREATE TABLE purchase (id INT, item VARCHAR(30), purchase_date DATE)
    PARTITION BY RANGE( YEAR(purchase_date) )
    SUBPARTITION BY HASH( TO_DAYS(purchase_date) ) (
        PARTITION p0 VALUES LESS THAN (1990) (
            SUBPARTITION s0,
            SUBPARTITION s1
        ),
        PARTITION p1 VALUES LESS THAN (2000) (
            SUBPARTITION s2,
            SUBPARTITION s3
        ),
        PARTITION p2 VALUES LESS THAN MAXVALUE (
            SUBPARTITION s4,
            SUBPARTITION s5
        )
    );

MySQL partitioning limitations
MySQL partitioning also has limitations, We are listing down below the limitations of MySQL partitioning:

A PRIMARY KEY must include all columns in the table’s partitioning function:

CREATE TABLE tab3 (
column1 INT NOT NULL,
column2 DATE NOT NULL,
column3 INT NOT NULL,
column4 INT NOT NULL,
UNIQUE KEY (column1, column2),
UNIQUE KEY (column3)
)
PARTITION BY HASH(column1 + column3)
PARTITIONS 4;

Expect this error after running above script – ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table’s partitioning function

The right way of doing it:

CREATE TABLE table12 (
    column1 INT NOT NULL,
    column2 DATE NOT NULL,
    column3 INT NOT NULL,
    column4 INT NOT NULL,
    UNIQUE KEY (column1, column2, column3)
)
PARTITION BY HASH(column3)
PARTITIONS 5;
CREATE TABLE table25 (
    column11 INT NOT NULL,
    column12 DATE NOT NULL,
    column13 INT NOT NULL,
    column14 INT NOT NULL,
    UNIQUE KEY (column11, column13)
)
PARTITION BY HASH(column11 + column13)
PARTITIONS 5;

Most popular limitation of MySQLPrimary key is by definition a unique key, this restriction also includes the table’s primary key, if it has one. The example below explains this limitation better:

CREATE TABLE table55 (
    column11 INT NOT NULL,
    column12 DATE NOT NULL,
    column13 INT NOT NULL,
    column14 INT NOT NULL,
    PRIMARY KEY(column11, column12)
)
PARTITION BY HASH(column13)
PARTITIONS 4;
CREATE TABLE table65 (
    column20 INT NOT NULL,
    column25 DATE NOT NULL,
    column30 INT NOT NULL,
    column35 INT NOT NULL,
    PRIMARY KEY(column20, column30),
    UNIQUE KEY(column25)
)
PARTITION BY HASH( YEAR(column25) )
PARTITIONS 5;

Both of the above scripts will return this error – ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table’s partitioning function

The right way of doing it:

CREATE TABLE t45 (
    column50 INT NOT NULL,
    column55 DATE NOT NULL,
    column60 INT NOT NULL,
    column65 INT NOT NULL,
    PRIMARY KEY(column50, column55)
)
PARTITION BY HASH(column50 + YEAR(column55))
PARTITIONS 5;
CREATE TABLE table88 (
    column80 INT NOT NULL,
    column81 DATE NOT NULL,
    column82 INT NOT NULL,
    column83 INT NOT NULL,
    PRIMARY KEY(column80, column81, column82),
    UNIQUE KEY(column81, column82)
);

In above example, the primary key does not include all columns referenced in the partitioning expression. However, both of the statements are valid !

You can still successfully partition a MySQL table without unique keys – this also includes having no primary key and you may use any column or columns in the partitioning expression as long as the column type is compatible with the partitioning type, The example below shows partitioning a table with no unique / primary keys:

CREATE TABLE table_has_no_pk (column10 INT, column11 INT, column12 varchar(20))
PARTITION BY RANGE(column10) (
PARTITION p0 VALUES LESS THAN (500),
PARTITION p1 VALUES LESS THAN (600),
PARTITION p2 VALUES LESS THAN (700),
PARTITION p3 VALUES LESS THAN (800)
);

You cannot later add a unique key to a partitioned table unless the key includes all columns used by the table’s partitioning expression, The example below explains this much better:

ALTER TABLE table_has_no_pk ADD PRIMARY KEY(column10);
ALTER TABLE table_has_no_pk drop primary key;
ALTER TABLE table_has_no_pk ADD PRIMARY KEY(column10,column11);
ALTER TABLE table_has_no_pk drop primary key;

However, the next statement fails, because column10 is part of the partitioning key, but is not part of the proposed primary key:

mysql> ALTER TABLE table_has_no_pk ADD PRIMARY KEY(column11);
ERROR 1503 (HY000): A PRIMARY KEY must include all columns in the table's partitioning function
mysql> 

MySQL partitioning limitations (at storage engine level)

InnoDB

  • InnoDB foreign keys and MySQL partitioning are not compatible, Partitioned InnoDB tables cannot have foreign key references, nor can they have columns referenced by foreign keys, So you cannot partition InnoDB tables which have or referenced by foreign keys.
  • InnoDB does not support use of multiple disks for subpartition (MyISAM supports this feature)
  • Use ALTER TABLE … REBUILD PARTITION and ALTER TABLE … ANALYZE PARTITION than using ALTER TABLE … OPTIMIZE PARTITION

NDB storage engine 

  • We can only partition by KEY (including LINEAR KEY) in NDB storage engine.

FEDERATED storage engine 

  • Partitioning not supported in FEDERATED storage engine.

CSV storage engine

  • Partitioning not supported in CSV storage engine.

MERGE storage engine 

  • Tables using the MERGE storage engine cannot be partitioned. Partitioned tables cannot be merged.

MySQL functions shown in the following list are allowed in partitioning expressions:

  • ABS()
  • CEILING()
  • DATEDIFF()
  • DAY()
  • DAYOFMONTH()
  • DAYOFWEEK()
  • DAYOFYEAR()
  • EXTRACT()
  • FLOOR()
  • HOUR()
  • MICROSECOND()
  • MINUTE()
  • MOD()
  • MONTH()
  • QUARTER()
  • SECOND()
  • TIME_TO_SEC()
  • TO_DAYS()
  • TO_SECONDS()
  • UNIX_TIMESTAMP()
  • WEEKDAY()
  • YEAR()
  • YEARWEEK()

MySQL partitioning and locks 

Effect on DML statements

  • In MySQL 5.7, updating a partitioned MyISAM table cause only the affected partitioned to be locked.
  • SELECT statements (including those containing unions or joins) lock only those partitions that actually need to be read. This also applies to SELECT …PARTITION.
  • An UPDATE prunes locks only for tables on which no partitioning columns are updated.
  • REPLACE and INSERT lock only those partitions having rows to be inserted or replaced. However, if an AUTO_INCREMENT value is generated for any partitioning column then all partitions are locked.
  • INSERT … ON DUPLICATE KEY UPDATE is pruned as long as no partitioning column is updated.
  • INSERT … SELECT locks only those partitions in the source table that need to be read, although all partitions in the target table are locked.
  • Locks imposed by LOAD DATA statements on partitioned tables cannot be pruned.

Effect on DML statements

  • CREATE VIEW does not cause any locks.
  • ALTER TABLE … EXCHANGE PARTITION prunes locks; only the exchanged table and the exchanged partition are locked.
  • ALTER TABLE … TRUNCATE PARTITION prunes locks; only the partitions to be emptied are locked.
  • In addition, ALTER TABLE statements take metadata locks on the table level.

Effect on other statements

  • LOCK TABLES cannot prune partition locks.
  • CALL stored_procedure(expr) supports lock pruning, but evaluating expr does not.
  • DO and SET statements do not support partitioning lock pruning.

 

About MinervaDB Corporation 88 Articles
Independent and vendor neutral consulting, support, remote DBA services and training for MySQL, MariaDB, Percona Server, PostgreSQL and ClickHouse with core expertize in performance, scalability and high availability . We are an virtual corporation, all of us work from home on multiple timezones and stay connected via Email, Skype, Google Hangouts, Phone and IRC supporting over 250 customers worldwide
UA-155183614-1