Benefits of Range-Based Partitioning{1}


Database Partitioning is breaking up large amounts of data into smaller parts which can be beneficial if done correctly. The purpose behind database partitioning is to simplify managing large amounts of data and boost performance. Performance is boosted because segments of data can be stored in different directories and because MySQL will only search the segments that contain the data the query is asking for. The picture below illustrates an example of application requests being handled by limited resources which can slow down the system.

 

The two most commonly used partitioning methods are horizontal and vertical partitioning. In the article entitled “Improving Database Performance with Partitioning “  Robert Schumacher  focuses primarily on horizontal partitioning. This involves combining rows of data that contain attributes that fall within a specified range.  The combined rows are segmented into smaller parts and are assigned a position. The article points out, that historically DBA’s had to build separate tables to hold these parts. However this is now a built-in feature within most storage engines.  The author uses a non-partition table (non_part table) and a partitioned table(part_table) to demonstrate the differences in performances.  The partitioned table is created by the following:

mysql> CREATE TABLE part_tab
->   (  c1 int default NULL,
->      c2 varchar(30) default NULL,
->      c3 date default NULL
->
->      ) engine=myisam
->      PARTITION BY RANGE (year(c3)) (
PARTITION p0 VALUES LESS THAN (1995),
PARTITION p1 VALUES LESS THAN (1996) ,
PARTITION p2 VALUES LESS THAN (1997) ,

The author inserts 8 million records into both tables and executes a query to count only the records from 1995.  The non-partitioned table takes 38 seconds to execute this query while the partitioned table takes only 3.8 seconds.

Non-Partitioned Table:
mysql> select count(*) from no_part_tab where
-> c3 > date ‘1995-01-01′ and c3 < date ‘1995-12-31′;
+———-+
| count(*) |
+———-+
|   795181 |
+———-+
1 row in set (38.30 sec)

Partitioned Table:
mysql> select count(*) from part_tab where
-> c3 > date ‘1995-01-01′ and c3 < date ‘1995-12-31′;
+———-+
| count(*) |
+———-+
|   795181 |
+———-+
1 row in set (3.88 sec)

The author explains the executed query by an “explain command” in MySQL. As you can tell from below, only 798,458 were accessed in the partitioned as opposed to the eight million in the non-partitioned table.

mysql> explain select count(*) from no_part_tab where
-> c3 > date ‘1995-01-01′ and c3 < date ‘1995-12-31’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: no_part_tab
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 8000000
Extra: Using where
1 row in set (0.00 sec)

mysql> explain partitions select count(*) from part_tab where
-> c3 > date ‘1995-01-01′ and c3 < date ‘1995-12-31’\G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: part_tab
partitions: p1
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 798458
Extra: Using where
1 row in set (0.00 sec)

To further reduce response time, you can also partition vertically by dropping a column in the partitioned table.  The article gave insight on the benefits and applications of database partitioning. In class we talked about different ways to partition data (range, hash, list, etc) and the article spoke widely on range-based partition.
Schumacher, R. (n.d.). MySQL :: Improving Database Performance with Partitioning. MySQL :: Developer Zone. Retrieved November 5, 2012, from http://dev.mysql.com/tech-resources/articles/partitioning.html