Monday, March 16, 2009

Memory (de)allocation,(de)fragmentation, and chkfrag!

(this blog post applies to MySQL Cluster 6.3 and MySQL Cluster 6.4/7.0)

A while ago I discussed Memory Allocation in MySQL Cluster. Here it comes again, but we also discuss fragmentation and a tool to help analyze if there is fragmentation.

The main problem I want to address here is that "I filled up 100% of my DataMemory, then deleted 40% and I cannot create a new table!!". Now this is not such a common problem, but can happen in systems with a lot of insert/deletes (especially on VAR* columns), and can be an issue if you need to create a new table. If the allocation patterns are always on the same set of tables, then this is less of an issue.

Allocation
  1. If the table is empty and there is an insert - one page will be allocated from the page pool (DataMemory) and associated with that table.
  2. Subsequent inserts will fill up the data page until it is full.
  3. NEW: Attributes in a table are divided into a fixed size part, and a variable part (if the table has VAR* or even DYNAMIC attributes). The fixed size part is stored together with the record header and from there is a reference to the VAR*/DYNAMIC part. The VAR*/DYNAMIC part and the FIXED part are stored on different pages and are never mixed!
  4. The allocator allocates one page at a time!
  5. If there is a hole on an already allocated page, and this hole fits the data that should be inserted, then this hole will be used.
  6. 5) means that space within a page that is associated with a table will be reused.
Deallocation
  • From MySQL Cluster 6.3 there is per page de-allocation, so an empty page after a DELETE will release the pages to the DataMemory page pool.
  • TRUNCATE/DROP TABLE will release the pages to the DataMemory page pool.
Fragmentation/Defragmentation
  • The pages will be fragmented when you insert/delete record.
  • OPTIMIZE TABLE can help to defragment pages within a table, and free (empty) pages will be put on the page pool (see Deallocation above).
  • In MySQL Cluster - rolling restarts of the data nodes will not defragment the FIXED size pages, but VAR*/DYNAMIC pages will be defragmented.
But how do you know if you have fragmentation or not? A new tool, chkfrag (beta) can help you.

Back to the problem: "I filled up 100% of my DataMemory, then deleted 40% and I cannot create a new table!!". Why is that?

Pretened that we have a MySQL Cluster up and running. We have only four pages and all of them are being used (figure 1):


We cannot create a new table at this stage, because we don't have an free pages that can be used to store records.

Now we delete 40% of the records. This frees up space on the pages (shown in figure 2). We can still not create a new table, but if we can of course insert new records into t1 provided they fit. OPTIMIZE TABLE can help to reduce fragmentation.
In its current form and foreseeable future, OPTIMIZE TABLE, will only compact the VAR* pages. Fixed size will not be compacted. And frankly, it is not great on defragmentation either. A number of IFs and buts about record sizes etc mak
es it rather inefficient (in most cases). It would be good to have the option to run it more aggressive. Figure 3 shows what it could look like after OPTIMIZE TABLE.
Currently, the most efficient to defragmentation method is simply to perform a rolling restart (figure 4).. and chkfrag let's you know if you need to!
Example
Below follows an example on fragmentation and how chkfrag can help:

CREATE TABLE `t5` (
`id` varchar(32) CHARACTER SET utf8 NOT NULL,
`name` varchar(32) NOT NULL DEFAULT '',
`unit` int(11) DEFAULT NULL,
PRIMARY KEY (`id`,`name`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1

Then we use hugoLoad (a test program that isn't built by default in the src distribution of cluster) to load it with 1M records.

./hugoLoad -r 1000000 -d test t5

Checking fragmentation:
johan@homecomputer:~/dev/chkfrag$ src/chkfrag
Gathering information how much DataMemory is being used
Used DataMemory calculated: 338MB 10816 pages (32KB)
Used DataMemory reported: 324MB 10368 pages (32KB)
Total DataMemory reported: 512MB 16384 pages (32KB)
Percentage of VAR* in database: 42.01 percent
There is no fragmentation!

In this case there is a discrepancy between Calculated DataMemory and real data memory.
That is ok, but when multiplying the difference with the percentage of VAR* in the database, then the fragmenation is too small, i.e, less than 10MB. Btw, this value is taken out of the blue..so there is no science with that.

deleting 200000 random records and checking fragmentation again:

Gathering information how much DataMemory is being used
Used DataMemory calculated: 270MB 8640 pages (32KB)
Used DataMemory reported: 324MB 10368 pages (32KB)
Total DataMemory reported: 512MB 16384 pages (32KB)
Percentage of VAR* in database: 41.85 percent
Fragmentation of VAR* in database: 16.67 percent
Possibly lost memory (locked in by tables): 22.60MB

Recommendation:
1. Run OPTIMIZE TABLE on tables with a lot of VARCHAR/VARBINARY.
If you are not happy with the result of this then do step 2)!
2. Perform a rolling restart.

Running the OPTIMIZE TABLE:

mysql> optimize table t5;
+---------+----------+----------+----------+
| Table | Op | Msg_type | Msg_text |
+---------+----------+----------+----------+
| test.t5 | optimize | status | OK |
+---------+----------+----------+----------+
1 row in set (3 min 42.44 sec)

Checking the fragmentation:

Gathering information how much DataMemory is being used
Used DataMemory calculated: 270MB 8640 pages (32KB)
Used DataMemory reported: 324MB 10368 pages (32KB)
Total DataMemory reported: 512MB 16384 pages (32KB)
Percentage of VAR* in database: 41.85 percent
Fragmentation of VAR* in database: 16.67 percent
Possibly lost memory (locked in by tables): 22.60MB

Recommendation:
1. Run OPTIMIZE TABLE on tables with a lot of VARCHAR/VARBINARY.
If you are not happy with the result of this then do step 2)!
2. Perform a rolling restart.


Hmm... OPTIMIZE TABLE does not work so great (i have however seen it reclaimin some pages sometimes..).
And after a rolling restart:

Gathering information how much DataMemory is being used
Used DataMemory calculated: 270MB 8640 pages (32KB)
Used DataMemory reported: 272MB 8704 pages (32KB)
Total DataMemory reported: 512MB 16384 pages (32KB)
Percentage of VAR* in database: 41.48 percent
Fragmentation of VAR* in database: 0.74 percent
Possibly lost memory (locked in by tables): 0.83MB

We have reclaimed 324MB-272MB=52MB!

Conclusion:
A dba should never have to perform a rolling restart in order to defragment (and arguably in no other case either). OPTIMIZE TABLE should be improved and perhaps have a "FULL" flag, indicating how thorough it should be.

No comments: