Friday, November 28, 2008

Disk data extent (de)allocation caveats

Here is a discussion about some caveats with allocation and deallocation of extents in MySQL Cluster (disk data).

Let's assume we have created a table space with one datafile having 5 extents (i.e the datafile is 5MB in since, since the size of each extent is 1MB).

Using the information_schema we can figure out how many free extents we have by executing the following query:

mysql> select free_extents, total_extents from information_schema.files where file_type='datafile';
+--------------+---------------+
| free_extents | total_extents |
+--------------+---------------+
| 5 | 5 |
| 5 | 5 |
+--------------+---------------+

The figure below shows a number of extents that are all free.
We then create a table using the table space and start populating it with data:

mysql> create table t1(a char(255)) engine=ndb tablespace ts1 storage disk;
mysql> insert into t1(a) values ('a');
mysql> insert into t1(a) select a from t1 limit 4096;



After inserting for while you can the table space is full (all extents have been filled):


mysql> select free_extents, total_extents from information_schema.files where file_type='datafile';
+--------------+---------------+
| free_extents | total_extents |
+--------------+---------------+
| 0 | 5 |
| 0 | 5 |
+--------------+---------------+

If I try to insert more data, then I will get:
ERROR 1114 (HY000): The table 't1' is full and I would have to add another datafile to my table space.

Now I decide to delete records:
mysql> delete from t1 limit 4096;

Eventually extents will become empty:

But there are a couple caveats to be aware of!!
Caveat one is that the extents that are free up by the deletion does not show up in the information schema. I still have zero free extents!!:

mysql> select free_extents, total_extents from information_schema.files where file_type='datafile';
+--------------+---------------+
| free_extents | total_extents |
+--------------+---------------+
| 0 | 5 |
| 0 | 5 |
+--------------+---------------+

However, I can insert more records , since there are holes in the extents that can be used.

Now I delete ALL (or almost all, it doesn't really matter) records in t1, so that table is completely empty!!

Caveat two comes when I try to create a new table using the same tablespace:

mysql> create table t2(a char(255)) engine=ndb tablespace ts1 storage disk;
Query OK, 0 rows affected (1.03 sec)

mysql> insert into t2 (a) values ('aaaa');
ERROR 1114 (HY000): The table 't2' is full

The problem is that:
  1. Extents are allocated/assigned to a table
  2. Empty extents followed by DELETE are not deallocated from the table.
Only DROP TABLE deallocate/deassociate the extents making them available to another table!!!

Thus I cannot do:

mysql> insert into t2 (a) values ('aaa')
ERROR 1114 (HY000): The table 't2' is full

But inserting into t1 is still possible:

mysql> insert into t1 (a) select a from t1 limit 4096;
Query OK, 4096 rows affected (0.17 sec)
Records: 4096 Duplicates: 0 Warnings: 0

mysql> insert into t1 (a) select a from t1 limit 4096;
Query OK, 4096 rows affected (0.16 sec)
Records: 4096 Duplicates: 0 Warnings: 0


I hope this will be fixed one day!

(btw, the cmon trunk on http://launchpad.net/cmon monitors the table spaces from today. I will release source distro shortly, but in the time being you can get the source using bazaar-vcs and do bzr branch lp:cmon).

Friday, November 21, 2008

MySQL Cluster 6.3.19

...has just been released:

You can download the GPL source distribution here. The config tool has been updated to use the new version.

Important things in 6.3.19:
  • MySQL Server 5.1.29
  • Faster --initial
  • Undo logs and data files (disk data) is created faster. Now 512K is written at a time instead of 32K.
  • New config parameter: InitFragmentLogFiles=sparse|full (you can choose if you want sparse redo log files or zero filled (full)).
For a full change log please see the reference manual.

To upgrade using severalnines.com/config:
  • copy cluster/config/config.ini to a safe place (if you have made changes to it)
  • copy cluster/config/my.cnf to a safe place (if you have made changes to
    comment out #skip_thread_priority in my.cnf if you have it set (it is deprecated).
  • generate a new config that mimics your current config so that you get exactly the same scripts (but a new version of them).
  • Unpack the mysqlcluster-63.tar.gz
  • copy the config.ini and the my.cnf from the safe place to mysqlcluster-63cluster/config/
  • cd mysqlcluster-63/cluster/scripts
  • sh download-and-compile.sh
  • sh dist.sh
  • sh rolling-restart.sh (will copy out the config files from cluster/config hence it was important to copy them back there)
CMON
If you are using cmon-0.10 (you should upgrade to cmon-0.11) or cmon-0.11:
  • stop cmon
  • start a mysql client:
  • use cmon
  • drop mysql_statistics;
  • drop mysql_variables;
  • start cmon
cmon will now recreate the mysql_statistics and mysql_variables tables. This is because the MySQL server in MySQL Cluster 6.3.19 (and 6.3.18) has added more status variables. If you don't do the above, then cmon cannot aggregate mysql statistics.

Disk data counters

I have measured the time it takes to do some operations with Cluster (6.3.19, which is about to be released very soon - you should upgrade) with respect to the disk, like starting it, creating table spaces etc.

Below all of this you can find the setup I used.

Initial start of Cluster (40GB of redo log files):
6.3.19: 3min 27sec

Create a 20GB undo file:
6.3.19: 6min 17sec

Create a 128MB data file for the tablespace:
6.3.19: ~3 sec

Insert 1M records a' 4096B (1 thread, batches of five):
6.3.19: 286 sec (3721.12 QPS)
(we can probably provision faster with bigger batches or more threads)

I then provisioned another 4M records (total of 5M records in DB).

Evil test: 100K Random reads (read 4096B) (5M records in DB):
6.3.19: 1290.42QPS (20 threads, io util is ~90% so we are almost completely io bound). So this is result is inline what we would expect when being io bound, especially since i have used four data nodes, each having one disk.

Setup:
  • Total of 4 data nodes on 4 computers.
  • Another computer was used to generate the inserts/reads
  • 8GB of RAM
  • Gig-E
  • 1 * 146GB SAS, 10KRPM
  • 128MB IndexMemory
  • 1024MB DataMemory
  • ODirect=1
  • SharedGlobalMemory=384M
  • NoOfFragmentLogFiles=40
  • FragmentLogFileSize=256M
  • DiskPageBufferMemory=3072M
  • Table space (one ts) with 100 datafiles a' 128MB (best practice is to use many but small data files instead of one big. This will be changed in 6.4 so that you can use one big data file).
    Point here is that you should have a few data files. One data file is bad, more than 128 is overkill since the data node won't keep more than that many data files open at once anyways. This affects how many data files the data node can write to in "parallel".
  • Extent size=1MB (which is quite ok)
  • Logfile group: One 20GB undofile and 128MB Undobuffer
    The undo file was a bit too big (not that it matter, i had the disk space) but I used 5366054928 extents out of 21474836480 (so ~25% was only used).
  • There is also a new configuration option in 6.3.19 which lets you create the data files
  • The disk data table looks like (data column will be stored on disk):
    create table dd (
    id integer primary key,
    ts integer,
    data varbinary(4096),
    index(ts)) engine=ndb TABLESPACE ts_1 storage disk;
  • Fedora core 9 ( uname -r --> 2.6.26.6-79.fc9.x86_64 )
(the config files were generated by severalnines.com/config )

Now, especially for inserts/writes there are quite a few things competing for the single disk:
  • REDO log
  • LCP
  • and the disk data itself (UNDO LOG + DATA FILES)
However, the best setup is to (if you have two disks)
  • Disk 1: REDO + LCP
  • Disk 2: UNDO LOG + DATA FILES
Three disks:
  • Disk 1: REDO + LCP
  • Disk 2: UNDO LOG
  • Disk 3: DATA FILES
If you have more than that then you can put DATA FILES and UNDO LOG on a RAID.

IMPORTANT: When you have done an --initial start, the files for the UNDO LOG and the DATA FILES are NOT removed. You have to remove them by hand. Otherwise you will get an error if you try to CREATE LOGFILE GROUP..

Thursday, November 20, 2008

cmon 0.11 released

cmon 0.11 which you can get from www.severalnines.com/cmon corrects the following bugs:
  • Graphs wrap half ways - https://bugs.launchpad.net/cmon/+bug/300222
  • Memory leaks (missed a couple of mysql_free_resultset) https://bugs.launchpad.net/cmon/+bug/300225
There are also some code cleanups.
To upgrade from earlier version:
  • start a mysql client on the mysql server(s) having the cmon database
  • use cmon
  • drop mysql_statistics table
  • drop mysql_variables table
  • start cmon which will then recreate those tables
The above is particularly important if you move from MySQL Cluster 6.3.17 or earlier to 6.3.18 (or later) because the number of statistic counters and variables has increased between the versions. cmon will build up the mysql_statistics and mysql_variables tables dynamically instead (if you have dropped them earlier).

Tuesday, November 04, 2008

CMON - News Flash! Monitor your mysql servers!

Here is a short summary of CMON news:
Feature update of cmon 0.1 (the same as cmon 1.2.2 plus):
  • monitor your mysql servers (add/remove mysql servers that should be monitored)
  • view status and variables of your mysql servers
  • view graphs of status variables for individual mysql servers as well as a grand total.
  • get recommendations on parameter tuning of my.cnf
  • updated web interface (small fixes made)
  • and some bug fixes..
To do
  • add AJAX support to web interface so that graphs are reloaded in a better way
  • finish plugin interface - so that it will be possible to write SNMP, SMTP, etc plugins
  • improve rules for spotting configuration problems in the monitored mysql servers.
  • improved overview of mysql variables.
  • replication channels
Screen shots

Display graphs (both for individual as well as aggregated over all mysql servers):


View the status of your mysql servers that CMON is monitoring (status, variables, graphs). Sub-optimal config values are highlighted in read!:



Add a mysql server that should be monitored by CMON...



Remove a MySQL Server from CMON..