Thursday, December 16, 2010

Semi-sync replication (5.5)

A bit unorthodox, I normally just write about MySQL Cluster here, but I just wanted to understand how much latency semi-sync replication adds.

The test was very simple:
  • two mysql servers, interconnected (same switch) on a 1 Gig-E network
  • one table (see below)
  • comparing insert performance (one thread) with 'no replication at all' and 'semi sync replication enabled'.
  • bencher (had to hack it a bit to make it work with vanilla mysql) running one thread, inserting 4B+128B+4B = 136B of data
CREATE TABLE `t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data` varchar(255) DEFAULT NULL,
`ts` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `ts` (`ts`)
) ENGINE=InnoDB
Test 1 - no replication
src/bencher -t1 -r 30   -q "insert into t1(b) values('012345678901234567890123456789001234567890123456789012345678900123456789012345678901234567890012345678901567890123456789001234567')" \
-i "truncate table t1" -o no_replication

Summary:
--------------------------
Average Throughput = 7451.26 tps (stdev=969.24)
Average Latency (us)=134.31 (stdev=2904.26)
95th Percentile (us)=110.00
(I haven't looked into why the stdev is so high, but it looks like some requests takes really long time, but this i have not studied why).

Test 2 - Semi-sync replication
src/bencher -t1 -r 30   -q "insert into t1(b) values('012345678901234567890123456789001234567890123456789012345678900123456789012345678901234567890012345678901567890123456789001234567')" \
-i "truncate table t1" -o no_replication

Summary:
--------------------------
Average Throughput = 2377.10 tps (stdev=210.42)
Average Latency (us)=421.46 (stdev=2644.61)
95th Percentile (us)=362.67
From this simpe test the increased latency when using semi sync replication is not surprising at all, since the semi-sync adds another network hop to the slave mysql server, and the difference in latency between 'no replication' and 'semi-sync' is very much expected network overhead, (similar factor as if you do a read in mysql cluster vs an write (which requires 2PC and network communication between the data nodes) ).

So the next step now is to compare this to synchronous replication (cluster), but this will be another time.

innodb config for the record
innodb_buffer_pool_size=2048M
innodb_log_file_size=256M
innodb_log_files_in_group=3
innodb_file_format=barracuda
innodb_flush_method = O_DIRECT
innodb_flush_log_at_trx_commit=2

Wednesday, November 03, 2010

DiskPageBufferMemory tuning and disk data statistics in MySQL Cluster 7.1.9

From MySQL Cluster 7.1.9 (not yet released) it is possible to get better stats on disk data tables. In fact, the statistics makes it possible to tune the DiskPageBufferMemory parameter (similar to innodb_bufferpool), in order to avoid disk seeks. It is much (understatement) faster to fetch data from the DiskPageBufferMemory than disk.

Here is an example/tutorial how to use this information and how to check the hit ratio of the DiskPageBufferMemory. Next time, I will explain about other counters you can get from ndbinfo.diskpagebuffer.

Finally, no more educated guesswork is needed.

Let's take an example.

I have a table t1 with 650000 record
CREATE TABLE `t1` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`data1` varchar(512) DEFAULT NULL,
`data2` varchar(512) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=ndbcluster TABLESPACE ts_1 STORAGE DISK ENGINE=ndbcluster
data1 and data2 is non-indexed columns so they will be placed on disk.
Then doing Random reads on this table with bencher:
 src/bencher -r 100000 -t 1 -q \
"SET @x=FLOOR(1 + (RAND() * 650000)); SELECT * FROM t1 WHERE id=@x"
DiskPageBufferMemory=64M
The initial performance after 340 seconds into the run:
Thread 0 - 274 tps (average tps measured after 335 secs)
Thread 0 - 273 tps (average tps measured after 340 secs)
and the hit ratio:
mysql> SELECT
node_id,
100*(sum(page_requests_direct_return)/
(sum(page_requests_direct_return)+sum(page_requests_wait_io))) as hit_ratio_pct
FROM diskpagebuffer GROUP BY node_id;
+---------+---------------+
| node_id | hit_ratio_pct |
+---------+---------------+
| 3 | 93.3577 |
| 4 | 93.6565 |
+---------+---------------+

2 rows in set (0.01 sec
A hit ratio of ~93% - not so great.

So let's increase the DiskPageBufferMemory ( i am using the configurator scripts for this)
# change DiskPageBufferMemory to 512M in config.ini
vi ../config/config.ini
# do a rolling restart of the data nodes
./rolling-restart.sh --layer=storage
## the sit back and wait for the data nodes to be rolled..
hostx: (324s elapsed) Node 3: starting (Last completed phase 100) (mysql-5.1.51 ndb-7.1.9)
hostx: Data node 3 is STARTED
## then wait for node 4..
hosty: (98s elapsed) Node 4: starting (Last completed phase 4) (mysql-5.1.51 ndb-7.1.9)

Increasing DiskPageBufferMemory=512M
Of course, while doing the rolling restart, the same traffic is still happening on the cluster. Let's check what the hit_ratio is now (while restarting node 3).

695 seconds into the run we have:
mysql>  "see above"
+---------+---------------+
| node_id | hit_ratio_pct |
+---------+---------------+
| 4 | 91.2639 |
+---------+---------------+
1 row in set (0.00 sec)
and perf is...
Thread 0 - 200 tps (average tps measured after 690 secs)
Thread 0 - 200 tps (average tps measured after 695 secs)
not good at all.

After increase to DiskPageBufferMemory=512M

When the rolling restart has completed the ./rolling-restart.sh prints out:
Cluster: Cluster has performed a rolling restart  
The DiskPageBufferMemory has now been changed to 512M.

Let's check what hit ratio we get:
mysql>  "see above"
+---------+---------------+
| node_id | hit_ratio_pct |
+---------+---------------+
| 3 | 98.2487 |
| 4 | 98.6386 |
+---------+---------------+
2 rows in set (0.01 sec)
Hit ratio is picking up and performance as well:
Thread 0 - 1141 tps (average tps measured after 1310 secs)
Thread 0 - 1149 tps (average tps measured after 1315 secs)
Now, we just have to wait for a while and let the system enter a steady state.
Then we can check back on the hit ratio and perhaps increase the DiskPageBufferMemory even more.

If you have 99.5 -- 99.9% hit ratio i would say you are good.

After 1780 seconds into the run we have:
+---------+---------------+
| node_id | hit_ratio_pct |
+---------+---------------+
| 3 | 99.2035 |
| 4 | 99.5251 |
+---------+---------------+
2 rows in set (0.00 sec)

and
Thread 0 - 1669 tps (average tps measured after 1775 secs)
Thread 0 - 1674 tps (average tps measured after 1780 secs)
... wow .. tuning the DiskPageBufferMemory really helped. And 3300 secs into the run, perf and hit ratio is still climbing:
Thread 0 - 2329 tps (average tps measured after 3305 secs)
Thread 0 - 2330 tps (average tps measured after 3310 secs

Good luck!

Friday, October 29, 2010

Pushed down JOINs - Webinar

On Thursday, November 04, at 0900PST/1700CET/1600GMT there is a webinar about Pushed Down Joins. This webinar will explain how Pushed Down Joins works, and some performance numbers.
Register here: http://mysql.com/news-and-events/web-seminars/display-583.html.

NDB Pushed JOINs means query shipping instead of data shipping and it reduces drastically the network hops between the MySQL Server and data nodes, which in turn gives a tremendous performance improvement. For particular queries a 180x improvement has been measured.

Monday, October 11, 2010

MySQL Cluster - Performance (UPDATE on PK) - >120K tx/sec

This post follows on the previous post on SELECT performance. In this post I want to show three things:
  1. How many single row UPDATEs per second you can do on on a Cluster with two data nodes (updating 64B data by the PRIMARY KEY, no batching)
  2. Show how MySQL Cluster scales with threads and mysql servers
  3. How ndb_cluster_connection_pool affects performance
Next post will be what happens to INSERTs.

Setup
  • two data nodes
  • one to four mysql servers
  • interconnected with Gig-E (single NIC)
deployed on six computers (of varying quality, but not really modern, see below). www.severalnines.com/bencher was co-located with each mysql servers to drive the load. The inserts are non batched and looks like:

UPDATE t1 SET data1='64B of data' WHERE id=[random];
Table looks like:
CREATE TABLE `t1` (
`id` bigint(20) NOT NULL AUTO_INCREMENT,
`data1` varchar(512) DEFAULT NULL,
`data2` varchar(512) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=ndbcluster

The configuration was generated using www.severalnines.com/config with the following options:
  • MySQL Cluster 7.1.7 (src)
  • Cluster Usage: Option 3) High read/High write
  • Cores: 8
  • Multi-connection: 16 (so that i later could easily change between 1,8,16 connections in the pool)
  • DataMemory=2500M
Hardware
  • Data nodes deployed on: 2x4 cores Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
  • 2 mysql servers deployed on: 2x2 cores with Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
  • 2 mysql servers deployed on: 2x4 cores with Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
Results

A few notes:
  • conn= 8 means ndb_cluster_connection_pool=8
  • 1 app means one pair of bencher + mysqld co-located one server
  • With identical HW (2x4 cores) for the mysql server 130K updates per second should be possible.
  • Data nodes are quite loaded at 4 applications (from 64 - 128 threads), and the TC peaks at 86% CPU util.
  • The network was not saturated (about 200Mb/s each for TX and RX between the data nodes)
  • Latency numbers - i have them if someone is interested.
How to improve this more:
  • More cores, faster cores
  • more mysql servers (to scale out more)
  • eventually more data nodes
  • inifiniband
  • use ndbapi (nosql)

Wednesday, September 29, 2010

MySQL Cluster - Performance (SELECT on PK)

In this post I want to show three things:
  1. How many single row SELECTs per second (on the PRIMARY KEY, no batching) you can do on on a Cluster with two data nodes
  2. Show how MySQL Cluster scales with threads and mysql servers
  3. How ndb_cluster_connection_pool affects performance
Next post will be what happens to INSERTs, and then UPDATEs.

Setup
  • two data nodes
  • one to four mysql servers
  • interconnected with Gig-E (single NIC)
deployed on six computers (of varying quality, see below). www.severalnines.com/bencher was co-located with each mysql servers to drive the load. The reads were a PK SELECT like:

SELECT data1,data2 FROM t1 WHERE id=[random];
data1 and data2 are each 256B, so in total 512B was read. There was 1M records in total in table t1, but this does not matter as the SELECT is on the PK (hash index) so it could have been 100B rows (if i have had the storage for it).

Table looks like:
CREATE TABLE `t1` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data1` varchar(512) DEFAULT NULL,
`data2` varchar(512) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=ndbcluster
but only 256B of each 512B varchar was filled with data.

The configuration was generated using www.severalnines.com/config with the following options:
  • MySQL Cluster 7.1.7 (src)
  • Cluster Usage: Option 3) High read/High write
  • Cores: 8
  • Multi-connection: 16 (so that i later could easily change between 1,8,16 connections in the pool)
  • DataMemory=2500M
Hardware
  • Data nodes deployed on: Dual CPU Quad core Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
  • 2 mysql servers deployed on: Dual CPU Dual core with Intel(R) Xeon(R) CPU 5160 @ 3.00GHz
  • 2 mysql servers deployed on: Dual CPU Quad core Intel(R) Xeon(R) CPU E5345 @ 2.33GHz
Results


A few notes:
  • conn= 8 means ndb_cluster_connection_pool=8
  • 1 app means one pair of bencher + mysqld co-located one server
  • See below why 4 apps does not give >200K tx/s

Conclusions
  • Going from one mysql server to two mysql server gives a 99.9% throughput increase (32 threads).
    At this stage I wanted to try with four mysql servers, as the two mysql servers were completely CPU bound at this stage.

  • Going from two mysql servers to four mysql servers gives a 66% throughput increase.

    However, two of mysql servers were located on dual core, dual cpu, instead of dual cpu - quad core. The Dual CPU/Dual core servers were saturated already at 16 threads connected to MySQL.

    I expect, with similar hardware to get well above 200000tx/s. The MySQL servers on the dual cpu/quad core machines could do 52000tx/s, but the dual cpu/dual core could only handle ~ 38000tx/s when executing the SELECT from 32 threads.

    Hence, more than 200000tx/s should not be impossible, which would be a 92% increase in throughput compared to 2 mysql servers.

  • ndb_cluster_connection_pool=8 is a good value

  • ndb_cluster_connection_pool=1 is is really bad

  • ndb_cluster_connection_pool=16 does not give any real benefit, i can imagine it will be worse actually with more data nodes and more mysql servers b/c each data node has to read from more sockets. OTOH, epoll might help. Must be tested more :)

  • On my dual cpu/quad core machines the mysql server could scale up to 32 threads and then it hit the roof (100% CPU util including the benchmark program)

  • On my dual cpu/dual core machines the mysql server could scale up to16 threads and then it hit the roof (100% CPU util including the benchmark program)

  • CPU-bound -- the mysql servers got completely CPU bound with 32 threads

  • It would be fun to test with 16 or 24 core machines, but I don't have those. It would also be fun to test with faster CPUs.

  • The data nodes were not saturated on the peak results, but pay attention to the TC thread as in this case it accounts for most CPU util within the data node.

  • Latency numbers - i have them if someone is interested.
How to improve this more:
  • More cores, faster cores
  • more mysql servers (to scale out more)
  • eventually more data nodes
  • inifiniband
  • use ndbapi (nosql)
I think, despite MySQL Cluster being a relational database and despite we are using SQL, it is possible to get some decent numbers, and good scalability.

Thursday, September 02, 2010

Cluster - spins/contentions and thread stuck in..

I get a number of question about contentions/"stuck in..". So here comes some explanation to:
  • Contention
  • Thread Stuck in
  • What you can do about it
In 99% of the cases the contentions written out in the out file of the data nodes (ndb_X_out.log) is nothing to pay attention to.

sendbufferpool waiting for lock, contentions: 6000 spins: 489200
sendbufferpool waiting for lock, contentions: 6200 spins: 494721


Each spin is read from the L1 cache (4 cycles on a Nehalem (3.2GHz), so about a nanosecond).
1 spin = 1.25E-09 seconds (1.25ns)

In the above we have:
(494721-489200)/(6200-6000)= 27 spins/contention
Time spent on a contention=27 x 1.25E-09=3.375E-08 seconds (0.03375 us)

So we don't have a problem..

Another example (here is a lock guarding a job buffer (JBA = JobBuffer A, in short it handles signals for heartbeats and some other small things, all traffic goes over JobBuffer B).

jbalock thr: 1 waiting for lock, contentions: 145000 spins: 3280892543
jbalock thr: 1 waiting for lock, contentions: 150000 spins: 3403539479


(3403539479-3280892543)/(150000-145000)=24529 spins/contention
Time spent on a contention: 3.06613E-05 seconds (30.66us )

This is a bit higher than I would have expected and I think more analysis is needed. However, i tend not to get these contentions on a busy system.

Ndb kernel thread X is stuck in ..

Ndb kernel thread 4 is stuck in: Job Handling elapsed=100 Watchdog: User time: 82 System time: 667
Ndb kernel thread 4 is stuck in: Job Handling elapsed=200

Watchdog: User time: 82 System time: 668
Ndb kernel thread 4 is stuck in: Job Handling elapsed=300
Watchdog: User time: 82 System time: 669
Ndb kernel thread 4 is stuck in: Job Handling elapsed=400
Watchdog: User time: 82 System time: 670

Here the important is to look at how User time and System time behaves.
If User time is constant (as it is here - 82ms), but the System time is growing (667, 668 etc) which indicates that the OS kernel is busy.
Slow network? Sub-optimal kernel version? NIC drivers? swapping? some kernel process using too much cpu?

If User time is growing it is probably because the ndb kernel is overloaded.

What can you do about this?
  • In config.ini:
    RealtimeScheduler=1

    LockExecThreadToCPU=[cpuids]
  • check that cpuspeed is not running ( yum remove cpuspeed )
  • .. and finally ask us to optimize more!
Also, pay attention if you get the contentions on an idle Cluster or a busy Cluster.

Tuesday, April 27, 2010

MySQL Cluster - SPJ Preview - Feedback welcome

SPJ (preview, not production ready) is a new feature allowing some types of JOINs to be pushed down and executed inside the data nodes! This allows for, in many cases, much faster JOIN execution.

Now we would love to get your feedback on this new feature:

  • Does what we have right now improve performance for you?
  • Are there other types of JOINs we should support in order to improve performance in you application (currently only eq_ref is supported)?
  • What application are you using?

There are some limitations currently:

  • node failure handling of SPJ is not complete, so if a data node crash, there are side-effects.
  • only eq_ref is supported - other JOINs are executed as normal.
  • Don't put this in production.
Obtaining the MySQL Cluster SPJ preview version:
  • You must know how to build MySQL Cluster from source
  • www.severalnines.com/config includes "7.1 SPJ Preview" (the scripts will compile and distribute the binaries for you) - just select "MySQL Cluster 7.1.3-SPJ" from the Cluster Version drop down.
  • The source code (if you don't use severalnines/config) can be found here.
How to test is:
  • Run a JOIN query:
    mysql> set ndb_join_pushdown=0;
    mysql> run join query
    mysql> set ndb_join_pushdown=1; //ENABLES SPJ
    mysql> run join query again
    Did it help?
  • Run EXPLAIN on the queries (especially if SPJ did not help)
Feedback:
  • Send us the EXPLAINs of the queries where SPJ did not help!
    or
  • Let us know if this feature was helpful for your application.
    and in any case please include
  • What kind of speed up (or decrease) was observed
  • What kind of cluster configuration/hw was used.
Please send your feedback to us: spj-feedback (at) sun (dot) com . Thank you!

Read more about SPJ:

Tuesday, April 20, 2010

Tuning your Cluster with ndbinfo (7.1) part 1 of X

The new ndbinfo interface in 7.1 is really useful to assist in tuning MySQL Cluster. Here is an example (more will follow):

I started with one test where I inserted two blobs (1KB + 1KB) in one table.
From 16 threads (colocated with one mysqld, two data nodes, separate computers) and one application driving the load I reached about 6960TPS, and the utilization of the redo buffers (controlled by the parameter RedoBuffer in config.ini) looked like:

mysql< select * from ndbinfo.logbuffers;
+---------+----------+--------+----------+----------+--------+
| node_id | log_type | log_id | log_part | total | used |
+---------+----------+--------+----------+----------+--------+
| 3 | REDO | 0 | 1 | 50331648 | 196608 |
| 3 | REDO | 0 | 2 | 50331648 | 294912 |
| 3 | REDO | 0 | 3 | 50331648 | 131072 |
| 3 | REDO | 0 | 4 | 50331648 | 229376 |
| 4 | REDO | 0 | 1 | 50331648 | 229376 |
| 4 | REDO | 0 | 2 | 50331648 | 262144 |
| 4 | REDO | 0 | 3 | 50331648 | 163840 |
| 4 | REDO | 0 | 4 | 50331648 | 229376 |
+---------+----------+--------+----------+----------+--------+
8 rows in set (0.01 sec)
Which is basically nothing.

I then increased the load and inserted 2 x 5120B BLOBs (from 16 threads one MySQL server), and run with an insert speed of 4320TPS:
mysql< select * from ndbinfo.logbuffers;
+---------+----------+--------+----------+----------+----------+
| node_id | log_type | log_id | log_part | total | used |
+---------+----------+--------+----------+----------+----------+
| 3 | REDO | 0 | 1 | 50331648 | 11468800 |
| 3 | REDO | 0 | 2 | 50331648 | 31522816 |
| 3 | REDO | 0 | 3 | 50331648 | 42008576 |
| 3 | REDO | 0 | 4 | 50331648 | 43057152 |
| 4 | REDO | 0 | 1 | 50331648 | 14090240 |
| 4 | REDO | 0 | 2 | 50331648 | 17432576 |
| 4 | REDO | 0 | 3 | 50331648 | 10321920 |
| 4 | REDO | 0 | 4 | 50331648 | 12615680 |
+---------+----------+--------+----------+----------+----------+

Above you can see that the redo buffers are used (the load will be spread around, and it is hard to catch a moment where the load is even on all buffers), and now the application started to throw the error "Got temporary error 1221 'REDO buffers overloaded (increase RedoBuffer)' from NDBCLUSTER (1297)"

I can now follow the instruction to increase the REDO buffer, but would it help in this case?
No, no and no.
The disk is too slow to keep up and cannot write out to disk in the same rate as the application writes out.

'iostat' gives:
< iostat -kx 1

Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
cciss/c0d0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
cciss/c0d1 0.00 27796.00 0.00 1454.00 0.00 115196.00 158.45 12.03 8.25 0.66 95.30
dm-0 0.00 0.00 0.00 29270.00 0.00 117080.00 8.00 274.79 9.33 0.03 95.20
dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-4 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
dm-5 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00


And here you can see that the disks are quite utilized. This means that I have two options now if I want to be able to sustain the 4320TPS insert load:
  • Increase the number of data nodes (computers) so instead of having two computers, I should have four so that I spread the load across more hardware
  • Improve my disk subsystem (add better disks, e.g, to have 2-4 disk spindles to spread the load on), or by having the REDO log on device cciss/c0d1 and the the LCP on device cciss/c0d0.
The CPU, could that also been an bottleneck in this case? No, it was not the issue. The CMVMI thread (one of the data nodes threads) was spending 44.4% polling data from the other nodes, and it is reading in quite large packets so that is why it was the heaviest user of CPU of the data node threads.
5453 root      20   0 6594m 4.1g 6956 R 44.4 51.9   4:05.64 ndbmtd
5471 root 20 0 6594m 4.1g 6956 S 32.5 51.9 3:39.07 ndbmtd
5474 root 20 0 6594m 4.1g 6956 R 26.6 51.9 2:25.55 ndbmtd
5475 root 20 0 6594m 4.1g 6956 S 23.7 51.9 2:25.01 ndbmtd
5476 root 20 0 6594m 4.1g 6956 R 23.7 51.9 2:20.83 ndbmtd
5473 root 20 0 6594m 4.1g 6956 R 21.7 51.9 2:26.57 ndbmtd

Wednesday, April 07, 2010

MySQL Cluster - BLOB performance and other things

At the UC 2010 I will have a session on MySQL Cluster Performance Tuning. This session will address a lot of the most common performance problems I see in my day to day job with Cluster. A small excerpt of the talk is below here and many other things will be addressed in the session (JOINs, schema optimization, batching, indexes, parameter tuning etc). I hope to see you there!

First a bit on ndb_autoincrement_prefetch_sz and then blobs!

ndb_autoincrement_prefetch_sz

One thing is contention on auto_increments, which can really slow down performance.
By default the ndb_autoincrement_prefetch_sz=1. This means that the mysqld will cache one auto_increment number and then go down to the data nodes to fetch the next number. It is better to let the mysqld cache more numbers, to avoid the unnecessary round-trip.

From MySQL Cluster 7.0.13 you can set it up to 65536 (previously max was 256)

With ndb_autoincrement_prefetch_sz=1024 the mysqld will cache 1024 numbers before fetching the next range of numbers from the data nodes.

Som numbers for inserting batches of 16 records from 8 concurrent threads on one mysqld:
ndb_autoincrement_prefetch_sz=1:                1211.91TPS
ndb_autoincrement_prefetch_sz=256: 3471.71TPS
ndb_autoincrement_prefetch_sz=1024: 3659.52TPS
Here we got an 3x improvement on INSERT performance. Nice!

This test is by no means trying to max out the cluster, far from it, just to illustrate how important it is to set the ndb_autoincrement_prefetch_sz.

BLOB/TEXT

Another is on BLOB/TEXT attributes - which in many cases are overused in applications (e.g, there is no reason to store an 'email' as a TEXT, or if the data is less than about 8000B).

If you can change them to VARBINARY/VARCHAR (as LinuxJedi suggests), do it:

BLOBs/TEXTs are significantly slower compared to VARBINARY/VARCHAR (because the BLOBs are stored in a separate table, and need to be locked with at least a shared lock when accessed).
CREATE TABLE `t1_blob` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data1` blob,
`data2` blob,
PRIMARY KEY (`id`)
)ENGINE=ndbcluster
and inserted 200K records. sizeof(data1) = 1024B, sizeof(data2) = 1024B.

SELECT data1, data2 FROM t1 WHERE id=<rand>

Executed from 1 App - 8 Threads on one MySQLD and two data nodes resulted in:

data1 and data2 represented as BLOBs : 5844 TPS
data1 and data2 represented as VARBINARYs: 19206 TPS

Again, this is not maxing out the data nodes, just to show you the idea.

Conclusion
If you can replace TEXT/BLOB with VARCHAR/VARBINARY (remember that the max record size in Cluster is 8052B) do it.

Thursday, March 25, 2010

UC2010 - MySQL Cluster Deploy and Perf Tuning BP

At the MySQL UC 2010 (on Tuesday 4/14 11:55 am) me and my colleague Joffrey Michaie will present MySQL Cluster - Deployment Best Practices. We will give talk about what is important to think about when deploying MySQL Cluster, what to do, what not to do, operational aspects and a few other practical things.
This session will be a great follow-on to the introductory session on MySQL Cluster (and the tutorial).

After the Deployment session, same day at 2:00pm, I will also have a session on MySQL Cluster Performance Tuning Best Practices. In this session you will learn tricks and tips how to tune your Cluster, e.g how to tune and design your schema and queries so it runs optimally on MySQL Cluster.

So if you want to see what happens after this slide you should come and join us!

Wednesday, February 03, 2010

MySQL Cluster Configurator v3

Version 3.0 of the severalnines/Configurator has been released:
  • Improved scripts (a lot of cleanup), prefixing the script output with the hostname:
    E.g,:
    Cluster: Cluster Start
    Cluster: STARTING MANAGEMENT SERVERS
    ps-ndb01: Starting management server (nodeid=1)
    ps-ndb01: Copying ../config/config.ini to /etc/mysql
    ps-ndb01: Started management server (nodeid=1, pid=28253)
    ...

  • Reduced number of scripts
    start-cluster-initial.sh --> start-cluster.sh --initial
    rolling-restart-initial.sh --> rollling-restart.sh --initial
    start-ndbd-<host>-<id>-initial.sh --> start-ndbd-<host>-<id>.sh --initial
  • Better error handling during rolling restarts
  • Scripts are checking if Node Failure handling is ready before starting nodes
  • start-ndbd-<host>-<id>.sh - new parameters [--initial] [--nowait] [--ignore]
    --initial - initial node start (clear out local ndb_fs
    --nowait - script will exit as soon as data node is starting
    --ignore - ignore waiting for Node Failure handling to complete
  • Supports MySQL Cluster 7.1.1
Here is an example of how to try out 7.1.1 using the Configurator

1. Generate a Configuration (make sure you select "MySQL Cluster 7.1.1")
  • Currently 7.1.1 is only in source format so you need to have ncurses-devel/gcc/g++/make installed


2. Unpack the tarball you get and start the installation!
You are recommended to set up shared ssh keys between the "Frontend" (where you run the scripts) to the other nodes in the cluster in order to avoid typing passwords all the time.
tar xvfz mysqlcluster-71.tar.gz
cd mysqlcluster-71
cd cluster/scripts/install/
./shared-ssh-keys.sh ###WARNING - This script is experimental, let me know if it works!
./download-and-compile.sh
./install-cluster.sh
./bootstrap.sh
cd ..
./start-cluster.sh --initial
Voila!

MySQL Cluster 7.1.1 (beta) - what's in there

The main new features in MySQL Cluster 7.1.1 (beta) is the following:
  • ndbinfo (aka ndb$info) - Finally!! System tables for MySQL Cluster describing live usage of a lot of resources (RedoBuffer, Redo Log space, counters etc etc). Looks promising - makes Cluster a lot more transparent. See examples below.
  • Cluster/J - for more information and to learn more about it - register to a webinar and meet the architect and developer of Cluster/J!
Severalnines/Configurator supports 7.1.1 - to try it out!

NDBINFO - examples

Ever wondered how much for the RedoBuffer you are actually using?
mysql>  SELECT * FROM ndbinfo.logbuffers;
+---------+----------+--ç------+----------+----------+--------+
| 3 | REDO | 0 | 1 | 33554432 | 229376 |
| 3 | REDO | 0 | 2 | 33554432 | 229376 |
| 3 | REDO | 0 | 3 | 33554432 | 98304 |
| 3 | REDO | 0 | 4 | 33554432 | 229376 |
| 4 | REDO | 0 | 1 | 33554432 | 262144 |
| 4 | REDO | 0 | 2 | 33554432 | 65536 |
| 4 | REDO | 0 | 3 | 33554432 | 98304 |
| 4 | REDO | 0 | 4 | 33554432 | 262144 |
+---------+----------+--------+----------+----------+--------+
And how about the Redo log space?
mysql>  SELECT * FROM ndbinfo.logspaces;
+---------+----------+--------+----------+------------+-----------+
| node_id | log_type | log_id | log_part | total | used |
+---------+----------+--------+----------+------------+-----------+
| 3 | REDO | 0 | 0 | 3221225472 | 200278016 |
| 3 | REDO | 0 | 0 | 3221225472 | 201326592 |
| 3 | REDO | 0 | 0 | 3221225472 | 202375168 |
| 3 | REDO | 0 | 0 | 3221225472 | 202375168 |
| 4 | REDO | 0 | 0 | 3221225472 | 201326592 |
| 4 | REDO | 0 | 0 | 3221225472 | 201326592 |
| 4 | REDO | 0 | 0 | 3221225472 | 202375168 |
| 4 | REDO | 0 | 0 | 3221225472 | 202375168 |
+---------+----------+--------+----------+------------+-----------+
8 rows in set (0.00 sec)
Statistics ...
mysql> SELECT * FROM ndbinfo.counters;
+---------+------------+----------------+------------+--------------+-----------+
| node_id | block_name | block_instance | counter_id | counter_name | val |
+---------+------------+----------------+------------+--------------+-----------+
| 3 | DBLQH | 1 | 10 | OPERATIONS | 1747971 |
| 3 | DBLQH | 2 | 10 | OPERATIONS | 1748870 |
| 3 | DBLQH | 3 | 10 | OPERATIONS | 1749056 |
| 3 | DBLQH | 4 | 10 | OPERATIONS | 1777925 |
| 4 | DBLQH | 1 | 10 | OPERATIONS | 1747976 |
| 4 | DBLQH | 2 | 10 | OPERATIONS | 1749008 |
| 4 | DBLQH | 3 | 10 | OPERATIONS | 1749033 |
| 4 | DBLQH | 4 | 10 | OPERATIONS | 1777928 |
| 3 | DBTC | 0 | 1 | ATTRINFO | 470827014 |
| 3 | DBTC | 0 | 2 | TRANSACTIONS | 245150 |
| 3 | DBTC | 0 | 3 | COMMITS | 245137 |
| 3 | DBTC | 0 | 4 | READS | 25 |
| 3 | DBTC | 0 | 5 | SIMPLE_READS | 0 |
| 3 | DBTC | 0 | 6 | WRITES | 3512472 |
| 3 | DBTC | 0 | 7 | ABORTS | 13 |
| 3 | DBTC | 0 | 8 | TABLE_SCANS | 29 |
| 3 | DBTC | 0 | 9 | RANGE_SCANS | 0 |
| 4 | DBTC | 0 | 1 | ATTRINFO | 472683577 |
| 4 | DBTC | 0 | 2 | TRANSACTIONS | 218857 |
| 4 | DBTC | 0 | 3 | COMMITS | 218834 |
| 4 | DBTC | 0 | 4 | READS | 10 |
| 4 | DBTC | 0 | 5 | SIMPLE_READS | 1 |
| 4 | DBTC | 0 | 6 | WRITES | 3501562 |
| 4 | DBTC | 0 | 7 | ABORTS | 22 |
| 4 | DBTC | 0 | 8 | TABLE_SCANS | 3 |
| 4 | DBTC | 0 | 9 | RANGE_SCANS | 0 |
+---------+------------+----------------+------------+--------------+-----------+
26 rows in set (0.12 sec)
And there is a lot more!
mysql> show tables;
+-------------------+
| Tables_in_ndbinfo |
+-------------------+
| blocks |
| config_params |
| counters |
| logbuffers |
| logspaces |
| memoryusage |
| nodes |
| pools |
| resources |
| transporters |
+-------------------+
10 rows in set (0.01 sec)

Monday, February 01, 2010

Cluster Performance Tuning Webinar - EMEA

The 2nd of February, 0900GMT/1000CET I will have a webinar on Cluster Performance Tuning for people located in EMEA timezone.

For more information and registration:

http://mysql.com/news-and-events/web-seminars/display-480.html


In addition to what is mentioned in the Agenda I will also present numbers on typical operations, how to design requests, explain how the optimizer works with MySQL Cluster etc etc.

See you there!

-johan

Wednesday, January 20, 2010

Gearman meets MySQL Cluster (NDBAPI)

After a discussion with my colleague Stephane Varoqui we decided to see how Gearman and the NDBAPI could be used together. The result of the POC was a Gearman worker and a couple of clients (clients and workers use Google Protocol Buffers as the protocol). The worker can:
  • set/get/delete records on a single table in MySQL Cluster using the primary key
  • set/get/delete "any" type. It is not possible to dynamically add types but this is done at compile time.
  • supports the following SQL data types: (UNSIGNED) INTEGER, (UNSIGNED) BIGINT, CHAR, VARCHAR/VARBINARY
  • supports the following Google Protocol Buffer scalars: int32, uint32, int64, uint64, string, bytes.
  • not handle much errors for the time being
and a client that can
  • create a message and send it to the Gearman Job Server
  • clients ca n be written in either C++, Java, or Python (subject to what languages that Google Protocol Buffers supports)
  • receive (deserialize) the data.
So basically this is a new, albeit simple, connector to MySQL Cluster! Hopefully someone will find it useful.

The code can be downloaded here and some short instructions are here, and if you guys out there thinks this is usable, then it might make it to launchpad. Let me know!

Here follows some information what has been done and how to use this.

First you have to create the relation tables (engine=ndb). I will use a 'Person' (for the rest of the examples) that I will persist to the database. I have created the following relational table:
CREATE TABLE `Person` (
`id` int(11) NOT NULL,
`name` varchar(128) DEFAULT NULL,
`message` varchar(1024) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1
The relational tables needs to be translated to Google Protocol Buffers:
> cat proto/Person.proto
message Person {
required int32 id = 1; //PRIMARY KEY attributes are 'required'
optional string name = 2;
optional string message = 3;
}
All columns in the relational table must exist in the protocol buffer definition.

There is union-proto buffer called NdbMessage.proto that then contains all the proto buffers that can be sent between the client and worker:
> cat proto/NdbMessage.proto
// import the relevant protos that can be in
// the NdbMessage
import "Person.proto";

message NdbMessage {
enum Type { Person=1;}
// Identifies which field is filled in.
required Type type = 1;
// list of possible protos comes here:
optional Person person = 2;
}

The proto files then needs to be run through the protocol buffer compiler:
> /usr/local/bin/protoc --proto_path=proto --cpp_out=`pwd` proto/Person.proto proto/NdbMessage.proto
# this generates .h and .cc files for Proto buffer files.
There are three clients for this, one for each operation (set,get,delete).

In the ndbapi_set_client.cpp (the clients are based on the reverse_client.cpp in gearman) we invoke the 'ndbapi_set' function, that will be executed by the worker:

#include "NdbMessage.pb.h"
#include "Person.pb.h"

.
/**instantiate a NdbMessage object and associate a Person to it*/
NdbMessage m;
m.set_type(NdbMessage_Type_Person);
Person * p=m.mutable_person();
/* I must set all fields for now */
p->set_id(1);
p->set_name("Johan Andersson");
p->set_message("hello world, my first insert");

string s;
m.SerializeToString(&s);

...

const char * data = s.data();
result= (char*)gearman_client_do(&client,
"ndbapi_set",
NULL,
(void *)data,
(size_t)p.ByteSize(),
&result_size,
&ret);

The worker (ndbapi_worker.cpp) registers three functions:
ret=gearman_worker_add_function(&worker, "ndbapi_get", 0, ndbapi_get,NULL);
ret=gearman_worker_add_function(&worker, "ndbapi_set", 0, ndbapi_set,NULL);
ret=gearman_worker_add_function(&worker, "ndbapi_delete", 0, ndbapi_delete,NULL);

And when the worker receives the function call to 'ndbapi_set' it has to deserialize the received message into a NdbMessage:

static void *ndbapi_set(gearman_job_st *job,
void *context,
size_t *result_size,
gearman_return_t *ret_ptr)
{
/** receive message and convert into c++ string
* construct the wanted object (dataObject) from parsing c++ string.
*/
const void * rawmessage;
rawmessage= gearman_job_workload(job);
string s((char*)rawmessage, gearman_job_workload_size(job));

NdbMessage dataObject;
if(! dataObject.ParseFromString(s))
{
*ret_ptr= GEARMAN_WORK_FAIL;
return NULL;
}

The worker then looks at the type of the message and gets the underlying object (in this case Person):

google::protobuf::Message * message;
switch(dataObject.type())
{
case NdbMessage_Type_Person:
{
message= (google::protobuf::Message*)&dataObject.person();
reflection = (google::protobuf::Reflection *)message->GetReflection();
descriptor = (google::protobuf::Descriptor*)message->GetDescriptor();
}
break;
/*
case NdbMessage_Type_MyType:
{
// the myType() .. is the name of the field in MyType.proto:
// MyType myType = ;
message= (google::protobuf::Message*)&dataObject.myType();
reflection = (google::protobuf::Reflection *)message->GetReflection();
descriptor = (google::protobuf::Descriptor*)message->GetDescriptor();
}
break;
*/
default:
cout << "unknown type: "<< ret_ptr=" GEARMAN_WORK_FAIL;"> the insert was successful */
*result_size=0;
*ret_ptr= GEARMAN_SUCCESS;
return NULL;
}

In order to add a new type, you need to add a new 'case' to handle the type and how to get that object from the NdbMessage object (dataObject).

The worker loops over all fields in the received Proto Message and creates a transaction in the NDBAPI and executes it. Thus this part agnostic to the type you give it. As long as the following is true:
  • The relational table only uses (UNSIGNED) INTEGER, (UNSIGNED) BIGINT, CHAR, VARCHAR/VARBINARY data types.
  • The .proto definition contains all columns in the relational table
  • The .proto file marks the PRIMARY KEY of the relational table as 'required'
  • For 'ndbapi_set' you need to set all columns in the table ( i will fix that as soon as possible)
The data is then persisted in the table (currently the worker expects all tables to be stored in the 'test' database):
mysql> select * from Person;
+----+-----------------+------------------------------+
| id | name | message |
+----+-----------------+------------------------------+
| 1 | Johan Andersson | hello world, my first insert |
+----+-----------------+------------------------------+
1 row in set (0.00 sec)


Now the worker can also handle 'get' requests, and by using the get_client:
> ./get_client  1
name: Johan Andersson
message: hello world, my first insert
And there is also a client that does deletes (delete_client):
> ./delete_client  1
Delete successful: id=1
Summary/Conclusions
  • What are the performance implications of using Proto Buffer's reflection mechanism?
  • Proto Buffer only works for C++, Java, and Python - currently no support for PHP.
  • Is it better to use something else than Proto Buffers for this?
  • Gearman was super-easy to install so thanks for that!
  • Google Protocol Buffers was super-easy to install so thanks for that!
  • The worker needs also to be extended to support range searches and to make use of the batching interface so that it is possible persist either many types or many instances of a type in a batch.
  • NO-SQL -- YES-NDB !

Thursday, January 14, 2010

CMON - Install Instructions

CMON - the Cluster Monitor has recently been released and here is a little how to about how to install from binary and source (at the end), deployment etc.

1. Download the cmon binary package to a monitoring host

Currently there are binaries available for Linux 64-bit and 32-bit (statically linked) and works for MySQL Cluster 7.0.9 and later.

In this case the monitoring host will be on 'ndb05' (IP address 10.0.1.5) - see 9. Suggested Deployment for a picture describing the setup.

The Monitoring machine should have installed:
  • a mysql server up and running that will store the cmon database (if you build from source you need to have the mysql cluster libraries and include files)
  • apache (if you want to have the web interface) + php_mysql
  • rrdtool (if you want to have graphs in web interface)
The picture at the end of this post shows how you can deploy CMON.
> cd /usr/local
> wget http://www.severalnines.com/downloads/cmon/cmon-1.0.2-64bit-glibc23-mysqlcluster-709.tar.gz
> tar xvfz cmon-1.0.2-64bit-glibc23-mysqlcluster-709.tar.gz
> ln -s cmon-1.0.2-64bit-glibc23-mysqlcluster-709 cmon


2. Verify that you have a free API slot [mysqld] so that cmon can connect to MySQL Cluster

> ndb_mgm -e "show"
Cluster Configuration
---------------------
[ndbd(NDB)] 2 node(s)
id=3 @10.0.1.3 (mysql-5.1.39 ndb-7.0.9, Nodegroup: 0, Master)
id=4 @10.0.1.4 (mysql-5.1.39 ndb-7.0.9, Nodegroup: 0)

[ndb_mgmd(MGM)] 2 node(s)
id=1 @10.0.1.1 (mysql-5.1.39 ndb-7.0.9)
id=2 @10.0.1.2 (mysql-5.1.39 ndb-7.0.9)

[mysqld(API)] 6 node(s)
id=7 @10.0.1.1 (mysql-5.1.39 ndb-7.0.9)
id=8 @10.0.1.2 (mysql-5.1.39 ndb-7.0.9)
id=9 (not connected, accepting connect from any host)
id=10 (not connected, accepting connect from 10.0.1.5)

Yes, I have two slots - if I run cmon from host 10.0.1.5, then it is fine (id=10), but in this case, I can connect cmon from any host (id=9). Point is that you must have a slot free with "any host" or from an explicit host.

You must also have MemReportFrequency=30 (or some other value, in seconds) in order to get the memory usage events sent from MySQL Cluster. Otherwize, CMON will not be able to present memory utilization information!


3. Run the cmon install script

> cd /usr/local/cmon/bin
> ./cmon_install.sh

This script will install cmon, rrd, init.d scripts, and generate SQL scripts for GRANTs
Distribution: fedora

**** MYSQL CONNECTION ****
CMON and the RRD scripts needs a mysql connection to speak mysql server holding the cmon database.
Specify the BASEDIR where mysql is installed (default /usr/local/mysql/): /usr/local/mysql/mysql

Specify the hostname of the mysql server having the cmon database (default 'ndb05'): <return>
No hostname specified - using default 'ndb05'

Specify the port of the mysql server having the cmon database (default 3306): <return>
No port specified - using default 3306

Specify the password for the 'cmon' user (default no password): <return>
No password specified - using default (no password)

Specify the ndb-connectstring to the cluster (e.g, host_A;host_B): ndb01;ndb02
## Comment: ndb01;ndb02 are the hostnames of the two management
## servers in my cluster. YOU MUST SPECIFY THIS!

**** WWW interface ****
The www files will be copied to /var/www/html/cmon/
## Comment: The installation scripts tries to find the default
## location for www used in your distribution
Specify the WWWROOT of your webserver (default /var/www/html): <return>
Copying files to /var/www/html ..

**** RRD ****
## Comment: If you don't have RRD installed then graphs will not be available
## from the web client, but there is no other functional impact!
Specify the full path to 'rrdtool' (default is /usr/bin/): <return>
No path to rrd specified - using default /usr/bin/

The rrdtool stores data files in a data directory.
Specify the full path to the data directory (about 20MB free space will be neeeded).
RRD data directory (default is /data/rrd/): <return>
No RRD data directory specified - using default /data/rrd/
Saving ../etc/cmon.conf

**** INITD SCRIPTS ****
Do you want to install /etc/init.d/cmon (y/n)? : y
Specify the directory where CMON should write its pidfile (default /var/run/): <return>
chkconfig
Saving configuration to ../etc/init.d/cmon
Installing /etc/init.d/cmon
Done - Installed init.d scripts
Now you can start cmon with '/etc/init.d/cmon start'

**** INSTALL CRONTAB ****
cron schedules jobs every 5 minutes to update the rrd database and generarate graphs for the web interface. You are recommended to install the cron jobs
Do you want to install cron jobs for cmon (y/n)? : y
You need to issue the following GRANTs before starting CMON:

GRANT super, replication client ON *.* TO 'cmon'@'ndb05';
GRANT select,update,insert,delete,create ON cmon.* TO 'cmon'@'ndb05';


Configuration now complete - you can edit the /usr/local/cmon/bin/../etc/cmon.conf manually if you wish.

Configuration is now complete, but you need to apply the suggest GRANTs to the cmon database:

## COMMENT: Connect a mysql client to the cmon database and do
## (actual GRANTs are subject to your particular settings):
mysql> GRANT super, replication client ON *.* TO 'cmon'@'ndb05';
mysql> GRANT select,update,insert,delete,create ON cmon.* TO 'cmon'@'ndb05';

4. Start CMON

## COMMENT: Make sure CMON database and Cluster is started (so initialization won't fail)
> /etc/init.d/cmon start
Starting cmon version 1.0.0 with the following parameters:
--mysqlpasswd=
--mysqlhost=ndb05
--mysqlport=3306
--ndb-connectstring=ndb01;ndb02
--savetime-clusterlog=48 (hours)

If that doesn't look correct, kill cmon and restart with -? for help on the parameters, or change the params in /etc/init.d/cmon

You need to GRANT (and specify a password if you wish) the following on mysql on ndb05:

GRANT create,select,update,insert,delete on cmon.* to 'cmon'@'ndb05';
GRANT super on *.* to 'cmon'@'ndb05';

Testing connection to mysqld..
Connection ok..
Please wait while cmon is starting up..
Recreating missing tables
Done
Registering managed cluster with cluster id=1
Managed cluster has been registered - registered cluster id=1
cmon has started successfully.
Going to daemoinze.. - cmon will write a log in syslog from now
start on service executed successfully
5. Access the Cluster status from SQL
## COMMENT: connect a mysql client to the mysql server holding the cmon database.
mysql> use cmon;

Database changed

mysql> show tables;
+-------------------------+
| Tables_in_cmon |
+-------------------------+
| alarm |
| alarm_log |
| backup |
| backup_log |
| cluster |
| cluster_log |
| cluster_state |
| cluster_statistics |
| configurator_nodemap |
| diskdata |
| email_notification |
| mailserver |
| memory_usage |
| mysql_global_statistics |
| mysql_master_status |
| mysql_server |
| mysql_slave_status |
| mysql_statistics |
| mysql_variables |
| node_state |
| node_statistics |
| restore |
| restore_log |
| schema_object |
+-------------------------+
24 rows in set (0.00 sec)

mysql> select status from cluster_state;
+---------+
| status |
+---------+
| STARTED |
+---------+
1 row in set (0.00 sec)

mysql> select * from node_state;
+-----+--------+------------+-----------+-----------+---------------+---------+-------------+-----------------+---------------------+
| cid | nodeid | status | node_type | nodegroup | host | version | disconnects | last_disconnect | report_ts |
+-----+--------+------------+-----------+-----------+---------------+---------+-------------+-----------------+---------------------+
| 1 | 1 | CONNECTED | NDB_MGMD | NULL | 10.0.1.1 | 7.0.9 | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 2 | CONNECTED | NDB_MGMD | NULL | 10.0.1.2 | 7.0.9 | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 3 | STARTED | NDBD | 0 | 10.0.1.3 | 7.0.9 | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 4 | STARTED | NDBD | 0 | 10.0.1.4 | 7.0.9 | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 7 | CONNECTED | API | NULL | 10.0.1.1 | 7.0.9 | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 8 | CONNECTED | API | NULL | 10.0.1.2 | 7.0.9 | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 13 | NO_CONTACT | API | NULL | 0.0.0.0 | NULL | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 12 | NO_CONTACT | API | NULL | 0.0.0.0 | NULL | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 11 | NO_CONTACT | API | NULL | 0.0.0.0 | NULL | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 10 | NO_CONTACT | API | NULL | 0.0.0.0 | NULL | 0 | NULL | 2010-01-13 12:00:56 |
| 1 | 9 | CONNECTED | API | NULL | 10.0.1.5 | 7.0.9 | 0 | NULL | 2010-01-13 12:00:56 |
+-----+--------+------------+-----------+-----------+---------------+---------+-------------+-----------------+---------------------+
6. Web interface

Requires that you have a webserver with php-mysql installed.
Simply connect your web browser to e.g 127.0.0.1:8080/cmon

7. Add MySQL Servers to be monitored

As MySQL Cluster is not able to distinguish between an API or a MySQL Server, we need to tell CMON where our mysql servers are!
> cd /usr/local/cmon/bin
> ./cmon_add_mysqld.sh -ndb01 -p3306
Using default config file path ../etc/cmon.conf
Added MySQL Server

You need to execute the following GRANTs on mysqld@ndb01 for CMON agent to be able to connect:

GRANT REPLICATION SLAVE, SUPER ON *.* TO 'cmon'@'ndb05';
GRANT SELECT ON mysql.* TO 'cmon'@'ndb05';
And so on, you need to add every mysql server you want to monitor.

8. Scripts

All information is stored in SQL tables so it is easy to write scripts around it. Here you can find some example scripts.
## COMMENT: Put the scripts in /usr/local/cmon/bin
> cd /usr/local/cmon/bin
> sh get_cluster_status.sh
STARTED

> sh get_node_status.sh -n 4
STARTED
9. Suggested Deployment
Recommendation: If you have the 'mysqld' on 10.10.1.5, connect it also to Cluster and you can use it for administration!


10. Building from source

If you prefer to build from source then you have to have:
  • MySQL Cluster binaries/libs/includes installed on the build machine
  • gcc/g++/make
  • mysql_config and ndb_config on the PATH
> which mysql_config
/usr/local/mysql/mysql/bin/mysql_config
> which ndb_config
/usr/local/mysql/mysql/bin/ndb_config

## COMMENT: If 'mysql_config' or 'ndb_config' is not on the PATH,
## then you must make sure it is!

> wget http://www.severalnines.com/downloads/cmon/cmon-1.0.2.tar.gz
> tar xvfz cmon-1.0.2.tar.gz
> cd cmon-1.0.2
> ./configure --prefix=/usr/local/cmon/
## COMMENT: end the prefix with 'cmon' and life will be easier.
> make
> sudo make install
> cd /usr/local/cmon/
Proceed at section 2 above.
11. Bugs and problems

I would be very happy if you file bugs on lauchpad! If you don't want to file a bug report, you can send an email to 'support (at) severalnines (dot) com
Also, if you have suggestions for improvements, let me know!