Tuesday, October 31, 2006

NdbScanFilter

Sometimes the NDBAPI is not perhaps the most straightforward. This might be useful if you are struggling with scan filters (filtering tuples in the ndbd kernel, like engine-condition-pushdown) in combination with strings. Please have a look at my previous blog about how to use strings in the NDBAPI.

If you are using string daratypes with the NdbScanFilter::COND_EQ then the following rules applies:
  • VARCHAR and VARBINARY strings must have their length prepended (1 or 2 bytes)
  • CHAR must be space padded
  • BINARY must be null byte padded
However, for NdbScanFilter::COND_LIKE the following applies:
  • VARCHAR and VARBINARY must NOThave the lenght prepended!
  • CHAR must NOT be space padded
  • BINARY must NOT be null byte padded
Many thanks to Pekka (MySQL) and Jim Dowling (MySQL) on this subject.

Monday, October 30, 2006

Using CHAR/VARCHAR data types in NDBAPI

Many people have problem how to use the string data types (char/varchar) from the NDBAPI. Here I explain how you can use them.

First of all, let's create a table we can use as an example:

CREATE TABLE `t1` (
`a` int(11) NOT NULL,
`b` char(64) DEFAULT NULL,
`c` varchar(255) DEFAULT NULL,
`d` varchar(256) DEFAULT NULL,
PRIMARY KEY (`a`)
) ENGINE=ndbcluster DEFAULT CHARSET=latin1;


CHAR

In the MySQL server CHARs are space padded to the length of the CHAR.

Example:
insert into t1(a,b) values (1, 'mysql'); -- 'mysql' is 5 bytes long.

The MySQL server will now space-pad the string up to the length (in this case 64 bytes, given the character set used).

However, the NDBAPI does not respect this and does not care if it is space-padded, null-padded, or even 5-padded. But, if you want to be able to query the data that you have inserted/updated from the NDBAPI from the MySQL server, then you have to space-pad the data, because the MySQL Server requires this. Otherwise, the data will not display properly. So, in the NDBAPI you are responsible to space-pad the data!

From the NDBAPI, the insert above would be represented as (no error handling):

int a=1;
char b[64];
char str = "mysql";
memset(b,32,sizeof(b)); //ascii 32 is 'space'
memcpy(b,str,strlen(str));

NdbTransaction * t = ndb>startTransaction();
NdbOperation * op = t>getNdbOperation("t1");

op>insertTuple();
op>equal("a", (const char*)&a);
op>setValue("b",b);
t>execute(NdbTransaction::Commit, NdbTransaction::AbortOnError, 1);


Upon retrieval using the NDBAPI, you will have to strip the trailing spaces yourself (stripping is not shown in the example):


char b[65]; // length of column C + 1 for null character
memset(b,0,sizeof(b));

NdbTransaction * t = ndb>startTransaction();
NdbOperation * op = t>getNdbOperation("t1");
op>committedRead();
op>equal("a", (const char*)&a);
op>getValue("b",b));
t>execute(NdbTransaction::Commit, NdbTransaction::AbortOnError, 1);


cout << "a= " << a << "b=" << b << endl;


Will printout
a=1 b=mysql<59 trailing spaces>


VARCHAR

The VARCHARs data type stores the length of the string in the first byte (please not that different character sets exists with different storage requirements), followed by the actual string. There is an exception to this. If the VARCHAR is longer than 256 characters, then the first two bytes store the length of the string that follows.

Example:
insert into t1(a,c) values (2, 'mysql cluster'); -- 'mysql cluster' is 13 bytes long.

Internally, the length (13 bytes) is stored in the first byte, and then the string follows. In order to insert this string using the NDBAPI, then we will have to put together a buffer that stores the lenght in the first byte, and then the rest of the string. From the NDBAPI, the insert above would be represented as (no error handling):

char * str = "mysql cluster";
char c[strlen(str) + 1]; // +1 to store the length of the string. memset(c,0,sizeof(c));
memcpy(&c[1],str,strlen(str));
c[0]=strlen(str);
NdbTransaction * t = ndb>startTransaction();
NdbOperation * op = t>getNdbOperation("t1");
op>insertTuple();
op>equal("a", (const char*)&a);
op>setValue("c",c);
t>execute(NdbTransaction::Commit, NdbTransaction::AbortOnError, 1);

And to retrieve:

int a=2;
char c[255 + 1]; // length of column c + 1 byte for length

memset(c,0,sizeof(c));

NdbTransaction * t = ndb>startTransaction();
NdbOperation * op = t>getNdbOperation("t1");
op>committedRead();
op>equal("a", (const char*)&a);
op>getValue("c",c);
t>execute(NdbTransaction::Commit, NdbTransaction::AbortOnError, 1);

int len=c[0];
char c_real[len+1]; //+1B for null-terminator
c_real[len]=0;
memcpy(c_real, &c[1], len);

cout << "a= " << a << "c_real=" << c_real << endl;

Will printout:
a= 2 c_real= mysql cluster



LONG VARCHAR

If a VARCHAR is longer than 256B, then it is said to be a LONG VARCHAR. In a LONG VARCHAR the length is stored in the first two bytes, then the string follows. Below is an example on how to prepare a LONG VARCHAR in the NDBAPI.

insert into t1(a,d) values (3, 'how LONGVARCHAR works in the ndbapi);

would be represented as:

int a=3;
char * str = "how LONGVARCHAR works in the ndbapi";
int len = strlen(str);

char d[len + 2]; // +2B to store the length of the string.
memset(d,0,sizeof(d));
memcpy(&d[2],str, len);

d[0]=(char) ((len & 0x00FF))>>0;
d[1]= (char)((len & 0xFF00)>>8);

NdbTransaction * t = ndb>startTransaction();
NdbOperation * op = t>getNdbOperation("t1");
op>insertTuple();
op>equal("a", (const char*)&a);
op>setValue("d",d);
t>execute(NdbTransaction::Commit, NdbTransaction::AbortOnError, 1);


And to read it:

int a=3;

char d[256 + 2]; // length of column d + 2 byte for length

memset(d,0,sizeof(d));

NdbTransaction * t = ndb>startTransaction();
NdbOperation * op = t>getNdbOperation("t1");
op>committedRead();
op>equal("a", (const char*)&a);
op>getValue("d",d);
t>execute(NdbTransaction::Commit, NdbTransaction::AbortOnError, 1);

/*get the length of the string*/

int len = ((int) ((unsigned char) d[0]) + ((int) ((unsigned char) d[1]) << 8));
char d_real[len+1]; //+1B for null-terminator
d_real[len]=0;
memcpy(d_real, &d[2], len);

cout << "a= " << a <<" d_real=" << d_real << endl;


Will printout:
a= 3 d_real= how LONGVARCHAR works in the ndbapi


BINARY/VARBINARY/LONGVARBINARY

The same principles apply, but read the reference manual about the padding of BINARY/VARBINARY.