Pages

Friday, August 22, 2014

Functions - Cassandra Query Language

CQL3 supports a few functions (more to come). Currently, it only support functions on values (functions that transform one or more column values into a new value) and in particular aggregation functions are not supported. The functions supported are described below:

Token

The token function allows to compute the token for a given partition key. The exact signature of the token function depends on the table concerned and of the partitioner used by the cluster.
The type of the arguments of the token depend on the type of the partition key columns. The return type depend on the partitioner in use:
  • For Murmur3Partitioner, the return type is bigint.
  • For RandomPartitioner, the return type is varint.
  • For ByteOrderedPartitioner, the return type is blob.
For instance, in a cluster using the default Murmur3Partitioner, if a table is defined by
CREATE TABLE users (
    userid text PRIMARY KEY,
    username text,
    ...
)
then the token function will take a single argument of type text (in that case, the partition key is userid (there is no clustering columns so the partition key is the same than the primary key)), and the return type will bebigint.

Uuid

The uuid function takes no parameters and generates a random type 4 uuid suitable for use in INSERT or SET statements.

Timeuuid functions

now

The now function takes no arguments and generates a new unique timeuuid (at the time where the statement using it is executed). Note that this method is useful for insertion but is largely non-sensical in WHERE clauses. For instance, a query of the form
SELECT * FROM myTable WHERE t = now()
will never return any result by design, since the value returned by now() is guaranteed to be unique.

minTimeuuid and maxTimeuuid

The minTimeuuid (resp. maxTimeuuid) function takes a timestamp value t (which can be either a timestamp or a date string) and return a fake timeuuid corresponding to the smallest (resp. biggest) possibletimeuuid having for timestamp t. So for instance:
SELECT * FROM myTable WHERE t > maxTimeuuid('2013-01-01 00:05+0000') AND t < minTimeuuid('2013-02-02 10:00+0000')
will select all rows where the timeuuid column t is strictly older than ‘2013-01-01 00:05+0000’ but strictly younger than ‘2013-02-02 10:00+0000’. Please note that t >= maxTimeuuid('2013-01-01 00:05+0000') would still not select a timeuuid generated exactly at ‘2013-01-01 00:05+0000’ and is essentially equivalent to t > maxTimeuuid('2013-01-01 00:05+0000').
Warning: We called the values generated by minTimeuuid and maxTimeuuid fake UUID because they do no respect the Time-Based UUID generation process specified by the RFC 4122. In particular, the value returned by these 2 methods will not be unique. This means you should only use those methods for querying (as in the example above). Inserting the result of those methods is almost certainly a bad idea.

dateOf and unixTimestampOf

The dateOf and unixTimestampOf functions take a timeuuid argument and extract the embedded timestamp. However, while the dateof function return it with the timestamp type (that most client, including cqlsh, interpret as a date), the unixTimestampOf function returns it as a bigint raw value.

Blob conversion functions

A number of functions are provided to “convert” the native types into binary data (blob). For every <native-type> type supported by CQL3 (a notable exceptions is blob, for obvious reasons), the functiontypeAsBlob takes a argument of type type and return it as a blob. Conversely, the function blobAsType takes a 64-bit blob argument and convert it to a bigint value. And so for instance, bigintAsBlob(3) is0x0000000000000003 and blobAsBigint(0x0000000000000003) is 3.

No comments:

Post a Comment