CQL supports a rich set of data types for columns defined in a table, including collection types. On top of those native and collection types, users can also provide custom types (through a JAVA class extending
The following table gives additional informations on the native data types, and on which kind of constants each type supports:
For more information on how to use the collection types, see the Working with collections section below.
Timestamp can be input in CQL as simple long integers, giving the number of milliseconds since the epoch, as defined above.
They can also be input as string literals in any of the following ISO 8601 formats, each representing the time and date Mar 2, 2011, at 04:05:00 AM, GMT.:
The time of day may also be omitted, if the date is the only piece that matters:
The use of the counter type is limited in the following way:
Deleting a map record is done with:
Writing
Lists also provides the following operation: setting an element by its position in the list, removing an element by its position in the list and remove all the occurrence of a given value in the list. However, and contrarily to all the other collection operations, these three operations induce an internal read before the update, and will thus typically have slower performance characteristics. Those operations have the following syntax:
AbstractType
loadable by Cassandra). The syntax of types is thus:Note that the native types are keywords and as such are case-insensitive. They are however not reserved ones.<type> ::= <native-type> | <collection-type> | <string> // Used for custom types. The fully-qualified name of a JAVA class <native-type> ::= ascii | bigint | blob | boolean | counter | decimal | double | float | inet | int | text | timestamp | timeuuid | uuid | varchar | varint <collection-type> ::= list '<' <native-type> '>' | set '<' <native-type> '>' | map '<' <native-type> ',' <native-type> '>'
The following table gives additional informations on the native data types, and on which kind of constants each type supports:
type | constants supported | description |
---|---|---|
ascii | strings | ASCII character string |
bigint | integers | 64-bit signed long |
blob | blobs | Arbitrary bytes (no validation) |
boolean | booleans | true or false |
counter | integers | Counter column (64-bit signed value). See Counters for details |
decimal | integers, floats | Variable-precision decimal |
double | integers | 64-bit IEEE-754 floating point |
float | integers, floats | 32-bit IEEE-754 floating point |
inet | strings | An IP address. It can be either 4 bytes long (IPv4) or 16 bytes long (IPv6). There is no inet constant, IP address should be inputed as strings |
int | integers | 32-bit signed int |
text | strings | UTF8 encoded string |
timestamp | integers, strings | A timestamp. Strings constant are allow to input timestamps as dates, see Working with dates below for more information. |
timeuuid | uuids | Type 1 UUID. This is generally used as a “conflict-free” timestamp. Also see the functions on Timeuuid |
uuid | uuids | Type 1 or type 4 UUID |
varchar | strings | UTF8 encoded string |
varint | integers | Arbitrary-precision integer |
Working with dates
Values of thetimestamp
type are encoded as 64-bit signed integers representing a number of milliseconds since the standard base time known as “the epoch”: January 1 1970 at 00:00:00 GMT.Timestamp can be input in CQL as simple long integers, giving the number of milliseconds since the epoch, as defined above.
They can also be input as string literals in any of the following ISO 8601 formats, each representing the time and date Mar 2, 2011, at 04:05:00 AM, GMT.:
2011-02-03 04:05+0000
2011-02-03 04:05:00+0000
2011-02-03 04:05:00.000+0000
2011-02-03T04:05+0000
2011-02-03T04:05:00+0000
2011-02-03T04:05:00.000+0000
+0000
above is an RFC 822 4-digit time zone specification; +0000
refers to GMT. US Pacific Standard Time is -0800
. The time zone may be omitted if desired— the date will be interpreted as being in the time zone under which the coordinating Cassandra node is configured.2011-02-03 04:05
2011-02-03 04:05:00
2011-02-03 04:05:00.000
2011-02-03T04:05
2011-02-03T04:05:00
2011-02-03T04:05:00.000
The time of day may also be omitted, if the date is the only piece that matters:
2011-02-03
2011-02-03+0000
Counters
Thecounter
type is used to define counter columns. A counter column is a column whose value is a 64-bit signed integer and on which 2 operations are supported: incrementation and decrementation (see UPDATE
for syntax). Note the value of a counter cannot be set. A counter doesn’t exist until first incremented/decremented, and the first incrementation/decrementation is made as if the previous value was 0. Deletion of counter columns is supported but have some limitations (see the Cassandra Wiki for more information).The use of the counter type is limited in the following way:
- It cannot be used for column that is part of the
PRIMARY KEY
of a table. - A table that contains a counter can only contain counters. In other words, either all the columns of a table outside the
PRIMARY KEY
have the counter type, or none of them have it.
Working with collections
Noteworthy characteristics
Collections are meant for storing/denormalizing relatively small amount of data. They work well for things like “the phone numbers of a given user”, “labels applied to an email”, etc. But when items are expected to grow unbounded (“all the messages sent by a given user”, “events registered by a sensor”, ...), then collections are not appropriate anymore and a specific table (with clustering columns) should be used. Concretely, collections have the following limitations:- Collections are always read in their entirety (and reading one is not paged internally).
- Collections cannot have more than 65535 elements. More precisely, while it may be possible to insert more than 65535 elements, it is not possible to read more than the 65535 first elements (see CASSANDRA-5428 for details).
- While insertion operations on sets and maps never incur a read-before-write internally, some operations on lists do (see the section on lists below for details). It is thus advised to prefer sets over lists when possible.
Maps
Amap
is a typed set of key-value pairs, where keys are unique. Furthermore, note that the map are internally sorted by their keys and will thus always be returned in that order. To create a column of type map
, use themap
keyword suffixed with comma-separated key and value types, enclosed in angle brackets. For example:WritingCREATE TABLE users ( id text PRIMARY KEY, given text, surname text, favs map<text, text> // A map of text keys, and text values )
map
data is accomplished with a JSON-inspired syntax. To write a record using INSERT
, specify the entire map as a JSON-style associative array. Note: This form will always replace the entire map.Adding or updating key-values of a (potentially) existing map can be accomplished either by subscripting the map column in an// Inserting (or Updating) INSERT INTO users (id, given, surname, favs) VALUES ('jsmith', 'John', 'Smith', { 'fruit' : 'apple', 'band' : 'Beatles' })
UPDATE
statement or by adding a new map literal:Note that TTLs are allowed for both// Updating (or inserting) UPDATE users SET favs['author'] = 'Ed Poe' WHERE id = 'jsmith' UPDATE users SET favs = favs + { 'movie' : 'Cassablanca' } WHERE id = 'jsmith'
INSERT
and UPDATE
, but in both case the TTL set only apply to the newly inserted/updated values. In other words,will only apply the TTL to the// Updating (or inserting) UPDATE users USING TTL 10 SET favs['color'] = 'green' WHERE id = 'jsmith'
{ 'color' : 'green' }
record, the rest of the map remaining unaffected.Deleting a map record is done with:
DELETE favs['author'] FROM users WHERE id = 'jsmith'
Sets
Aset
is a typed collection of unique values. Sets are ordered by their values. To create a column of type set
, use the set
keyword suffixed with the value type enclosed in angle brackets. For example:Writing aCREATE TABLE images ( name text PRIMARY KEY, owner text, date timestamp, tags set<text> );
set
is accomplished by comma separating the set values, and enclosing them in curly braces. Note: An INSERT
will always replace the entire set.Adding and removing values of a set can be accomplished with anINSERT INTO images (name, owner, date, tags) VALUES ('cat.jpg', 'jsmith', 'now', { 'kitten', 'cat', 'pet' });
UPDATE
by adding/removing new set values to an existing set
column.As with maps, TTLs if used only apply to the newly inserted/updated values.UPDATE images SET tags = tags + { 'cute', 'cuddly' } WHERE name = 'cat.jpg'; UPDATE images SET tags = tags - { 'lame' } WHERE name = 'cat.jpg';
Lists
Alist
is a typed collection of non-unique values where elements are ordered by there position in the list. To create a column of type list
, use the list
keyword suffixed with the value type enclosed in angle brackets. For example:Do note that as explained below, lists have some limitations and performance considerations to take into account, and it is advised to prefer sets over lists when this is possible.CREATE TABLE plays ( id text PRIMARY KEY, game text, players int, scores list<int> )
Writing
list
data is accomplished with a JSON-style syntax. To write a record using INSERT
, specify the entire list as a JSON array. Note: An INSERT
will always replace the entire list.Adding (appending or prepending) values to a list can be accomplished by adding a new JSON-style array to an existingINSERT INTO plays (id, game, players, scores) VALUES ('123-afde', 'quake', 3, [17, 4, 2]);
list
column.It should be noted that append and prepend are not idempotent operations. This means that if during an append or a prepend the operation timeout, it is not always safe to retry the operation (as this could result in the record appended or prepended twice).UPDATE plays SET players = 5, scores = scores + [ 14, 21 ] WHERE id = '123-afde'; UPDATE plays SET players = 5, scores = [ 12 ] + scores WHERE id = '123-afde';
Lists also provides the following operation: setting an element by its position in the list, removing an element by its position in the list and remove all the occurrence of a given value in the list. However, and contrarily to all the other collection operations, these three operations induce an internal read before the update, and will thus typically have slower performance characteristics. Those operations have the following syntax:
As with maps, TTLs if used only apply to the newly inserted/updated values.UPDATE plays SET scores[1] = 7 WHERE id = '123-afde'; // sets the 2nd element of scores to 7 (raises an error is scores has less than 2 elements) DELETE scores[1] FROM plays WHERE id = '123-afde'; // deletes the 2nd element of scores (raises an error is scores has less than 2 elements) UPDATE plays SET scores = scores - [ 12, 21 ] WHERE id = '123-afde'; // removes all occurrences of 12 and 21 from scores
No comments:
Post a Comment