Understand and use the DDL subset of CQL

Keyspace

a top-level namespace for a CQL table schema

  • Defines the replication strategy for a set of tables
    • Keyspace per application is a good idea
  • Data objects (e.g., tables) belong to a single keyspace

How are primary key, partition key, and clustering columns defined?

  • Simple partition key, no clustering columns

    PRIMARY KEY ( partition_key_column )

  • Composite partition key, no clustering columns

    PRIMARY KEY ( ( partition_key_col1, ..., partition_key_colN ) )

  • Simple partition key and clustering columns

    PRIMARY KEY ( partition_key_column,clustering_column1, ..., clustering_columnM )

  • Composite partition key and clustering columns

    PRIMARY KEY ( ( partition_key_col1, ..., partition_key_colN ), clustering_column1, ...,clustering_columnM ) )

UUID

universally unique identifiers

Format: hex{8}-hex{4}-hex{4}-hex{4}-hex{12}

TIMEUUID

  • Embeds a time value within a UUID
  • CQL function now() generates a new TIMEUUID
  • CQL function dateOf() extracts the embedded timestamp as a date
  • TIMEUUID values in clustering columns or in column names are ordered based on time

TIMESTAMP

64-bit integer representing a number of milliseconds since January 1 1970 at 00:00:00 GMT

COUNTER

  • Cassandra supports distributed counters
  • Useful for tracking a count
  • Counter column stores a number that can only be updated
    • Incremented or decremented
    • Cannot assign an initial value to a counter (initial value is 0)
  • Counter column cannot be part of a primary key
  • If a table has a counter column, all non-counter columns must be part of a primary key

What is a secondary index?

  • Can index additional columns to enable searching by those columns
  • Cannot be created for
    • counter columns
    • static columns

When do you want to use a secondary index?

  • Use with low-cardinality columns

    Columns that may contain a relatively small set of distinct values

Do not use:

  • On high-cardinality columns
  • On counter column tables
  • On a frequently updated or deleted columns
  • To look for a row in a large partition unless narrowly queried