Blake Eggleston

CQLengine 0.8 Released

Originally posted on tech.shift.com

We've just released version 0.8 of cqlengine, the Python object mapper for CQL3. Below are the new features.

Table Polymorphism

The big announcement for this release is the addition of support for table polymorphism. Table polymorphism allows you to read and write multiple model types to a single table. This is very useful for cases where you have multiple data types that you want to store in a single physical row in cassandra.

For instance, suppose you want a table that stores pets owned by someone, and you wanted all the pets owned by a particular owner to appear in the same physical cassandra row, regardless of type. You would setup your model class hierarchy like this:

class Pet(Model):
    __table_name__ = 'pet'
    owner_id = UUID(primary_key=True)
    pet_id = UUID(primary_key=True, default=uuid.uuid4)
    pet_type = Text(polymorphic_key=True)
    name = Text()

    def eat(self, food):
        pass

    def sleep(self, time):
        pass

class Cat(Pet):
    __polymorphic_key__ = 'cat'
    cuteness = Float()

    def tear_up_couch(self):
        pass

class Dog(Pet):
    __polymorphic_key__ = 'dog'
    fierceness = Float()

    def bark_all_night(self):
        pass

After calling sync_table on each of these tables, the columns defined in each model will be added to the pet table. Additionally, saving Cat and Dog models will save the instances to the Pet table with the meta data needed to identify each row as either a cat or dog.

Next, let's create some rows:

owner_id = uuid.uuid4()
Cat.create(owner_id=owner_id, name='fluffles', cuteness=100.1)
Dog.create(owner_id=owner_id, name='destructo', fierceness=5000.001)

Now if we query the Pet table for pets owned by our owner id, we will get a Dog instance, and a Cat instance:

print list(Pet.objects(owner_id=owner_id))
[<Dog name='destructo'>, <Cat name='fluffles'>]

Note that querying one of the sub types like:

list(Dog.objects(owner_id=owner_id))

will raise an exception if the query returns a type that's not a subclass of Dog.

Normally, you should perform queries from the base class, in this case Pet. However, if you do want the ability to query a table for only objects of a particular sub type, like Dog, set the polymorphic_key column to indexed.
When the polymorphic key column is indexed, queries against subtypes like Dog will automatically add a WHERE clause to the query that filters out other subtypes.

To properly setup a polymorphic model structure, you do the following:

  1. Create a base model with a column set as the polymorphic_key (set polymorphic_key=True in the column definition)
  2. Create subclass models, and define a unique __polymorphic_key__ value on each
  3. Run sync_table on each of the sub tables

About the polymorphic key

The polymorphic key is what cqlengine uses under the covers to map logical cql rows to the appropriate model type. The base model maintains a map of polymorphic keys to subclasses. When a polymorphic model is saved, this value is automatically saved into the polymorphic key column. You can set the polymorphic key column to any column type that you like, with the exception of container and counter columns, although Integer columns make the most sense.

VarInt column

Thanks to a pull request from Tommaso Barbugli, cqlengine now supports the varint data type, which stores integers of arbitrary size (in bytes).

class VarIntDemo(Model):
    row_id = Integer(primary_key=True)
    bignum = VarInt()