Playhouse, a collection of addons

Peewee comes with numerous extras which I didn’t really feel like including in the main source module, but which might be interesting to implementers or fun to mess around with.

apsw, an advanced sqlite driver

The apsw_ext module contains a database class suitable for use with the apsw sqlite driver. With apsw, it is possible to use some of the more advanced features of sqlite. It also offers better performance than pysqlite and finer-grained control over query execution. For more information on the differences between apsw and pysqlite, check the apsw docs.

Example usage

from apsw_ext import *

db = APSWDatabase(':memory:')

class BaseModel(Model):
    class Meta:
        database = db

class SomeModel(BaseModel):
    col1 = CharField()
    col2 = DateTimeField()
    # etc, etc

apsw_ext API notes

class APSWDatabase(database, **connect_kwargs)
Parameters:
  • database (string) – filename of sqlite database
  • connect_kwargs – keyword arguments passed to apsw when opening a connection
transaction([lock_type='deferred'])

Functions just like the Database.transaction() context manager, but accepts an additional parameter specifying the type of lock to use.

Parameters:lock_type (string) – type of lock to use when opening a new transaction
class APSWAdapter(timeout)
Parameters:timeout (int) – sqlite busy timeout in seconds (docs)
register_module(mod_name, mod_inst)

Provides a way of globally registering a module. For more information, see the documentation on virtual tables.

Parameters:
  • mod_name (string) – name to use for module
  • mod_inst (object) – an object implementing the Virtual Table interface
unregister_module(mod_name)

Unregister a module.

Parameters:mod_name (string) – name to use for module
class VirtualModel

A model subclass suitable for creating virtual tables.

Note

You must specify the name for the extension module you wish to use

_extension_module

The name of the extension module to use with this virtual table

Postgresql Extensions (hstore, ltree)

The postgresql extensions module provides a number of “postgres-only” functions, including:

Warning

In order to start using the features described below, you will need to use the extension PostgresqlExtDatabase class instead of PostgresqlDatabase.

The code below will assume you are using the following database and base model:

from playhouse.postgres_ext import *

ext_db = PostgresqlExtDatabase('peewee_test', user='postgres')

class BaseExtModel(Model):
    class Meta:
        database = ext_db

hstore support

Postgresql hstore is an embedded key/value store. With hstore, you can store arbitrary key/value pairs in your database alongside structured relational data. hstore is great for storing JSON.

Currently the postgres_ext module supports the following operations:

  • store and retrieve arbitrary dictionaries
  • filter by key(s) or partial dictionary
  • update/add one or more keys to an existing dictionary
  • delete one or more keys from an existing dictionary
  • select keys, values, or zip keys and values
  • retrieve a slice of keys/values
  • test for the existence of a key
  • test that a key has a non-NULL value

using hstore

To start with, you will need to import the custom database class and the hstore functions from playhouse.postgres_ext (see above code snippet). Then, it is as simple as adding a HStoreField to your model:

class House(BaseExtModel):
    address = CharField()
    features = HStoreField()

You can now store arbitrary key/value pairs on House instances:

>>> h = House.create(address='123 Main St', features={'garage': '2 cars', 'bath': '2 bath'})
>>> h_from_db = House.get(id=h.id)
>>> h_from_db.features
{'bath': '2 bath', 'garage': '2 cars'}

You can filter by keys or partial dictionary:

>>> House.select().where(features__contains='garage') # <-- all houses w/garage key
>>> House.select().where(features__contains=['garage', 'bath']) # <-- all houses w/garage & bath
>>> House.select().where(features__contains={'garage': '2 cars'}) # <-- houses w/2-car garage

Suppose you want to do an atomic update to the house:

>>> query = House.update(features=hupdate('features', {'bath': '2.5 bath', 'sqft': '1100'}))
>>> query.where(id=h.id).execute()
1
>>> h = House.get(id=h.id)
>>> h.features
{'bath': '2.5 bath', 'garage': '2 cars', 'sqft': '1100'}

Or, alternatively an atomic delete:

>>> query = House.update(features=hdelete('features', 'bath'))
>>> query.where(id=h.id).execute()
1
>>> h = House.get(id=h.id)
>>> h.features
{'garage': '2 cars', 'sqft': '1100'}

Multiple keys can be deleted at the same time:

>>> query = House.update(features=hdelete('features', ['garage', 'sqft']))

You can select just keys, just values, or zip the two:

>>> for h in House.select(['address', hkeys('features', 'keys')]):
...     print h.address, h.keys

123 Main St [u'bath', u'garage']

>>> for h in House.select(['address', hvalues('features', 'vals')]):
...     print h.address, h.vals

123 Main St [u'2 bath', u'2 cars']

>>> for h in House.select(['address', hmatrix('features', 'mtx')]):
...     print h.address, h.mtx

123 Main St [[u'bath', u'2 bath'], [u'garage', u'2 cars']]

You can retrieve a slice of data, for example, all the garage data:

>>> for h in House.select(['address', hslice('features', 'garage_data', ['garage'])]):
...     print h.address, h.garage_data

123 Main St {'garage': '2 cars'}

You can check for the existence of a key and filter rows accordingly:

>>> for h in House.select(['address', hexist('features', 'has_garage', 'garage')]):
...     print h.address, h.has_garage

123 Main St True

>>> for h in House.select().where(hexist('features', ['garage'])):
...     print h.address, h.features['garage'] # <-- just houses w/garage data

123 Main St 2 cars

ltree support

Postgresql ltree is a hierarchical data store. With ltree, you can store hierarchical structures as a series of labels separated by a delimiter (”.”).

Currently the postgres_ext module supports the following operations:

  • store and retrieve label-trees
  • very complex filtering by labels
  • querying by prefix

using ltree

As with hstore, you will need to import the custom database class and the ltree functions from playhouse.postgres_ext (see above code snippet). Then, it is as simple as adding a LTreeField to your model:

class Category(BaseExtModel):
    name = CharField()
    path = LTreeField()

You can now store hierarchy information on Category instances. Let’s use the following data-set:

  • languages
    • dynamic
      • python
      • ruby
    • static
      • c
      • c++
      • java
>>> Category.create(name='java', path='languages.static.java') # last one...
>>> Category.get(name='python').path
'languages.dynamic.python'

You can filter by path. Complex queries are possible, so please refer to the ltree documentation for details.

>>> show = lambda q: [l.name for l in q]
>>> show(Category.select().where(path__startswith='languages.dynamic'))
[u'dynamic', u'python', u'ruby']

>>> show(Category.select().where(path__lmatch='*.static.c*'))
[u'c', u'c++']

>>> show(Category.select().where(path__lmatch_text='(static | dynamic) & (p* | c*) & !cpp'))
[u'python', u'c']

You can select subtrees, label indices, and more:

>>> q = Category.select(['name', lsubpath('path', -1, 1, 'leaf')])
>>> for cat in q:
...     cat.leaf

languages
dynamic
static
python
ruby
c
cpp
java

>>> q = Category.select([lindex('path', 'static', 'static_pos')])
>>> [x.static_pos for x in q]
[-1, -1, 1, -1, -1, 1, 1, 1]

>>> q = Category.select([nlevel('path', 'depth')]).where(name='python')
>>> print q.get().depth
3

pwiz, a model generator

pwiz is a little script that ships with peewee and is capable of introspecting an existing database and generating model code suitable for interacting with the underlying data. If you have a database already, pwiz can give you a nice boost by generating skeleton code with correct column affinities and foreign keys.

If you install peewee using setup.py install, pwiz will be installed as a “script” and you can just run:

pwiz.py -e postgresql -u postgres my_postgres_db > my_models.py

This will print a bunch of models to standard output. So you can do this:

pwiz.py -e postgresql my_postgres_db > mymodels.py
python # <-- fire up an interactive shell
>>> from mymodels import Blog, Entry, Tag, Whatever
>>> print [blog.name for blog in Blog.select()]
Option Meaning Example
-h show help  
-e database backend -e mysql
-H host to connect to -H remote.db.server
-p port to connect on -p 9001
-u database user -u postgres
-P database password -P secret
-s postgres schema -s public

The following are valid parameters for the engine:

  • sqlite
  • mysql
  • postgresql

Signal support

Models with hooks for signals (a-la django) are provided in playhouse.signals. To use the signals, you will need all of your project’s models to be a subclass of playhouse.signals.Model, which overrides the necessary methods to provide support for the various signals.

from playhouse.signals import Model, connect, post_save


class MyModel(Model):
    data = IntegerField()

@connect(post_save, sender=MyModel)
def on_save_handler(model_class, instance, created):
    put_data_in_cache(instance.data)

The following signals are provided:

pre_save
Called immediately before an object is saved to the database. Provides an additional keyword argument created, indicating whether the model is being saved for the first time or updated.
post_save
Called immediately after an object is saved to the database. Provides an additional keyword argument created, indicating whether the model is being saved for the first time or updated.
pre_delete
Called immediately before an object is deleted from the database when Model.delete_instance() is used.
post_delete
Called immediately after an object is deleted from the database when Model.delete_instance() is used.
pre_init
Called when a model class is first instantiated
post_init
Called after a model class has been instantiated and the fields have been populated, for example when being selected as part of a database query.

Connecting handlers

Whenever a signal is dispatched, it will call any handlers that have been registered. This allows totally separate code to respond to events like model save and delete.

The Signal class provides a connect() method, which takes a callback function and two optional parameters for “sender” and “name”. If specified, the “sender” parameter should be a single model class and allows your callback to only receive signals from that one model class. The “name” parameter is used as a convenient alias in the event you wish to unregister your signal handler.

Example usage:

from playhouse.signals import *

def post_save_handler(sender, instance, created):
    print '%s was just saved' % instance

# our handler will only be called when we save instances of SomeModel
post_save.connect(post_save_handler, sender=SomeModel)

All signal handlers accept as their first two arguments sender and instance, where sender is the model class and instance is the actual model being acted upon.

If you’d like, you can also use a decorator to connect signal handlers. This is functionally equivalent to the above example:

@connect(post_save, sender=SomeModel)
def post_save_handler(sender, instance, created):
    print '%s was just saved' % instance

Signal API

class Signal

Stores a list of receivers (callbacks) and calls them when the “send” method is invoked.

connect(receiver[, sender=None[, name=None]])

Add the receiver to the internal list of receivers, which will be called whenever the signal is sent.

Parameters:
  • receiver (callable) – a callable that takes at least two parameters, a “sender”, which is the Model subclass that triggered the signal, and an “instance”, which is the actual model instance.
  • sender (Model) – if specified, only instances of this model class will trigger the receiver callback.
  • name (string) – a short alias
from playhouse.signals import post_save
from project.handlers import cache_buster

post_save.connect(cache_buster, name='project.cache_buster')
disconnect([receiver=None[, name=None]])

Disconnect the given receiver (or the receiver with the given name alias) so that it no longer is called. Either the receiver or the name must be provided.

Parameters:
  • receiver (callable) – the callback to disconnect
  • name (string) – a short alias
post_save.disconnect(name='project.cache_buster')
send(instance, *args, **kwargs)

Iterates over the receivers and will call them in the order in which they were connected. If the receiver specified a sender, it will only be called if the instance is an instance of the sender.

Parameters:instance – a model instance
connect(signal[, sender=None[, name=None]])

Function decorator that is an alias for a signal’s connect method:

from playhouse.signals import connect, post_save

@connect(post_save, name='project.cache_buster')
def cache_bust_handler(sender, instance, *args, **kwargs):
    # bust the cache for this instance
    cache.delete(cache_key_for(instance))

Sqlite Extensions

The sqlite extensions module provides a number of “sqlite-only” functions, including:

Warning

In order to start using the features described below, you will need to use the extension SqliteExtDatabase class instead of SqliteDatabase.

The code below will assume you are using the following database and base model:

from playhouse.sqlite_ext import *

ext_db = SqliteExtDatabase('tmp.db')

class BaseExtModel(Model):
    class Meta:
        database = ext_db

Granular Transactions

Sqlite uses three different types of locks to control access during transactions. Details on the three types can be found in the docs, but here is a quick overview:

deferred
locks are not acquired until the last moment. multiple processes can continue to read the database.
immediate
lock is acquired and no further writes are possible until lock is released, but other processes can continue to read. Additionally, no other immediate or exclusive locks can be acquired.
exclusive
lock is acquired and no further reads or writes are possible until lock is released

These various types of transactions can be opened using the special context-manager:

with ext_db.granular_transaction('exclusive'):
    # no other connections can read or write to the database now
    execute_some_queries()

# safe for other processes to read and write again
do_some_other_stuff()

Custom aggregators, collations and user-defined functions

Sqlite allows you to specify custom functions that can stand-in as aggregators, collations or functions, and then be executed as part of your queries. If you read the notes on the full-text search extension, the “sort by rank” is implemented as a user-defined function.

Python’s sqlite documentation gives a good overview of how these types of functions can be used.

  • custom aggregates

    class WeightedAverage(object):
        def __init__(self):
            self.total_weight = 0.0
            self.total_ct = 0.0
    
        def step(self, value, wt=None):
            wt = wt or 1.0
            self.total_weight += wt
            self.total_ct += wt * value
    
        def finalize(self):
            if self.total_weight != 0.0:
                return self.total_ct / self.total_weight
            return 0.0
    
    ext_db.adapter.register_aggregate(WeightedAverage, 2, 'weighted_avg')
    
  • custom collations

    def collate_reverse(s1, s2):
        return -cmp(s1, s2)
    
    ext_db.adapter.register_collation(collate_reverse)
    
  • custom functions

    def sha1(s):
        return hashlib.sha1(s).hexdigest()
    
    ext_db.adapter.register_function(sha1)
    

Swee’pea, syntactic sugar for peewee

Calling it syntactic sugar is a bit of a stretch. I wrote this stuff for fun after learning about ISBL from a coworker. The blog post can be found here.

At any rate, ISBL (Information Systems Base Language) is an old domain-specific language for querying relational data, developed by IBM in the 60’s. Here are some example SQL and ISBL queries:

-- query the database for all active users
SELECT id, username, active FROM users WHERE active = True

-- query for tweets and the username of the sender
SELECT t.id, t.message, u.username
FROM tweets AS t
INNER JOIN users AS u
    ON t.user_id = u.id
WHERE u.active = True
-- tables appear first -- the colon indicates a restriction (our where clause)
-- and after the modulo is the "projection", or columns we want to select
users : active = True % (id, username, active)

(tweets * users) : user.active = True % (tweet.id, tweet.message, user.username)

Pretty cool. In the above examples:

  • multiplication signifies a join, the tables to query (FROM)
  • a colon signifies a restriction, the columns to filter (WHERE)
  • modulo signifies a projection, the columns to return (SELECT)

I hacked up a small implementation on top of peewee. Since peewee does not support the ”:” (colon) character as an infix operator, I used the “power” operator to signify a restriction:

# active users
User ** (User.active == True)

# tweets with the username of sender
(Tweet * User) ** (User.active == True) % (Tweet.id, Tweet.message, User.username)

To try out swee’pea, simply replace from peewee import * with from playhouse.sweepea import * and start writing wacky queries:

from playhouse.sweepea import *

class User(Model):
    username = CharField()
    active = BooleanField()

class Tweet(Model):
    user = ForeignKeyField(User)
    message = CharField()

# have fun!
(User * Tweet) ** (User.active == True)