An inter-process, inter-computer lock for Cassandra

Wed
01/16/13

WARNING: This implementation of an inter-process, inter-computer lock works with Cassandra only if you know that you are directly dealing with a single Cassandra node at a time. The Cassandra C++ driver (probably all the drivers) makes use of a set of threads to connect to several Cassandra nodes and if the load of the current thread/node pair becomes too large, it will automatically switch to another thread/node pair. This means your messages may not be received in the order you sent them to the database cluster. As a result, the lock mechanism described below will not function as expected. On our end, we now use snaplock instead.

Pictures on Sodahead

Here we go! I had a need to have a very temporary lock to create a new user row in the Snap! C++ database. This is important to make sure that the user is unique. When users register, I could assign them a unique number. The problem with that method is that I could end up with two users having the same email address. Instead, we want to use the email address to register users (we could have used a username, that forces us to ask the user for a user name which we think should not be mandatory.) The table is organized using the email address as the row key:

table["user"][email]["password"] = "password1";

This works very well, assuming that all the users to ever register have a unique email address. If not, then you could end up creating one user object in the database from two different people. To avoid this problem, you need a lock.

The Cassandra database system is really cool but by itself it does not offer a locking mechanism. Instead many people rely on ready made tools that offer a full inter-process and inter-computer locking mechanism. This gives them what is necessary. However, thinking about it, I just couldn't see a reason why we couldn't implement a lock with Cassandra and sure enough, I found a page talking about using the Lamport's bareky algorithm and this works like a charm.

I wrote an implement in the libQtCassandra library so one can lock one's database to do work like I just describe. The lock can be used as a scoped lock since creating the lock lasts only as long as the QCassandraLock object exists. For example, we could write the following:

{
  // obtain a lock on the stack (RAII)
  QtCassandra::QCassandraLock lock(context, key);
  QSharedPointer<QtCassandra::QCassandraCell> cell(table->row(key)->cell("IP"));
  cell->setConsistencyLevel(QtCassandra::CONSISTENCY_LEVEL_QUORUM);
  QtCassandra::QCassandraValue user_ip(cell->value());
  bool register_user(user_ip.nullValue());
  if(register_user) {
    QtCassandra::QCassandraValue ip(getenv("REMOTE_ADDRESS"));
    ip.setConsistencyLevel(QtCassandra::CONSISTENCY_LEVEL_QUORUM);
    cell->setValue(ip);
  }
  // here the lock is released automatically
}
if(register_user) {
  // finish user registration
  // for example we can also save the email address:
  QtCassandra::QCassandraValue email(user_email);
  email.setConsistencyLevel(QtCassandra::CONSISTENCY_LEVEL_QUORUM);
  table->row(key)->cell("email")->setValue(email);
  ...
}
else {
  // an error occured, this email address is already assigned to an account
  ...
}

In this example we notice that I create a lock, create one column named "IP" and use that to know whether the user exists or not. If not, then the registration can go on. If it already exists, then we're not the first to attempt registering with that email address and the registration fails.

Note that all the accesses within the locked area use QUORUM to access the database. You probably want to use QUORUM for all your writes anyway, but what's important here is that the synchronization won't work if you use a consistency of ONE (only you have just one node in your cluster,) even if only for the reads. (i.e. you must read & write with QUORUM.)

You can revert back to non-QUORUM reads after the lock was released. You probably want to continue with QUORUM writes for most of your data.

Note that since the lock is defined inside a class, it is fully RAII aware. This means it will automatically be released, even if you have a return or an exception in the middle of the block while the lock is in effect.

The lock requires you to initialize the name of your host. The computer you are running your application on must be registered within the context in which you want to do the lock. If you use many contexts with your application, you may limit the locks themselves to a single one of those contexts. See the QCassandraLock and the different Lock funtions in the QCassandraContext object for detailed information about the locking mechanism.

Cassandra

Alexis Wilke's blog

Snap! Websites
An Open Source CMS System in C++

Contact Us Directly

Recent blog posts

more

Snap! A C++ Open Source CMS

An inter-process, inter-computer lock for Cassandra

Recent Posts

Recent blog posts