Cassandra and production systems

As we are using Cassandra in our development process, we are encountering problems with it. Contrary to an SQL database, Cassandra is very light weight and can at times fail to the point where you cannot use a node anymore.

When that happens on a development system, we can simply delete everything and restart fresh. Fine.

However, in a production system, you're going to run into some problems if you lose a node and... that's the only node you have. In that case, you'd need a backup. However, the idea of Cassandra is to run many nodes to have automatic backups. If one node causes problems, then you can always try a repair and if that fails, just clean it and reinstalled. The node will auto-replicate once up again.

On my end, I got this error while trying to read a specific row:

apache::thrift::transport::TTransportException

Since the system reads that row each time a user tries to access the website running with Cassandra, the whole system went down. This is just a development system, so not a big problem in itself, I could just delete everything. Now in a production system, you'd want to disconnect the node, attempt a repair, and then reconnect (not too sure about the disconnect/reconnect, it may not be required.)

Thus, I'm writing this post to urge you to have a bare minimum of 2 Cassandra nodes while running Snap! in a production system. Either that, or come up with a way to make backups of your data.

See also: Cassandra fails when trying to read certain rows.

Snap! Websites
An Open Source CMS System in C++

Contact Us Directly