Cassandra

WARNING: This implementation of an inter-process, inter-computer lock works with Cassandra only if you know that you are directly dealing with a single Cassandra node at a time. The Cassandra C++ driver (probably all the drivers) makes use of a set of threads to connect to several Cassandra nodes and if the load of the current thread/node pair becomes too large, it will automatically switch to another thread/node pair. This means your messages may not be received in the order you sent them to the database cluster. As a result, the lock mechanism described below will not function as ...

As I was working on the antihammering plugin for Snap!, I wanted to use the count() feature to quickly know how many hits there are for a given amount of time.

Only as I did that, I noticed that the count() was blocked at 100. The problem was that the predicate used at the lower layer in libQtCassandra would actually set the count maximum to 100 by default.

I think that since the count function only counts the columns of interest, that it goes really fast no matter what the counter maximum is and whether you have a predicate or not. So I changed the lower layer implementation to force the ...

Sample Code:

QtCassandra::QCassandraRowPredicate row_predicate;
row_predicate.setCount(100);
...

QtCassandra::QCassandraColumnRangePredicate column_predicate;
column_predicate.setCount(100);
...

The Cassandra system allows you to read an array of rows or columns. This is done by a special query command sent to the database system.

The libQtCassandra library offers predicate classes giving you the ability to read a set of rows or columns all at once (see example above.) In general, reading more at once is better because it gives you a faster transfer rate to get one large block ...

Introduction

In order to run tests against what looks like a real cluster of Cassandra nodes, one wants to create multiple nodes and run them in parallel and run their software against that cluster.

So, I decided to create two racks on two separate computers. Each computer runs 3 VMs each run one instance of Cassandra. The VMs are installed with the most basic Ubuntu server (i.e. do not add anything when offered to install LAMP and other systems), Java, and Cassandra.

Network

Local Network

It is possible to setup a VM to make use of the local network (LAN). VirtualBox will ...

Introduction

The following are instructions to get Cassandra installed on your servers.

You may also want to install Cassandra directly from a file distributed from the Cassandra website (if you want to work with a version on the edge.) In that case, the Java installation instructions still apply to you, however, the Cassandra installation then falls in your lap...

Table of Contents

Java

Cassandra uses Java so you want to install Java first. In order to get the latest version of Java directly from Oracle, you want to install the PPA update source as follow:

sudo vim ...

Snap! is being developed in a way which is quite different from many of the existing CMS.

One of the features is to move pages being deleted to a trashcan, in effect, not actually deleting the data from the Cassandra cluster, but keeping it in a different place.

Actually, the "move page" feature in Snap! does not move anything, it makes a copy of the existing data in the specified destination and marks the old data as hidden. It also links the old and new data together and time stamp them.

There are several reason for doing all of that work. First of all, the Cassandra cluster ...

Cassandra is very light weight, contrary to a standard database, they coin the safety of your data on the fact that it gets replicated many times, not on the fact that it gets transported safely between you and its journal and the drive.

There is a huge impact to that light weight though. Once in a while, the tables, or more specifically, a node journal get mangled. When that happens, you can continue to use Cassandra for any data that appears before the mangled data. This gives you the impression that everything works, when in fact, something is awry in that node.

An interesting side effect ...

As I work more and more with Cassandra, I bump in more and more side effects of how the system works.

Yesterday I noticed that I would always get new entries for a set of pages I create on Snap! websites. These pages had a parameter, a list of boxes, which could be empty because some theme do not allow any boxes at all.

Unfortunately, if I may say, Cassandra does not support empty data. That is, if a cell is set to an empty string (""), it is the same as deleting that cell. The problem with that is that the cell disappears completely. So the only way is to have at least one byte of ...

As I was working with libQtCassandra for the Snap! project, I notices a problem in one of my queries. That's actually the only one where I used the reverse flag. This flag is used to ask Cassandra to return its data in reverse order. That works perfectly, on Cassandra's side, but not so well in libQtCassandra...

In order to allow the C++ array operator (i.e. the square brackets ([]) are overloaded!) in the libQtCassandra, I decided to make use of QMap to be able to quickly access the data. This means you can create a Cassandra context and then access data like this:

value = ...

As we are using Cassandra in our development process, we are encountering problems with it. Contrary to an SQL database, Cassandra is very light weight and can at times fail to the point where you cannot use a node anymore.

When that happens on a development system, we can simply delete everything and restart fresh. Fine.

However, in a production system, you're going to run into some problems if you lose a node and... that's the only node you have. In that case, you'd need a backup. However, the idea of Cassandra is to run many nodes to have automatic backups. If one node causes ...

Snap! Websites
An Open Source CMS System in C++

Contact Us Directly