Search feature

Table of words

With a Cassandra environment, the system can simply save all the words (3 letters or more, or whatever we can think of) in a table used as an index that references all the pages with those words.

search_table[word][page] = 1;

"[page]" is specific to a website, but not "[word]". We could also control and make sure that words inserted in this table are limited to a dictionary, however, it could be difficult if we support 200 languages. (We have to think of speed as well and we may want to have one table per letter.)

Maintenance

When a page is deleted, we need to make sure to delete the corresponding entries in the search index.

Other Solutions

We want to look at the Apache solr and such features to see weather that could be of any help.

Extensions

Extensions would be systems that allow us to search all kinds of documents (PDF, MS-Words, etc.)

 

See also: http://www.opensearch.org/Home

Snap! Websites
An Open Source CMS System in C++

Contact Us Directly