Tuesday, April 22, 2014

Getting started with Cassandra

Apache Cassandra™ is a massively scalable open source NoSQL database. Cassandra is perfect for managing large amounts of structured, semi-structured, and unstructured data across multiple data centers and the cloud. Cassandra delivers continuous availability, linear scalability, and operational simplicity across many commodity servers with no single point of failure, along with a powerful dynamic data model designed for maximum flexibility and fast response times.

For more information about Cassandra's architecture, please read this documentation:  http://www.datastax.com/documentation/cassandra/2.0/cassandra/gettingStartedCassandraIntro.html

If you want to give it a try without having to allocate a big amount of resource, you can use the pre-built Cassandra virtual machine which can be downloaded at https://s3.amazonaws.com/planetcassandra-downloads/Cassandra-2.0.6.ova

0. First, there are some terms in Cassandra data model you need to know in comparison with Relational model

(src: http://www.ebaytechblog.com/2012/07/16/cassandra-data-modeling-best-practices-part-1/)

1. You need to install VirtualBox or VMware to run that virtual machine. Here I'm using a Ubuntu 14.04 computer, so I choose VirtualBox:

$ sudo apt-get install virtualbox

2. Double click the Cassandra-2.0.6.ova to import the machine into VirtualBox, and start it.

3. Press Alt+F2 to open the root shell:

and start the cassandra service by this command:

# systemctl start cassandra.service

4. Type in cqlsh to start the CQL shell:

Or you can open the web interface of cql shell by accessing the url:

http://<ip of the virtual machine>:8080/

(you may need to set the network interface of the virtual machine to Bridge Mode)

5. Trying some CQL commands:

a. Creating a keyspace (a database in CQL):

cqlsh> CREATE KEYSPACE geniusdb
      ...  WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', 'dc1' : 3 };

b. Using the keyspace:

cqlsh > USE geniusdb;

c. Creating a table:

cqlsh > CREATE TABLE users (
        ... user_name varchar,
        ... password varchar,
        ... gender varchar,
        ... session_token varchar,
        ... state varchar,
        ... birth_year bigint,
        ... PRIMARY KEY (user_name));

Reference: http://www.datastax.com/documentation/cql/3.1/cql/cql_using/about_cql_c.html