Apache Cassandra is a highly-scalable partitioned row store. Rows are organized into tables with a required primary key.
https://cwiki.apache.org/confluence/display/CASSANDRA2/Partitioners[Partitioning] means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster.
https://cwiki.apache.org/confluence/display/CASSANDRA2/DataModel[Row store] means that like relational databases, Cassandra organizes data by rows and columns. The Cassandra Query Language (CQL) is a close relative of SQL.
For more information, see http://cassandra.apache.org/[the Apache Cassandra web site].
Issues should be reported on https://issues.apache.org/jira/projects/CASSANDRA/issues/[The Cassandra Jira].
- Java: see supported versions in build.xml (search for property "java.supported").
- Python: for
cqlsh
, seebin/cqlsh
(search for function "is_supported_version").
This short guide will walk you through getting a basic one node cluster up and running, and demonstrate some simple reads and writes. For a more-complete guide, please see the Apache Cassandra website's https://cassandra.apache.org/doc/latest/cassandra/getting_started/index.html[Getting Started Guide].
First, we'll unpack our archive:
$ tar -zxvf apache-cassandra-$VERSION.tar.gz $ cd apache-cassandra-$VERSION
After that we start the server. Running the startup script with the -f argument will cause Cassandra to remain in the foreground and log to standard out; it can be stopped with ctrl-C.
$ bin/cassandra -f
Now let's try to read and write some data using the Cassandra Query Language:
$ bin/cqlsh
The command line client is interactive so if everything worked you should be sitting in front of a prompt:
Connected to Test Cluster at localhost:9160. [cqlsh 6.3.0 | Cassandra 5.0-SNAPSHOT | CQL spec 3.4.8 | Native protocol v5] Use HELP for help. cqlsh>
As the banner says, you can use 'help;' or '?' to see what CQL has to offer, and 'quit;' or 'exit;' when you've had enough fun. But lets try something slightly more interesting:
cqlsh> CREATE KEYSPACE schema1 WITH replication = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }; cqlsh> USE schema1; cqlsh:Schema1> CREATE TABLE users ( user_id varchar PRIMARY KEY, first varchar, last varchar, age int ); cqlsh:Schema1> INSERT INTO users (user_id, first, last, age) VALUES ('jsmith', 'John', 'Smith', 42); cqlsh:Schema1> SELECT * FROM users; user_id | age | first | last ---------+-----+-------+------- jsmith | 42 | john | smith cqlsh:Schema1>
If your session looks similar to what's above, congrats, your single node cluster is operational!
For more on what commands are supported by CQL, see http://cassandra.apache.org/doc/latest/cql/[the CQL reference]. A reasonable way to think of it is as, "SQL minus joins and subqueries, plus collections."
Wondering where to go from here?
- Join us in #cassandra on the https://s.apache.org/slack-invite[ASF Slack] and ask questions.
- Subscribe to the Users mailing list by sending a mail to user-subscribe@cassandra.apache.org.
- Subscribe to the Developer mailing list by sending a mail to dev-subscribe@cassandra.apache.org.
- Visit the http://cassandra.apache.org/community/[community section] of the Cassandra website for more information on getting involved.
- Visit the http://cassandra.apache.org/doc/latest/development/index.html[development section] of the Cassandra website for more information on how to contribute.