Apache Cassandra is a Java-based, open-source distributed database management system initially designed to power Facebook’s In Box search feature. Cassandra was released as an open-source project on Google code in 2008 and in 2010 became a top-level Apache Incubator project. Since then, Cassandra has become one of the most widely used NoSQL database applications.
Offering no single point of failure in a highly available service, Cassandra has advantages over relational databases and other NoSQL databases including its continuous availability, linear scale performance, and overall simplicity to operate. All nodes play an identical role and data is automatically replicated to multiple nodes for fault-tolerance. When compared to relational databases, Cassandra enables high incoming data velocity, very high data volumes, decentralized deployments and management of both structured and unstructured data.
Cassandra’s highly scalable nature requires no downtime or interruption to applications. The lack of a single point of failure enables Cassandra to offer continuous availability and uptime. When new nodes are added – or existing nodes experience failure and must be resolved – the work is simply done to the specific node without taking the application out of service.
[See Also:Aerospike vs. Redis]
Cassandra can deploy a large number of nodes across multiple data centers with configurable replication strategies. The ability to scale across multiple data centers makes Cassandra a popular choice for major corporations to store their data in the cloud as evidenced by the fact that by 2014, Cassandra had become the 9th most popular database. Cassandra uses its own SQL-like language called CQL (Cassandra Query Language) so users who have a background in relational databases will recognize the syntax, making the switch from a relational database easier.
Current Cassandra users include many of the largest and most prominent cloud-based services in the world. Netflix uses Cassandra as their backend database to store customer viewing data and to support their streaming API. Expedia utilizes Cassandra to store over 2 billion historical hotel prices, enabling them to quickly retrieve price data in an efficient manner. Many other well-known online businesses like Digg, Mahalo and Reddit also utilize Cassandra for various purposes.
Cassandra is a general purpose non-relational database that also offers advantages to non-cloud companies such as Walmart, Australia Post, and Intercontinental Hotels Group who all count themselves as users. Cassandra is suitable for use in a wide variety of applications such as Internet of things apps, messaging, product catalogs and other retail applications among others. In each of these cases, Cassandra is superior to most any other available option.
[See Also:Kafka Getting Started]
To companies like the ones mentioned above, Cassandra offers a number of benefits, not the least of which is its ability to process very large amounts of data quickly. In Expedia’s case, this refers to billions of constantly-updated price points from the 140,000 hotels that they work with. Cassandra can read and write data very quickly and handle volatile data that changes regularly. Cassandra is highly scalable and is easy to set up, administer and use. Apple has over 100,000 Cassandra nodes in production alone, illustrating this scalability. Cassandra’s operational costs are reasonable and support is easily available when required.
Everything you need to know about outsourcing technology development
Access a special Introduction Package with everything you want to know about outsourcing your technology development. How should you evaluate a partner? What components of your solution that are suitable to be handed off to a partner? These answers and more below.