How to Tune Cassandra Transaction Integrity for E-Commerce Applications

December 1, 2023

Technology

Transaction integrity refers to the way data transforms or updates other data. The result should be a real reflection of the combination of the original data and the new transactions. Sometimes it refers to having an accurate record of each transaction.Why Transaction Integrity is CrucialTransaction Integrity is important whenever the failure of any one command in the transaction at commit-time would invalidate the entire transaction. Transaction integrity is an uncompromising proposition: either all of the transaction is written to the data source when you commit it, or all of it is rolled back.Unlike RDBMS ACID transactions with rollback or locking mechanisms, Cassandra instead offers atomic, isolated, and durable transactions with eventual/tuneable consistency that lets the user decide how strong or eventual they want each transaction’s consistency to be. As a non-relational database, Cassandra does not support joins or foreign keys and consequently does not offer consistency in the ACID sense. Cassandra supports atomicity and isolation at the row-level, but trades transactional isolation and atomicity for high availability and fast write performance. Cassandra writes are durable.Cassandra is distributed, meaning that it is capable of running on multiple machines while appearing to users as a unified whole. Cassandra is however decentralized, meaning that every node is identical; no Cassandra node performs certain organizing operations distinct from any other node. Instead, Cassandra features a peer-to-peer protocol. The fact that Cassandra is decentralized means that there is no single point of failure. All the nodes in a Cassandra cluster functions exactly the same, also known as server symmetry.Understanding the Write Operation in CassandraIn Cassandra, a write operation is atomic at the partition level, meaning the insertions or updates of two or more rows in the same partition are treated as one write operation. A delete operation is also atomic at the partition level.The transaction integrity in this can be illustrated thus: For example, if using a write consistency level of QUORUM with a replication factor of 3, Cassandra will replicate the write to all nodes in the cluster and wait for acknowledgement from two nodes. If the write fails on one of the nodes but succeeds on the other, Cassandra reports a failure to replicate the write on that node. However, the replicated write that succeeds on the other node is not automatically rolled back.The fact the Cassandra uses client-side timestamps to determine the most recent update to a column is also very important in e-commerce and making it more user-friendly. The latest timestamp always wins when requesting data, so if the multiple client sessions update the same columns in a row concurrently, the most recent update is the one seen by users.Writes in Cassandra are durable. All writes to a replica node are recorded both in memory and in a commit log on disk before they are acknowledged as a success. If a crash or server failure occurs before the memtables are flushed to disk, the commit log is replayed on restart to recover any lost writes. In addition to the local durability (data immediately written to disk), the replication of data on other nodes strengthens durability. You can set the commitlog_sync option in the cassandra.yami file to either “periodic” or “batch” to manage the local durability to suit your needs for consistency.Cassandra write and delete operations are performed with full-row isolation. This means that a write to a row within a single partition on a single node is only visible to the client performing the operation. The operation is confined to this scope until completion. All updates in a batch operation belonging to a given partition key have the same restriction. However, a Batch operation is not isolated if it includes changes to more than one partition.