Apache Cassandra 3 storage system
The Apache Cassandra 3 storage system is an open source distributed Key-Value storage system. It was originally developed by Facebook to store particularly large amounts of data.
Main features
1. Distributed
The main feature of Cassandra is that it is not a single database, but a distributed network service composed of a group of database nodes. Write operations to Cassandra are replicated to other nodes, while read operations are routed to a specific node for reading.
2. Column-based structuring
Cassandra uses a column-based data model, similar to Google's BigTable. This model allows users to store and query data as needed without having to define the entire data structure in advance.
3. High stretchability
Cassandra is highly scalable and nodes can be easily added to expand the capacity of the cluster without restarting any processes, changing application queries, or manually migrating data.
Cassandra features compared to other databases
Cassandra is a hybrid non-relational database, similar to Google's BigTable. It has richer functions than Dynomite (distributed Key-Value storage system), but its support is not as good as the document storage MongoDB (between a relational database and a non-relational database). An open source product among relational databases, it is the most feature-rich among non-relational databases and is most similar to a relational database. The supported data structure is very loose and is a json-like bjson format, so it can store more complex data types).
Cassandra was originally developed by Facebook and later became an open source project. It is very suitable for database needs in network social cloud computing. It is based on Amazon's proprietary fully distributed Dynamo and combines Google BigTable's column family-based data model. P2P decentralized storage can be called Dynamo 2.0 in many aspects.
Compared with other databases, Cassandra has the following outstanding features:
1. Schema Flexibility: With Cassandra, just like a document store, you don’t have to determine the fields in a record in advance. You can add or remove fields at will while the system is running. This is a huge efficiency gain in large deployments.
2. True scalability: Cassandra is horizontally scalable in a pure sense. To add more capacity to the cluster, just point to another machine. You don't need to restart any processes, change application queries, or manually migrate any data.
3. Multi-data center identification: You can adjust the node layout to avoid failure of one data center. The backup data center will contain at least a complete copy of every record, ensuring data security.
Other functions
In addition to the main features mentioned above, Cassandra also provides some other features:
1. Range query: If you don’t want to perform all key value queries, you can set the range of keys to query.
2. List data structure: In mixed mode, super columns can be added to the five-dimensional data structure. This is very convenient for per-user indexing.
3. Distributed write operations: Cassandra allows you to centrally read or write any data anywhere, at any time, without any single point of failure.
Summarize
Apache Cassandra 3 is a powerful open source distributed Key-Value storage system that provides high scalability, schema flexibility, and reliability, making it ideal for applications that need to store and process large amounts of data.