Google Cloud/Bigtable
From charlesreid1
Overview
Bigtable features:
- sparsely populated table
- billions of rows, thousands of columns
- ideal data source for MapReduce operations
- updatable/mutatable
- TB to PB of data
- large amounts of single-keyed data with low latency
- fast read write throughput, low latency
- fully managed - design your schema and you're done
- example applications: marketing data, financial data, IoT data, time series data
- empty cells don't take up any space
From the original white paper: "A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, a column key, and a timestamp; each value in the map is an uninterrupted array of bytes."
From the documentation: "A Cloud Bigtable table is sharded into blocks of contiguous rows, called tablets, to help balance the workload of queries. (Tablets are similar to HBase regions.) Tablets are stored on Colossus, Google's file system, in SSTable format."
This means that Bigtable nodes don't store the data - it's all in cloud storage. So moving data is fast, because you just move pointers - no need to copy data around.
When you update a row, updates are stored sequentially, so updates take up additional space. When infrequent compaction occurs, duplicate data is eliminated.
Access control happens at project level - not at table level.
Comparison with other storage options
From Cloud Docs overview of Bigtable: https://cloud.google.com/bigtable/docs/overview
Cloud Bigtable is not a relational database; it does not support SQL queries or joins, nor does it support multi-row transactions. Also, it is not a good solution for less than 1 TB of data.
- If you need full SQL support for an online transaction processing (OLTP) system, consider Google Cloud SQL.
- If you need interactive querying in an online analytical processing (OLAP) system, consider Google BigQuery.
- If you need to store immutable blobs larger than 10 MB, such as large images or movies, consider Google Cloud Storage.
- If you need to store highly structured objects, or if you require support for ACID transactions and SQL-like queries, consider Cloud Datastore.
Resources
Bigtable paper (2006): http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf