Revision as of 19:16, 24 October 2017

Overview

Bigtable features:

sparsely populated table
billions of rows, thousands of columns
ideal data source for MapReduce operations
updatable/mutatable
TB to PB of data
large amounts of single-keyed data with low latency
fast read write throughput, low latency
fully managed - design your schema and you're done
example applications: marketing data, financial data, IoT data, time series data
empty cells don't take up any space

From the original white paper: "A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, a column key, and a timestamp; each value in the map is an uninterrupted array of bytes."

From the documentation: "A Cloud Bigtable table is sharded into blocks of contiguous rows, called tablets, to help balance the workload of queries. (Tablets are similar to HBase regions.) Tablets are stored on Colossus, Google's file system, in SSTable format."

This means that Bigtable nodes don't store the data - it's all in cloud storage. So moving data is fast, because you just move pointers - no need to copy data around.

When you update a row, updates are stored sequentially, so updates take up additional space. When infrequent compaction occurs, duplicate data is eliminated.

Access control happens at project level - not at table level.

Resources

Bigtable paper (2006): http://static.googleusercontent.com/media/research.google.com/en/us/archive/bigtable-osdi06.pdf

Flags

@@ Line 5: / Line 5: @@
 * billions of rows, thousands of columns
 * ideal data source for MapReduce operations
+* updatable/mutatable
 * TB to PB of data
 * large amounts of single-keyed data with low latency
@@ Line 17: / Line 18: @@
 This means that Bigtable nodes don't store the data - it's all in cloud storage. So moving data is fast, because you just move pointers - no need to copy data around.
+When you update a row, updates are stored sequentially, so updates take up additional space. When infrequent compaction occurs, duplicate data is eliminated.
+Access control happens at project level - not at table level.
 =Resources=

Google Cloud/Bigtable: Difference between revisions

From charlesreid1

Revision as of 19:16, 24 October 2017

Overview

Resources

Flags