From charlesreid1

Line 17: Line 17:
</pre>
</pre>


==How many binary search function calls does it take to find one person in the million-person Manhattan phone book==
==Examples==
 
===How many binary search function calls does it take to find one person in the million-person Manhattan phone book===


The above lookup table can be used to answer this question.
The above lookup table can be used to answer this question.
Line 31: Line 33:
<math>
<math>
\log_{2}(1,000,000) \sim log_{2}( 2^{20} ) \sim 20
\log_{2}(1,000,000) \sim log_{2}( 2^{20} ) \sim 20
</math>
===How many strings with a specified length of 140 characters are in 10 terabytes?===
Rumor has it Twitter generates around 10 TB a day. How many 140 character strings would that be, in Python?
Using compact arrays (which Strings are, by default), storing characters as unicode (16 bits, or 2 bytes, per character).
Taking the overly simplistic view that there is around another 100 characters attached with each tweet to indicate username, dates, etc., would make each string:
<math>
256 \text{unicode characters} \times \dfrac{16 \text{bits}}{\text{unicode character}}
</math>
</math>



Revision as of 07:19, 29 May 2017

Estimating Bits and Bytes

Via Richard Feynman's Computing Heuristics lecture on Youtube

The following extremely useful relation, specified in IEEE 1541-2002

Powers of 2^10 (1024) are very close to powers of 1000.

        2^0 =                      1 ~ 1000^0
[kibi] 2^10 =                   1024 ~ 1000^1
[mebi] 2^20 =                1048576 ~ 1000^2
[gibi] 2^30 =             1073741824 ~ 1000^3
[tebi] 2^40 =          1099511627776 ~ 1000^4
[pebi] 2^50 =       1125899906842624 ~ 1000^5
[exbi] 2^60 =    1152921504606846976 ~ 1000^6

Examples

How many binary search function calls does it take to find one person in the million-person Manhattan phone book

The above lookup table can be used to answer this question.

The binary search is an O(log n) operation, assuming the array is sorted (like a phone book). In this case, we want log base 2 of one million,

$ \log_{2}(1,000,000) $

Log base 10 is easy enough, but we have log base 2. Using the above table, 1M ~ 2^20:

$ \log_{2}(1,000,000) \sim log_{2}( 2^{20} ) \sim 20 $

How many strings with a specified length of 140 characters are in 10 terabytes?

Rumor has it Twitter generates around 10 TB a day. How many 140 character strings would that be, in Python?

Using compact arrays (which Strings are, by default), storing characters as unicode (16 bits, or 2 bytes, per character).

Taking the overly simplistic view that there is around another 100 characters attached with each tweet to indicate username, dates, etc., would make each string:

$ 256 \text{unicode characters} \times \dfrac{16 \text{bits}}{\text{unicode character}} $

Flags





See also: