- 1 Notes
- 1.1 Goodrich Chapter 3 Notes
- 1.2 Skiena Chapter 1 and 2 Notes
- 1.2.1 Analyzing algorithms
- 1.2.2 Big O, Big Omega, Big Theta
- 1.2.3 Heuristics
- 1.2.4 List of functions
- 1.2.5 Analyzing two sort functions
- 1.2.6 String Pattern Matching
- 1.2.7 Matrix Multiplication
- 1.2.8 Logs and Binary Search
- 1.2.9 Logs and Bits
- 1.2.10 Logs and Multiplication
- 1.2.11 Fast exponentiation
- 2 Questions
- 2.1 Goodrich Chapter 3 Questions
- 2.2 Skiena Chapter 1 Questions
- 2.2.1 Proofs by Induction
- 2.2.2 Estimation
Goodrich Chapter 3 Notes
date structure - systematic way of organizing and accessing data
algorithm - step by step procedure for performing a task in a fixed amount of time
To time algorithms: distinguish between:
- wall time (less fair, dependent on environment) time.time
- number of cpu cycles used by algorithm (better) time.clock
- timeit/built in profiling tool (best)
Moving beyond experimental analysis:
- Independence from hardware/software
- Use higher level analysis
- Account for all possible inputs
Worse case versus average case
- Algorithm speed changes with different inputs
- Worst case is easiest to determine
- Average case requires probability distribution over inputs
Seven functions for algorithm analysis:
1. Constant function
2. Logarithmic function
3. Linear function
4. n log n function
5. Quadratic function
6. Cubic/polynomial function
7. Exponential function
Example: find_max function
- Runtime: 2 statements always run, plus the for loop
- Probabilit/worse case/average case;
- The worst case assumes we have to run biggest = k every time through the for loop
- Average case analysis:
- If there are n numbers total, and if we have just looked at j numbers so far, the probability that the jth number will be the new largest of j numbers is 1/j
- The number of times biggest number is updated, when we run n times, is the sum of each probability,
nth Harmonic number
Harmonic numbers increase as - sometimes this is the source of a "magic log" term that appears.
- For any integer n>= 0 and any real number a > 0, a neq 1,
This applies to finding the largest integer representable with n bits:
Big O notation:
- If we have two functions f(n) and g(n) mapping positive integers to positive real numbers, wthen we say f(n) is O(g(n)) if, for a real constnat,
"f(n) is gib-O of g(n)
Alt: "f(n) is order of g(n)"
Alt: "f(n) is O(g(n))"
Big o notation allows ignoring constant factors and all lower order terms
Big omega notatino:
- Big O means less than or equal to
- Big Omega means greater than or equal to
- Take the same two functions f(n) and g(n),
- We say that f(n) is Omega(g(n)) if
Example: 3 n log n - 2 n is Omega(n log n)
- To compare algorithms, look at two things:
- Growth rates of algorithms as n increases
- Maximum problem size (n) that can be solved as t increases
- Examples of algorithm analysis:
- Length of list - O(1)
- Accessing element of list - O(1)
- Finding maximum: worst case vs aerage case
- Prefix averages
- Three-way disjointness
- Element uniqueness
- If we want to find average of sequence S of n numbers, in another sequence A, such that A[j] is average of first j numbers,
- Implementation 1: nested for loops (poor performance, O(n^2))
- Implementation 2: uses colon indexing notation, like , but no better performance, still O(n^2)
- Implementation 3: keep a running total, and use a cumulative algorithm
Three way set disjointness:
- Finding items that are in all 3 sets (if not, 3 sets are disjoint)
- 2 implementations:
- Triple nested for loop, if a==b==c (wastes time, O(n^3))
- Double nested for loop, if a==b, then another for loop, if b==c, O(n^2)
Sorting as an algorithm tool:
- Sort before searching. Sorted function is O(n log n)
- Limits cost to cost ofsearch
Justifying algorithm cost:
- Proof by example/counter-examkple
- Proof by contradiction/contrapositive
- use of DeMorgan's Law
- Induction and loop invariants
- To prove Fibonacci number nl, F(n), is less than 2^n: First show
- Then show
- Then we know
- Prove a statement L about a loop
- Show L0 (initial claim) is true before the loop is run
- Show that if L(j-1) is true before an iteration, then L(j) is true after an iteration
- Final statement, Lk implies L is true.
Skiena Chapter 1 and 2 Notes
RAM model of computation - each simple operation takes exactly one time step. Loops and subroutines are not simple operations.
Algorithms can be understood and studied in a language- and machine-independent manner.
Three ways of looking at runtime:
- worst-case complexity
- average-case complexity
- best-case complexity
Why average case complexity so difficult? To illustrate: casino example.
Try to project what will happen if you bring n dollars into a casino to gamble. The best case, that you walk out owning the place, is possible but so unlikely that you should not even think about it. The worst case, that you lose all n dollars, is easy to calculate and distressingly likely to happen. The average case, that the typical bettor loses 87.32% of the money that he brings into the casino, is difficult to establish and its meaning subject to debate. What exactly does average mean? Stupid people lose more than smart people, so are you smarter or stupider than the average person, and by how much? Card counters at blackjack do better on average than customers who accept three or more free drinks. We avoid all these complexities and obtain very useful result by just considering the worst case.
That's concisely expressing in language the complexity of the parameter space . The average time is expressed mathematically as:
where is algorithm time as a function of the input parameters (this is the function that maps the input parameter values to the output, the computational cost. The probability distribution is the probability distribution that integrates to 1 when integrated over the entire input parameter domain :
Note: this comes from the definition of the average of a function, from calculus 2:
Big O, Big Omega, Big Theta
- means is an upper bound on . There exists a constant such that for
- means is a lower bound on . There exists constant c such that for all
- means the function is bounded by a function within two values of constants, forming a lower bound and forming an upper bound, for
Example question: is ?
Using exponent properties, and the definition of ,
and there are two constants and such that
Another question: is ?
By expanding the quantity on the left hand side, and determining the number of operations,
If then the term
If then the middle term
Algorithms, whatever their big Oh class, take generally the same time for n=10
Any algorithm with runtime is impractical for
Algorithms with a runtime of have a greater operating range, but become impractical for
Linear time and n log n time algorithms are practical for inputs of a billion items
An O(log n) algorithm hardly breaks a sweat for any imaginable value of n
List of functions
List of functions, in increasing order of dominance (more thorough than the Goodrich book):
- Constant functions
- Logarithmic functions
- Linear functions
- Superlinear functions
- Quadratic functions
- Cubic functions
- Exponential functions
- Factorial functions
The sum of two functions is governed by the dominant one:
The product of two functions is governed by the dominant one:
Analyzing two sort functions
String Pattern Matching
Pattern matching is the most fundamental algorithmic operation on text strings.
Problem: Substring pattern matching.
Input: A text string t and a pattern string p.
Output: Does t contain the pattern p as a substring, and if so, where?
Logs and Binary Search
Binary search is an O(log n) algorithm.
To find a given name in the one-million-person Manhattan phone book, you only need 20 comparisons.
How many lookups would it take to find one thing in 1000 items?
The height h of a rooted binary tree with n leaf nodes is log(n)
Logs and Bits
The number of bits we need to represent any one of n different possibilities (n items, or the n integers 1 to n), using bits, is log(n) bits. (Log base 2, of course.)
Logs and Multiplication
Logarithms can be used to multiply numbers (principle of the slide rule).
Now can use exponent and log functions to calculate any exponent , like so:
The simplest algorithm to raise some number a to the power of n would require n-1 multiplications.
However, can also observe that n = floor(n/2) + ceil(n/2)
Then we can split up the exponent as:
and if n is odd,
This eliminates half of the total multiplications, and only introduces two new multiplication operations.
func power(a,n): if(n==0) return(1) x = power(a, floor(n/2)) if(n is even) then return(x^2) else return(a*x^2)
Goodrich Chapter 3 Questions
Sum of first n even numbers
Question R-3.6: What is the sum of the first n even numbers, 0..2n, for n > 0?
Answer: Use the sum of integers formula,
Sum of first n even numbers is .
Solving 3 way disjointedness problem in n log n time
Question C-3.35: Assuming it is possible to sort n numbers in O(n log n) time, show it is possible to solve 3-way set disjointedness problem in O(n log n) time.
Answer: First, we sort 3 sets. That takes 3 * n log n units of time.
Then we can iterate over each element of a, which costs us n, and inside of that loop we can perform a binary search, which costs us log n, for an overall cost of n log n.
If we find b in a, then we perform another binary search for c in a. This will cost us log n.
Is the worst case then still O(n log n)? Worst case is, every element in A also in B, so we perform two binary searches each time. 2 * log n still does not cost us that much.
finding 10 largest elements
Question C-3.36: Describe an efficient algorithm for finding the 10 largest elements of a sequence of size n. What is the runtime of this algorithm?
Answer: sort n elements in O(n log n) time.
- Get 10 items from the back (biggest) - or if you want the smallest, from the front.
O(n) algorithm calling O(n2) algorithm
Question C-3.38: Show that is O(n^3).
Answer: This is the notation we use when we need to say, "we are running a loop, and inside the loop calling a function whose cost is O(i^2)."
We know (via the story about Carl Gauss - 5050):
We can also write:
(Derivation requires some series manipulation, starting with sum over k^3 and eventually getting to k^2
This shows that when an algorithm that runs n times, O(n), calls an algorithm that runs in O(i^2) time, it results in an O(n^3) algorithm overall
Run a scaling analysis to show Python sorted function is n log n
Question P-3.57: Perform experimental analysis to test hypothesis that Python's sorted() function runs in O(n log n) time.
Skiena Chapter 1 Questions
Proofs by Induction
Sum of integers
Question: Prove by induction that for
Answer: start by letting n = 1, then we see that the expression holds.
Then we let n = 2, and we see that the expression holds.
Finally, we assume that the expression is true for n, and see if that implies it is true for n+1. The strategy here is, peel off the one largest term, then substitute the expression (which we assumed is true) and see what shakes out.
Sum of squares of integers
Question: Prove by induction that for by induction.
Answer: same strategy as before: verify the case is true for n = 1, then verify the case is true for n = 2.
Then, we assume the nth case is true, and check if that implies the n+1th case is true:
For this one, it's a little bit easier to work backwards from the given formula, plugging in (n+1) for n,
Sum of cubes of integers
Question: Prove that for , by induction.
Answer: Same strategy as above.
Sum of product of neighbor numbers
Question: Prove that .
Let n=1, prove the case of n=1. Let n=2, prove the case of n=2.
Then, if the nth case is true, the n+1th case become:
The last two terms can be factored by grouping, to form a fourth linear term (n+4), so
which was the premise to be proved.
Sums of expression divisible by 3
Question: Prove by induction that is divisible by 3 for all .
Start with case of n = 1, and n = 2. Both divisible by 3.
Then we assume the nth case is true, and see if it implies the n+1th case:
The first term on the right hand side is divisible by 3 by our assumption, and the last term on the right is definitely divisible by 3.
Pages of books
Question 1-19 a) Do all of the combined the books you own have at least 1M pages?
Each book has O(100) pages - let's say, O(250) pages. Using either 1E2 or 2.5E2 as number of pages per book.
100 pages per book: 1 million pages is 1E6 pages x (1 book/1E2 pages) = 1E4 books = 10,000 books. Nope.
250 pages per book: 1 million pages is 1E6 pages x (1 book/2.5E2 pages) = 4E3 books = 4,000 books. Still nope.
Question 1-19 b) How many total page are in the school library?
Cascading estimations here.
1 book ~ 100 pages.
1 shelf ~ 10 books.
1 long bookcase ~ 100 shelves.
1 section ~ 10 long bookcases.
1 small library ~ 10 sections
1 big library ~ 100 sections
1 very large library ~ 1000 sections
(Note: can squeeze in additional factors of 10 in by turning one 1 into one 2.5 and one 1 into one 4.)
Number of pages in the library:
1 small library x (10 sections/1 lib) x (10 bookcases/1 section) x (100 shelves/1 bookcase) x (10 books/1 shelf) x (10 pages/1 book)
= 10 x 10 x 100 x 10 x 10
= 100 x 100 x 100
= 1 million pages for a small library (10k books)
= 10 million pages for a big library (100k books)
= 100 million pages for a very large library (1M books)
(This estimate does seem on the low side, probably additional factors of 10 missing. Plus or minus 2 orders of magnitude.)
Note: the NY Public Library is estimated to have somewhere around 3 million books. So, this seems to be in the right general ballpark.
Number of words in Skiena textbook
Question: Number of words in the Skiena textbook?
A single full page is about 15 words per line, about 40 lines per page, for a total of about 600 words per page.
The total number of pages is 700.
So the absolute upper bound, assuming every page is densely packed, is 420,000 words.
We can get a better estimate by assuming that 2/3 of the pages are full (600 words), and 1/3 of the pages are occupied by tables, figures, and white space (half full, or 300 words).
Do the half-full pages plus the full pages:
Nwords = 700 pages x (33 half-full pages/100 pages total) x ( .50(600) words / 1 half-full page) + 700 pages x (66 full pages/100 pages total) x (1.0(600) words / 1 full page) Nwords = (700)(600)(.33*.5 + .67) Nwords = 420000(.7) Nwords = 290,000
So the grand total is in the ballpark of 300,000 words. (Intuitively, this seems a bit on the long side.)
Number of hours in 1M seconds
How many hours are in 1M seconds? How many days? Do the math in your head.
Here we use that trick of using quarters when doing scale analysis.
3600 seconds per hour ~ 2500
24 hours per day ~ 25
1 M seconds = 1E6/2.5E3 = 4E(5-3) = 4E2 = 400 hours
1 M seconds = 400 hours/(25 hr/day) == 16 days
The actual value?
1E6 x (1/3600) x (1/24) = 11 days
Trick here is not to do just an order of magnitude estimate, but to make use of quarters as well. 1000/25, not just 1000/10. (If you don't do this, twice, you'll be off by an order of magnitude.)
Number of towns and cities in US
Question: Estimate the number of towns and cities in the United States.
Here we use the 80/20 heuristic. We assume that 80% of the population lives in large cities, the remaining 20% live in smaller cities.
Then, of that remaining 20%, we assume that 80% of them live in large cities, and 20% of them live in smaller cities.
Of that remaining 20%, etc.
Do this multiple times, splitting into groups that live in cities of 1M, 100k, 10k, 1k, 100, and 10. Then we get:
(Note: the drawing includes cities of >10M separately from 1M, but this throws the estimate off by an order of magnitude on the small side. Not enough megacities in the US.)
When we combine this with the total population of 3 M people, we get the total number of cities/towns that people live in (that is, if 80% of 300 M people live in cities of 10 M people or more, we can determine how many 10 M person cities there are; etc.)
Here's the nested expression (p = people):
Plugging this into Wolfram Alpha gives 17,040.
This is pretty damn close to the number of "incorporated places" listed by the census: 19,354 .
Sorting algorithm time
A sorting algorithm takes 1 second to sort 1000 items.
How long will it take to sort 10,000 items if:
a) O(n^2) algorithm?
b) O(n log n) algorithm?
a) If O(n^2) algorithm:
which gives us a time per operation of 1 microsecond:
Now, for 10,000 items,
and this gives a total time of
The quadratic algorithm will sort 10,000 items in 100 seconds.
For the superlinear algorithm,
This gives time per operation of
So for 10,000 items,
This gives a total time of
The superlinear algorithm will sort 10,000 items in 120 seconds.
Computer Sciencenotes on computer science topics on the wiki, for educational and learning purposes
Part of the 2017 CS Study Plan.
Flags · Template:CSFlag · e