From charlesreid1

Breakdown

This was an extremely interesting problem. When I first read through the problem, the solution came to me almost immediately - if we want to know how many divisors a number has, it relates to its prime factorization, and if we want a number with eight divisors, we want a number whose prime factors can be rearranged in exactly 4 distinct ways (leading to 8 pairings, or 8 divisors).

This is confirmed by examining the examples the problem gives. Several integers that are below 100 and have exactly 8 divisors are given in the problem. The prime factorizations of each number followed certain patterns. These patterns are:


p_1 p_2 p_3

and


p_A^3 p_B

Why these patterns? It's because these are the two kinds of groupings of objects that lead to exactly four ways of partitioning the objects. The first is if we have 3 distinct objects, which can be grouped in 4 distinct ways:

A B C |
A B | C
A | B C
A C | B

and if we have 3 identical items and one unique item, these can also be arranged in exactly four ways:

s s s T | 
s s s | T
s s | s T
s | s s T

The problem statement gives two cases where this happens, and leaves you stumped when your code gives you 179 numbers with exactly 8 divisors, instead of the information given in the problem, which is 180. I struggled over this for a while. I thought it might be a bug in my code, but typically these kinds of bugs are symmetric in some way, so I would be leaving out two or more numbers with 8 divisors.

The next thing I checked was the form of the numbers I was generating, which was either the product of 3 distinct primes:


p_1 p_2 p_3

and


p_A^3 p_B

I was generating every combination of these numbers for every prime below the maximum, in this case 1000, and things seemed to be working out okay, except that I was missing one. When I ran it on the case of a 1000000 maximum, I was off from the information given in the problem by four. The challenge, of course, is that when I spot checked my list of 179 numbers, all of them had exactly 8 divisors - it wasn't that I was generating too many numbers. It was that some number not included on my list had 8 divisors, and I had to find it.

This was a cryptic problem. I tried investigating whether the primes I was using could, in fact, be composite numbers. So, if I substitute 4 for pA and 6 for pB, does that result in a number with 8 factors? The answer, of course, is no. These definitely need to be primes. What if pA and pB, or p1 and p2 and p3, are not distinct primes? This led to more numbers with fewer or more than 8 factors.

I tried pA = 2 and pB = 2, which led to 2^4 or 16, which did not have 8 factors. I tried p1 = 4 and p2 = 2 and p3 = 4, totaling 2^5 or 32, which also did not have 8 factors. Then I tried pA = 4 and pB = 2, for a total of 2^7 or 128, and to my surprise, 128 had 8 factors. Here was the missing number! 128 was not in the list of numbers with eight divisors, and its prime factorization did not fit the two forms given above. The procedure for finding 2^7 and the rule for it are a bit different. The rule relates to how we can use the prime numbers to construct new numbers, as we saw in Problem 500, the prior problem.

(To give it away: I was missing a prime of the form p_r^7. More details below.)

Generating composites from primes

Let's go back to Problem 500 for a moment.

There, we were trying to construct numbers with a specified number of divisors. To do that, we constructed a table with the prime numbers, with columns arranged in a particular way: n, n^2, n^4, n^8, and so on. Then, to construct new numbers, we found new combinations of numbers from each column.

The reason we have only powers of 2 for these primes is, any number can be expressed as the sum of powers of 2 - i.e., can be represented in binary. If we had an odd power of n, say n^5, we could write in terms of powers of 1 and 2, like this: (n)(n^4). Likewise for n^13, we could write it as (n)(n^4)(n^8).

That is, we can take n raised to an integer power, and we can express that integer power as a binary number; that binary number yields the order of selection of the primes from the table.

Generating composites with 8 divisors

Back to our problem at hand. We are looking for a third combination of prime numbers that will give us numbers with 8 divisors. If, instead of using primes in the above expressions, we instead use even powers of primes, we get a whole new class of numbers with 8 divisors, which do not show up until at least 128.

For example, a number with 8 divisors can be formed by letting the primes be distinct powers of the prime 2:


\begin{align}
p_1 &=& 2 \\
p_2 &=& 2^2 \\
p_3 &=& 2^4
\end{align}

That forms the candidate number:


c = 2^{4+2+1} = 128

which, indeed, has 8 divisors:

WolframAlpha 128 8factors.png

We were only off by 1 (our program predicted 179 numbers with 8 factors below 1,000, instead of the problem statement's 180) below 1,000 because the next such candidate number is:


c = 3^{4+2+1} = 2187

which has 8 divisors:

WolframAlpha 2187 8factors.png

and here are the values of the seventh power of the first few primes:

2		128
3		2187
5		78125
7		823543
11		19487171
13		62748517
17		410338673
19		893871739

which all have 8 divisors:

WolframAlpha 893871739 8factors.png

All of which is to say, in the process of generating numbers with 8 divisors, we need to include numbers of the form p_r^7.

Procedure

Now, the procedure is as follows:

  • Generate various combinations of three primes p_1, p_2, p_3 whose product is less than the maximum; those numbers are numbers with 8 divisors of the form p_1 p_2 p_3. The range for these numbers should be all prime numbers from 2 to max/6.
  • Generate combinations of two primes p_A p_B, and generate the numbers p_A^3 p_B and p_A p_B^3. Keep the products that are less than the maximum; those numbers each have 8 divisors. The range for these numbers should be all prime numbers from 0 to the square root of max.
  • Generate primes p_r and corresponding numbers p_r^7. Keep results that are less than the maximum; those numbers each have 8 divisors. The range for these numbers should be from 0 to the seventh root of max (square root of square root, if you're lazy).

Once we do that, we're all set. Indeed, this procedure gets me correct answers for all numbers up to 1000 that have 8 divisors, and also gives me the correct answer for all numbers up to 1,000,000 that have 8 factors. It's also blazing fast, finding the correct number of composites up to 1 million with 8 divisors and returning it in less than half a second:

$ javac Problem501.java && time java Problem501
Finished with first part. On to second part...
Finished with second part. On to third part...

Number of integers with exactly 8 divisors:
224427

real	0m0.286s
user	0m0.512s
sys	0m0.052s

But when we raise the ceiling to 1 trillion, it chokes.

Not So Fast, Eratosthenes

So, what's the problem here?

The problem is that we are trying to count to a trillion - possibly several times.

Why are we trying to count to a trillion? We're trying to assemble numbers with 8 divisors, and one possible form of these numbers is:


p_1 p_2 p_3

where the p's are arbitrary prime numbers.

The best case scenario is that these three primes are all around the same size, the cube root of our max (the cube root of a trillion is only 1,000).

But worst case scenario is that we have two very small primes like 2 or 3, and the last prime is an integer near 1000000000000/6. That means that to FIND those primes, we're going to be running a number sieve and actually counting to almost a trillion.

To put a trillion into perspective, the typical computer can do a billion operations per second. Doing anything interesting takes around a hundred instructions, so that leaves us with around ten million operations per second. A simple back of the envelope tells you that you're going to be waiting


\dfrac{10^{12} \mbox{  ops}}{ 10^7 \mbox{  ops/s}} = 10^5 \mbox{  s} \sim 1 \mbox{  day} 4 \mbox{ hours} \dots

Keep in mind, this is optimistic. A factor of 10 could ruin your week.

This means our approach to the problem with a maximum of a million is much simpler than our approach with a maximum of a trillion, because suddenly we're doing everything we were doing before, a million times. Even the smallest inefficiencies will snowball.

Solving this problem requires a couple of tools, beginning with a prime number sieve that can start at an arbitrary location, instead of starting at 2. The term for this type of prime number sieve is a segmented Sieve of Eratosthenes; it is a prime number sieve taking lower and upper bound arguments.

Because this is a combinatorics search problem, we also need to spend additional time analyzing the problem space and finding some elegant shortcuts to save ourselves time.

Segmented Sieve

See Segmented Sieve

A Long Hiatus

(We let this problem sit and stew for a few years without coming back to it)

Redux

On revisiting Project Euler Problem 501, here's the crux of the problem as we left it:

  • We want to count the number of integers less than a trillion that have exactly 8 divisors.
  • We know that such numbers can take several forms, involving prime factors raised to various powers. Specifically:

Form 1:


p_A p_B p_C

Form 2:


p_1^3 p_2

Form 3:


p_r^7

  • A naive brute-force approach will work with a maximum around a million, but gets swamped by calculations as we move max into the billions range.
  • The real problem has a maximum of a trillion, so we have to re-think our approach.

We summarize the solution procedure for each of the three forms below.

Form 3

Numbers with eight divisors of Form 3 are of the form:


p_r^7

Form 3 is definitely the easiest case.

We start by using the fact that \sqrt[7]{10^{12}} \sim 51, which tells us that any prime number below 52 will generate a number of Form 3 that is less than a trillion.

Since each prime below 52 will yield one Form 3 number less than a trillion, we just need to count the number of prime numbers less than 52, which happens to be:


\pi(52) = 15

where \pi is the prime-counting function (the wikipedia article gives an excellent introduction, wolfram mathworld provides additional references).

We can do this in Python using Sympy:

$ python
Python 3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sympy
>>> sympy.ntheory.generate.primepi(52)
15

Therefore there are exactly 15 numbers with eight divisors that have Form 3.

We're nearly done! :)

Just kidding. :(

Form 2

Numbers with eight divisors of Form 2 have the form:


p_A^3 p_B

We can put an upper bound on the cubed prime by noting that it MUST be less than the cube root of a trillion:


\sqrt[3]{10^{12}} = 10^4

Therefore, we know that the candidates for the cubed prime number must be below 10,000. In fact, because the second prime must be at least 2, we actually know the candidates for the cubed prime number must be below 5,000. This greatly restricts our search space.

However, the smaller our cubed prime gets, the larger the search space for the second prime gets, so if the cubed prime is very small we could end up needing to look for some really big prime numbers.

Here is an example of a best case scenario, where the cubed prime is in the vicinity of 5,000 and the non-cubed prime is a small number:


871,330,142,599 = 4,993^3 \times 7

and whose eight divisors are:


1 \quad 7 \quad 4993 \quad 34951 \quad 24930049 \quad 174510343 \quad 124475734657 \quad 871330142599

But here is an example of a worst case scenario, when the cubed prime is very small (2 cubed) and therefore leaves a huge search space for the remaining non-cubed prime, whose max is 125 billion:


999,999,999,848 = 2^3 \times 124,999,999,981

and whose eight divisors are:


1 \quad 2 \quad 4 \quad 8 \quad 124999999981 \quad 249999999962 \quad 499999999924 \quad 999999999848

While the search space for that last prime can be cut down in short order as the cubed prime is increased (the search space is a trillion cut down by the cubed prime, so 8, 27, 125, 373, etc., which are all still very very large numbers).

Form 1

Numbers with eight divisors are of Form 1 if they are of the form:


p_1 p_2 p_3

(the eight divisors are p1, p2, p3, p1*p2, p1*p3, p2*p3, p1*p2*p3, and 1.)

Form 1 presents the greatest challenge. To find numbers with eight divisors of Form 1, we have a multivariate problem space (3 variables) to search. Each value can only take on specific, unpredictable values (the primes). The product of these three variables must not exceed some maximum.

If we analyze the problem space, we can see it is enormous.

Our best case scenario is that these three numbers are all around \sqrt[3]{10^{12}} = 10^4, in which case each prime cuts down the problem size by as much as possible. Here's an example of a best case scenario, where each number is around the cube root of a trillion:


906,163,920,427 = 9,511 \times 9,719 \times 9,803

whose eight divisors are:


1 \quad 9511 \quad 9719 \quad 9803 \quad 92437409 \quad 93236333 \quad 95275357 \quad 906163920427

On the other hand, the worst case scenario is that the first two primes are very tiny, leaving an enormous search space for the last prime. Here is a worst case scenario, in which the last prime can be any prime number less than a trillion divided by six:


999,999,999,762 = 2 \times 3 \times 166,666,666,627

whose eight divisors are:


1 \quad 2 \quad 3 \quad 6 \quad 166666666627 \quad 333333333254 \quad 499999999881 \quad 999999999762

Some Pseudocode

First, let's state up front that this pseudocode is not going to work. (Note below that we have a triply-nested for loop, with at least one of the loops starting at 2, sooooooooooo yeah good luck with that one.)

The Form 3 pseudocode is easy, we already saw it above. This just counts the number of primes between 2 and the seventh root of our max (a trillion):

fun form_3(n):
    seventhRoot = n^(1/7)
    i = 0
    for p in primesBetween(2, seventhRoot):
        i += 1
    return i

The Form 2 pseudocode starts with the prime with the largest search space (note the p1 and p2 subscripts are interchangeable in the formula above). What is absolutely key here is that the second loop starts where the first loop ends - that is to avoid duplication of effort. This will be enormously helpful in helping to carve up the problem's search space and avoid duplicate efforts.

fun form_2(n):
    cubedRoot = n^(1/3)
    i = 0
    for p1 := primesBetween(2, n):
        for p2 := primesBetween(p1+2, cubedRoot):
            if (p1*p2*p2*p2 < n) && (p1 != p2):
                i += 1
    return i

The Form 1 pseudocode looks similar to the Form 2 pseudocode, but with an additional for loop. Again, the starting and ending points of each loop are linked to avoid duplication of efforts:

fun form_1(n):
    i = 0
    for pA := primesBetween(2,n):
        for pB := primesBetween(pA+2,n):
            for pC := primesBetween(pB+2,n):
                if (pA*pB*pC < n) && (pA != pB) && (pA != pC) && (pB != pC):
                    i += 1
    return i

Mashing Form 1 and Form 2 Together

These nested loops (particularly, the loops for Form 1 and Form 2) are totally infeasible to do separately, so we need to mash them together to do everything at once. (Since Form 3 is trivial to do separately, we ignore it here.)

Also, we need a really, really fast implementation of primesBetween if this code stands any chance of finishing...

We start with the outer loop going from 2 to n, since that is common to both forms.

fun forms_1_and_2(n):
    j_form1 = 0
    j_form2 = 0

    cubeRoot = n^(1/3)

    for p1 := primesBetween(2,n):
        for p2 := primesBetween(pA+2,n):

            # check if Form 2
            if (p2 < cubeRoot) && (p1 != p2):
                if (p1*p2*p2*p2 < n):
                    i += 1

            for pC := primesBetween(pB+2,n):
                # check if Form 1

                if (pA*pB*pC < n) && (pA != pB) && (pA != pC) && (pB != pC):
                    i += 1

            # check if Form 3


        for pB := primesBetween(pA+2,n):
    return i


<pre>
fun form_3(n):
    seventhRoot = n^(1/7)
    i = 0
    for p in primesBetween(2, seventhRoot):
        i += 1
    return i


fun form_2(n):
    cubedRoot = n^(1/3)
    i = 0
    for p1 := primesBetween(2, n):
        for p2 := primesBetween(p1+2, cubedRoot):
            if (p1*p2*p2*p2 < n) && (p1 != p2):
                i += 1
    return i


fun form_1(n):
    i = 0
    for pA := primesBetween(2,n):
        for pB := primesBetween(pA+2,n):
            for pC := primesBetween(pB+2,n):
                if (pA*pB*pC < n) && (pA != pB) && (pA != pC) && (pB != pC):
                    i += 1
    return i

Is It Feasible

If we check the Mathworld page for the prime counting function (http://mathworld.wolfram.com/PrimeCountingFunction.html) we see there are roughly 37 billion primes under a trillion. So even if we have a super efficient prime-finding function, we're going to be counting to 37 billion.

This solution (https://euler.stephan-brumme.com/501/) shows an alternative approach, which involves a prime counting function, but it still requires actually iterating over prime numbers, and it only computes/stores prime numbers up to sqrt(n), so I'm not sure how it deals with some of the corner cases above.

Side Notes

Sympy Prime Counting Function

Brief demo of how to compute the prime counting function using sympy:

$ python
Python 3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38)
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import sympy
>>> sympy.ntheory.generate.primepi(1e6)
78498
>>> sympy.ntheory.generate.primepi(1e7)
664579
>>> sympy.ntheory.generate.primepi(1e8)
5761455
>>> sympy.ntheory.generate.primepi(1e9)
50847534
>>> sympy.ntheory.generate.primepi(1e10)
455052511

Each takes mere seconds to evaluate. However, when we try

>>> sympy.ntheory.generate.primepi(1e11)

it takes a good 30 seconds or so.

Cpp Prime Counting Function

C++ implementation of a prime counting function: http://am-just-a-nobody.blogspot.com/2015/11/c-code-for-primepi-function.html

Helpful Facts

We know that if we split all the positive integers that have exactly 8 divisors into three classes, Form 1, Form 2, and Form 3, the number of positive integers in each class is:

Form 1: 190,614,467,420

Form 2: 7,297,845,280

Form 3: 15

If you're lame you can just add these and you're done.

Links

Prime Counting Functions:

Prime counting codes:

Flags