From charlesreid1

Mathematics related to cards.

This relies heavily on AOCP/Binomial Coefficients and the Hypergeometric Distribution (counts for sampling without replacement).

All code referenced on this page is available here: https://git.charlesreid1.com/cs/java/src/branch/master/combinatorics/cards

Combinatorics and Counting

Probability of 5-Card Hands

Here, we give two examples that illustrate the care that is needed when setting up combinatorics problems with decks of cards. How we decide to model the process has implications for what type of distributions we use.

We examine two procedures below: in the first procedure, we are looking for five-card hands with at least one particular face value - say, at least once ace - which we model as a single event. In the second procedure, we are looking for five-card hands with exactly one occurrence of a particular face value - exactly one ace - which requires us to think about each card that is dealt as a separate event.

Note that the total number of 5-card hands is given by .

Hands with at least one ace

Suppose we wish to know the probability that a given 5-card hand will contain at least once ace.

In this case, it is slightly easier to enumerate the number of hands that do not contain any aces. We re-pose the problem as counting the number of ways we can form 5-card hands from 48 cards (the 52 total minus the 4 aces). This probability is given by:

This gives the total number of 5-card hands that contain at least one ace as the total number of possible hands minus the number of hands without aces:

If we want a probability, we can compute the number of hands with aces over the number of total hands:

We can empirically verify this by writing up a simple computer program: it creates a deck of cards, shuffles it, picks 5 cards at random, and determines if there is an ace in the hand. It repeats this procedure millions of times. Here are the results of several runs of the program, showing good agreement with theory:

$ javac CountAces.java && java CountAces
N_aces / N_cards = 341129 / 1000000 = 0.3411
N_aces / N_cards = 341178 / 1000000 = 0.3412
N_aces / N_cards = 341416 / 1000000 = 0.3414

Now, you may be wondering why we use the binomial distribution, if we are trying to model a process with replacement? Isn't the hypergeometric distribution more appropriate? (Maybe not after seeing the numbers presented above.) Well, you would almost be correct - except for the fact that we are only analyzing a single event, so it doesn't matter whether we replace or not.

In other words, in this process, success is not defined as "the next card being dealt comes up an ace", with many trials and many outcomes. Success is defined as "one of five cards, pulled at random from the deck, will be an ace." That means that pulling five cards is a single event, a single independent trial, with a single outcome. The program above shuffles the deck between each hand, so each hand is a single, independent trial.

Another way to think about it is, our success criteria applies to all of the cards at once, and is only met or not when all five cards have been dealt.

We will see in the next section that if we are concerned with the actual process of dealing the hand (i.e., we check criteria as the deal is happening), then we must treat each card as a separate event - and then we no longer have independent sampling with replacement, and we cannot use the binomial distribution. Instead, we have sampling without replacement, and we must use the hypergeometric distribution.

Hands with exactly one X

Now suppose we wish to know the probability that a given 5-card hand contains exactly one ace. We must exclude some of the solutions above that contain multiple aces.

To do this, we need to treat each card as a separate event, and analyze the outcomes of each event to rule certain outcomes out. Each card is removed without replacement, so the distribution that we see for each card changes slightly. We deal with this by using a hypergeometric distribution.

We begin by defining the following parameters:

  • K = number of possible success events = 4 aces in deck
  • k = number of success events we want = 1 ace
  • N = size of population being sampled = 52 cards
  • n = size of sample = 5 cards

Then the number of outcomes with exactly k successes in n trials will be given by:

The probability divides outcomes with k successes by total number of outcomes:

(Note that there are other forms of this equation, but I find the form given above is the easiest to remember.)

If we were to change our requirement to hands with exactly two aces, our probability decreases to:

I followed up on this by modifying the card-shuffling program written above to count number of hands containing exactly 1 ace, and then exactly 2 aces. Here is some output from that program:

$ javac CountAces.java && java CountAces
N_aces / N_cards = 340906 / 1000000 = 0.3409
N_aces / N_cards = 341865 / 1000000 = 0.3419
N_aces / N_cards = 340290 / 1000000 = 0.3403
N_twoaces / N_cards = 39990 / 1000000 = 0.0400
N_twoaces / N_cards = 40175 / 1000000 = 0.0402
N_twoaces / N_cards = 39872 / 1000000 = 0.0399

This lists the number of hands with exactly two aces, out of a million trials, along with the corresponding probabilities.

Table of Poker Hand Probabilities

Consider the list of outcomes from a 5-card poker hand:

  • High Card: Highest value card.
  • One Pair: Two cards of the same value.
  • Two Pairs: Two different pairs.
  • Three of a Kind: Three cards of the same value.
  • Straight: All cards are consecutive values.
  • Flush: All cards of the same suit.
  • Full House: Three of a kind and a pair.
  • Four of a Kind: Four cards of the same value.
  • Straight Flush: All cards are consecutive values of same suit.
  • Royal Flush: Ten, Jack, Queen, King, Ace, in same suit.

We will demonstrate computation of probabilities of the occurrence of each hand in 5 random cards being drawn.

The Abbreviated Table

The abbreviated table below gives the probability of a hand falling into each category below. To compile this table, we do not include the higher-ranked hands in lower-ranked hands (e.g., a royal flush is not counted among the straight flush outcomes).

Poker Hand Probabilities


Outcome



Number of Outcomes/Hands



Probability of Outcome/Hand

Royal Flush 1 / 649,740 ~0.0000015391
Straight Flush 3 / 216,580 ~0.00001385
Four of a Kind 1 / 4165 ~0.000240
Full House 6 / 4165 ~0.001440
Flush 12,777 / 649,740 ~0.0019
Straight 5 / 1,274 ~0.003924
Three of a Kind 88 / 4,165 ~0.0211
Two Pairs 198 / 4,165 ~0.04753
One Pair 353 / 833 ~0.422569

The Full Table

In the full table below, we contrast some values from the abbreviated table with their values if we account for alternative, higher-ranked ways of getting that hand.

Poker Hand Probabilities


Outcome



Number of Outcomes/Hands



Probability of Outcome/Hand

Royal Flush
Straight Flush
Straight Flush (incl. royal flush)
Four of a Kind
Full House
Flush
Flush (incl. straight/royal flush/full house)
Straight
Straight (incl. straight flush/royal flush)
Three of a Kind
Three of a Kind (incl. full house/4OAK)
Two Pairs
Two Pairs (incl. full house/4OAK)
One Pair
One Pair (incl. full house/3OAK/4OAK)

Calculating the Probabilities

Note that even with a simple calculation, like how many one-pair hands are possible, gets tricky - some of the one-pair hands will result in three of a kind, some will result in a full house, some will result in two pair, and some will result in a flush.

To do the calculation, we really need to be able to say:

(Number of hands with exactly one-pair) = (Number of hands with at least one pair) - (Number of hands with 3 of a kind) - (Number of hands with full house) - (Number of hands with flush)

Therefore, we start by counting the number of outcomes of the rarest occurrences, and work our way backwards, which allows us to apply the principle of Inclusion-Exclusion to account for the duplicates.

Royal Flush

A royal flush is a 10, J, Q, K, A of the same suit. There are only 4 royal flushes possible (4 suits).

Letting C denote the COUNT of poker hands with this outcome, and P denote the PROBABILITY of a poker hand with this outcome, we have:

We construct this number by first selecting one suit from among 4 - that's - and once we do that, we have exactly one of each type of remaining card that we can select.

We can either think of the choice of suit as an independent choice, whereupon each position ion the hand has exactly 1 card that works, or we can think of it as choosing from among the 4 Aces, and that choice then determines the suit of the remaining cards. Etc.

Therefore the probability of a royal flush is

Here is sample output from my poker hand counting program, showing that after 10 million rounds of poker, we have only seen 13 royal flushes:

$ javac PokerGame.java && java PokerGame
Player 1 got royal flush 13 / 10000000 = 0.00000130

Straight Flush

A straight flush contains cards of consecutive values, all of the same suit.

Using the principle of inclusion-exclusion, we can write the number of hands with a straight flush outcome as:

We should exclude hands with a royal flush, since we already counted them in the royal flush outcomes above. If we included them, we would be over-counting straight flushes.

The procedure for a straight flush is similar to the procedure for a royal flush: we begin by selecting a suit, which is , then we go on down the line and select our cards. If a straight consists of 5 cards, the straight must start with a card between 2 and 10. If the straight starts with 10, it is a royal flush and we have already counted it. Therefore, the first card in the straight flush can be any one of the cards A through 9. That's 9 total choices of cards, from which we are selecting one, for .

Once we select a suit and a starting card, the rest of the cards are determined - each remaining position in the hand has only one card that can fill it.

Therefore the count of straight flush hands is:

and the probability of having a straight flush in one's hand (not counting royal flushes) is:

Notice that the use of the Principle of Inclusion/Exclusion here, while straightforward, was also somewhat subtle. As we accumulate a wider variety of hands, we will need more complicated inclusion/exclusion counts.

Here is output from the Monte Carlo poker hand counting program:

$ javac PokerGame.java && java PokerGame
Player 1 got straight flush 128 / 10000000 = 0.00001280
Player 1 got straight flush 131 / 10000000 = 0.00001310

Four of a Kind

Four of a kind hands have four cards of the same value. Any card may be part of a four-of-a-kind, since all cards appear four times in the deck, therefore our procedure is to (a) pick a face value as our four-of-a-kind face value; (b) multiply the number of possibilities for each remaining position in the hand.

If we can pick any face value, that means we are choosing 1 face value from 13 possible face values, or .

Once we choose the face value, we have fixed our choices for the next four cards, so we have 1*1*1*1.

The last card can be any card at all, except the face value we chose in step 1. That leaves 52-4=48 possible face values for the last remaining card, from which we choose 1, for a total number of combinations of:

There are 156 poker hands containing four-of-a-kind. Note that the inclusion-exclusion principle is not needed here, since four of a kind hands and straight hands are disjoint subsets of the set of all hands.

Here's the output of the poker program counting number of occurrences of four-of-a-kind hands in rounds of 1 million hands:

$ javac PokerGame.java && java PokerGame
Player 1 got four 254 / 1000000 = 0.00025400
Player 1 got four 257 / 1000000 = 0.00025700
Player 1 got four 233 / 1000000 = 0.00023300
Player 1 got four 252 / 1000000 = 0.00025200
Player 1 got four 224 / 1000000 = 0.00022400

Full House

A full house is a pair and three of a kind. To count the number of hands with a full house, we start by picking a face value for our pair, then picking a face value for our three of a kind set.

We start by selecting a face value for our pair cards, which is equivalent to selecting one face value from among 13 possible face values (2 thru Ace). That's outcomes. Once we've set the face value of the pair of cards, we have 4 possible cards (one of each suit), from which we are selecting 2. That's outcomes for the second card of the pair.

Next, we select a face value for the three of a kind subset. Note that this cannot be the same face value as the choice of face value for the pair, so we only have 12 face values left, from which we select 1 face value, for choices. Next, we select three cards. These cards must be the same face value as the first card in the three of a kind subset. There are only 4 occurrences of that face value to begin with, from which we are selecting 3, for outcomes.

This gives the final expression for number of full house combinations:

As before, no full house hands are subsets of any previously-appearing class of hands, so we do not need to use the inclusion-exclusion principle here. The probability of a full house is therefore given by:

We get good agreement with our empirical poker simulations as well:

$ javac PokerGame.java && java PokerGame
Player 1 got full house 1472 / 1000000 = 0.00147200
Player 1 got full house 1489 / 1000000 = 0.00148900
Player 1 got full house 1469 / 1000000 = 0.00146900

Flush

A flush is a run of 5 cards, all of the same suit.

Let F denote the set of all hands containing a flush but no higher outcome

Let A denote the set of all hands containing a flush

Let B denote the set of all hands containing a straight flush but winning due to a straight flush outcome

Let C denote the set of all hands containing a straight flush but winning due to a royal flush outcome

Then we have:

where denotes the cardinality of Z.

We can get the cardinality of A, and have already obtained B and C. A is the number of hands that contain a flush. To determine this:

Start by selecting one of four possible suits. That's .

Now the first card can be any card of that suit, so there are 13 possible cards to choose from. (Note that some of these will lead to a straight flush or a royal flush, but we will discount these later.) The second card cannot be identical to the first, but there are no other restrictions. The third, fourth, and fifth cards are all similar - there are no particular restrictions on these cards. Thus, we are actually choosing all five cards from 13 possible cards, for a number of outcomes equal to .

The total number of hands with flushes, and possibly something higher, is therefore:

But this is and not . Discounting the 40 outcomes that result in straight flushes or royal flushes yields:

for a total probability of:

Again, this number was verified empirically with the poker program:

$ javac PokerGame.java && java PokerGame
Player 1 got flush 2031 / 1000000 = 0.00203100
Player 1 got flush 1956 / 1000000 = 0.00195600
Player 1 got flush 2016 / 1000000 = 0.00201600

Straight

A straight is a run of cards that follow in sequence.

Five cards in sequence mean that we can start at any one of the cards A through 10, of any suit. That gives for the first card.

We assume that the selected card is the lowest card (we can do that because order is unimportant). Then the sequence can only increase, and the first card in the sequence determines the face value of all other cards in the sequence. But each card can be any suit, so the second, third, fourth, and fifth cards are each selected from a pool of 4 possible cards, for each. The total number of hands containing straights (including higher-ranked outcomes, royal and straight flushes) is:

However, this count includes royal flushes and straight flushes! Let us denote the sets

S = Set of hands containing straights and nothing higher

A = Set of hands containing straights+

B = Set of hands containing straight flushes

C = Set of hands containing royal flushes (which are straights)

The cardinality of set A was enumerated in the formula above; the sets B and C were enumerated earlier. These are:

This gives the desired count of hands with a straight:

and the corresponding probability is:

and the corresponding output from the poker program verifies this number:

$ javac PokerGame.java && java PokerGame
Player 1 got straight 3496 / 1000000 = 0.00349600
Player 1 got straight 3542 / 1000000 = 0.00354200
Player 1 got straight 3625 / 1000000 = 0.00362500

Three of a Kind

Next we have three of a kind. Three of a kind cards can be any face value, so we begin by selecting the first card's face value from a set of 13 possible:

The first three cards share a face value, and we are selecting three out of four possible cards with that face value:

Once we select those three cards, the remaining two can be any remaining face value (12 choose 2) and any suit (4 choose 2), for . This restricts some of the previously-occurring hands, however. Alternatively, we can say this is - selecting 2 cards from the 49 cards that remain once we've removed the 3 we chose. (This ends up being a bit more complicated than using the prior expression, but we'll suffer through it.)

But once again, we need to discount some of the three of a kind hands that will result in four of a kind, or in full houses. Let us denote the following sets:

T = set of all hands that win by three of a kind (contain nothing higher)

A = set of all hands that contain three of a kind and possibly something higher

B = set of all hands that contain four of a kind

C = set of all hands that contain full house

then we can write the quantity of interest, the cardinality of T, as:

The set A was enumerated above. This is:

B and C were also enumerated earlier. These are the four of a kind and full house enumerations, respectively:

So our final count of hands that win via three of a kind is:

That's why we started with the royal flush instead of pairs. Simplifying:

With a bit of algebra and the use of AOCP/Binomial Coefficients identities, the above expression can be written in a simpler form. This is what we would have arrived at, had we used the simpler term in place of .

This gives a probability of occurrence of

Verified with the poker program:

$ javac PokerGame.java && java PokerGame
Player 1 got three 21219 / 1000000 = 0.02121900
Player 1 got three 21346 / 1000000 = 0.02134600
Player 1 got three 21142 / 1000000 = 0.02114200
Player 1 got three 21037 / 1000000 = 0.02103700
Player 1 got three 21124 / 1000000 = 0.02112400

Two Pair

The application of the inclusion/exclusion principle is in full effect. Let's count the number of occurrences of hands that win with two pairs, denote this set WPP for winning with two pairs.

To find the set of hands that contain two pairs, start with the first pair. We select the face value of the first pair, and then select 2 cards from the 4 possible, for an overall .

To select the second pair, we have the same situation, but now with 12 cards, so we have .

These two terms can be combined to the simpler .

Now we have selected four cards, and the last remaining card can be any of 11 remaining face values and any of the remaining 4 suits, if we assume the pairs do not match and we do not have a full house. This gives us

Now, we have:

The total probability is then:

Verified with the poker program:

$ javac PokerGame.java && java PokerGame
Player 1 got two 47437 / 1000000 = 0.04743700
Player 1 got two 47804 / 1000000 = 0.04780400
Player 1 got two 47590 / 1000000 = 0.04759000
Player 1 got two 47714 / 1000000 = 0.04771400
Player 1 got two 47101 / 1000000 = 0.04710100

One Pair

Following the steps above, we start by selecting a single face value for the pair from among 13 possible, then chooses two suits from among 4 possible. We then chooses the remaining 3 cards from the 12 possible face values that remain, and select from among the 4 possible suits. Each card has 4 possible suits, so it is .

This gives the binomial sequence:

And the corresponding probability:

Here is the output from the poker program:

Player 1 got one 422599 / 1000000 = 0.42259900
Player 1 got one 422157 / 1000000 = 0.42215700
Player 1 got one 422446 / 1000000 = 0.42244600
Player 1 got one 422868 / 1000000 = 0.42286800
Player 1 got one 422627 / 1000000 = 0.42262700

Runs

Knuth mentions an anecdote about runs - sequences of increasing integers. There was a mathematician/scientist who would play a form of solitaire in which he would place cards on an existing pile as their face value increased, and start a new pile as their face values decreased. The question was, how many piles would result? This is a form of sampling a distribution - the distribution of number of increasing sequences in a randomly shuffled deck.

This problem can be described in more formal terms as searching for the longest fragments in a sequence of integers that appear in increasing, sorted order. For example, a permutation with 4 runs, separated by vertical bars

3 4 5 | 2 | 1 9 10 | 7 49 343

As it turns out, this problem can be formulated in terms of a bivariate exponential Generating Function, where the two variables count the size of the permutation n and the size of the run k.

If counts the number of permutations of length n that contain k runs, then we can write the bivariate generating function as:

Related