Project Euler/59
From charlesreid1
Contents
Problem Statement
This question is about a method of encrypting ASCII text.
Each character in ASCII has some representation (e.g., A=65, *=42, k=107)
Encryption technique is to take text file, convert bytes to ASCII, and XOR each byte with a character from a secret key
Password method: use a password in place of a secret one-time pad; shorter than message, so repeated cyclically throughout method.
You are given a file with encrypted ascii codes, knowledge that plain text must contain common English words, and told encryption key consists of 3 lowercase characters. Decrypt message, find sum of ascii values in original text.
Sample file looks like this: 36,22,80,0,0,4,23,...
Link: https://projecteuler.net/problem=59
Approach
Part 1:
- Create list of most common bigrams in English language
- Sliding window of 2 letters at a time
- For each pair of letters in the ciphertext, for each most common bigram, find the corresponding secret key at that position that would cause this bigram to become the most common bigram
- Keep track of weights too - not just which secret key value would lead to the given bigram, but how common that bigram is
- Keep track of potential secret key values and weights in 3 data structures, one for each potential secret key character; we know which one to use based on character position
Part 2:
- Create list of most common trigrams in English language
- Sliding window of 3 letters at a time
- For each triplet of letters, for each most common trigram, find the corresponding secret key at that position that would cause this trigram to become the most common trigram
Part 3:
- Create list of most common letters in English language, plus weights
- For each "letter" in the ciphertext, for each most common letter, find the corresponding secret key at that position that would cause the ciphertext to become the most common letter
(Part 1 should be enough to get us a key...)
Shorter Direct Approach
Attempting to short-circuit and simplify the problem:
- This problem uses the Vigenere cipher with a 3-letter key
- Generate bigrams, keeping track of relative positions separately (1-2, 2-3, 3-1)
- For each bigram data structure, for each most common English bigram, determine corresponding secret keys, add it to secret keys histogram
- List of most frequent bigrams: https://mathcenter.oxford.emory.edu/site/math125/englishLetterFreqs/
Shortest Brute Force Approach
Given that (a) the solution that the above approach yielded seemed correct, but slightly fishy (the plaintext obtained from the ciphertext had weird capitalization and punctuation issues), and (b) we know the size of the possible key space is only 26^3 = 17,576 (because the problem states the key consists of 3 lowercase letters), we can just use brute force to figure out which key is correct.
Brute force approach:
- Generate all possible 3-letter keys
- Use the generated 3-letter key to convert the ciphertext to plaintext
- Check if the resulting plaintext is English-like, add to list of possible secret keys
- Review output by hand to identify the correct secret key
Code
See https://git.charlesreid1.com/cs/euler/src/branch/main/java/Problem059.java
Flags