One-time pads
and the Venona project

Previous section: The ‘Ndrangheta and the San Luca code

Although the two Mafia groups have supplied us with examples of encodings that were particularly trivial to crack, codemakers underestimating the ingenuity, tenacity and intuition of codebreakers is a recurring theme in the history of encryption. Perhaps human intelligence has evolved to be better suited to the goal-oriented challenge of finding patterns than to the open-ended task of hiding them.

Working without a computer may seem a surprising choice in the twenty-first century, but there is in fact an encryption method that allows communication with a surprisingly high degree of secrecy in the absence of specific technology and without requiring any complex mathematics to be performed by hand. If you distribute a book – for example, this one – in advance to all the people intended to encode and decode messages, you can use each individual letter in the book as a separate key for each individual letter to be encrypted.

This book begins with the word OUR, so the first letter in the message would be encrypted using the letter O, which is the fifteenth letter in the alphabet. If the message began with the letter B, it would be replaced by the letter fifteen places after it in the alphabet, which is Q. The person decoding the message would know to use the letter O from the book to decode the letter Q by going back fifteen letters to yield the original B.

The second letter in this book is U, the twenty-first letter in the alphabet. If the second letter in the message to be encrypted were R, we would not be able to advance twenty-one positions from it without reaching the end of the alphabet first. We would then cycle back round to the beginning of the alphabet again and take A to be the letter to follow Z. The encrypted version of the letter would be M. Again, the recipient of the encoded message would know from the U from the book to go back twenty-one places from the encrypted M, again cycling back round from A to Z, to retrieve the original R.

The code that results from using a book as a key cannot be broken by considering how often it contains each letter of the alphabet provided that the book is only used as a key a single time. On the other hand, if the same book is used on a large number of occasions to encode different messages, statistical frequency analysis can be used to determine probable values for each letter within the book and then to decode the messages. If this book were used as a key source a million times, the relative frequencies with which the first letters in the encrypted messages occurred would parallel the relative frequencies with which letters occur in English, but the values would all be shifted fifteen letters to the right, which would identify the first letter in the book as an O.

If a message were long enough – perhaps itself book-length – it is not unthinkable that an intelligence service with sufficient resources would be able to reveal the original text even if the book were only used a single time as a key source. Because the relative frequencies of letters in both the book and the original message would be uneven, the relative frequencies of letters in the encrypted message would themselves form a tell-tale signature betraying the fact that the key source must consist of a proper text rather than of random letters. For example, as the letter E (the fifth letter of the alphabet) occurs relatively frequently in English, the letter J (the tenth letter of the alphabet) would occur relatively frequently in the encrypted messages because it would result whenever an E in the message to be encrypted happened to coincide with an E in the book being used as a key.

If a computer had access to electronic versions of all the books that have ever been published in any language – between 100 million and 200 million – it could easily loop through them, trying to decrypt the code with each one. The program would recognise when it had hit on the right book because the relative letter frequencies in the decrypted text would correspond to the expected signature of a known language.

Even though not all books are really available in electronic form, this thought experiment demonstrates the drawbacks of using an existing text as a key. They can be avoided by using a random sequence of letters instead. When such a sequence is only used once to encrypt a single message, it is known as a one-time pad. A one-time pad has the distinction of being the only encryption method that is theoretically totally unbreakable.

However, it does not represent a practical answer to most problems that encryption might actually be used to solve. For every secret message that is to be sent, a new random sequence of letters has to be generated. It has to be distributed to all the people who are intended to use it without being intercepted by anybody else. The logistical hurdles these requirements entail are identical to those that motivate using encryption in the first place. If you can send the one-time pad safely without encryption, why not just send the message safely without encryption? In most situations, a one-time pad merely shifts the challenges of secure transmission from one document (the secret message) to a second document (the one-time pad).

On the other hand, this can be exactly what is called for if the conditions for secure and trusted communication exist at one point in time but cannot be guaranteed at some future point in time. In 1963, following the Cuban missile crisis, the United States and the Soviet Union set up an emergency hotline between their respective leaders that was based on one-time pads. Each country generated pads and passed them to the second country via diplomatic channels. They allowed the superpowers to build up a basis for communication slowly and during a period of low tension and high trust that would be ready for use if the situation heated up in the future.

Cybertwists book cover
Publication of Cybertwists is planned for early 2018.

As soon as a one-time pad is used more than once, it loses its status as an unbreakable encryption method. The Soviet Union based their confidential communications during World War II on one-time pads. However, the unit generating the random pads was unable to meet the demand for them quickly enough, which led to pads being recycled. The authorities tried to prevent this lack of rigour from being exploited by never reusing the same pad in similar places or situations. These countermeasures proved insufficient to stop American analysts, who had obtained a large quantity of encoded messages, from taking advantage of the situation.

The name of the project to analyse the Soviet messages was Venona. Work on it continued until October 1980, which is testimony to the fact that, once an adversary is in possession of encrypted messages, he has all the time in the world to try and decipher them. The messages that were successfully decoded represented but a fraction of the total. However, they still contained a considerable amount of information considered important from an intelligence standpoint. Venona led to the definitive exposure of double agent Klaus Fuchs, who was responsible for passing the Russians top secret American information concerning the construction of the hydrogen bomb.

Tweet about the Venona project

Next section: The Enigma machine and Bletchley Park