The Sicilian Mafia and the Caesar cipher

Previous section: Introduction

One organisation that relatively recently attempted to encrypt messages without any technical aids was the Sicilian Mafia. Traditionally, Mafiosi communicate using cryptic handwritten messages called pizzini. The mid-2000s saw a wave of high-profile Mafia arrests of which the most notable was the capture of kingpin Bernardo Provenzano in April 2006 after more than forty years on the run. The investigations that made many of these possible were greatly aided by the fact that much of the information like names that the Mafiosi had tried to disguise in their pizzini was in fact easily decodable. The information had been encrypted using a technique called the Caesar cipher.

The Caesar cipher, thus named because it was used by the ancient Roman military, consists of replacing each letter in a text with its corresponding number – A with 1, B with 2, C with 3, and so on – and then adding to each number that results a secret number or key. If the secret number is 5, A will become 6, B will become 7, C will become 8, and so on. The Sicilian Mafia formed their code directly out of the resulting numbers. The Romans, on the other hand, converted the numbers back into their corresponding letters, so that 6 became F, 7 became G, 8 became H, and so on. 27 became AA and 28 became AB.

In ancient Rome, the recipient of the encoded message would first convert each letter in the code into its corresponding number; in modern Sicily, he would already have the numbers. With either variant, the idea is that he has already received the key from the sender on some previous occasion. He can then subtract the key from each number in the code and replace each resulting number with its corresponding letter to retrieve the original message.

The flaw in this method is that a third party who knows or suspects that a secret code is a Caesar cipher can easily reveal the original text without having to know the key in advance. If the lowest number in a secret code is 35 and the highest number is 60, and we know the original message is written in English with an alphabet of 26 letters, it follows that the key must be 34, because any higher key would cause the first letter A to be encoded as a number greater than 35 and any lower key would cause the last letter Z to be encoded as a number less than 60.

Even when a shorter message does not contain all the available letters in the alphabet and the difference between the lowest and the highest numbers in the corresponding code is less than 26, there will only be a handful of possible values for the key. Somebody trying to decode the message can simply try them out one by one and will know she has the right value when she ends up with a text that makes sense rather than a meaningless string of letters. Interestingly, whether by design or by chance, the Sicilian Mafia’s messages were encoded with the same key – 3 – that the Roman army was known to have employed some two thousand years earlier!

Provenzano and his accomplices may have relied on the fact that anyone finding their code would not be able to decode it with such techniques because the way it was written would stop it being recognized as a Caesar cipher in the first place. Firstly, they recorded it as numbers rather than the letters of ancient times. Secondly, the key value 3 had the special property that it allowed an encoded message to be written as a continuous chain of numbers even though the individual code numbers that made up the chain were partly single-digit and partly double-digit.

Cybertwists book cover
Publication of Cybertwists is planned for 2017.

The first letter A becomes 4 and the second letter B becomes 5: there are no letters that correspond to the single-digit numbers 1, 2 and 3. At the same time, the last letter Z becomes 29: there are no letters that correspond to double-digit numbers whose first digit is 3, 4, 5, 6, 7, 8 or 9. This means that even if the message is written without spaces between one encoded letter and the next, whoever is decoding it has a foolproof way of knowing whether each new number he encounters should be interpreted as a single-digit number or as the first digit of a double-digit number.

This allows a word like PROVENZANO to be written in code as 192118258172941718. The Mafia probably presumed such a contiguous string of numbers would be completely impenetrable to anyone not knowing what it was. In fact, it betrays an obvious pattern that almost jumps out of the page: there is a strong tendency for each second digit to be either 1 or 2. This quickly leads to the suspicion that the code is made up of smaller units of which at least some are double-digit, and from there it is only a small step to identifying and decoding the Caesar cipher. In any case, the Caesar cipher is so notorious that it is one of the first possibilities any professional codebreaker would consider.

Tweet about Bernardo Provenzano and the Sicilian Mafia

Next section: The ‘Ndrangheta and the San Luca code