The Sicilian Mafia and the Caesar cipher

One organisation that modern knowledge and technology seems to have largely passed by is the Sicilian Mafia. Traditionally, Mafiosi communicate using cryptic handwritten messages called pizzini. The mid-2000s saw a wave of high-profile Mafia arrests of which the most notable was the capture of kingpin Bernardo Provenzano in April 2006 after more than forty years on the run. One of the main reasons he was caught was that the Mafiosi had tried to disguise within their pizzini information such as names using a technique called the Caesar cipher that was actually trivial to decode.

The Caesar cipher, thus named because it was invented by the ancient Roman military, consists of replacing each letter in a text with its corresponding number – A with 1, B with 2, C with 3, and so on – and then adding to each number that results a secret number or key. If the secret number is 5, A will become 6, B will become 7, C will become 8, and so on. The Sicilian Mafia used the resulting numbers directly as their code. The Romans, on the other hand, converted the numbers back into their corresponding letters, so that 6 became F, 7 became G, 8 became H, and so on. Had the Latin alphabet had 26 letters like the modern English alphabet, 27 would have become AA.

In ancient Rome, whoever received the code would first convert each letter in it into its corresponding number; in modern Sicily, he would already have the numbers. With either variant, the idea is that he has already received the key from the sender on some previous occasion. He can then subtract the key from each number in the code and change each resulting number to its corresponding letter to retrieve the original message.

The flaw in this method is that somebody else who knows or suspects that a secret code is a Caesar cipher can easily reveal the original text without having to know the key in advance. If the lowest number in a secret code is 35 and the highest number is 60, and we know the original message is written in English with an alphabet of 26 letters, it follows that the key must be 34, because any lower key would cause the first letter A to be encoded as a number less than 35 and any higher key would cause the last letter Z to be encoded as a number greater than 60.

Even when a shorter message does not contain all the available letters in the alphabet and the difference between the lowest and the highest numbers in the corresponding code is less than 26, there will still only be a handful of possible values for the key. Somebody trying to decode the message can simply try these possible values out one by one and will know he has the right value when he ends up with a text that makes sense rather than with a meaningless string of letters. Interestingly, whether by design or by chance, the Sicilian Mafia’s messages were encoded with the same key – 3 – that Julius Caesar is known to have employed some two thousand years earlier!

Provenzano and his accomplices may have speculated that anyone finding their code would not be able to decode it with such techniques because the way it was written would prevent it from being recognized as a Caesar cipher in the first place. Firstly, they recorded it as numbers rather than the letters of ancient times. Secondly, the key value 3 allowed an encoded message to be written as a continuous chain of numbers even though the individual code numbers that made up the sequence were partly single-digit and partly double-digit.

The first letter A becomes 4 and the second letter B becomes 5: there are no letters that correspond to the single-digit numbers 1, 2 and 3. At the same time, the last letter Z becomes 29: there are no letters that correspond to double-digit numbers whose first digit is 3, 4, 5, 6, 7, 8 or 9. This means that even if the message is written without spaces between one encoded letter and the next, whoever is decoding it has a foolproof way of knowing whether each new number he encounters should be interpreted as a single-digit number or as the first digit of a double-digit number.

This allowed a word like GITALLI to be written in code as 1012234151512. The Mafia probably presumed such a contiguous string of numbers would be completely impenetrable to anyone not knowing what it was. In fact, it betrays an obvious pattern that almost jumps out of the page: there is a strong tendency for each second digit to be either 1 or 2. This quickly leads to the suspicion that the code is made up of smaller units of which at least some are double-digit, and from there it is only a small step to identifying and decoding the Caesar cipher. In any case, the Caesar cipher is so notorious that it is one of the first possibilities any codebreaker would consider.

Next section: The ‘Ndrangheta and the San Luca code