The Lorentz cipher was a code used by Germany in the second world war. In this post I wanted to explore the general ideas behind it and see if I could implement a simple Python Lorentz Cipher. I’m not trying to create an exact simulation of the Lorentz cipher here. For a python lorentz cipher simulation suggest you check out this project on github or if you would like to see a more graphical simulation, this project.
If you want to learn more about the actual Lorentz cipher and how it was used I suggest here or here:
Go to the Github Gist of the code for this post
Lorentz Cipher Basics
The Lorentz cipher was based on an idea developed based on the ‘Vernam Cipher‘ . The Vernam Cipher approach can in theory be completely secure if a number of conditions are met. One of the more challenging conditions is the pre-sharing of a number of punch tapes containing truly random digits within which to encrpyt and decrypt the message. A common way to get around this challenge was to use a pseudorandom number generator to produce a keystream to be combined with the message text to produce the encrypted text. This is the approach taken by the Lorentz cipher.
In modern terminology it is known as an example of a stream cipher.
The general idea behind a stream cipher is that each character in the plain text message ‘stream’ is combined with a pseudorandom character from a cipher digit stream.
The Baudot-Murray code
The Lorentz cipher system was actually an extension to a teleprinter machine. The teleprinter would convert characters on a type writer with a 5 bit code. The coding scheme typically used in teletype machines was the Baudot-Murray or ITA-2 code.
This is a way of encoding characters using 5-bits. It is similar to the 8-bit byte used in modern computing, but needs to use a shift system to access more than the 32 characters available. Given that there are 26 letters in the English alphabet, 32 characters does not leave much to play with.
Although in reality messages would have actually included characters using the shift codes, for the sake of simplicity I have just used the unshfited codes. The last five codes in the following dictionary are for: NULL, carriage return, linefeed, shift-on, shift-off. The ‘0b’ at the start of each string is for python’s binary representation.
baudot = { 'a': '0b00011', 'b': '0b11001', 'c': '0b01110', 'd': '0b01001', 'e': '0b00001', 'f': '0b01101', 'g': '0b11010', 'h': '0b10100', 'i': '0b00110', 'j': '0b01011', 'k': '0b01111', 'l': '0b10010', 'm': '0b11100', 'n': '0b01100', 'o': '0b11000', 'p': '0b10110', 'q': '0b10111', 'r': '0b01010', 's': '0b00101', 't': '0b10000', 'u': '0b00111', 'v': '0b11110', 'w': '0b10011', 'x': '0b11101', 'y': '0b10101', 'z': '0b10001', ' ': '0b00100', 'N': '0b00000', 'C': '0b01000', 'L': '0b00010', 'F': '0b11011', 'T': '0b11111'}
For a more comprehensive handling of Baudot in python, I suggest looking into this project.
The message
This is just typed into the type writer-style interface of the teleprinter. The message characters are converted into the the corresponding Baudot-Murray codes. In this simple python version it is done by looping through the message characters and finding the corresponding dictionary values.
Original message: my secret message Original Baudot: 11100 10101 00100 00101 00001 01110 01010 00001 10000 00100 11100 00001 00101 00101 00011 11010 00001
The keystream
In the Lorentz cipher this was generated by a complicated arrangement of dials. In this simple python lorentz cipher we can just use python’s random library to generate a random selection of letters. It is worth keeping in mind of course that just in the same way that the the original Lorentz system only used a psuedorandom keystream, so is the python random library only pseudorandom.
random_choice_characters = list(baudot.keys()) key_string = ''.join(random.choice(random_choice_characters) for x in range(len(message)))
From the original ‘my secret message’, one run of the code give the corresponding key stream:
Ket stream: CrwbFLFtoiibksNai
Exclusive OR (XOR)
An essential part of the Lorentz cipher process is carrying out an XOR operation on the message and key streams.
In python we can carry out a bit-wise exclusive OR operation with the ^ operator. A binary number in python is indicated by putting ‘0b’ before the numbers.
x = 0b01 y = 0b10 print(bin(x^y))
This gives the result ‘0b11’. We need to explicitly use the ‘bin‘ command otherwise python converts the response to a decimal ‘3’.
The Lorentz cipher uses 5-bit character codes and we can represent those more completely using string formatting to zero pad the text and remove the ‘0b’.
For example if we wanted to
i = 0b00001 print(f"{i:#07b}"[2:7])
yields ‘00001’.
The Broadcast
The encrypted message that we will actually broadcast is found by doing an XOR (Exclusive OR) operation on both the plaintext message and the key stream.
broadcast = [] for i, letter in enumerate(message): broadcast.append(int(baudot[letter], 2) ^ int(baudot[key_string[i]], 2)) print('\nBroadcast signal: ') [print(f"{i:#07b}"[2:7]) for i in broadcast]
Which for one run of the code gave:
Broadcast signal: 10100 11111 10111 11100 11010 01100 10001 10001 01000 00010 11010 11000 01010 00000 00011 11001 00111
Decoding The Message
One of the properties of the XOR operation is that it is reversible.
As we receive the letter code and need to find the corresponding letter we need to do a reverse lookup in the dictionary – i.e. find the key corresponding to a particular value.
received_message = [] for i in broadcast: received_message.append(list(baudot.keys())[list(baudot.values()).index(f"{i:#07b}")]) print('\nMessage Received: ') print(''.join(received_message))
The received message is clearly very different from the original:
Message Received: hTqmgnzzCLgorNabu
And then to decrypt the message we re-apply the XOR operation:
decrypted_message = [] # Decrypt signal # Need to XOR with original key_string again for i, letter in enumerate(received_message): decrpyted_code = int(baudot[letter], 2) ^ int(baudot[key_string[i]], 2) decrypted_message.append(list(baudot.keys())[list(baudot.values()).index(f"{decrpyted_code:#07b}")]) print('\nMessage Decrypted: ') print(''.join(decrypted_message))
Which gets us back to the original message:
Message Decrypted: my secret message
Going further
This has been a very simple Python Lorentz Cipher exploration and there are doubtless ways that it could be taken further. One obvious way would be to more accurately represent how the Lorentz Cipher pseudorandom generation worked. Alternatively you could attempt to crack the resulting encrypted message with something like a simple Caesar cipher.