Python Lorentz Cipher

By | July 23, 2021

The Lorentz cipher was a code used by Germany in the second world war. In this post I wanted to explore the general ideas behind it and see if I could implement a simple Python Lorentz Cipher. I’m not trying to create an exact simulation of the Lorentz cipher here. For a python lorentz cipher simulation suggest you check out this project on github or if you would like to see a more graphical simulation, this project.

If you want to learn more about the actual Lorentz cipher and how it was used I suggest here or here:

Go to the Github Gist of the code for this post

Lorentz Cipher Basics

The Lorentz cipher was based on an idea developed based on the ‘Vernam Cipher‘ . The Vernam Cipher approach can in theory be completely secure if a number of conditions are met. One of the more challenging conditions is the pre-sharing of a number of punch tapes containing truly random digits within which to encrpyt and decrypt the message. A common way to get around this challenge was to use a pseudorandom number generator to produce a keystream to be combined with the message text to produce the encrypted text. This is the approach taken by the Lorentz cipher.

In modern terminology it is known as an example of a stream cipher.

The general idea behind a stream cipher is that each character in the plain text message ‘stream’ is combined with a pseudorandom character from a cipher digit stream.

 

The Baudot-Murray code

The Lorentz cipher system was actually an extension to a teleprinter machine. The teleprinter would convert characters on a type writer with a 5 bit code. The coding scheme typically used in teletype machines was the Baudot-Murray or ITA-2 code.

This is a way of encoding characters using 5-bits. It is similar to the 8-bit byte used in modern computing, but needs to use a shift system to access more than the 32 characters available. Given that there are 26 letters in the English alphabet, 32 characters does not leave much to play with.

Although in reality messages would have actually included characters using the shift codes, for the sake of simplicity I have just used the unshfited codes. The last five codes in the following dictionary are for: NULL, carriage return, linefeed, shift-on, shift-off. The ‘0b’ at the start of each string is for python’s binary representation.

baudot = { 'a': '0b00011', 
           'b': '0b11001',
           'c': '0b01110',
           'd': '0b01001',
           'e': '0b00001',
           'f': '0b01101',
           'g': '0b11010',
           'h': '0b10100',
           'i': '0b00110',
           'j': '0b01011',
           'k': '0b01111',
           'l': '0b10010',
           'm': '0b11100',
           'n': '0b01100',
           'o': '0b11000',
           'p': '0b10110',
           'q': '0b10111',
           'r': '0b01010',
           's': '0b00101',
           't': '0b10000',
           'u': '0b00111',
           'v': '0b11110',
           'w': '0b10011',
           'x': '0b11101',
           'y': '0b10101',
           'z': '0b10001',
           ' ': '0b00100',
           'N': '0b00000',
           'C': '0b01000',
           'L': '0b00010',
           'F': '0b11011',
           'T': '0b11111'}

For a more comprehensive handling of Baudot in python, I suggest looking into this project.

The message

This is just typed into the type writer-style interface of the teleprinter. The message characters are converted into the the corresponding Baudot-Murray codes. In this simple python version it is done by looping through the message characters and finding the corresponding dictionary values.

Original message:
my secret message

Original Baudot:
11100
10101
00100
00101
00001
01110
01010
00001
10000
00100
11100
00001
00101
00101
00011
11010
00001

 

The keystream

In the Lorentz cipher this was generated by a complicated arrangement of dials. In this simple python lorentz cipher we can just use python’s random library to generate a random selection of letters. It is worth keeping in mind of course that just in the same way that the the original Lorentz system only used a psuedorandom keystream, so is the python random library only pseudorandom.

random_choice_characters = list(baudot.keys())

key_string = ''.join(random.choice(random_choice_characters) for x in range(len(message)))

From the original ‘my secret message’, one run of the code give the corresponding key stream:

Ket stream:
CrwbFLFtoiibksNai

Exclusive OR (XOR)

An essential part of the Lorentz cipher process is carrying out an XOR operation on the message and key streams.

In python we can carry out a bit-wise exclusive OR operation with the ^ operator. A binary number in python is indicated by putting ‘0b’ before the numbers.

x = 0b01
y = 0b10
print(bin(x^y))

This gives the result ‘0b11’. We need to explicitly use the ‘bin‘ command otherwise python converts the response to a decimal ‘3’.

The Lorentz cipher uses 5-bit character codes and we can represent those more completely using string formatting to zero pad the text and remove the ‘0b’.

For example if we wanted to

i = 0b00001
print(f"{i:#07b}"[2:7])

yields ‘00001’.

The Broadcast

The encrypted message that we will actually broadcast is found by doing an XOR (Exclusive OR) operation on both the plaintext message and the key stream.

broadcast = []
for i, letter in enumerate(message):
    broadcast.append(int(baudot[letter], 2) ^ int(baudot[key_string[i]], 2))

print('\nBroadcast signal: ')
[print(f"{i:#07b}"[2:7]) for i in broadcast]

Which for one run of the code gave:

Broadcast signal: 
10100
11111
10111
11100
11010
01100 
10001 
10001 
01000 
00010 
11010 
11000 
01010 
00000 
00011 
11001 
00111

Decoding The Message

One of the properties of the XOR operation is that it is reversible.

As we receive the letter code and need to find the corresponding letter we need to do a reverse lookup in the dictionary – i.e. find the key corresponding to a particular value.

received_message = []
for i in broadcast:
    received_message.append(list(baudot.keys())[list(baudot.values()).index(f"{i:#07b}")])

print('\nMessage Received: ')
print(''.join(received_message))

The received message is clearly very different from the original:

Message Received:
hTqmgnzzCLgorNabu

And then to decrypt the message we re-apply the XOR operation:

decrypted_message = []
# Decrypt signal
# Need to XOR with original key_string again
for i, letter in enumerate(received_message):
    decrpyted_code = int(baudot[letter], 2) ^ int(baudot[key_string[i]], 2)
    decrypted_message.append(list(baudot.keys())[list(baudot.values()).index(f"{decrpyted_code:#07b}")])

print('\nMessage Decrypted: ')
print(''.join(decrypted_message))

Which gets us back to the original message:

Message Decrypted:
my secret message

Going further

This has been a very simple Python Lorentz Cipher exploration and there are doubtless ways that it could be taken further. One obvious way would be to more accurately represent how the Lorentz Cipher pseudorandom generation worked. Alternatively you could attempt to crack the resulting encrypted message with something like a simple Caesar cipher.