Cryptopals challenge 2: Fixed XOR in Python

This is my write up of the second Cryptopals challenge, using Python3 as my language of choice. The challenge:

Fixed XOR

Write a function that takes two equal-length buffers and produces their XOR combination.

If your function works properly, then when you feed it the string:

1c0111001f010100061a024b53535009181c

… after hex decoding, and when XOR’d against:

686974207468652062756c6c277320657965

… should produce:

746865206b696420646f6e277420706c6179

For some background on XOR, I’ll summarize Wikipedia: XOR (Exclusive or) is a logical operation that outputs true only when inputs differ (one is true, the other is false). It has many uses in computer science, and is heavily used in cryptographic operations. Here is an XOR truth table, showing the result of XOR’ing input A with input B:

Let’s take a look at this in Python. First, you can get the integer value of a string (or byte, in this case) using the ord() function:

>>> ord(b'A')
65
>>> ord(b'a')
97

To understand what would happen if we XOR ‘A’ with ‘a’ we can break down these integer values into binary:

>>> bin(ord(b'A'))
'0b1000001'
>>> bin(ord(b'a'))
'0b1100001'

Python strips the leading zeroes so I’ll add them back so it’s easier to read. Let’s stack the bits and manually XOR.

01000001
01100001
00100000

Python confirms this, and when converted back to a string character (or byte string) using chr() function, produces the space ‘ ‘ character:

>>> ord(b'A') ^ ord(b'a')
32
>>> chr(ord(b'A') ^ ord(b'a')).encode()
b' '

Luckily, the Wikipedia for XOR_cipher had an example implementation using Python3, so my solution can mainly be attributed to that article:

Let’s explore the xor_byte_strings function to understand what is going on in the code.

Sometimes list comprehensions can make things look confusing, so let’s rewrite it is easier to step through:

With these changes the function still returns the same thing, but the code is easier to step through. The comments explain what is going on in the code, but I’ll explain in a bit more detail.

The zip() function returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables. For example:

>>> s1 = '123'
>>> s2 = '456'
>>> for i,j in zip(s1,s2):
...     print(i,j)
...
1 4
2 5
3 6

So in this case, I am taking an individual first byte from each byte string, and XOR’ing them within the bytes function at bytes([b1 ^ b2]), where ^ is the XOR operator. Note that when you iterate a byte string, the characters are integers, which is why you are able to XOR them. For example:

>>> name = b'Jake'
>>> for letter in name: print(letter)
...
74
97
107
101

If you have difficulty understanding the code, it is helpful to add in print() and input() function to help you step through the code. For example, printing the XOR’d bytes helps understand how the byte string is built:

Leave a Reply

Your email address will not be published. Required fields are marked *