Cryptopals challenge 9: Implement PKCS#7 padding in Python

This is my write up of the ninth Cryptopals challenge, using Python3 as my language of choice. The challenge:

Implement PKCS#7 padding

A block cipher transforms a fixed-sized block (usually 8 or 16 bytes) of plaintext into ciphertext. But we almost never want to transform a single block; we encrypt irregularly-sized messages.

One way we account for irregularly-sized messages is by padding, creating a plaintext that is an even multiple of the blocksize. The most popular padding scheme is called PKCS#7.

So: pad any block to a specific block length, by appending the number of bytes of padding to the end of the block. For instance,


… padded to 20 bytes would be:

“YELLOW SUBMARINE\x04\x04\x04\x04”

Padding background

Block cipher modes for symmetric-key encryption algorithms require plain text input that is a multiple of the block size, so messages may have to be padded to bring them to this length.1 In PKCS#7 padding, padding is in whole bytes. The value of each added byte is the number of bytes that are added, i.e. N bytes, each of value N are added. The number of bytes added will depend on the block boundary to which the message needs to be extended.

The padding will be one of:
02 02
03 03 03
04 04 04 04
05 05 05 05 05
06 06 06 06 06 06

Let’s do this in Python…

The easy way

Pycryptodome includes both pad and unpad functions. To solve the challenge we can use the pad function:

>>> from Crypto.Util.Padding import pad
>>> block_size = 20
>>> data = b'YELLOW SUBMARINE'
>>> pad(data, block_size)
b'YELLOW SUBMARINE\x04\x04\x04\x04'

Of note, if the message length is equal to the block size, then the message is padded with a full set of block size-sized bytes:

>>> data = b'YELLOW SUBMARINE1234'
>>> pad(data, block_size)
b'YELLOW SUBMARINE1234\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14'

The harder way

The harder way is to recreate pad() myself, which isn’t too complicated (maybe…hopefully I got this right…). Here is what I did:

I tested this the same way I tested the pad function in Pycryptodome:

>>> data = b'YELLOW SUBMARINE'
>>> block_size = 20
>>> pkcs7_pad(data, block_size)
b'YELLOW SUBMARINE\x04\x04\x04\x04'
>>> data = b'YELLOW SUBMARINE1234'
>>> pkcs7_pad(data, 20)
b'YELLOW SUBMARINE1234\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14\x14'

After writing and testing this function, it occurred to me that I could just look at the Pycryptodome pad function (I mean, since it’s on my machine (C:\Python36\Lib\site-packages\Crypto\Util\ on my box)). It turns out I got it correct. Not sure why I didn’t think to look to the source earlier.

Leave a Reply

Your email address will not be published. Required fields are marked *