Brandon T. Elliott
CTF Writeup: TAMUctf 2023 - "MD5"
The CTF
TAMUctf 2023 took place from April 28, 2023 to April 30, 2023 and was organized by the Texas A&M Cybersecurity Center.
The Challenge
This crypto challenge began with the following description and an archive file md5.zip
to download:
Full transparency: although I did briefly take a look at this challenge while the event was running, I did not solve it during the event. However, after the conclusion of the event, I found the challenge interesting enough to tinker with and decided to explore it in depth, hence this writeup.
The Source Code
After extracting the contents of md5.zip
we are left with two files:
server.py
, which is the source code of the serversolver-template.py
, which is a template for connecting to the challenge instance server in order to attempt to solve the challenge
Taking a look at server.py
, the code is fairly simple:
We first see a function called md5sum
which appears to calculate the md5sum of the b
bytes object, however it then slices the result to only return the first 3 bytes.
MD5
An MD5 hash is 128-bit, so the full hash consists of 16 bytes.
Although MD5 is still widely used, it has been demonstrated to suffer from collisions using the full MD5 hash since 2004 and thus is considered to be insecure. Thus, using a severely truncated version of the hash, (in this case, only the first 3 bytes), it is even more susceptible to collisions.
The Source Code - continued…
Moving on in the code, we see some variables next which denote a whitelisted command echo lmao
and a whitelisted hash, which is the truncated md5 hash of that command.
Finally, we see the main
function which runs a while True
loop, prompts the user for the input of a command they wish to execute, which is then converted to bytes and subsequently the truncated md5sum
of the inputted command is calculated. If the result of this doesn’t match the whitelisted hash, then it prints an error, and it continues to the next command input. Otherwise, if the hash does match the hash of the whitelisted command, then it is executed with /bin/bash -c
.
Therefore, the problem statement for this challenge is quite simple. In the context of the CTF, our goal is to get the contents of flag.txt
, so ultimately we need to find a collision of the whitelisted hash with a command such as cat flag.txt
or something similar to that effect.
Finding a collision like this manually is not really feasible and thus, we will be creating a python script that will find collisions for a requested input command.
Solver
We can first copy some of the lines from server.py
including the whitelisted command and hash, and the md5sum
function.
We will get an inputted command that we want to run from the 1st command line argument as seen on line 15 above.
The main
function will then check if the md5sum
of the cmd
variable equals the whitelisted hash. While it doesn’t, it will then generate a random string of 8 characters which is placed inside an inconsequential echo
command like this:
{command} && echo "{random_string}" > /dev/null
Once a hash collision is found, it will be printed out to the console.
In my testing, this usually found a collision within under a minute and should theoretically work for any command you supply as a command-line argument (however, for the challenge, we are only interested in cat flag.txt
):
┌──(kali㉿kali)-[~/tamuctf/md5]
└─$ python3 solver.py "cat flag.txt"
success! collision found in 13 seconds
here's your command:
cat flag.txt && echo "lxkb@k~Y" > /dev/null
We can now use the solver-template.py
to add in some additional lines in order to automatically initiate the remote connection to the challenge instance and send the calculated command to it:
After executing this, we get the flag: