04-30-2023

CTF Writeup: TAMUctf 2023 - "MD5"

TAMUctf

The CTF

TAMUctf 2023 took place from April 28, 2023 to April 30, 2023 and was organized by the Texas A&M Cybersecurity Center.

The Challenge

This crypto challenge began with the following description and an archive file md5.zip to download:

md5

Full transparency: although I did briefly take a look at this challenge while the event was running, I did not solve it during the event. However, after the conclusion of the event, I found the challenge interesting enough to tinker with and decided to explore it in depth, hence this writeup.

The Source Code

After extracting the contents of md5.zip we are left with two files:

server.py, which is the source code of the server
solver-template.py, which is a template for connecting to the challenge instance server in order to attempt to solve the challenge

Taking a look at server.py, the code is fairly simple:

server

We first see a function called md5sum which appears to calculate the md5sum of the b bytes object, however it then slices the result to only return the first 3 bytes.

MD5

An MD5 hash is 128-bit, so the full hash consists of 16 bytes.

Although MD5 is still widely used, it has been demonstrated to suffer from collisions using the full MD5 hash since 2004 and thus is considered to be insecure. Thus, using a severely truncated version of the hash, (in this case, only the first 3 bytes), it is even more susceptible to collisions.

The Source Code - continued…

Moving on in the code, we see some variables next which denote a whitelisted command echo lmao and a whitelisted hash, which is the truncated md5 hash of that command.

Finally, we see the main function which runs a while True loop, prompts the user for the input of a command they wish to execute, which is then converted to bytes and subsequently the truncated md5sum of the inputted command is calculated. If the result of this doesn’t match the whitelisted hash, then it prints an error, and it continues to the next command input. Otherwise, if the hash does match the hash of the whitelisted command, then it is executed with /bin/bash -c.

Therefore, the problem statement for this challenge is quite simple. In the context of the CTF, our goal is to get the contents of flag.txt, so ultimately we need to find a collision of the whitelisted hash with a command such as cat flag.txt or something similar to that effect.

Finding a collision like this manually is not really feasible and thus, we will be creating a python script that will find collisions for a requested input command.

Solver

We can first copy some of the lines from server.py including the whitelisted command and hash, and the md5sum function.

solve

We will get an inputted command that we want to run from the 1st command line argument as seen on line 15 above.

The main function will then check if the md5sum of the cmd variable equals the whitelisted hash. While it doesn’t, it will then generate a random string of 8 characters which is placed inside an inconsequential echo command like this:

{command} && echo "{random_string}" > /dev/null

Once a hash collision is found, it will be printed out to the console.

In my testing, this usually found a collision within under a minute and should theoretically work for any command you supply as a command-line argument (however, for the challenge, we are only interested in cat flag.txt):

┌──(kali㉿kali)-[~/tamuctf/md5]
└─$ python3 solver.py "cat flag.txt"
success! collision found in 13 seconds
here's your command: 
cat flag.txt && echo "lxkb@k~Y" > /dev/null

We can now use the solver-template.py to add in some additional lines in order to automatically initiate the remote connection to the challenge instance and send the calculated command to it:

full-solve

After executing this, we get the flag: