Folke Challenge

In 2005 Carsten Folke Møller gave me a USB memory stick with two encrypted files. There were 72'057'594'037'927'936 possible keys. In 2018 I found the correct one.

folke.txt.cef deciphers to:

Tillykke, du har nu adgang til det rene ingenting. Mvh Folke ;-)

getthis_morten.jpg.cef deciphers to (click for the real version):

I was working on solving the challenge for a while in 2005, but found other projects to work on. In early 2015 i was making my bucket list, and I put "solve the Folke Challenge" on the list. I started working on it again. The memory stick contains three files: CruzerLock.exe which is the decryption software, folke.txt.cef, and getthis_morten.jpg.cef, which are the encrypted files. I got reverse engineering help from Peter Wilhelmsen and Samuel Neves, taking the crypto code in CruzerLock.exe apart. It turns out that the developers had problems with endianness, but there is no obvious way to use that in an attack. I named the hash algorithm AlmostSHA1, and the cipher algorithm AlmostDES. I tried a few billion of the most common passwords, but didn't find the correct one.

So, if there is no easy way to guess the key, the only option is to use brute force - try every possible key.

If I ignore the password hash, I can use brute force directly on the crypto key. The downside is that I would not get the password itself, but it would be way faster, and I would be able to decipher the two .cef files. It's a good trade-off. On my PC I can test about 500'000 keys per second, so it is going to take around 4'500 years, worst case. That's too long. I could buy 900 PCs, and have them work for five years, worst case. That's too expensive. I could design my own IC like Deep Crack, but I have no experience with that, and it would probably be very expensive as well. So, not a regular CPU, and not a custom IC.

What's in between? A Field-Programmable Gate Array!

I did not want to develop my own printed circuit board, so I was looking for a simple board with just an FPGA, and the few other parts required. As the key space is easily devided into smaller chunks, it is easy to parallelize the brute force attack. If it is cheaper to buy two slow FPGAs than one fast FPGA, as long as they are not half speed, then do that. If the FPGA is double the size, but not double the prize, then buy that one. I ended up buying 32 CoreEP3C16 boards. They are relatively old, not super fast, but they are cheap and have enough logic elements to host four AlmostDES instances running at 100 MHz. That means each board can test 400'000'000 keys per second, a total of 12'800'000'000 keys per second with the 32 FPGAs. That's 25'600 times faster than my PC, and makes the worst case 65 days.

It took me a while to implement AlmostDES in the Very High Speed Integrated Circuit Hardware Description Language, working on-and-off on the project, and then I started overengineering. I 3D-printed mounting frames for the FPGA boards. I implemented I²C as the communications protocol, and added an Atmel AVR microcontroller to connect a regular computer with that bus. I added temperature sensors to the design, so I could automatically shut down the FPGAs, if the fans stopped running. I wrote a scheduler, which could keep track of which key spaces had been tested. I added support for "test vectors" so each job not only had the real target, but also one which the FPGA had to find the key for, in the key space covered by that job. I added jumpers, to physically configure each node's address on the I²C bus. I added CRC32 checksumming of all communication between the scheduler and the FPGAs.

At some point, I was ready to get the job done.

In the winter of 2017/2018 I bought a particle board. With the help of Jacob, Simon, and Lucas, we managed to cut it in proper dimentions, and have the cooling fans glue-gunned onto the board. The original plan was to put all the FPGAs in a rack mountable case, but I figured it was easier to just build something a bit larger, and put it in my dining room. I mounted an old power supply (which used to power a DEC Alpha machine, when I worked for Folke), the microcontroller, and a small ARM computer to do the scheduling. It took me some evenings to wire it all up.

Oh, and then I added a feature to the daemon, so you can tell how far it is, what job each node is handling, and what temperatures the different sensors see:

$ telnet ades.afdelingp.dk 1337
7481012 out of 16777216 jobs (44.59%) done.

7226C3 7226C0 7226C4 7226B7 7226B4 7226C9 7226CE 7226C1
7226C8 7226C7 7226BB 7226C6 7226CC 7226CF 7226CB 7226B8
7226D1 7226C2 7226D3 7226CD 7226D2 7226BE 7226BD 7226B5
7226BA 7226B6 7226CA 7226C5 7226D0 7226B9 7226BC 7226BF

19.7500 20.7500 19.8750 20.3125 19.7500 20.8125 20.6875 21.3125

In the end, it looks like this (click for larger version):

I started the FPGA cluster on 2018-04-03, and it ran for 55 days, before finding the key: 11010000010000000011100001111110000011010011011011000100.

$Id: index.xml,v 1.2 2018/07/29 06:35:57 moki Exp $