In order to actually get a start on the project I needed an FPGA, power, a method to program the FPGA, etc. Getting a board made for this takes time (and money), and I just wanted to get going, so I looked around for a device that already had an FPGA, microprocessor, glue logic, etc. After a while I thought of reusing a Proxmark3 -- I didn't want to (potentially) destroy an already working unit, so (3 days before I was scheduled to leave for IETF 71) I decided to quickly build a partial unit, with just the FPGA, processor, USB glue and JTAG stuff.
Unfortunately the FPGA on the Proxamrk3 is much too small to actually implement any hash algorithms, but it did allow me to work in the general principles while sitting in some of the more boring working groups.
What I came up with was a system where the PC pushes general parameters down to the Atmel processor on the unit -- basically a list of candidate hashes, a device ID and where in the key-space to start. The microprocessor loads the FPGA bit image into memory and writes the following:
into the relevant bits of the image. It then programs the (modified) image into the FPGA and signals it to start -- this somewhat kludgey system was chosen because I a: had only a limited number of (connected) pins and B: didn't want to develop and implement a whole protocol for communication.
The FPGA initializes a counter with Start, concatenates that with the salt and then calculates the hash. If compares this with Hash and, if they match it signals that it found the result within this work unit. If they don't match, it increments the counter and rechecks until End is reached. If End is reached without finding the correct input, the FPGA signals this, the processor builds a new image and restarts the FPGA. In order to keep the communication between the FPGA and the microprocessor as simple as possible, the FPGA does not actually return what the IV to the hash function is, but only that it lies within the current work unit -- this means that the host computer still has perform some brute-force work, but only has to perform End - Start hashes. I chose a work unit of 222 which provides a reasonable trade-off between restarting the FPGA and search time on the PC.
After I returned from the IETF I found my Xilinx development kit waiting for me -- unfortunately I haven't had much time recently, but hope to soon rework the communications bit and will then post some results.