Reverse Engineering keyboard firmware with Ghidra - Part 3

In which we succeed, and fail - and take a break to play through Half-Life: Alyx

At the end of Part 2, we’d found and extracted the “firmware blob”, which is the data that the updater sends over USB to the keyboard. The problem is that the data doesn’t look anything like Arm Cortex-M3 code.

00000000: 84be c2c7 450a 0879 6c0a d553 51ce 1efc  ....E..yl..SQ...
00000010: fe5b e848 e9c1 3c77 3b74 48b7 768c cbd9  .[.H..<w;tH.v...
00000020: c68b 8c2a a8ad 6709 5c0f 52d4 f666 c3d0  ...*..g.\.R..f..
00000030: a8d0 c0ea 7494 c2e7 7f0a 0879 7c0a d553  ....t......y|..S
00000040: 41ce 1efc e85b e848 fdc1 3c77 2174 48b7  A....[.H..<w!tH.
00000050: 05cd cbd9 b5ca 8c2a dbec 6709 2f4e 52d4  .......*..g./NR.
00000060: ee66 c3d0 b6d0 c0ea 07d5 c2e7 630a 0879  .f..........c..y
00000070: 7e0a d553 41ce 1efc e85b e848 fdc1 3c77  ~..SA....[.H..<w
00000080: 2174 48b7 05cd cbd9 b5ca 8c2a dbec 6709  !tH........*..g.
00000090: 2f4e 52d4 ee66 c3d0 b6d0 c0ea 07d5 c2e7  /NR..f..........
000000a0: 630a 0879 7e0a d553 328f 1efc e85b e848  c..y~..S2....[.H
000000b0: fdc1 3c77 2174 48b7 05cd cbd9 b5ca 8c2a  ..<w!tH........*
000000c0: dbec 6709 2f4e 52d4 ee66 c3d0 b6d0 c0ea  ..g./NR..f......
000000d0: 07d5 c2e7 104b 0879 0d4b d553 328f 1efc  .....K.y.K.S2...
000000e0: 9b1a e848 afc6 3c77 7772 48b7 05cd cbd9  ...H..<wwrH.....
000000f0: b5ca 8c2a dbec 6709 2f4e 52d4 ee66 c3d0  ...*..g./NR..f..

I spent a bunch of time failing to decode it with 4-byte XOR keys (like are used elsewhere), but finally taking Sprite_tm’s clue about a 52-byte key instead, I made some progress.

I wrote a little program which would split the blob into 52-byte chunks, and generate a histogram of values for each position in the chunks. I figured ‘0’ is a common value, and ‘0’ XORed with the key would give the key value. So across the whole blob, the most common value in each position is reasonably likely to be the key value.

  0: 0x74 0x94 0xc2 0xe7 0x10 0x4b 0x08 0x79
  8: 0x0d 0x4b 0xd5 0x53 0x32 0x8f 0x1e 0xfc
 16: 0x9b 0x1a 0xe8 0x48 0x8e 0x80 0x3c 0x57
 24: 0x52 0x35 0x48 0xb7 0x76 0x8c 0xcb 0xf9
 32: 0xc6 0x8b 0x8c 0x2a 0xa8 0xad 0x67 0x29
 40: 0x5d 0x0f 0x52 0xd4 0x9d 0x27 0xc3 0xf0
 48: 0xc5 0x91 0xc0 0xea

Using this to XOR with each byte, we get a much better looking chunk of possible Arm code:

00000000: f0 2a 00 20 55 41 00 00 61 41 00 00 63 41 00 00  .*. UA..aA..cA..
00000010: 65 41 00 00 67 41 00 00 69 41 00 00 00 00 00 00  eA..gA..iA......
00000020: 00 00 00 00 00 00 00 00 00 00 00 00 6b 41 00 00  ............kA..
00000030: 6d 41 00 00 00 00 00 00 6f 41 00 00 71 41 00 00  mA......oA..qA..
00000040: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
00000050: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
00000060: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
00000070: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
00000080: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
00000090: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
000000a0: 73 41 00 00 73 41 00 00 00 00 00 00 73 41 00 00  sA..sA......sA..
000000b0: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
000000c0: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..
000000d0: 73 41 00 00 00 00 00 00 00 00 00 00 00 00 00 00  sA..............
000000e0: 00 00 00 00 21 46 00 00 25 47 00 00 73 41 00 00  ....!F..%G..sA..
000000f0: 73 41 00 00 73 41 00 00 73 41 00 00 73 41 00 00  sA..sA..sA..sA..

You might remember from Part 2 that Cortex-M series microcontroller firmware starts with a very specific pattern. Here we can see the initial stack pointer in the first 4 bytes (0x20002af0), followed by a very structured list of pointers to interrupt handlers (e.g. 0x00004173). This is encouraging, because we saw from the Wireshark packet capture that the code is programmed starting at address 0x4000, so 0x4173 is a very plausible location for an interrupt handler. 0x00004173 specifically repeats a lot, and so is likely a function which is used for all of the unused interrupts.

You’d be forgiven for thinking that an “odd” function pointer (ending with a ‘3’) is a little strange, as pointers tend to be better aligned than that. However, this is the way that Arm processors distinguish between “Arm” and “Thumb” code. Thumb is a 16-bit encoding of the most common 32-bit Arm instructions. This improves code density, which is useful for embedded applications. If the processor is told to jump to an “odd” address, it knows that it must switch to Thumb mode to execute that code (though the instructions themselves are still aligned to multiples of two bytes).

Now that we have a decoded chunk of Arm/Thumb code instead of nonsense, we can load that into Ghidra and start figuring out how the firmware itself works.

Investigating the firmware

At this point, I’m trying to find out how the In-Application Programming (IAP) parts of the keyboard’s code works, which is the bit which reads and programs the flash. The IAP code doesn’t let us read out the code from the keyboard, so if we can identify how it prevents that, we might be able to bypass it and dump the whole of the keyboard’s flash.

My suspicion was that the actual read/erase/program code wouldn’t be in the blob I’d extracted, as this code is meant to be loaded at 0x4000, and the chip will reset to start running code from 0x0000, so there must be some other code permanently in the keyboard, which isn’t touched by the firmware updater, and this is likely to be the thing handling IAP.

My simple approach to finding the 52-byte key did end up having a few bits which were incorrect, for example the 0x71 in the vector table is meant to be 0x73. This manifested as occasional little chunks of invalid code or bad addresses, and it took me a couple of iterations of identifying and fixing flipped bits in the key by hand to get it right.

When we loaded the updater .exe into Ghidra, the ISP_* symbols were left in, which helped a lot with the reverse engineering. In the firmware, there’s no symbols at all, but we can get the datasheet for the chip, which tells us the address of different peripherals, which can help us figure out what’s going on. I did take a look at Thomas Roth’s (@StackSmashing) SVD loader which is able to automatically label peripheral addresses in Ghidra for a whole bunch of Arm Cortex microcontrollers, but I didn’t look very hard for an SVD file for the HT32F1654. I think that’s something I’ll try in the future.

Working by hand, we can look in the datasheet to find that the base address of the USB peripheral is 0x400A8000, then we can search in Ghidra for any memory locations containing 0x400A80 and hopefully that will find us code which interacts with the USB peripheral (I’ve already labelled a bunch of them in this screenshot):

References to the USB peripheral

Working through the disassembled and decompiled code is pretty tough, but it turns out that much of it uses Holtek’s libraries, which we can get the source code for from their website. Using this source code, we can compare it to the decompiled code side-by-side and it helps a lot with figuring out what bits of the code do.

After spending a few evenings working through the firmware, I was quite confident about a couple of things:

  • There’s no flash programming routines in the code in the updater
  • There’s no CRC checking or blob decoding in the code from the updater

A brief detour into physical access

Throughout this whole thing, I’ve been trying to avoid physically soldering to the keyboard. However, failing to find what I wanted in the firmware blob, I thought I would at least try.

Taking the keyboard apart and probing around I was pretty pleased to find that the board actually has a 5-pin 0.1" header footprint, which breaks out the SWD debug pins and sits right under the space bar! This means you can solder a header onto it, and leave that in place without needing to modify the case or otherwise interfere with the operation of the keyboard!

SWD header

The pinout is:

 |_|  o  o  o  o
  |   |  |  |  `--- GND
  |   |  |  `------ nRST
  |   |  `--------- SWCLK
  |   `------------ SWDIO
  `---------------- 3v3

I soldered a header on, and used openocd on a Pi Zero in USB gadget mode to see if I could learn anything that way. I knew this was unlikely, due to the flash protection features of the microcontroller. If activated, this prevents any access to flash on the CPU’s data bus when a debugger is attached, which prevents the debugger from reading out the flash contents, but also prevents the code on the keyboard from reading flash, which crashes it pretty quickly.

I was able to hand-write some machine code into RAM and jump to it, but as suspected, I couldn’t read any flash locations that way (though I could dump peripheral registers, not terribly useful).

Still, hopefully in the future I’ll be able to dump the whole of flash, then mass-erase it to remove the flash protection, and then the SWD header will be very useful for debugging and developing new firmware.

Attempting to modify the firmware

Without a full flash dump to restore from, I was really hesitant to attempt to modify the firmware, as it risks bricking the keyboard with no way to restore working code. However, seeing that the IAP code wasn’t in the blob from updater, I was fairly confident that it was in another part of flash which is protected from accidental breakage.

So, I made the smallest modification I could: In the product name string “USB-HID Keyboard”, I changed “HID” to “KID”, re-encoded the data, dumped it back into the right part of the .exe file, copied it over to Windows and ran it.

It ran as usual, and let me start the programming process - but then at the part where it normally resets back into “normal” operation mode, the keyboard LEDs stayed off and after around 10 seconds of waiting the updater said “Firmware Update Failed!”.

I was capturing the data with Wireshark, and could see that only my single byte was actually changed in the programming sequence - so my modification of the .exe was successful, it just didn’t work on the keyboard.

Thankfully, running the non-modified updater, it was able to put “good” firmware back on, and I knew that I could be a bit more confident about hacking around because it seemed like I’d be able to recover with the original updater.

But why didn’t it work? After some more investigation, I found that the final operation, which writes the version string (V2.1.03) to address 0x3c00 didn’t work, and that memory was still erased (erased in flash memory means full of 0xff values - programming flash can only clear bits, so to set them it has to be erased to 0xff). I’ve no way to tell if the rest of flash was successfully written, but one of my theories is that if the version string isn’t programmed, the keyboard doesn’t jump from the “IAP” code to the “Keyboard” code.

The key seems to be the ISP_CRCCheck() function, which runs after all of the firmware data is sent, but before the version is written. This takes a 16-bit value from one of the updater structures, and sends it with command type 02.

CRCCheck

This must represent some kind of CRC of the data, and with my modification of the firmware I made the CRC invalid. This either prevents any flash programming, or prevents the version string from being programmed.

Trying to figure out the CRC

I haven’t found out how the CRC is calculated yet. It isn’t a simple CRC of the firmware data (either encoded with the 52-byte key or not). There’s a tool called reveng which Hackaday did a nice write-up on which can be used to reverse-engineer CRCs, but giving it the three firmware versions I have didn’t give me any usable results.

What I do know is (from later experiments):

  • Just skipping the CRC check doesn’t work
  • Appending a zero byte to the firmware doesn’t work
  • Changing the CRC value breaks it
  • Writing only part of the firmware breaks it
  • Re-ordering the firmware writing (e.g. writing the second half first) breaks it
  • Putting other operations (like a read) in the middle of the firmware write is OK
  • Changing the packet transfer length breaks it
  • Simply erasing and re-writing the version, with no other changes, doesn’t work

I intend to keep using these clues to work at figuring out the CRC, then I should be able to modify the firmware in order to read out the contents of flash.

Succeeding and failing

Unable to modify the firmware, I took my various hacked up scripts and wrote a much better/neater tool and couple of libraries, which I’ve uploaded to github:

https://github.com/usedbytes/ducky-tools/

Armed with this, and an updater .exe from Ducky, it should be possible to update the keyboard firmware on Linux or any other operating system supporting go (and libusb). So that’s a success I guess? I’ve tested this on my UK-layout (I don’t know why the updater says ANSI not ISO) Ducky One TKL, with v1.03r firmware. I don’t know it if works beyond that, but I think if modifications are needed they should be pretty minor. If you do happen to try it, let me know how it goes on GitHub, I’m happy to accept improvements or make changes myself.

$ sudo ./ducky iap update ../One_TKL_EU_L_1.03r.exe
Firmware version: V2.1.03
Name:             KB Upgrade
IAP version:      v1.0.0
Layout:           ANSI 108 Keys
File Key:         ea 61 87 ed
Device Info:
Chip:         HT32F1654
Option Size:  0x0400 (1024)
Flash Size:   0x0000c000 (49152)
OB_PP Bits:   0x0030 (48)
Start Addr:   0x00004000
Version Addr: 0x00003c00

Device Version: V2.1.03
>>> Erase version...
>>> Erase program...
>>> Write program...
>>> Check CRC...
>>> Write version...
>>> Success!

However, through all of this there’s been an implicit goal that just updating with the factory firmware isn’t nearly as interesting as updating with our own code. There are already keyboard firmware projects, and it’s definitely possible to flash them by totally erasing the chip and starting from a clean slate. However, what I really want to do is get a full backup of the stock firmware as a starting point. This would then allow us to keep the stock IAP code, and allow us to switch back to Ducky’s firmware with their official updater.

So, I’m going to keep plugging at the CRCCheck(), and maybe I’ll start taking a look at my Ducky One2 keyboard, too!