A small freedom area.

Scavenger hunt

Mon 28 Mar 2011

rom hacking, reverse, game, pokemon, pokanalysis, hidden, secret, gb

The Pokémon hidden items reverse article I wrote a while ago was certainly the most interesting one in my old blog, and this is an updated rewrite.

Let's start from the beginning: I am found of the first generation Pokémon games Red, Blue and later Yellow, while I recognize the second generation is way better (from a gamer point of view at least).

floatrightimg

The first generation really has mythes and legends around it, and it always stirred my curiosity. So when I was technically able to hack it, I did it and started developing pokanalysis. While working on it, I run into an interesting experience: how to get the list of the hidden objects lying on the ground? While it seems trivial when you have a map pointer, in fact, it isn't. Before I start, you have to know that a map pointer has an "object data" pointer, which focus on 3 lists:

You may think the items are what we are looking for, but it's not: those items are items you can actually see (they are generally represented by a pokéball on the ground).

Catch the event

So how do we hack those hidden items? We first have to locate one of them, and trace what happens when we grab it. But in order to do so, we need to break just after the A button is pressed. While we can slow down our computer and hit escape quickly after hitting the A button, this is not efficient and reliable at all. Let's do something smarter.

According to the GB CPU manual, buttons control is done while reading at 0xff00. Let's put a reading access breakpoint with BGB on this address.

BGB access breakpoint

Almost immediately, we get stuck in the infinite loop that checks if a key is pressed and so we can't manually press the A button (the debugger is catching the focus every time). So let's simulate it in order to get out of this loop and see where we are driven.

Buttons I/O

This is how the octet located at 0xff00 is handled:

centerimg

The neutral value here is 1; 0 is used when a button is pressed. On writing, only bit 4 and bit 5 are affected. Those two bits allow you to choose to monitor arrows (set 0 at bit 4) or buttons (set 0 at bit 5). And then, after a reading at 0xff00, the four first bits will be filled with the interesting stuff.

Let's see how it's done in Pokémon:

ROM0:015F 3E 20            ld a,20
ROM0:0161 0E 00            ld c,00
ROM0:0163 E0 00            ld (ff00+00),a   ; a = 0x20 = 0b00100000 → 0b..10.... → bit 4 set to 0 → Pin 14 so arrows
ROM0:0165 F0 00            ld a,(ff00+00)
ROM0:0167 F0 00            ld a,(ff00+00)
ROM0:0169 F0 00            ld a,(ff00+00)   ; a few reading cycles (this is where the debugger stopped)
ROM0:016B F0 00            ld a,(ff00+00)
ROM0:016D F0 00            ld a,(ff00+00)
ROM0:016F F0 00            ld a,(ff00+00)
ROM0:0171 2F               cpl              ; zeroes are becoming ones, and ones zeroes.
ROM0:0172 E6 0F            and a,0f         ; we only keep the first 4 bits
ROM0:0174 CB 37            swap a           ; and back them up in the high nibble
ROM0:0176 47               ld b,a           ; save the result in another temporary register

Now high nibble of b contains 1 for set arrows. We are interested in the buttons, so let's continue.

ROM0:0177 3E 10            ld a,10
ROM0:0179 E0 00            ld (ff00+00),a   ; a = 0x10 = 0b00010000 → 0b..01.... → bit 5 set to 0 → Pin 15 so buttons
ROM0:017B F0 00            ld a,(ff00+00)
ROM0:017D F0 00            ld a,(ff00+00)
ROM0:017F F0 00            ld a,(ff00+00)
ROM0:0181 F0 00            ld a,(ff00+00)
ROM0:0183 F0 00            ld a,(ff00+00)   ; same as previously, a few reading cycles
ROM0:0185 F0 00            ld a,(ff00+00)
ROM0:0187 F0 00            ld a,(ff00+00)
ROM0:0189 F0 00            ld a,(ff00+00)
ROM0:018B F0 00            ld a,(ff00+00)
ROM0:018D F0 00            ld a,(ff00+00)
ROM0:018F 2F               cpl              ; complement, just as before.
ROM0:0190 E6 0F            and a,0f         ; still the same, we only need the 4 buttons flags
ROM0:0192 B0               or b             ; b now contains arrows (high nibble) and buttons (low nibble)
ROM0:0193 E0 F8            ld (ff00+f8),a   ; save result somewhere in memory
ROM0:0195 3E 30            ld a,30
ROM0:0197 E0 00            ld (ff00+00),a   ; a = 0x30 = 0b00110000 -> 0b..11.... → bit 4 and 5 set to 1 → no more monitoring
ROM0:0199 C9               ret

So now, we just need to change the content of the register a before the cpl instruction at ROM0:018F.

BGB Joypad hook

Here we have 0xdf in a, or 0b11011111. The first 4 bits are set to 1, which means all the buttons are up. If we set the bit 0 to 0, we will simulate a button A pressure. So we put 0xde (0b11011110) in the a register. Be careful and keep the content of the f register while changing this value. We can now go ahead and see what happens. Needless to say we previously focused the player in front of the hidden item.

Tracing

We can now engage the funny part. We previously noticed the result of the joypad key events loop is stored at 0xfff8, and there is no other save. So we can safely ignore everything until this address is read again. Let's put an access breakpoint on 0xfff8 and hit F9 to run. And then we are teleported here:

ROM3:4000 F0 F8            ld a,(ff00+f8)
ROM3:4002 FE 0F            cp a,0f          ; all buttons pressed
ROM3:4004 CA 3C 40         jp z,403c        ; reset?
ROM3:4007 47               ld b,a
ROM3:4008 F0 B1            ld a,(ff00+b1)
ROM3:400A 5F               ld e,a
ROM3:400B A8               xor b
ROM3:400C 57               ld d,a
ROM3:400D A3               and e
ROM3:400E E0 B2            ld (ff00+b2),a   ; first (modified) copy
ROM3:4010 7A               ld a,d
ROM3:4011 A0               and b
ROM3:4012 E0 B3            ld (ff00+b3),a   ; second (modified) copy
ROM3:4014 78               ld a,b
ROM3:4015 E0 B1            ld (ff00+b1),a   ; third (original) copy
ROM3:4017 FA 30 D7         ld a,(d730)
ROM3:401A CB 6F            bit 5,a
ROM3:401C 20 16            jr nz,4034
ROM3:401E F0 B1            ld a,(ff00+b1)
ROM3:4020 E0 B4            ld (ff00+b4),a   ; copy from the original copy…
ROM3:4022 FA 6B CD         ld a,(cd6b)
ROM3:4025 A7               and a
ROM3:4026 C8               ret z
ROM3:4027 2F               cpl
ROM3:4028 47               ld b,a
ROM3:4029 F0 B4            ld a,(ff00+b4)
ROM3:402B A0               and b
ROM3:402C E0 B4            ld (ff00+b4),a
ROM3:402E F0 B3            ld a,(ff00+b3)
ROM3:4030 A0               and b
ROM3:4031 E0 B3            ld (ff00+b3),a
ROM3:4033 C9               ret

It's not really important to get the meaning of this; we just need to observe a few writing are based on the value at 0xfff8:

0xffb1 stores the exact same value of 0xfff8, and the other are "modified" values. While we can deduce the original content of 0xfff8 from 0xffb2 and 0xffb3, it is unlikely this will happen, so we will just ignore them.

Later, at ROM3:401E, 0xffb1 is read again… to be written at 0xffb4. I didn't get the meaning of all of this from the current context, but it does not matter: what we need to do is just watch over 0xfff8, 0xffb1, and now 0xffb4.

So let's continue…

At ROM0:0F68, a is fetched from 0xffb4. And this time it seems to test whether A or B is pressed:

ROM0:0F68 F0 B4            ld a,(ff00+b4)
ROM0:0F6A E6 F3            and a,f3         ; the 4 first bits of 0xf3 are 0011, which means it is A and B buttons mask
ROM0:0F6C 20 04            jr nz,0f72

We are on the right direction, so let's continue. By the way, we can notice 0xffb3 is read again at ROM0:044D, but I previously decided to ignore it.

Now we are at ROM0:3EAB, where bit 0 is checked (button A, hell yeah!):

ROM0:3EBA CB 47            bit 0,a          ; check button A
ROM0:3EBC 28 2C            jr z,3eea        ; it is set to 1, so we don't jump
ROM0:3EBE 3E 11            ld a,11          ;
ROM0:3EC0 EA 00 20         ld (2000),a      ;    switch to bank 11
ROM0:3EC3 E0 B8            ld (ff00+b8),a   ;
ROM0:3EC5 CD A0 69         call 69a0        ; next episode :)

To confirm our hypothesis, we can put a breakpoint on ROM0:3EBE, and disable all the others. And indeed, we get here only when pressing the A button (even if there is nothing in front of the player).

Storage

We can assert the 0x69a0 function is somewhat the handle_button_A() function, so we're finally out the events maze.

This function is introduced with some kind of memset that doesn't seem interesting, but the following part is much more appealing:

RO11:69A0 21 EB FF         ld hl,ffeb   ; destination address
RO11:69A3 AF               xor a        ; filled with zeroes
RO11:69A4 22               ldi (hl),a   ;
RO11:69A5 22               ldi (hl),a   ;    unrolled "memset"
RO11:69A6 22               ldi (hl),a   ;
RO11:69A7 77               ld (hl),a    ;

; the interesting section:

RO11:69A8 11 00 00         ld de,0000   ; this register is incremented further, let's consider it's an index
RO11:69AB 21 40 6A         ld hl,6a40   ; an important base address
RO11:69AE 2A               ldi a,(hl)   ; loop start, reading byte per byte from this base address
RO11:69AF 47               ld b,a       ; ^
RO11:69B0 FE FF            cp a,ff      ; |
RO11:69B2 28 48            jr z,69fc    ; |   0xff marks the end of the array/list
RO11:69B4 FA 5E D3         ld a,(d35e)  ; |   Read at 0xd35e: this looks like the map id
RO11:69B7 B8               cp b         ; |   comparison with previous value: so this is a list of map IDs
RO11:69B8 28 04            jr z,69be    ; |
RO11:69BA 13               inc de       ; |
RO11:69BB 13               inc de       ; |   index += 2
RO11:69BC 18 F0            jr 69ae      ; `-  loop

We know there is a list of map IDs at 0x6a40 from bank 11, or at offset 0x46a40 in the ROM (0x11 * 0x4000 + 0x6a40 % 0x4000) which is terminated with a 0xff.

Let's see what happens when the current map is found in the list.

RO11:69BE 21 96 6A         ld hl,6a96   ; a new address
RO11:69C1 19               add hl,de    ; 0x6a96 + map index
RO11:69C2 2A               ldi a,(hl)   ;
RO11:69C3 66               ld h,(hl)    ;   dereference pointer
RO11:69C4 6F               ld l,a       ;
RO11:69C5 E5               push hl      ; save the read pointer on stack

OK, now we have a second list at 0x6a40, followed by another small kind of memset we can certainly ignore; what we are interested in is what will come out this last saved pointer.

RO11:69C6 21 3D CD         ld hl,cd3d   ; "memset" address
RO11:69C9 AF               xor a        ; …with zeroes again
RO11:69CA 22               ldi (hl),a   ;
RO11:69CB 22               ldi (hl),a   ;    on 3 bytes
RO11:69CC 77               ld (hl),a    ;

RO11:69CD E1               pop hl       ; we get back our pointer
RO11:69CE 2A               ldi a,(hl)   ; this strange value looks like the Y position of the item
RO11:69CF FE FF            cp a,ff
RO11:69D1 28 29            jr z,69fc
RO11:69D3 EA 40 CD         ld (cd40),a
RO11:69D6 47               ld b,a
RO11:69D7 2A               ldi a,(hl)   ; mmmh, and this one looks like the X
RO11:69D8 EA 41 CD         ld (cd41),a
RO11:69DB 4F               ld c,a
RO11:69DC CD 01 6A         call 6a01    ; item Y is in b and item X is in c before calling this function

Great, we just figure out that immediately after dereferencing the pointer there are two bytes for Y and X item position.

From a quick glance at this last function (11:6a01), it seems to read various information and write the return value at 0xffea. Testing was the most efficient thing to do: when there is an hidden item (already grabbed or not) the value is 0x00. If not, it is 0xff. Simple.

RO11:69DF F0 EA            ld a,(ff00+ea)   ; get 6a01 function return value (I don't know why it's there)
RO11:69E1 A7               and a
RO11:69E2 28 0C            jr z,69f0        ; we have an item (a == 0), so we jump

...

RO11:69F0 2A               ldi a,(hl)       ; if we change this value, the item change, so this is the item id
RO11:69F1 EA 3D CD         ld (cd3d),a      ; saved for later usage
RO11:69F4 2A               ldi a,(hl)       ; another extra byte after the item id, I don't know what's this yet
RO11:69F5 EA 3E CD         ld (cd3e),a      ; saved too
RO11:69F8 2A               ldi a,(hl)       ;
RO11:69F9 66               ld h,(hl)        ;   and a pointer
RO11:69FA 6F               ld l,a           ;
RO11:69FB C9               ret              ; done.

We now have a first draft of the layout:

Y | X | item-id | unknown-byte | unknown-address

After a few researches, we can deduce the byte following item id looks like some kind of "type"; indeed, there are not only hidden items in this table, there are also various stuff like generic signs (same sign which has to be repeated in various places in the same map). Hidden items seem to be identified by a 0x1d. And then the address would make sense (custom script, custom action).

Finally, we can write a sample code to extract the oil:

int map_id_addr = 0x11 * 0x4000 + 0x6a40 % 0x4000;
int index = 0;

while (rom[map_id_addr] != 0xff)
    int item_ptr_addr = 0x11 * 0x4000 + (0x6a96 + index)        % 0x4000;
    int item_addr     = 0x11 * 0x4000 + GET_ADDR(item_ptr_addr) % 0x4000;

    if (rom[item_addr + 3] == 0x1d)
        printf("[MAP #%03d] Y=%03d X=%03d item-id=%03d type=0x%02x address=0x%04x\n",
                        rom[map_id_addr],
                        rom[item_addr    ],
                        rom[item_addr + 1],
                        rom[item_addr + 2],
                        rom[item_addr + 3],
               GET_ADDR(rom[item_addr + 4]));
    index += 2;
    map_id_addr++;
}

And then we have the coveted treasure:

[MAP #051] Y=018 X=001 item-id=020 type=0x1d address=0x0522
[MAP #061] Y=012 X=018 item-id=010 type=0x1d address=0x0522
[MAP #036] Y=003 X=038 item-id=080 type=0x1d address=0x0522
[MAP #020] Y=007 X=014 item-id=080 type=0x1d address=0x0522
[MAP #104] Y=001 X=003 item-id=018 type=0x1d address=0x0522
[MAP #199] Y=015 X=021 item-id=079 type=0x1d address=0x0522
[MAP #201] Y=017 X=027 item-id=049 type=0x1d address=0x0522
[MAP #202] Y=001 X=025 item-id=019 type=0x1d address=0x0522
[MAP #146] Y=012 X=004 item-id=082 type=0x1d address=0x0522
[MAP #024] Y=014 X=001 item-id=079 type=0x1d address=0x0522
[MAP #156] Y=001 X=010 item-id=049 type=0x1d address=0x0522
[MAP #219] Y=005 X=006 item-id=053 type=0x1d address=0x0522
[MAP #210] Y=003 X=012 item-id=082 type=0x1d address=0x0522
[MAP #233] Y=015 X=002 item-id=017 type=0x1d address=0x0522
[MAP #176] Y=001 X=001 item-id=049 type=0x1d address=0x0522
[MAP #228] Y=011 X=014 item-id=040 type=0x1d address=0x0522
[MAP #227] Y=003 X=027 item-id=002 type=0x1d address=0x0522
[MAP #083] Y=016 X=017 item-id=083 type=0x1d address=0x0522
[MAP #160] Y=015 X=015 item-id=049 type=0x1d address=0x0522
[MAP #162] Y=017 X=025 item-id=002 type=0x1d address=0x0522
[MAP #165] Y=016 X=008 item-id=010 type=0x1d address=0x0522
[MAP #215] Y=009 X=001 item-id=054 type=0x1d address=0x0522
[MAP #034] Y=044 X=009 item-id=016 type=0x1d address=0x0522
[MAP #194] Y=002 X=005 item-id=002 type=0x1d address=0x0522
[MAP #111] Y=011 X=014 item-id=083 type=0x1d address=0x0522
[MAP #001] Y=004 X=014 item-id=020 type=0x1d address=0x0522
[MAP #021] Y=017 X=009 item-id=019 type=0x1d address=0x0522
[MAP #022] Y=005 X=048 item-id=029 type=0x1d address=0x0522
[MAP #023] Y=063 X=002 item-id=018 type=0x1d address=0x0522
[MAP #216] Y=009 X=001 item-id=040 type=0x1d address=0x0522
[MAP #028] Y=014 X=015 item-id=040 type=0x1d address=0x0522
[MAP #119] Y=004 X=003 item-id=016 type=0x1d address=0x0522
[MAP #121] Y=002 X=012 item-id=049 type=0x1d address=0x0522
[MAP #006] Y=015 X=048 item-id=079 type=0x1d address=0x0522
[MAP #161] Y=016 X=009 item-id=083 type=0x1d address=0x0522
[MAP #005] Y=011 X=014 item-id=081 type=0x1d address=0x0522
[MAP #003] Y=008 X=015 item-id=040 type=0x1d address=0x0522
[MAP #015] Y=003 X=040 item-id=003 type=0x1d address=0x0522

To sum up a bit how all of this is organized, here is a diagram:

centerimg

Of course, the support for those special items is added in pokanalysis (represented by a red square on the maps), and with the items names if you want to have a look.

Pokanalysis hidden items

index