The Pokémon hidden items reverse article I wrote a while ago was certainly the most interesting one in my old blog, and this is an updated rewrite.
Let's start from the beginning: I am found of the first generation Pokémon games Red, Blue and later Yellow, while I recognize the second generation is way better (from a gamer point of view at least).
The first generation really has mythes and legends around it, and it always stirred my curiosity. So when I was technically able to hack it, I did it and started developing pokanalysis. While working on it, I run into an interesting experience: how to get the list of the hidden objects lying on the ground? While it seems trivial when you have a map pointer, in fact, it isn't. Before I start, you have to know that a map pointer has an "object data" pointer, which focus on 3 lists:
- the warps (doors for instance)
- the signs
- the entities (normal people, trainers and items)
You may think the items are what we are looking for, but it's not: those items are items you can actually see (they are generally represented by a pokéball on the ground).
Catch the event
So how do we hack those hidden items? We first have to locate one of them, and trace what happens when we grab it. But in order to do so, we need to break just after the A button is pressed. While we can slow down our computer and hit escape quickly after hitting the A button, this is not efficient and reliable at all. Let's do something smarter.
According to the GB CPU
manual, buttons control is
done while reading at 0xff00
. Let's put a reading access breakpoint with
BGB on this address.
Almost immediately, we get stuck in the infinite loop that checks if a key is pressed and so we can't manually press the A button (the debugger is catching the focus every time). So let's simulate it in order to get out of this loop and see where we are driven.
Buttons I/O
This is how the octet located at 0xff00
is handled:
The neutral value here is 1; 0 is used when a button is pressed. On writing,
only bit 4 and bit 5 are affected. Those two bits allow you to choose to monitor
arrows (set 0 at bit 4) or buttons (set 0 at bit 5). And then, after a reading
at 0xff00
, the four first bits will be filled with the interesting stuff.
Let's see how it's done in Pokémon:
ROM0:015F 3E 20 ld a,20
ROM0:0161 0E 00 ld c,00
ROM0:0163 E0 00 ld (ff00+00),a ; a = 0x20 = 0b00100000 → 0b..10.... → bit 4 set to 0 → Pin 14 so arrows
ROM0:0165 F0 00 ld a,(ff00+00)
ROM0:0167 F0 00 ld a,(ff00+00)
ROM0:0169 F0 00 ld a,(ff00+00) ; a few reading cycles (this is where the debugger stopped)
ROM0:016B F0 00 ld a,(ff00+00)
ROM0:016D F0 00 ld a,(ff00+00)
ROM0:016F F0 00 ld a,(ff00+00)
ROM0:0171 2F cpl ; zeroes are becoming ones, and ones zeroes.
ROM0:0172 E6 0F and a,0f ; we only keep the first 4 bits
ROM0:0174 CB 37 swap a ; and back them up in the high nibble
ROM0:0176 47 ld b,a ; save the result in another temporary register
Now high nibble of b
contains 1 for set arrows. We are interested in the
buttons, so let's continue.
ROM0:0177 3E 10 ld a,10
ROM0:0179 E0 00 ld (ff00+00),a ; a = 0x10 = 0b00010000 → 0b..01.... → bit 5 set to 0 → Pin 15 so buttons
ROM0:017B F0 00 ld a,(ff00+00)
ROM0:017D F0 00 ld a,(ff00+00)
ROM0:017F F0 00 ld a,(ff00+00)
ROM0:0181 F0 00 ld a,(ff00+00)
ROM0:0183 F0 00 ld a,(ff00+00) ; same as previously, a few reading cycles
ROM0:0185 F0 00 ld a,(ff00+00)
ROM0:0187 F0 00 ld a,(ff00+00)
ROM0:0189 F0 00 ld a,(ff00+00)
ROM0:018B F0 00 ld a,(ff00+00)
ROM0:018D F0 00 ld a,(ff00+00)
ROM0:018F 2F cpl ; complement, just as before.
ROM0:0190 E6 0F and a,0f ; still the same, we only need the 4 buttons flags
ROM0:0192 B0 or b ; b now contains arrows (high nibble) and buttons (low nibble)
ROM0:0193 E0 F8 ld (ff00+f8),a ; save result somewhere in memory
ROM0:0195 3E 30 ld a,30
ROM0:0197 E0 00 ld (ff00+00),a ; a = 0x30 = 0b00110000 -> 0b..11.... → bit 4 and 5 set to 1 → no more monitoring
ROM0:0199 C9 ret
So now, we just need to change the content of the register a
before the cpl
instruction at ROM0:018F
.
Here we have 0xdf
in a
, or 0b11011111
. The first 4 bits are set to 1,
which means all the buttons are up. If we set the bit 0 to 0, we will simulate
a button A pressure. So we put 0xde
(0b11011110
) in the a
register. Be
careful and keep the content of the f
register while changing this value. We
can now go ahead and see what happens. Needless to say we previously focused
the player in front of the hidden item.
Tracing
We can now engage the funny part. We previously noticed the result of the
joypad key events loop is stored at 0xfff8
, and there is no other save. So we
can safely ignore everything until this address is read again. Let's put an
access breakpoint on 0xfff8
and hit F9
to run. And then we are teleported
here:
ROM3:4000 F0 F8 ld a,(ff00+f8)
ROM3:4002 FE 0F cp a,0f ; all buttons pressed
ROM3:4004 CA 3C 40 jp z,403c ; reset?
ROM3:4007 47 ld b,a
ROM3:4008 F0 B1 ld a,(ff00+b1)
ROM3:400A 5F ld e,a
ROM3:400B A8 xor b
ROM3:400C 57 ld d,a
ROM3:400D A3 and e
ROM3:400E E0 B2 ld (ff00+b2),a ; first (modified) copy
ROM3:4010 7A ld a,d
ROM3:4011 A0 and b
ROM3:4012 E0 B3 ld (ff00+b3),a ; second (modified) copy
ROM3:4014 78 ld a,b
ROM3:4015 E0 B1 ld (ff00+b1),a ; third (original) copy
ROM3:4017 FA 30 D7 ld a,(d730)
ROM3:401A CB 6F bit 5,a
ROM3:401C 20 16 jr nz,4034
ROM3:401E F0 B1 ld a,(ff00+b1)
ROM3:4020 E0 B4 ld (ff00+b4),a ; copy from the original copy…
ROM3:4022 FA 6B CD ld a,(cd6b)
ROM3:4025 A7 and a
ROM3:4026 C8 ret z
ROM3:4027 2F cpl
ROM3:4028 47 ld b,a
ROM3:4029 F0 B4 ld a,(ff00+b4)
ROM3:402B A0 and b
ROM3:402C E0 B4 ld (ff00+b4),a
ROM3:402E F0 B3 ld a,(ff00+b3)
ROM3:4030 A0 and b
ROM3:4031 E0 B3 ld (ff00+b3),a
ROM3:4033 C9 ret
It's not really important to get the meaning of this; we just need to observe a
few writing are based on the value at 0xfff8
:
0xffb1
0xffb2
0xffb3
0xffb1
stores the exact same value of 0xfff8
, and the other are "modified"
values. While we can deduce the original content of 0xfff8
from 0xffb2
and
0xffb3
, it is unlikely this will happen, so we will just ignore them.
Later, at ROM3:401E
, 0xffb1
is read again… to be written at 0xffb4
. I
didn't get the meaning of all of this from the current context, but it does not
matter: what we need to do is just watch over 0xfff8
, 0xffb1
, and now
0xffb4
.
So let's continue…
At ROM0:0F68
, a
is fetched from 0xffb4
. And this time it seems to test
whether A or B is pressed:
ROM0:0F68 F0 B4 ld a,(ff00+b4)
ROM0:0F6A E6 F3 and a,f3 ; the 4 first bits of 0xf3 are 0011, which means it is A and B buttons mask
ROM0:0F6C 20 04 jr nz,0f72
We are on the right direction, so let's continue. By the way, we can notice
0xffb3
is read again at ROM0:044D
, but I previously decided to ignore it.
Now we are at ROM0:3EAB
, where bit 0 is checked (button A, hell yeah!):
ROM0:3EBA CB 47 bit 0,a ; check button A
ROM0:3EBC 28 2C jr z,3eea ; it is set to 1, so we don't jump
ROM0:3EBE 3E 11 ld a,11 ;
ROM0:3EC0 EA 00 20 ld (2000),a ; switch to bank 11
ROM0:3EC3 E0 B8 ld (ff00+b8),a ;
ROM0:3EC5 CD A0 69 call 69a0 ; next episode :)
To confirm our hypothesis, we can put a breakpoint on ROM0:3EBE
, and disable
all the others. And indeed, we get here only when pressing the A button (even
if there is nothing in front of the player).
Storage
We can assert the 0x69a0
function is somewhat the handle_button_A()
function, so we're finally out the events maze.
This function is introduced with some kind of memset
that doesn't seem
interesting, but the following part is much more appealing:
RO11:69A0 21 EB FF ld hl,ffeb ; destination address
RO11:69A3 AF xor a ; filled with zeroes
RO11:69A4 22 ldi (hl),a ;
RO11:69A5 22 ldi (hl),a ; unrolled "memset"
RO11:69A6 22 ldi (hl),a ;
RO11:69A7 77 ld (hl),a ;
; the interesting section:
RO11:69A8 11 00 00 ld de,0000 ; this register is incremented further, let's consider it's an index
RO11:69AB 21 40 6A ld hl,6a40 ; an important base address
RO11:69AE 2A ldi a,(hl) ; loop start, reading byte per byte from this base address
RO11:69AF 47 ld b,a ; ^
RO11:69B0 FE FF cp a,ff ; |
RO11:69B2 28 48 jr z,69fc ; | 0xff marks the end of the array/list
RO11:69B4 FA 5E D3 ld a,(d35e) ; | Read at 0xd35e: this looks like the map id
RO11:69B7 B8 cp b ; | comparison with previous value: so this is a list of map IDs
RO11:69B8 28 04 jr z,69be ; |
RO11:69BA 13 inc de ; |
RO11:69BB 13 inc de ; | index += 2
RO11:69BC 18 F0 jr 69ae ; `- loop
We know there is a list of map IDs at 0x6a40
from bank 11, or at offset
0x46a40
in the ROM (0x11 * 0x4000 + 0x6a40 % 0x4000
) which is terminated
with a 0xff
.
Let's see what happens when the current map is found in the list.
RO11:69BE 21 96 6A ld hl,6a96 ; a new address
RO11:69C1 19 add hl,de ; 0x6a96 + map index
RO11:69C2 2A ldi a,(hl) ;
RO11:69C3 66 ld h,(hl) ; dereference pointer
RO11:69C4 6F ld l,a ;
RO11:69C5 E5 push hl ; save the read pointer on stack
OK, now we have a second list at 0x6a40
, followed by another small kind of
memset we can certainly ignore; what we are interested in is what will come out
this last saved pointer.
RO11:69C6 21 3D CD ld hl,cd3d ; "memset" address
RO11:69C9 AF xor a ; …with zeroes again
RO11:69CA 22 ldi (hl),a ;
RO11:69CB 22 ldi (hl),a ; on 3 bytes
RO11:69CC 77 ld (hl),a ;
RO11:69CD E1 pop hl ; we get back our pointer
RO11:69CE 2A ldi a,(hl) ; this strange value looks like the Y position of the item
RO11:69CF FE FF cp a,ff
RO11:69D1 28 29 jr z,69fc
RO11:69D3 EA 40 CD ld (cd40),a
RO11:69D6 47 ld b,a
RO11:69D7 2A ldi a,(hl) ; mmmh, and this one looks like the X
RO11:69D8 EA 41 CD ld (cd41),a
RO11:69DB 4F ld c,a
RO11:69DC CD 01 6A call 6a01 ; item Y is in b and item X is in c before calling this function
Great, we just figure out that immediately after dereferencing the pointer
there are two bytes for Y
and X
item position.
From a quick glance at this last function (11:6a01
), it seems to read various
information and write the return value at 0xffea
. Testing was the most
efficient thing to do: when there is an hidden item (already grabbed or not)
the value is 0x00
. If not, it is 0xff
. Simple.
RO11:69DF F0 EA ld a,(ff00+ea) ; get 6a01 function return value (I don't know why it's there)
RO11:69E1 A7 and a
RO11:69E2 28 0C jr z,69f0 ; we have an item (a == 0), so we jump
...
RO11:69F0 2A ldi a,(hl) ; if we change this value, the item change, so this is the item id
RO11:69F1 EA 3D CD ld (cd3d),a ; saved for later usage
RO11:69F4 2A ldi a,(hl) ; another extra byte after the item id, I don't know what's this yet
RO11:69F5 EA 3E CD ld (cd3e),a ; saved too
RO11:69F8 2A ldi a,(hl) ;
RO11:69F9 66 ld h,(hl) ; and a pointer
RO11:69FA 6F ld l,a ;
RO11:69FB C9 ret ; done.
We now have a first draft of the layout:
Y | X | item-id | unknown-byte | unknown-address
After a few researches, we can deduce the byte following item id looks like
some kind of "type"; indeed, there are not only hidden items in this table,
there are also various stuff like generic signs (same sign which has to be
repeated in various places in the same map). Hidden items seem to be identified
by a 0x1d
. And then the address would make sense (custom script, custom
action).
Finally, we can write a sample code to extract the oil:
int map_id_addr = 0x11 * 0x4000 + 0x6a40 % 0x4000;
int index = 0;
while (rom[map_id_addr] != 0xff)
int item_ptr_addr = 0x11 * 0x4000 + (0x6a96 + index) % 0x4000;
int item_addr = 0x11 * 0x4000 + GET_ADDR(item_ptr_addr) % 0x4000;
if (rom[item_addr + 3] == 0x1d)
printf("[MAP #%03d] Y=%03d X=%03d item-id=%03d type=0x%02x address=0x%04x\n",
rom[map_id_addr],
rom[item_addr ],
rom[item_addr + 1],
rom[item_addr + 2],
rom[item_addr + 3],
GET_ADDR(rom[item_addr + 4]));
index += 2;
map_id_addr++;
}
And then we have the coveted treasure:
[MAP #051] Y=018 X=001 item-id=020 type=0x1d address=0x0522
[MAP #061] Y=012 X=018 item-id=010 type=0x1d address=0x0522
[MAP #036] Y=003 X=038 item-id=080 type=0x1d address=0x0522
[MAP #020] Y=007 X=014 item-id=080 type=0x1d address=0x0522
[MAP #104] Y=001 X=003 item-id=018 type=0x1d address=0x0522
[MAP #199] Y=015 X=021 item-id=079 type=0x1d address=0x0522
[MAP #201] Y=017 X=027 item-id=049 type=0x1d address=0x0522
[MAP #202] Y=001 X=025 item-id=019 type=0x1d address=0x0522
[MAP #146] Y=012 X=004 item-id=082 type=0x1d address=0x0522
[MAP #024] Y=014 X=001 item-id=079 type=0x1d address=0x0522
[MAP #156] Y=001 X=010 item-id=049 type=0x1d address=0x0522
[MAP #219] Y=005 X=006 item-id=053 type=0x1d address=0x0522
[MAP #210] Y=003 X=012 item-id=082 type=0x1d address=0x0522
[MAP #233] Y=015 X=002 item-id=017 type=0x1d address=0x0522
[MAP #176] Y=001 X=001 item-id=049 type=0x1d address=0x0522
[MAP #228] Y=011 X=014 item-id=040 type=0x1d address=0x0522
[MAP #227] Y=003 X=027 item-id=002 type=0x1d address=0x0522
[MAP #083] Y=016 X=017 item-id=083 type=0x1d address=0x0522
[MAP #160] Y=015 X=015 item-id=049 type=0x1d address=0x0522
[MAP #162] Y=017 X=025 item-id=002 type=0x1d address=0x0522
[MAP #165] Y=016 X=008 item-id=010 type=0x1d address=0x0522
[MAP #215] Y=009 X=001 item-id=054 type=0x1d address=0x0522
[MAP #034] Y=044 X=009 item-id=016 type=0x1d address=0x0522
[MAP #194] Y=002 X=005 item-id=002 type=0x1d address=0x0522
[MAP #111] Y=011 X=014 item-id=083 type=0x1d address=0x0522
[MAP #001] Y=004 X=014 item-id=020 type=0x1d address=0x0522
[MAP #021] Y=017 X=009 item-id=019 type=0x1d address=0x0522
[MAP #022] Y=005 X=048 item-id=029 type=0x1d address=0x0522
[MAP #023] Y=063 X=002 item-id=018 type=0x1d address=0x0522
[MAP #216] Y=009 X=001 item-id=040 type=0x1d address=0x0522
[MAP #028] Y=014 X=015 item-id=040 type=0x1d address=0x0522
[MAP #119] Y=004 X=003 item-id=016 type=0x1d address=0x0522
[MAP #121] Y=002 X=012 item-id=049 type=0x1d address=0x0522
[MAP #006] Y=015 X=048 item-id=079 type=0x1d address=0x0522
[MAP #161] Y=016 X=009 item-id=083 type=0x1d address=0x0522
[MAP #005] Y=011 X=014 item-id=081 type=0x1d address=0x0522
[MAP #003] Y=008 X=015 item-id=040 type=0x1d address=0x0522
[MAP #015] Y=003 X=040 item-id=003 type=0x1d address=0x0522
To sum up a bit how all of this is organized, here is a diagram:
Of course, the support for those special items is added in pokanalysis (represented by a red square on the maps), and with the items names if you want to have a look.