A large amount of ASM hacking comes form trying to find where you need to hook from. This is very similar to the previous tutorial in which we had to find where the trainer's name was being read from (hence why I picked this for tutorial #2), though luckily for us, this time it's much less of a hassle, since the coding around this area is much better :P
First open up your ROM a Hex editor (and pick your favorite item). CTRL + F the hex name of your item + the 0xFF suffix. I will be using a burn heal, so I'm going to do "BCCFCCC800C2BFBBC6
FF". If you're unsure how to convert ASCII -> Hex look at my first tutorial for my python program + table file (or just the table file is fine). The conversion once you have that table file is quite straight forward. Please for the love of god and all things holy don't use an ASCII -> Hex converter from the internet. Unfortunately the translation between the two aren't standardized well at all.
Once you've found the offset to the start of your item's name, take note of it. Mine was burn heal which ended up having it's string located at 0x3DB2BC. OK from now on, when I refer to offsets I will use the 08 prefix in substitute for 0x which means it's in the ROM and 02 prefix to signify it's in RAM. So this burn heal offset would be 083DB2BC in our new notation.
As you can see this is a string which says "BURN HEAL" then ends in 0xFF. The rest of the name space is padded with 00s, though it can be padded with anything because generally when strings are read they're read using a while loop, like this:
Code:
while (last byte != 0xFF)
copy this byte
I suggest you just make an NPC give the player the item in his room or something, then have a save state to that part of the game (because we'll need to get here many more times).
Open VBA-SDL-H, and run your game. Make sure to get your item of choice. After you've recieved the item, you'll need to set a break point.
Set a break point at the start of the item's name for 14 bytes (or the item's name's length, doesn't matter). Unlike our regular bt [offset] break points we've been doing, this one is a little different. bt stands for break thumb, but here we just want to break to see when this text string is going to start being read. The text string isn't in thumb, it's in hex, so with "bt" the game isn't going to break. What you need to do is set a break upon read. This will break once the game starts reading data from that offset.
If you're using VBA-SDL-H the syntax of that is as follows:
bpr [offset] [length]
So for my Burn heal, that'll be, "bpr 083DB2BC 14"
Once you've done that, press "c + Enter" to continue running the game. Open the bag (in game) and locate your item. The game should break at this point.
Look at the underlined offset of the second picture. As you know, this is the previously executed command. It's loading into r2 a byte from r1, and from the circled blue part, we see that the value of r1 was 083DB2BC. By the way, did I mention 083DB2BC is the offset to the start of the string for "Burn Heal"? :P
OK, we've got a ROM address, time to break out our VBA emulator's niffy disassembler (or IDA, IDA is way better). Make sure you don't forget to open your ROM in it first though.
Once you've got VBA's disassembler opened (tools -> disassemble), in the Go box we're going to write the address of that ldrb r2, [r1, #0x0] command, which was 08008D90.
We're going to follow the same ritual we did last time. Keep on scrolling up until we find a push statement which pushes atleast the link register.
A few scrolls later we've found the start of this function!
Let's see what this function is actually doing...
Code:
MAIN:
ROM:08008D84 PUSH {LR}
ROM:08008D86 MOV R3, R0
ROM:08008D88 B SECTION @label instead of offset for readability
SECTION2:
ROM:08008D8A STRB R2, [R3]
ROM:08008D8C ADDS R3, #1
ROM:08008D8E ADDS R1, #1
SECTION:
ROM:08008D90 LDRB R2, [R1]
ROM:08008D92 MOV R0, R2
ROM:08008D94 CMP R0, #0xFF
ROM:08008D96 BNE SECTION2 @I put a label here so it's easier to read
ROM:08008D98 MOV R0, #0xFF
ROM:08008D9A STRB R0, [R3]
ROM:08008D9C MOV R0, R3
ROM:08008D9E POP {R1}
ROM:08008DA0 BX R1
It's good practice to look at the code and try make sense of what it's trying to do. I suggest that you look at it long and hard and try come up with some pseudo code for what this is trying to accomplish. Once you've done that, look at my solution (in the spoiler tag).
As you can probably tell from my pseudo code, this is a function that copies an 0xFF terminated string from from r1 into a destination defined by r0. For some reason GameFreak's code is both r2 and r3 as well, which is inefficient, but in the end it gets the job done.
In other words, we've found the game's string copy function. Remember the freebie function I didn't use from last time? Well this is it :P
Now that we know how the function works, we can see that r0 contains the destination for the string and r1 contains the pointer to the string. I.e, r0 = destination, r1 = source.
Remember, r1 at this point contained the pointer to Burn Heal's name. That implies that before this function is even called the pointer to Burn Heal's name was already found. So we need to find the function that calls this string copying function.
Can you guess how we're going to find out where this string copy function is being called from? If you guessed that we're going to set a break point to the start of this function (08008D84) then you're right.
In VBA-SDL-H, type in "bprc" to clear all break up read points set up. We wouldn't want it interrupting us. After removing the break point, close the bag and hit F11 again to enter the debugger mode. We want to put our break point to string copy now.
Since this is a thumb function/instruction, we can do "bt 08008D84".
After doing that, "c + ENTER" to continue playing the game. Open you bag and navigate to the pocket your item is in, the game should break.
Now here's the important part. We've discovered the game's string copy function. There is no doubt that this function will be called for all or most strings read directly from ROM into RAM (possibly even RAM to RAM). This means that it may break multiple times for different strings, not just our "BURN HEAL".
But FBI, how will we know when it's finally on our item. Easy. Remember that R1 contains the pointer to the source string. In my case, burn heal's string is located at 083DB2BC. I will hit c + ENTER until I see that R1 is 083DB2BC. Depending on how many items are in your bag, this may take you a few, for me it takes 2 c + ENTER cycles because burn heal is the only item in my bag (the other string, if you're curious, is "CANCEL").
So you see that the first break for me was on a pointer to 08452F60, which is definitely not burn heal. The second one (underlined in red) was a success! Now we want to find which function called on the str copy function for the success case, so we will look at the previously ran instruction (underlined in pink) in the above picture.
We've run into a problem, the previously ran instruction is "blh $0fcc" which is not the right instruction we're looking for! If you recall from last time, I said that this is actually a branch with link instruction whose first two byte haven't been interpreted by the debugger. So the real instruction is at "08008DB6" minus 2, i.e 08008DB4.
Jump back to the VBA emulater's disassembler and jump go to 08008DB4. Scroll up until you can see the whole function. Here we'll find a rather small function which calls our string copy function.
Code:
ROM:08008DA4 PUSH {LR}
ROM:08008DA6 MOV R2, R0
ROM:08008DA8 B 08008DAC
ROM:08008DAA ADDS R2, #1
ROM:08008DAC LDRB R0, [R2]
ROM:08008DAE CMP R0, #0xFF
ROM:08008DB0 BNE 08008DAA
ROM:08008DB2 MOV R0, R2
ROM:08008DB4 BL 08008D84 <-- str copy function we found is called here
ROM:08008DB8 POP {R1}
Try on your own to make sense of what's going on here. Try to develop some pseudo code to match, then look at my solution.
This is also just another while loop, but what it does is a little different. It reads an 0xFF terminated string, and finds the end. It then feeds a pointer to the end of that string (where the 0xFF is) to our string copy function as the destination. So basically, this function is concatenating two 0xFF terminated strings into one string. For example, it takes "play" and another string "ground" and turns it into "playground". A pretty neat function. It would probably mainly be used to attach a color label to strings. Like you've seen in scripting, you can add colors to strings by adding special characters to the start of the string.
Through my explanation of the function, surely you must've noticed. When str_copy (I'm going to use that name when referring to the string copy function) was called at the time of back tracking, it had r1 as the source string, "BURN HEAL" already. This small str_cat (short for string concatinator function) also doesn't modify r1 in anyway. So it's obvious then that our burn heal's string pointer was derrived before the calling of this function. This means we have to back track a little further...
Again, delete all break points you might have (using "bprc" and "bd 0") in VBA-SDL-H. We're going to set a new break point at the start of the str_cat function (08008DA4). Make sure that before you set this break point, you've already obtained the item and it's in your bag.
Set the break point and try to view your item in your bag again. If you break before seeing your item (quite likely), then take a look at r1. If r1 isn't the pointer to your item's string pointer then it's safe to skip. Skip using c + Enter, as mentioned before.
Once you get the right break, take a look at the previously executed command again. It's "blh $080a", but this time we know how to deal with that! Since this seemingly odd command happens at 08108598, then bl must've been 2 bytes prior.
Open up VBA's disassembler and go to 08108596.
We've found ourselves in a pretty big function, comparatively speaking.
Code:
ROM:08108560 PUSH {R4,R5,LR}
ROM:08108562 MOV R4, R0
ROM:08108564 LSL R1, R1, #0x10
ROM:08108566 LSR R5, R1, #0x10
ROM:08108568 LDR R0, =0xFE940000
ROM:0810856A ADDS R1, R1, R0
ROM:0810856C LSR R1, R1, #0x10
ROM:0810856E CMP R1, #1
ROM:08108570 BHI SECTION
ROM:08108572 LDR R1, =a489
ROM:08108574 MOV R0, R4
ROM:08108576 BL 08008D84 <---- STRING COPY FUNCTION
ROM:0810857A B 0810858C
----------------------- Some pointer data here
SECTION:
ROM:08108584 LDR R1, =a423
ROM:08108586 MOV R0, R4
ROM:08108588 BL 08008D84 <---- STRING COPY FUNCTION
ROM:0810858C MOV R0, R5
ROM:0810858E BL 0809A8BC <---- UNKNOWN FUNCTION
ROM:08108592 MOV R1, R0
ROM:08108594 MOV R0, R4
ROM:08108596 BL 08008DA4 <-------HERE'S WHERE OUR BREAK HAPPENED (STR CONCATENATE)
ROM:0810859A POP {R4,R5}
ROM:0810859C POP {R0}
ROM:0810859E BX R0
Alright, just from intuition, by looking at this function I can tell you that the function at 0810858E is the one which does the retrieving to the pointer to the string "BURN HEAL". While that may seem like a big jump in logic and rather rash without examining the rest of the function, I assure you that this is 100% the case. Here's the reasoning:
Remember when I was talking about parameters to ASM functions? I said that parameters, by ASM standards, are defined to be the first four low registers. If there are more than four parameters, that's a different story (the extra parameters are writing to the stack pointer). Similarly to parameters, the output from a function is also like this. Generally, if a function outputs values or pointers for other functions to use (these are often called helper functions in other programming languages), the outputs are stored into r0-r3. They are always filled in consecutive order. So if some function outputted one value, that value would be in r0. Never will you see the value in r1, r2, or r3 and not in r0. Hopefully that makes sense to you, as it's important.
As you can see near the bottom I've marked in caps where we broke from in our VBA-SDL-H session. That function we broke from is then the str_cat function, which if you remember takes in r0 and destination and in r1 a source. The source is obviously a pointer to your item's name. But if you look a couple lines up you'll see "mov r1, r0" right after the "bl" to our unknown function. What this implies is that this unknown function outputted the pointer to Burn heal's string. If you don't believe me, set a break point before at after the unknown function (so at 0810858C and 08108592) and check the value in R1. In the case that I'm right, you'll notice that r1 will contain the pointer to you item's string after and not before.
Remember this concept, as it's quite useful and it WILL save you a large chunk of work.