Oh, you’ve not hacked yet?

Newcomers to the HL2SDK often have the same question. They come from the world of HL1 where simple things, such as changing velocities and making players glow, were trivial. Not only trivial, in fact, but mod independent. This isn’t the case in HL2.

Instead, you’re faced with a nightmare: you must use CBaseEntity from a mod, but that mod has changed the virtual table layout. Calling CBaseEntity::SetHealth might work on, say, HL2MP — but not on CS:S or DoD:S. Some functions will crash, others simply won’t work. The function you need can’t be done through datamaps. What do you do?

You need to a)Find the function, b)Make a way to find it at runtime, and c)Call it. In this article, I’ll concentrate on the latter two. Finding the function is often very difficult, but it can be done if you learn the tricks. Without going into details, it requires a bit of reverse engineering. I recommend getting a copy of OllyDbg (free) or IDA (not for free) and disassembling the binary containing the function. Using literal strings and catching flows of execution, you can often find what you’re looking for quite quickly.

For an example, I wanted the function in CS:S that terminates the round. I know from experience that this will display the translated message for “#Round_Draw”. After mucking around in IDA, I find this function:
Because of the other strings at 0x2205A240, it looks like a good candidate! (In fact this is something CS:S DM uses). I’ve completed the hardest part: finding the function!

However, it’s quite likely that 0x2205a240 will change next time CS:S is recompiled. I need a way to find this function at runtime – something fysh and lance have called “signature scanning”. For today, I’ll cover only “stupid” signature scanning – a very primitive way of storing the function header and scanning for it. The idea is that every time your plugin loads, you will search the entirety of the DLL’s memory for a certain sequence of bytes (the first ~32 bytes of the function). If you find it, then you have found the address of the function. There are two important things to consider: Is the sequence unique, and could the sequence itself change at runtime?

The answer to these is usually both yes. Not because the code is self editing (although CS:S DM has a bug where it edits a function, doesn’t restore it, and then cannot find it again), but because of the way code is relocatable, offsets to other sections will definitely change. Linux is even more random in this regard: ELF libraries rarely load at the same address each time. First, lets’s do the simple part: store the first 32 bytes of the function.

#define MKSIG(name) name##_Sig, name##_SigBytes
#define RoundEnd_Sig "\x83\xEC\x18\x53\x55\x8B\xE9\x8B\x4C\x24\x28\x56\x57\x33\xFF\x8D"
#define RoundEnd_SigBytes 16

I’ve only done 16 bytes but you get the idea. Say we extend this signature to 64 bytes because it’s not unique. A common problem occurs: there is an instruction that has a high chance of changing at any time from relocatability:

jmp     ds:off_2205A4A0[eax*4]

A simple investigation of this offset shows it looks like an array of addresses. Jumping to an array of address? This must be a case table for a switch statement. Anyway, this function in the disassembly looks like:

FF 24 85 A0 A4 05 22

Since we know 0x2205A4A0 could easily change from a recompile or base relocation*, let’s introduce a wildcard character: ‘*’ (0x2A). The address in the instruction is stored in little-endian order from the end. Our signature for this instruction would look like:

#define gaben "\xFF\x24\x85\x2A\x2A\x2A\x2A"

Now, I have a complete signature. Tomorrow: how to find the DLL in memory, and how to scan for the address!

*– For those paying attention, that address is actually with the code. It has no chance of being relocated. However it could definitely change with a recompile. A better example would be an instruction going to the .data section or an E8 eip-relative call, or one of those string offsets.

Leave a Reply

You must be logged in to post a comment.