Archive for October, 2005

Decompiling AMX Plugins, Part 7

Sunday, October 30th, 2005

Let’s continue decompiling client_prethink of hackmod.amx. If you recall, last time we were able to glean code something like this:

new g_PlayerArray[33]

public client_prethink(id)
{
   if (get_cvar_num("hackmod_active") != 0)
      return PLUGIN_CONTINUE
   if (!is_user_alive(id))
      return PLUGIN_CONTINUE

   if (g_PlayerArray[id])
   {
      //Backtracking a bit, we have two mysterious vars created before 
      //the other two.  I excluded these because they weren't used last time:
      //0xC178      PUSH.C               0x270F
      //0xC18C      PUSH.C                  0x0
      //0xC1A0      STACK                 -0x80
      //0xC1BC      PUSH.C                  0x0

      new var3 = 9999 //FRM-0x04
      new var4        //FRM-0x08
      new var1[32]    //FRM-0x88
      new var2        //FRM-0x8C

      get_players(var1, var2, "ae", get_user_team(id)==1 ? "TERRORIST" : "CT")

      for (new i=0; i<var2; i++)
      {
      }
   }
}

Let’s continue with the delicious, meaty internals of this for loop:

 0xC2CC      ADDR.alt              -0x88
 0xC2D4      LOAD.S.pri            -0x90
 0xC2DC      BOUNDS                 0x1F

FRM-0x88
is the address of the var1 array.
FRM-0x90
is the
i
variable. Note that the compiler is indexing this array without temporarily storing it. That means the author did this:
var1[i]

Rather than this:
new player
...
player = var1[i]

What’s the result of not storing it in a temporary variable? The compiler’s poor optimization reindexes the array on each usage. You’ve just used up three-five instructions for something that should take zero or one.

if (can_see(id, var1[i]))
{
   //This was mapped as func_04, which is clearly a stock.
   //disassembling it and looking in VexdUM_stoc.inc will show you that it's
   //get_entity_distance()
   //note that this branches to jump_0421 as well as if the parent if case fails.
   //this must mean that the if case has no alternative branches,
   //otherwise it would jump somewhere else (Small has no gotos).
   if (get_entity_distance(id, var1[i]) < = var3)
   {
      //Note this time that var1[i] is recalculated, as is the entire
      // get_entity_distance function! for performance reasons, this should always
      // be saved in a temporary variable.
      //0xC3B0      STOR.S.pri             -0x4
      //Also note, STOR.S.pri is storing the result back into var3.
      var3 = get_entity_distance(id, var1[i])
      //Finally, we see what the mysterious var4 is for... it's storing a player index 
      //guess? the aimbot is storing the closest player.
      var4 = var1[i]
   }
}
//0xC3E8      JUMP              jump_0423 ; target:jump_0421
//This is the target of both cases of both if statements.  
//Everything ends up here, and jumps back to jump_0423, which is...
//0xC294      LINE             line 0x543 ; target:jump_0423
//The start of the for loop! We're finished.

A final look at the for loop:

public client_prethink(id)
{
   if (g_PlayerArray[id])
   {
      new var3 = 9999 //FRM-0x04
      new var4        //FRM-0x08
      new var1[32]    //FRM-0x88
      new var2        //FRM-0x8C

      get_players(var1, var2, "ae", get_user_team(id)==1 ? "TERRORIST" : "CT")

      for (new i=0; i<var2 ; i++)
      {
         if (!can_see(id, var1[i]))
            continue
         if (get_entity_distance(id, var1[i]) <= var3)
         {
            var3 = get_entity_distance(id, var1[i])
            var4 = var1[i]
         }
      }
      //CODE
   }
}

Let’s continue on, from where I wrote “CODE”:

if (var4 < = 0)
{
   //I marked these as Float: because of the natives they're used in
   new Float:local1[3]   //FRM-0x98
   new Float:local2[3]   //FRM-0xA4
   
   //Next we encounter 0x1B being passed to entity_get_int.
   //Which number in the enum set is this?  27, or EV_INT_flags
   //Next we have flags 0x4000, which is (1< <14), or FL_DUCKING
   //Furthermore, note how the compiler trickily compiles this if block.
   //It can be split into: if ( (EXPR) && (EXPR) )
   //The compiler makes each EXPR into a mini-if case.  If all cases are correct,
   // it sets PRI to 1.  Otherwise, it breaks out, setting PRI to 0.
   //A final case checks whether PRI is 1 or 0 to make the statement work.
   if ((entity_get_int(var4, EV_INT_flags) & FL_DUCKING) && (entity_get_int(var4, EV_INT_bInDuck)))
   {
      //Here, we see the compiler calculating the second cell in local1,
      // saving the address on the stack, adding the value to 0x40E00000 (7.0),
      // then popping the address and storing the result into it.
      local1[2] += 7.0
      //Here, we just have simple storage
      //Note the compiler calculates the address in PRI because of ADD.C
      // only being able to work with one register.  It then moves PRI to ALT
      // in order to use STOR.I.
      local2[1] = 0.2
      //BRANCH TO jump_0429
   } else {  //BRANCH TO jump_0428
      if (var3 < 800)
         local1[2] += 2.0
   }
   //jump_0429
   if (var3 >= 800)
   {
      //Make sure to keep track of the stack on this one!
      //With only two registers, the compiler constantly saves
      //PRI and alt, often simply to switch them around later.
      local1[2] -= var3/1600
   }
   set_aim(id, var4, local2)
   if (g_PlayerArray2[id])
   {
      new clip     //FRM-0xA8
      new ammo     //FRM-0xAC
      new weapon     //FRM-0xB0
      
      weapon = get_user_weapon(id, ammo, clip)
      
      //Note the same trick is used s the previous double if statement.
      //g_PlayerArray3 is 0x3D4
      if (clip <= 0 && !g_PlayerArray3[id])
      {
         //Yet _another_ global array is referenced here.
         //0x140
         //Note the indexing...
         //PRI = &g_PlayerArray[id]
         //ALT = PRI
         //PRI = *PRI
         //PRI += ALT
         //That is how multi-dimensional arrays work in Small/Pawn;
         // the sub dimension is offset from the first, so g[A][B]
         // is located at (&g[A] + g[A] + B)
         //There is no B here, so B=0
         g_PlayerArray4[id][0] = var4
         g_PlayerArray4[id][1] = 0
         g_PlayerArray3[id] = 1
         //this is probably an array of weapons...
         //maybe reload times for each one?
         set_task(g_WeaponArray[local5], "reset_wait_shoot", 1000+id)
      }
   }
}

Phew! That was a lot. We couldn't quite make it to CASETBL today, but covered multi-part if statements and multidimensional arrays. A good portion of it made little to no sense. What's the point of local1 and local2? What is reset_wait_shoot and set_aim()? My guess is the aimbot turns off temporarily while you are reloading. I can't quite fathom what the unused local variable is for, but it's possible I made an error in the disassembly.

Next time: Since this portion of client_PreThink is finished, we'll disassemble reset_wait_shoot and set_aim. The full function so far is below.

new g_PlayerArray[33]
new g_PlayerArray3[33]
new g_PlayerArray4[33]
new g_WeaponArray[30]

public client_prethink(id)
{
   if (get_cvar_num("hackmod_active") != 0)
      return PLUGIN_CONTINUE
   if (!is_user_alive(id))
      return PLUGIN_CONTINUE

   if (g_PlayerArray[id])
   {
      new var3 = 9999, var4, var1[32], var2

      get_players(var1, var2, "ae", get_user_team(id)==1 ? "TERRORIST" : "CT")
      for (new i=0; i<var2 ; i++)
      {
         if (!can_see(id, var1[i]))
            continue
         if (get_entity_distance(id, var1[i]) <= var3)
         {
            var3 = get_entity_distance(id, var1[i])
            var4 = var1[i]
         }
      }
      if (var4 <= 0)
      {
         new Float:local1[3], Float:local2[3]
   
         if ((entity_get_int(var4, EV_INT_flags) & FL_DUCKING) && (entity_get_int(var4, EV_INT_bInDuck)))
         {
            local1[2] += 7.0
            local2[1] = 0.2
         } else {
            if (var3 < 800)
               local1[2] += 2.0
         }
         if (var3 >= 800)
         {
            local1[2] -= var3/1600
         }
         set_aim(id, var4, local2)
         if (g_PlayerArray2[id])
         {
            new clip, ammo, weapon
      
            weapon = get_user_weapon(id, ammo, clip)
      
            if (clip <= 0 && !g_PlayerArray3[id])
            {
               g_PlayerArray4[id][0] = var4
               g_PlayerArray4[id][1] = 0
               g_PlayerArray3[id] = 1
               set_task(g_WeaponArray[weapon], "reset_wait_shoot", 1000+id)
            }
         }
      }
   }
}

HLDM Match

Friday, October 28th, 2005

I have been playing CS/HLDM for the past day or two for the first time in quite a while – I forgot how addicting games could be, as I’ve not had time to play them. When I made note of this in #amxmodx, hullu (aka “evilspy”, the man behind Metamod-P) challenged me to an HLDM match.

HLDM was the first real FPS I played online, and from the first day until now my favorite weapon of any game has been the tau cannon, and my favorite map of any game has been crossfire. Naturally, I couldn’t refuse a battle of the Metamod developers. I put up a server (nope – not even running Metamod!) and we played.

Hullu was pretty good. On crossfire he narrowly outplayed me and on stalkyard he mopped the terrain with my corpses. All in all he out shotgun’d, crossbow’d, MP5′d, and tau’d me. Obviously, this means Metamod-P is better than Metamod.

At various points a few other #amxmodx idlers joined in – Bloodmist and Greentryst. Screenshots:
Crossfire
Crossfire
Stalkyard

I have to admit, the best part was HUD messages like these:

Decompiling AMX Plugins, Part 6

Friday, October 28th, 2005

Two days late, here is the next installment of the decompiling/disassembly article. Today I’ve chosen to decompile a portion of a real life closed source plugin. This plugin is “Hack Mod” by [email protected] I chose this plugin because a user who requested the source code as required by the GPL was denied access. The original link is here, or you can download my mirror of the binary:

www.bailopan.net/hackmod.amx

The first thing you should notice when loading this plugin into the disassembler is that it’s large: 9,200 lines of disassembly. Given about 5-15 lines of disassembly per actual line of source, it’s safe to estimate this plugin at roughly around 1,000 lines of code.

The second thing you should notice is the set of odd, unnamed functions at the top of the script. The first one is this:

 0x8         PROC                        ; func_00
 0x18        PUSH.S                 0x10  ;2nd parameter of function
 0x20        PUSH.C                  0x4  ;4bytes=1parameter
 0x28        SYSREQ.C              float
 0x30        STACK                   0x8
 0x38        PUSH.pri                     ;push result of float()
 0x3C        PUSH.S                  0xC  ;push first parameter
 0x44        PUSH.C                  0x8  ;bytes=2parameters
 0x4C        SYSREQ.C           floatmul
 0x54        STACK                   0xC
 0x5C        RETN                         ;return floatmul() result
; natives used:
native Float:float(value);
native Float:floatmul(Float:oper1, Float:oper2);

This is a stock function. Stocks are only compiled if they are used, and since they occur in the .inc files, they are generally first in the code. But, which stock is it? Using our native definitions, we can disassemble this to:

stock stock001(param1, param2)
   return floatmul(param1, float(param2))

Let’s correct this by adding in the tags these natives return/require:

stock Float:stock001(Float:param1, param2)
   return floatmul(param1, float(param2))

This function inputs a float and a non-float and multiplies them. Which stock is this? Look in

float.inc
:

stock Float:operator*(Float:oper1, oper2)
	return floatmul(oper1, float(oper2)); /* "*" is commutative */

This is the code for multiplying a float with a non-float. Although with MMX this can be done in a few very fast instructions, Small requires a stock and two natives. Floats in Small are expensive. Having no datatypes other than the cell, tags do not enforce any type rules; rather, they let you overload operators and simulate type checking with “tag checking”. Since floats are not native, and simply wrap around tagging, you unfortunately cannot optimize them at the VM level unless new instructions are added and the compiler has support for them. (Since the compiler already has special support for floats, I do not think this would break the Small design).

Now, let’s take a quick stroll through the strings in this plugin. Most of these things look pretty generic; maxspeeds, gravity, hit boxes, respawning, et cetera. However, this plugin also does some aimbot tricks I’d like to know about. From browsing the public functions, the author made the standard mistake of naming a private function as public: “set_aim”. This is a red flag, and it’s called from client_prethink. The client_PreThink forward was added in AMX Mod X 0.10, and presumably duplicated in a later version of VexD (with different capitalization, of course, to distort origins). This forward is very expensive – it’s called once per server frame for each client. That means, 32 times per frame, your server could be jumping from native code to a Small plugin.

At COD address 0xC0A0 you will see the

PROC
for client_prethink in hackmod.amx. It starts out fairly simply:

public client_prethink(id)
{
   if (get_cvar_num("hackmod_active") != 0)
      return PLUGIN_CONTINUE
   if (!is_user_alive(id))
      return PLUGIN_CONTINUE
}

Next, we come across a new instruction:

 0xC148      CONST.alt             0x2CC
 0xC150      LOAD.S.pri              0xC
 0xC160      LIDX      
 0xC164      JZER              jump_0416

LIDX
is one of the instructions used to access arrays. Using
ALT
as the base address, it gets the cell located at the address specified at
ALT+(PRI*cellsize)
. The address 0x2CC is clearly something the disassembler couldn’t quite figure out how to analyze: it sees it as an 825 cell array. Since this array is indexed by the “id” parameter, let’s assume it’s just an array for players. Then we have:

   if (g_PlayerArray[id])
   {
      new var1[32], var2

      //Note that the branch taken by get_user_team() looks like:
      // 0xC224      JZER              jump_0417
      // 0xC22C      CONST.pri            0xA44C
      // 0xC234      JUMP              jump_0418
      // 0xC23C      CONST.pri            0xA458 ; target:jump_0417
      //Since it only serves to switch between two PRI values, my conclusion
      // is that it was embedded in a ternary operator.  This makes sense because
      // the value of PRI is immediately pushed into the next function call.
      get_players(var1, var2, "ae", get_user_team(id)==1 ? "TERRORIST" : "CT")
      //CODE
   } else {  //BRANCHED to jump_0416
   }

When we get to the marker where I wrote “CODE”, there’s some tricky jumps. It jumps past a single opcode, then jumps back to before this opcode. I.e., an initial value is set, then the code iterates until a condition is no longer met. Each iteration, that originally skipped block of code is run. We’ve encountered a

for
-loop. To make this easier to follow, I’ve broken up the loop specifics:

 
for (
   // 0xC284      PUSH.C                  0x0
   // 0xC28C      JUMP              jump_0419
 new i = 0;
   // 0xC2A8      LOAD.S.pri            -0x90 ; target:jump_0419
   // 0xC2B0      LOAD.S.alt            -0x8C
   // 0xC2B8      JSGEQ             jump_0420
 i < var2;
   // 0xC294      LINE             line 0x543 ; target:jump_0423
   // 0xC2A0      INC.S                 -0x90
 i++)
{
  //LOOP CODE
}

What is the compiler doing? It's pushing a

0
onto the stack (this is the same as
STACK -4
, except it doesn't have to explicitly
FILL
and
ZERO
it). This puts the loop value at address FRM-0x90. Then it jumps past the increment step and does the comparison step. If the value of FRM-0x90 is less than FRM-0x8C (i, var2), then it will continue. It continues by jumping back to the increment+comparison step, which we can see will be a jump to
jump_0423
.

Next time: we'll continue to disassemble client_prethink and encounter the hideous

CASETBL
instruction.

Stupid Crash Bug 2

Tuesday, October 25th, 2005

I spent a good two hours trying to narrow down a crash in CS:S DM on Windows. Unhelpfully, it would only crash in Release mode, making debugging very difficult.

Finally I narrowed it down to the value of the

esi
register being corrupted, instead of being saved, by a call. Why? Dig up the “dropgate.asm” file I posted in the first Non-Virtual Function Hooking article:

	push	edi
	push	esi
; [...]
	pop	edi
	pop	esi

Oops. Right intention, but wrong order. The registers were essentially being swapped, and after the call, their values were completely wrong.

Lesson learned from this bug: Always, always, verify that you are restoring registers in the correct, reverse order.

Stupid Crash Bug

Monday, October 24th, 2005

I spent a few hours debugging this line of code. It was crashing, and I was debugging it at the assembly level assuming it was a SIGSEGV. It was so confused I began to think it was a code generation error.

int r = rand() % gpGlobals->maxClients;

Of course, everything looked fine. Finally, I realized it wasn’t a SIGSEGV, but a divide by zero. Modulus operator is an “IDIV” instruction, and when there are zero players, this will crash.

Take two lessons from this: always read the error messages, and if you’re making mistakes like this, it’s time to stop coding for a few hours.

Disassembly series will return on Wednesday.

Decompiling AMX Plugins, Part 5

Monday, October 17th, 2005

I’m skipping order a bit to take a break. First, Wraith has released a shiny new version of AMXReader! This version is more modular and organized, supports new Small 3.0 opcodes, and can read AMX Mod X 1.60 plugins. Lastly, there’s a binary only download now.

Today I’d like to look back on what we’ve accomplished so far, and note some rules from places where optimization was obvious.

Rule #1: Only calculate something once!
We saw in sample.amx that

get_user_team
was called in each
if
case. We also saw in the disassembly that this added a good deal of extra instructions, not counting the hundreds of actual processor instructions that occur to make SYSREQ.C jump from AMX to the C native, and run the native, the actual native function itself, and then the return to caller. It is an expensive operation, and it could be possible that
get_user_team
itself is very expensive (although it’s not).

The rule of thumb is to calculate once and save the result, because memory lookup is ALWAYS faster:

new team = get_user_team(id)
if (team == 1) 
//...
else if (team == 2)
//...

Rule #2: Watch array limits!
Small itself checks array bounds for direct accesses, but natives don’t have to, and usually don’t even try. For example:

new str[]="60"
new array[32]
array[str_to_num(str)] = 5

Will generate an AMX_ERR_BOUNDS error. However, this will not:

new gaben[60]
new array[1]
copy(array, 60, "Hello!")

Obviously, I meant to copy to

gaben
, but instead I’ve copied into
array
. Remember the stack layout before? Each variable is laid one top of the other on the stack. The
copy
function will simply copy 60 cells with no bounds checking, writing an ‘H’ into
array
, but filling
gaben
with the rest. This leads to all sorts of whacky errors and often crashes. The lesson is: always pass a buffer at least one less than the maximum size! An example of this problem in action is a mistakenly filed bug report from a user seeing a totally different variable being clipped by a string terminator.

Rule #3: If you’re going to repeat something, cache it.
This rule extends from #1. For example, we noticed that the DAT section is not optimized. You can repeat a string a 800 times and the DAT section will have an entry for each repetition. The solution?

//BAD
#define COMMON_STRING   "Gaben"
//GOOD
new COMMON_STRING[] = "Gaben"

This way, DAT will only have one reference. Furthermore, it might even be faster because in the case of non-const the compiler won’t have to copy it into the heap first. In many cases you know it will never be modified, and the native declaration simply forgot a ‘const’ keyword.

Another example:

//BAD
stock GetWeaponName(id, name[], max)
{
   if (id==0)
      copy(name, max, "weapon_none")
   else if (id==1)
      copy(name, max, "weapon_gaben")
   //....
}
//GOOD
new g_WeaponNames[] = {
     "weapon_none", //0
    "weapon_gaben", //1 
}

A lookup table is near-instantaneous execution. The compiler simply has to add an integer to a base memory offset. Calling a function is expensive — branch prediction, cache misses, more instructions can add up very quickly. If you’re using the function very often, it’s a good idea to take the time to make these trivial and worth-while optimizations. You can do lookup tables in Small for anything which inputs an integer and outputs any other data type, as long as the set of inputs has fixed limits. It is okay to build the lookup table at runtime, as long as you don’t need to do it often (or building it often outweighs the cost of computing entries totally dynamically).

4. You don’t need to make everything public.
Many people simply prepend “public” to every single function. You don’t need to do this. While it only adds a few bytes (it must store the name of the function), it’s bad design. Public means “external,” or “visible to everyone”. Private functions should not have this keyword. In the “sample.amx” example, “func_00″ was private, as it should have been.

5. You don’t need to return PLUGIN_CONTINUE and PLUGIN_HANDLED every time.
The compiler will automatically return 0 for you. However, because people tend to not obey #4, they also tend to randomly return either value at the end of every single function. This is not necessary. The rules for returning are as follows:
Explicit return – a return where you return a value (return X;)
Implicit return – a return where you specify no value (return;)
If you specify an explicit return, you must specificy an explicit return for each control path.
If you specify an implicit return, the compiler will return 0 for every control path.
If you specify no returns, the compiler will return 0 for every control path.

Lastly, the majority of public forwards do nothing with the return value. For example, returning an explicit value in client_disconnect, a set_task, or plugin_init will do absolutely nothing.

Decompiling AMX Plugins, Part 4

Sunday, October 16th, 2005

So far, we know two things: there is a registered cvar called pl_enabled, and if it’s set to 1 during client_authorized, a “display” timer is set for 10 seconds.

Luckily, the next function is the “display” function:

 0x178       PROC                        ; display
 0x188       PUSH.S                  0xC
 0x190       PUSH.C                  0x4
 0x198       CALL                func_00

We’ve come across our first local procedure call – a private function whose name was not stored, only identifiable by “func_00″ and its address. We know from the final PUSH.C that this function is only receiving one parameter. That one parameter is the first variable on the stack – the id passed to the display() function. Remember that if set_task does not have an array set, it will simply pass its task id to the timed function.

Let’s jump down to func_00 and see what it does:

 0x1E4       PROC                        ; func_00
 0x1F4       STACK                 -0x80
 0x1FC       ZERO.pri  
 0x200       ADDR.alt              -0x80
 0x208       FILL                   0x80
 0x210       STACK                 -0x30
 0x218       ZERO.pri  
 0x21C       ADDR.alt              -0xB0
 0x224       FILL                   0x30

We already know this function takes at least one parameter (since that’s how many was pushed to it). So, what’s up with these weird instructions?

The STACK instruction simply moves the stack pointer – negative (downwards) effectively allocates, and positive (upwards) deallocates. Each STACK instruction is meant to either reserve or unreserve local memory. -0×80 is allocating 0×80 bytes on the stack. That’s 128 in decimal, or divided by the cellsize (4), 32 cells. The next three instructions are important. ZERO.pri is obvious; the PRI register is zeroed. ADDR.alt is our second FRM relative instruction. It sets the ALT register to an address on the local stack frame. In this case, the stack pointer has been lowered by 0×80 bytes, so it makes sense that the starting address of the reserved block is FRM – 0×80. The FILL instruction then fills the memory at ALT with the number in PRI, for N bytes. So, in summary, a variable of 32 cells has been created on the stack, then overwritten with the number in PRI, which was zero. The concept of stack storage is made difficult to understand because it’s top-down, so I made a quick diagram:

The next four instructions are exactly the same, essentially. This time, only 12 cells are reserved (0×30, or 48 bytes). Since the stack pointer is now at FRM-0×80, it must be moved down to (FRM-0×80)-0×30, or FRM-0xB0. Thus, ADDR.alt sets the FILL address to -0xB0, then FILL zeroes the 12 cells. We can now write a prologue to this function:

func_00(id)
{
   new var1[32]
   new var2[12]
}

Note – we didn’t need an

={0,...}
after each variable. The compiler zeroes out variables automatically. For local functions, this is a large performance hit, but it makes scripting much easier! Onto the rest of the function:

 0x238       PUSH.C                 0x1F
 0x240       PUSHADDR              -0x80
 0x248       PUSH.S                  0xC
 0x250       PUSH.C                  0xC
 0x258       SYSREQ.C      get_user_name
 0x260       STACK                  0x10
 0x274       PUSH.C                  0x0
 0x27C       CONST.pri             0x110
 0x284       HEAP                    0x4
 0x28C       MOVS                    0x4
 0x294       PUSH.alt  
 0x298       PUSH.S                  0xC
 0x2A0       PUSH.C                  0xC
 0x2A8       SYSREQ.C      get_user_team
//get_user_name(index, name[], len)
//get_user_team ( index, team[]="", len=0)

The first native is pretty straighforward. PUSH.c is passing 0x1F or 31, the number of bytes the string can hold. PUSHADDR is another FRM relative instruction, and it’s pushing the address for FRM-0×80, which conveniently is var1[32]. PUSH.S, like we’ve seen before, is pushing a passed parameter; again this is the first one given to the function (at this point we’re assuming it’s the player id). The second native clearly uses the familiar trick for temporary storage. Now we have:

func_00(id)
{
   new var1[32]
   new var2[12]

   get_user_name(id, var1, 31)
   get_user_team(id)
}

Directly after, we get a surprise:

 0x2C0       EQ.C.pri                0x1
 0x2C8       JZER              jump_0001

EQ.C.pri checks if the value in PRI is equal to the opcod parameter (in this case, 0×1). If true, PRI is set to 1, otherwise, PRI is set to 0. This branch is occuring if PRI is zero. The result from SYSREQ.C (get_user_team) was stored in PRI, not in a temporary variable, so the return value of the function must be used directly in the branch. In summary, if PRI is not equal to 1, the branch will be taken. Therefore:

func_00(id)
{
   new var1[32]
   new var2[12]

   get_user_name(id, var1, 31)
   if (get_user_team(id) == 1)
   {
      //conditional code
   }
   //branched code (jump_0001)
}

Reading on for the conditional code, which is everything up to jump_0001:

 0x2DC       PUSH.C                0x114
 0x2E4       PUSH.C                  0xB
 0x2EC       PUSHADDR              -0xB0
 0x2F4       PUSH.C                  0xC
 0x2FC       SYSREQ.C               copy
 0x304       STACK                  0x10
 0x30C       JUMP              jump_0002

This is a simple call to copy() – we’re passing in DAT address 0×114, which AMXReader tells us is the string “T”. 0xB is 11 bytes, and FRM-0xB0 is the address of var2. Then, we’re jumping to jump_0002. Wait, jumping? Wouldn’t we simply move on to the branched code? In fact, that is exactly what’s happening. In an IF…ENDIF block one conditional is checked, and at most one branch is taken. In an IF/ELSE/ENDIF block, each conditional block of code must branch past the remaining conditional checks in order to move on. So, at this point, we can guess the function is this:

func_00(id)
{
   new var1[32]
   new var2[12]

   get_user_name(id, var1, 31)
   if (get_user_team(id) == 1)
   {
      copy(var2, 11, "T")
   } else[?] { //branched code (jump_0001)
      //[un?]conditional code
   }
   //branched code (jump_0002)
}

Skipping past that branch, we arrive at the location for jump_0001:

 0x354       SYSREQ.C      get_user_team
 0x36C       EQ.C.pri                0x2
 0x374       JZER              jump_0002

Here we see the same exact code as above for get_user_team, and another conditional branch. If the result is not 0×2, the branch will be taken. This is another conditional block, meaning an ‘else if’ rather than an ‘else’ statement. Furthermore, since the result of the function was recalculated again, we know the comparison occurred inside the statement. The code occuring continuing on until jump_0002 is trivial so I’ve included it.

func_00(id)
{
   new var1[32]
   new var2[12]

   get_user_name(id, var1, 31)
   if (get_user_team(id) == 1)
   {
      copy(var2, 11, "T")
      //branch to jump_0002 to skip next block
   } else if (get_user_team(id) == 2) { //branched code (jump_0001)
      copy(var2, 11, "CT")
   }
   //branched code (jump_0002)
}

Finally, all branches have been resolved, and we arrive at jump_0002:

 0x3C4       PUSHADDR              -0x80
 0x3CC       PUSHADDR              -0xB0
 0x3D4       PUSH.C                0x128
 0x3DC       PUSH.S                  0xC
 0x3E4       PUSH.C                 0x10
 0x3EC       SYSREQ.C         client_cmd

The only difficult part of this is remembering that PUSHADDR is local to FRM, and thus -0×80 is var1 and -0xB0 is var2. The string itself you can find in AMXReader’s DAT viewer. Note the function cleans up its stack by resetting the stack pointer – it adds (unreserving) 0xB0, which is the number of bytes it allocated. Lastly, note that the function uses RETN, which cleans up its own caller’s stack usage for parameters.

We’ll actually return to the display function now, which we see contains a trivially decodable client_print function. It did not use STACK after CALL because the local call cleaned up automatically. That is one of the reasons the number of parameters is pushed before each procedure call: RETN can pop that value, and use it to correct the stack before returning.

Finally, we have the entire source code to this plugin:

public plugin_init()
{
   register_plugin("Hello", "1.0", "BAILOPAN")
   register_cvar("pl_enabled", "1")
}

public client_authorized(id)
{
   if (!get_cvar_num("pl_enabled"))
      return
   set_task(10.0, "display", id)
   return
}

public display(id)
{
   func_00(id)
   //Note - we saw PUSH.C 3, the constant 3 is 'print_chat'
   client_print(id, print_chat, "[HELLO] Hello.")
}

func_00(id)
{
   new var1[32]
   new var2[12]

   get_user_name(id, var1, 31)
   if (get_user_team(id) == 1)
   {
      copy(var2, 11, "T")
   } else if (get_user_team(id) == 2) { //branched code (jump_0001)
      copy(var2, 11, "CT")
   }
   client_cmd(id, "[%s] %s", var2, var1)
}

Finally, we’ve disassembled a rather simple and trivial plugin. While it seems like a long process, eventually it becomes very easy to see patterns compilers use. Like anything, decompiling takes practice. For more practice, I’ll be disassembling a small portion of a real closed source plugin in the next article.

Bonus question: Why does the stack grow downwards? Isn’t it easier and less confusing for it to grow upwards?

Decompiling AMX Plugins, Part 3

Saturday, October 15th, 2005

Last time, we saw how to decompile a simple native back to source code. Let’s finish up the first procedure we were working on, then move on.

The next mini-block of disassembly in the plugin_init function:

 0x54        PUSH.C                  0x0
 0x5C        PUSH.C                  0x0
 0x64        PUSH.C                 0x78
 0x6C        PUSH.C                 0x4C
 0x74        PUSH.C                 0x10
 0x7C        SYSREQ.C      register_cvar

Looking at the native prototype:

register_cvar(const name[],const string[],flags = 0,Float:fvalue = 0.0);

The two optional parameters are always pushed – they’re not excluded even though they’re optional. Since it pushed two zeroes, we can guess that the author used the defaults. Checking the DAT section for those offsets:

So, plugin_init looks like:

public plugin_init()
{
    register_plugin("Hello", "1.0", "BAILOPAN")
    register_cvar("pl_enabled", "1")
}

The rest is the epilogue:

 0x8C        ZERO.pri  
 0x90        RETN

Since the PRI register handles return values, this means that the function is returning 0.

Now, let’s take the next public function:

 0x94        PROC                        ; client_authorized
 0xA4        PUSH.C                 0x80
 0xAC        PUSH.C                  0x4
 0xB4        SYSREQ.C       get_cvar_num

In our string table, 0×80 is “pl_enabled”. But wait – this string already occurred earlier! It’s clear that Small 2.7.3 does not optimize memory usage, and it simply repeats strings in memory rather than combining them. The result of get_cvar_num, like all procedure calls, will be stored in PRI. Next:

 0xC4        NOT       
 0xC8        JZER              jump_0000
 0xDC        ZERO.pri 

The compiler is inverting the boolean value of PRI (PRI = !PRI), then testing if it’s zero (Jump on ZERo). If the value is zero, it will branch off into another direction. Otherwise, it will continue. NOT+JZER exists as one instruction, JNZ — more evidence that the compiler only does trivial optimization. So the code should be read as ‘Jump if the return value of get_cvar_num() is not zero’. The branch then looks like this:

public client_authorized(id)
{
   if (get_cvar_num("pl_enabled") != 0)
   {
       //Conditional code
   }
   //branch location
}

From the disassembler’s lower right pane, we can see that jump_0000 is location 0xE4. Up to that, we have:

 0xDC        ZERO.pri  
 0xE0        RETN      

Therefore, this procedure probably says:

public client_authorized(id)
{
   if (!get_cvar_num("pl_enabled"))
      return PLUGIN_CONTINUE
   //branched code
}

Now, let’s finish this function off by looking at the branched code.

 0xF0        PUSH.C                  0x0
 0xF8        CONST.pri              0xD0
 0x100       HEAP                    0x4
 0x108       MOVS                    0x4
 0x110       PUSH.alt  
 0x114       PUSH.C                  0x0
 0x11C       CONST.pri              0xCC
 0x124       HEAP                    0x4
 0x12C       MOVS                    0x4
 0x134       PUSH.alt  
 0x138       PUSH.S                  0xC
 0x140       PUSH.C                 0xAC
 0x148       PUSH.C           0x41200000
 0x150       PUSH.C                 0x1C
 0x158       SYSREQ.C           set_task
//set_task reference:
set_task(Float:time,const function[],id = 0,parameter[]="",len = 0,flags[]="", repeat = 0)

The first push is 0. That takes care of repeat.

Next, we have 0xD0 being stored in PRI. The disassembler seems confused at this — it has a mysterious ‘Array[2]‘ entry in DAT, which starts at 0xCC. therefore, 0xD0 is Array[1]. The array is ‘empty’. Next, the HEAP 4 instruction is allocating four bytes on the heap and storing the address in ALT. MOVS 4 is copying those four bytes from PRI (the DAT address) into ALT (the HEA address). The address of ALT is then pushed, which can only be the flags parameter. What just happened?

The array that the disassembler couldn’t figure out was the compiler storing two strings of zero length (empty strings). The default parameter to flags is empty, but we can’t pass in a DAT offset to something non-const, as it could be modified. So the compiler is allocating temporary memory in the heap, copying the string, and passing the address. At the end of the function, the heap will be restored. This trick is used another time, and we can eliminate parameter and len, until this appears:

0x138       PUSH.S                  0xC

This is the first FRM offset instruction we’ve seen. The FRM is relative to each procedure, so each procedure has a baseline for where its local stack starts and the parent’s stack ends. CALL+PROC add two entries onto the stack, and the final PUSH.C which declares the parameter width is a third entry. Therefore, parameters always start at the 12th bytes (0xC) for local calls. This instruction is pushing the first parameter, which for

client_authorized
, is ‘id’. To try to make this clearer, I made a diagram:

Next is simple – look up the offset 0xAC for string “display”. The next number is Float:time – but it’s disassembled as 0×41200000! Floats are stored in a format called IEEE Single/Double Precision. To decode it, I’ve used a simple C program:

#include <stdio .h>

typedef union
{
        int num;
        float f;
} floatswp;

int main()
{
        floatswp swp;
        scanf("%x", &swp);
        printf("%f\n", swp.f);
}

The output is:

[email protected] ~ $ cc float.c -ofloat
[email protected] ~ $ ./float
<i>0x41200000</i>
10.000000

So, we finally have all the parameters, and most are defaults. Our function probably looks like this:

public client_authorized(id)
{
   if (!get_cvar_num("pl_enabled"))
      return
   set_task(10.0, "display", id)
   return
}

As you can probably see by now, decompiling is a very time consuming task, especially step by step — this article was quite long. Tomorrow we’ll finish up this plugin. After that, we’ll take a quick look at an actual closed source plugin currently in use, and then finish the series off with a 64bit discussion.