Archive for the ‘Cool Idea’ Category

Nightly Builds

Monday, February 12th, 2007

SourceMod has a nightly build system that will be used for both AMX Mod X and Metamod:Source nightly builds in the coming weeks. This build system took a decent bit of effort to set up, so I felt I’d explain it for people running into similar problems.

First, there’s the problem of hardware. To conserve hardware and processing power, all our builds are done on one machine. We don’t use cross compiling, as compilers usually don’t jump platforms well (or, in C++’s case, you lose ABI compatibility). We use VMware Server to host each platform.

The host operating system in our case is Linux; the guest is Windows XP. The “order” is irrelevant, but we ran into a few roadblocks that took a while to resolve. Next time I would probably run Windows as the host operating system instead — Linux doesn’t require a GUI to do builds, and VMware seems to work better on Windows.

Second, there’s two problems of automation. You need a good cross-platform way to build and package, and you need a good cross-platform way to automate the actual nightly build process itself. For our package building tools, we use C# (.NET 1.1) for both AMX Mod X and SourceMod. For nightly build scripts, we use Perl.

The package build tool has functions for copying files and folders, compressing folders, building libraries, and compiling plugins. For Linux, it runs ‘make‘ and ‘make clean‘ on each target library. For Windows, it runs ‘devenv.com /useenv /Rebuild Release project.vcproj‘. All of these commands work from a command line; no IDE involved.

The build tool also looks for any file called ‘svn_version.h‘. If found, it is edited to contain the current build number of that package. All subversion information is detected using svnversion. Once it’s done, the output folder is compressed into the final .tar.gz or .zip file.

The nightly build tool is written in Perl. Its job is to run the main build tool every night at 12AM CST. It does the following steps:

  • Checks if there have been any changes since the previous night (if none, the script dies).
  • Updates the local copy of the source tree. This is done using SSH Keys. On Windows, we use svn‘s --config-dir option and set the ssh config file option to point to TortoisePlink.exe, which has to be in the PATH environment variable.
  • Builds the C# packaging tool using devenv.com (Windows) or mcs (Linux, Mono c sharp compiler)
  • For Windows, the INCLUDE, PATH, and LIB environment variables are preset to all of SourceMod’s local folders. This is so we don’t have to use its built-in path features, which are inherited from the IDE, rather clunky, and don’t play well in a multi-project environment. These also have to contain Visual Studio’s own include, binary, and library folders respectively.
  • The build tool is executed. On Linux, this requires mono for running .NET applications.
  • If the build tool fails, an e-mail is sent to all developers with a compilation log and revision #.
  • Each package is transferred to a final public location via FTP.

Finally, these scripts are slapped into crontab for Linux and Scheduled Tasks for Windows. Both the host and guest operating systems stay powered on 24/7.

As mentioned before, we will be using this system soon to have nightly builds for both AMX Mod X and Metamod:Source. However, since both of those projects are open source, it will be solely for user convenience.

Text Parsing in SourceMod – Out-performing Valve

Monday, January 15th, 2007

For SourceMod, we decided to use Valve’s KeyValues file format for simplicity and familiarity. However, we didn’t use the provided KeyValues class. Why? To answer this question, let’s first take a look at the two major ways of writing text parsing API. There are “event based” parsers, and “tree based” parsers. Specifically, these are terms most commonly used with XML — but the KeyValues format has similar properties, such as nested sections, and so these parser patterns apply here.

A tree-based parser creates a big tree data structure out of the input text stream. The programmer then receives a pointer to the root node, and he can use this to recursively enumerate the tree (or just search for specific entries, depending on the contents). KeyValues is a tree based parser.

An event-based parser has no data structures. It reads the input character stream and fires user-provided callbacks for each major “event” encountered. It is then up to the programmer how to use these events. For example, events will be fired for:

  • Finding a new section starting or ending.
  • Find a new line.
  • Finding a key and value pair.

There are serious disadvantages to tree-based parsers. The immediate red flag you should see is that the tree structure must be allocated and built. If you don’t have a good allocator, you will wind up with a ton of small, unnecessary allocations for each node. On top of that, you waste memory. While both of these problems can be solved with a decent allocator, you also run into the problem of: what exactly does the programmer do with the tree structure? There are usually a few cases:

  1. You iterate over the tree and do a one-time operation/computation on all its contents. Then you delete the tree.
  2. You cache the tree and peek at its contents at a later time.
  3. You use the contents immediately by looking for a specific node, then delete the tree.
  4. You use the contents immediately by storing the data into a secondary structure, then delete the tree.

An event based parser is usually more ideal in all of these situations. Observe how for each method:

  1. All you’ve done is built a giant temporary structure that you discard. You could simply perform the operations as you receive the events, and remove the entire middle step of creating the structure. Iterating over the tree is fast, but creating the tree is unnecessary
  2. An event based parser isn’t always better here. However, if you are using a generic tree structure, peeking at it for a single entry may be very slow.
  3. This is even a bigger waste than #1. With an event based parser, you can stop parsing once you’ve hit the node you want. On top of that, you’ve removed the unnecessary tree structure AND eliminated a one-time recursive lookup in the tree.
  4. This is the worst of all. If you are going to use your own data structure, why bother building the tree in the first place? A better idea is to build the structure as you receive events.

So, why didn’t we use Valve’s KeyValues class? It’s a recursive tree parser. Worse yet, it’s generic. Everything and its mother in the SDK uses KeyValues, so it cannot possibly be optimized for each usage. A good example of this is in the Game Events system, which has name based lookup on event properties (a bad idea, as indexing them is not only more logical, but much, much faster).

At SourceMod, we wanted to show that not only were event based parsers better, but they allow you to quickly code alternative, faster data structures. At our arsenal, we had:

  • A highly optimized lookup class called a “double array trie.”
  • An optimized event-based parser for KeyValue-formatted files (source code provided below).
  • A fast lump allocator.

We took modevents.res from the cstrike folder of Counter-Strike: Source and wrote up a quick data structure using our pre-written, custom code library. Then we performed three benchmarks:

  1. Time to parse the file and load it into the respective data structure.
  2. Time to lookup two events in the structure – one in the middle, one at the beginning.
  3. Time to lookup a property in the event.

Our parser API: ITextParsers.h, CTextParsers.h, CTextParsers.cpp
Benchmark: benchmark.cpp

What happened?

[SOURCEMOD] Valve Read benchmark: 746622 clock ticks
[SOURCEMOD] SourceMod Read benchmark: 333864 clock ticks
[SOURCEMOD] Valve Lookup benchmark 1: 7740 clock ticks
[SOURCEMOD] SourceMod Lookup benchmark 1: 3339 clock ticks
[SOURCEMOD] Valve Lookup benchmark 2: 5391 clock ticks
[SOURCEMOD] SourceMod Lookup benchmark 2: 1827 clock ticks

In all three benchmarks, our code was more than twice as fast as Valve’s. Lesson: Always tune your data structures. One size does not fit all. Event-based parsers give you this full flexibility. As a bonus, you can implement a tree parser using an event parser (by building the tree during the events). Likewise, you can build an event parser using a tree parser (by firing an event for each iterated node), but that doesn’t make much sense.

Hooking Non-Virtual Functions, Part 4

Thursday, August 11th, 2005

As a wrap-up to this article, I’ll show a very hacky variable-arguments version that, while untested, should work. The trouble with varargs is that we simply don’t know how many bytes were passed, and format routines simply rely on the number of ‘%’ characters to get parameters. This means the safest way, rather than to format things ourselves, is to hack the stack frame.

vararg_gate_open:
  ;push the old return value
  push dword [esp]
.stack_save
  ;call our 'stack save' function
  db 0e8h, 000h, 000h, 000h, 000h
  ;Note that the stack will now be at the first parameter
  sub esp, 8
;This is an E8 call, we'll replace the last four bytes with the address later
.call
  db 0e8h, 000h, 000h, 000h, 000h
;Now we have called the function with the original stack and returned
;Unfortunately, we've destroyed the return eip.  Restore it.
.stack_restore
  db 0e8h, 000h, 000h, 000h, 000h
  mov [esp], eax
  ret

What happened here? We’re taking the stack and realigning it, directly using it to call the handler without pushing any arguments. This is because we can’t copy, as we don’t know the byte size. However, calling it overwrites the old return eip on the stack, so we’ve saved it with two unimplemented functions. We can’t save the values on our own stack because, quite simply, we don’t have one — using the caller’s stack has destroyed our chance at having one, because the callee could corrupt our own. We could have used something callee-saved like edi, but we can’t save it ourselves. You can implement these two functions with a simple heap or data stack implementation.

Lastly, in case you were wondering how to write in an E8 call dynamically, you must use relative eips. Here is an example:

make_gate:
  ;..code..
  push myfunc.end-myfunc.start
  call malloc
  mov edi, eax
  mov esi, myfunc
  mov ecx, myfunc.end-myfunc.start
  ;Copy the 'stock' gate into our buffer
  rep movsb
  ;Move the pointer to the offset of the 4 bytes after E8
  add eax, myfunc.call - myfunc.start + 1
  mov edi, eax
  ;edi will be what eip is DURING the call
  add edi, 4
  ;the function we will be calling will now be in esi
  ;[ebp+8] is just an example of where you might be storing it
  mov esi, [ebp+8] 
  ;make the address relative to the eip
  sub esi, edi
  ;store the relative address into the call
  mov [eax], esi

myfunc:
.start
  ;...code...
  .call
  db 0e8h, 000h, 000h, 000h, 000h
  ;...code...
.end

Hooking Non-Virtual Functions, Part 2

Tuesday, August 9th, 2005

The second method (this one PM gave me the idea for) for hooking non-virtual functions is with the jmp operator, rather than a call. This has a few immediate bonuses: the stack frame is left in tact, and we have more control over program flow. In other terms, we gain re-entrancy, and we don’t have to keep repatching code. In this article we’ll work on a generic hook, rather than a hardcoded one.

For beginners, let’s say we disassemble any function. In the HL2SDK for Win32, most non-virtual member fuctions will look like this (compiled with MSVC):

push ebx
push esi
push edi
...

Generally, the first four-five bytes of the function are simple pushes, and unless the function is less than five bytes, we won’t have a problem caching and overwriting the old code. So, let’s begin.

First, let’s make an imaginary data structure to hold all the information we’ll need for this. We’ll need to save the first six bytes of the function (that is, the amount of bytes a far, mem32 jmp takes). We’ll also need the eip to return to (function addr + 6), the function address, the calling convention, the stack size for the parameter list, and a few more little things we’ll get to. Since MSVC + HL2SDK compiles thiscalls with the __stdcall calling convention, I’ll demonstrate both GCC and MSVC options. Note you’d need a third to deal with varargs (maybe this will be a 3rd article).

enum CallConvention
{
   Call_cdecl=0,
   Call_stdcall=1,
};
//You must call mprotect on the address beforehand
//Returns a block of memory identifying the hook
//Pass a static address that contains the address of the function
void *hook_function(void *function, void **handler, int stacskize, int calltype);

Now we get to the assembly. As always, we are going to have at least two functions. The basic function gate that is never called, and the gate assembler that copies the basic gate into memory, then alters it.

cdecl_gate_open:
  push esi
  push edi
.params1
  mov ecx, 0 ;We will replace this with the number of bytes in the parameter stack
  lea esi, [esp+12] ;Get address of last stack frame
  lea edi, [esp-ecx] ;Point edi to bottom of the next call stack
  rep movsb
  pop edi
  pop esi
.params2
  sub esp, 0 ;Modify the stack frame to be N+8, again this will be copied
;The parameter stack has now been copied.  We can call our handler
;This is an E8 call, we'll replace the last four bytes with the address later
.call
  db 0e8h, 000h, 000h, 000h, 000h 
  add esp, 0; Another copy, for restoring the stack frame to N+8
  ret

That was easy enough. Now all hook_function has to do is copy correct values into the function. How to call the original, however? We need another, quick function assembled in memory.

cdecl_gate_close:
   ;Copy exactly the first six bytes of the original function into here
   times 6 db 0
   ;Construct a jump call.  Fill it with an address containing the eip (function addr + 6).
   db 0FFh, 025h, 000h, 000h, 000h, 000h

And how to keep track of all this information? Simple, we allocate a nice looking struct:

struct callgate
{
   void *function;
   void *orig_eip;
   void *gate_open;
   void *gate_close;
   int call_style;
   int num_params;
};

What’s going on here? We’re intercepting the function by telling it to jump somewhere else. That other location pushes the stack parameters, calls the handler, then returns. The second function simply executes the original 6 bytes, then jumps back to the original position.

If you get a little more creative, you can include handler lists (i.e., plugins) and a system for overriding return values. Tomorrow I’ll wrap this up with stdcall versions. To anyone actually reading this, you can make the actual gate creation function as an exercise. And, lastly, see if you can find how this will utterly fail! This will also be corrected in the next article.

HINT to the first question – All you need to do is allocate memory for those two code blocks, then overwrite the first six bytes of the target function with 0xFF 0×25 <4 bytes>, where the 4bytes hold the address of the callgate::gate_open pointer.

Hooking Non-Virtual Functions

Sunday, August 7th, 2005

While writing CS:S DM, I had to hook a non-virtual function in the HL2 engine. Unfortunately, SourceMM isn’t so magical that it can do this. I decided to try a first attempt on my own; cautiously, I’d never even done manual vtable hooking before.

The concept for virtual functions is easy and straight forward. Find the offset in the virtual function table and overwrite it with a “call gate”. The call gate calls a different function which decides whether to call the original. This is very easy since only one entry point needs to be modified.

However, non-virtual functions are a pain. They’re static blocks of code, and the only viable way to scan for their references is by walking the code section and calculating possible eip values. So, you have to edit the code.

The solution I came up with was quite hardcoded to both the calling convention and the actual function in question. But basically, it goes like this:

  1. Identify the address of the function you want to hook.
  2. Assembly a “call gate” in memory. This call gate should look something like:
    push ebp
    mov ebp, esp
    mov eax, [ebp+8]
    mov ecx, &lt;address>
    call [ecx]
    pop ebp
    retn

    Since you assemble this code in memory, you can move the address of your call handler into the code at runtime, by calculating the offset of “mov, ecx [ASDF]“.
  3. Assembly a “stub gate” in memory, and save it. This would look something like:
    push esp ;push the stack pointer so we can get parameters
    call callgate
    ret
  4. Calculate the length of the stubgate. Save this many bytes of the original function’s code.
  5. Copy the stubgate over the original code. Don’t forget a call to page-aligned mprotect() or VirtualQuery()!
  6. Create another function called “force_original”. This function should unpatch the original function by copying the saved bytes, call it, then restore the stub gate. Obviously, this is not re-entrant at all and a serious flaw.
  7. Return, return, return.
  8. What has happened? The original function was edited to call a gateway function. This gateway function takes the original stack pointer and gives it to your handler. Your handler decides whether to force the original call (which requires two repatchings), and then returns to the caller.
  9. Using this you can make a hacky, hardcoded, and awkward callgate for a non-virtual, static function. I have an example for hooking a four parameter, void function on linux here: dropgate.asm

    I’ll make a “Part 2″ post later: how to do this with a much nicer reentrant jmp. Using this method, combined with a system of specifying stack width and calling convention, SourceMM might have non-virtual function hooking in the future. Mmmm!

New MetaEng, Part 2

Sunday, May 29th, 2005

With core2, many object types have been shifted around, phased out, or renamed. One of the most important types that has arisen in new form is “IUnloadable”. Since SourceMod allows you to remove plugins at any time, nightmares would ensue if things like cvars and console commands were not properly removed on plugin unload.

An IUnloadable defines an object that, at any point in time, could be unloaded from memory. Back in core1, an Unloadable was the opposite – something that couldn’t be unloaded. Slight mistake.

So, now what can be unloaded? The core types essential to SourceMod – IModule and IScript – are both IUnloadables. An Unloadable must also have a parent Unloadable – that is, if it is attached to a related piece of unloadable memory, it must specify this through its GetOwner() property.

In this manner, an IScript is an IUnloadable whose owner is the IUnloadable of the IModule who loaded it. Furthermore, any ICallable from that IScript will be an IUnloadable owned by the script. When the base module is uploaded to CVS, classes like ITimer and IMenu will also be IUnloadables.

So, how does this solve anything? PluginManager and ModuleManager will define two new interfaces: IPluginListener and IModuleListener. A manager class, such as CMenuManager or CTimerManager, can implement these interfaces and register themselves with both the Plugin and Module managers. The manager will then be notified of every plugin and module that is loaded and unloaded.

When the unload event is called (OnScriptUnload or OnModuleUnload). the manager can check all of its internal structures. Since an IMenu is an IUnloadable, it should have a valid parent, which can be compared to the pointer of the module or script. If they match, the manager must remove the callbacks from its internal lists, which will prevent it from accessing deleted memory.

With this design, SourceMod now has complete plugin/module/metaeng control. Everything will be dynamically loadable and unloadable at runtime, and unloading the root object will cause a chain reaction of unloading everything.

Metamod: Source

Friday, May 6th, 2005

What we have been preparing for some time now is finally finished.

Valve wanted the server plugins interface to be a replacement for the old metamod. But as you might know, it is a very poor replacement – you can only supercede three functions. Plugin developers wanted more power, so they started hacking the vtables. But everyone did it his own way, the plugins did not like each other, and were often not portable. We wanted to provide a unified interface, so we’ve decided to create SourceMM (Metamod: Source).

SourceMM is a gamedll wrapper, just like metamod was. You specify SourceMM as your gamedll, and it then loads the original gamedll. SourceMM contains a plugin manager, a log subsystem, and a hook manager.

SourceMM plugins have much more power than original server plugins. They can easily hook any virtual function from any class instance by name (of course, if you have the class definition ;) ). A easy to use metamod-like interface (with META_RETURN, etc.) is provided for the hook handlers.

We hope that as many plugin developers as possible will convert their plugins to SourceMM in the future. SourceMod will also be a SourceMM plugin.

So visit sourcemm.net and download the Release Candidate 1 now!