SourcePawn, Part 1 – Features Added

Whoa, an article! Yes, it’s been a while. After our huge forums overhaul and a number of other server moves, this section of the site was mostly forgotten during the chaos and aftermath. However, the JIT series is not dead and will return with some delicious goodies.

During the past two months, faluco and I have worked very diligently on an extremely fine-tune optimized JIT for Pawn (which we’re tentatively renaming as SourcePawn for SourceMod). As part of next the JIT series installments, I will be open-sourcing our library code for implementing a quick and dirty JIT.

During our writing of the new SourcePawn JIT, we also changed a few things in the language. Features were added and oddities were removed. I’d like to take a moment to explain those changes in this article. This first part will cover the three major additions we’ve made. They are: Dynamic Arrays, Function Pointers, and Fast Declarations.

Dynamic Local Arrays
This one is a biggie that has been requested over and over again for AMX Mod X. So, we took the liberty of adding “Dynamic Local Arrays.” Normally in Pawn, arrays have statically defined dimensions. You must use a constant value. Thus, creating an array based on a string length, or creating an array based on the number of connected players — is impossible. Dynamic Local Arrays let you declare local arrays with a variable dimension size for any or all dimensions. An example of the syntax:

new string[] = "Gaben"   //statically dimensioned array, 6 cells
new len = StringLength(string);
new better_string[len+1]  //dynamically dimensioned array
new weird_var[len][len] //dynamically dimensioned 2D array

string[0] = weird_var[0][0] = better_string[0] = 0

There is a caveat: Creating a dynamic array is remarkably more expensive. It is always on the heap, rather than the stack, and thus it requires a special tracking mechanism in the JIT. Worse, a multi-dimensional array has to have “indirection tables” generated. For a large multi-dimensional array, I won’t lie: this process should be considered extremely expensive (the compiler’s generation of these tables is recursive, but luckily tail-recursive, and thus optimizable). So if you can make the dimensions constant, use brackets instead, as the performance will be better.

Astute readers will also note the emphasis I put on “local.” These arrays cannot be global, obviously, because global instantiation would require a constructor of some sort, otherwise dynamic values would not make sense.

Another fine point: these arrays are not references. Meaning, you cannot do:

new array[];
array = new magicref[strlen("hello")];

Though, this functionality will definitely be added in some form, someday.

Function Pointers and Function Type Enumeration
This was another biggie seen as a flaw in AMX Mod X. Pawn coders will remember the pain of tracking down an error like this:

native register_event(const event_name[], const hook_name[]);

public plugin_init()
{
   register_event("DeathMsg", "hookdeath")
}

public hook_death
{
}

Oops! The event name is mispelled. Now, our code will never run, and since it was registered in init, it is likely we won’t notice the error message. A few minutes of hair pulling later, the programmer curses: “Oh, if only the compiler could have caught that!” Now, it can.

(For those wondering why it couldn’t before: Consider that the native is only asking for an array. This means you could pass in any sort of variable, constant, or string. The compiler has absolutely no way of magically deducing that you meant a function. There might be some way to hack this with a special tag, but it would be very difficult, and would only work for literally known strings. It also would not allow for any of the features that will be described below.)

The first major change is that you can now pass functions as values, instead of by a string containing their name. For example, one might have this:

native register_event(const event_name[], EventHook:event);

OnDamage()
{
}

public plugin_init()
{
   register_event("Damage", OnDamage);
}

This should strike you with two implications. The first is that the compiler can now typo check in case of a silly error, and even better, it can also type check, meaning it can tell you if your function is potentially declared the wrong way. The second implication is that functions no longer have to be public. One of the annoying things about AMX Mod X, is the keyword “public” began to be mean “you must return something,” even if the return value was ignored! That was partially a side effect of improper tagging, but that’s a different rant.

Worse, everyone started making every function public just in case it needed to be public. This is bad. Public functions get exported to a table; a table which takes memory and adds to the relocation work of the JIT and VM. Though it’s not harmful, it’s bad practice for that reason. And worst of all is when people would combine all of these misunderstandings, and return random values in a stock-appropriate function which was declared as public.

But, I digress! We’ve only covered the “typo” benefits of this feature The type-checking part is where the magic happens. In AMX Mod X, the “set_task()” call supports two types of function prototypes:

native set_task(Float:time, const function_name[], timer_type);
forward TimerThatHasData(const data[], timer_id);
forward TimerThatHasNoData(timer_id);

However, the compiler lets you define a public function in any way you want. This means you can declare an array where one shouldn’t exist, or a float that should be a boolean, et cetera. As soon as the host app calls into your bad function, it will crash, or best, not work.

This was addressed in Pawn 3.1, which forced all public functions to have a pre-defined (“forwarded”) declaration. But this is a design flaw: it assumes that the user will only have one hook, since the named function has to be pre-defined. This is very bad for a lot of reasons – for example, timers and per-event hooks work most efficiently when assigned to unique functions. How can you get around that? Of course, you declare your own forward for your uniquely named function. But this defeats the purpose, since you can simply declare your forward wrong as well!

To get around this, we have introduced function types. Function types create a list of function prototypes that can be associated with a given tag. For example, in our previous example:

funcenum Timer
{
   public(const data[], timer_id),
   public(timer_id);
}

native set_task(Float:time, Timer:timer_function, timer_method);

In this revised method, the compiler will check the function you are trying to pass in. If it doesn’t match one of the given rules, you will get a tag mismatch warning (which may end up being moved to an error). The rules are strict: Your parameter count, dimension sizes, tags, return tags, and references must all match. Although you can change names, any inconsistency that could cause a potential crash or run-time error is removed.

There is one extra case, though. A common practice is to remove parameters which you won’t be using. Thus, we introduced “optional” parameters which can be omitted from your declaration. This looks roughly like C++’s “pure virtual” syntax:

funcenum Timer
{
   public(const data[], timer_id=0)
}

Now, the timer_id parameter can be removed.

“Fast” Garbage Declarations
Lastly, as far as major features go, we’ve introduced a new variable declaration keyword: “decl.” decl can be used in lieu of new, and it has one caveat: your variable won’t be automatically zeroed.

Automatically zeroing variables is considered very expensive. For example, code like this in AMX Mod X could easily increase CPU usage:

public server_frame()
{
   for (new i=1; i< =maxplayers; i++)
   {
      new some_array[256]
   }
}

Imagine - every frame, you are zeroing 256*4 (1KB) of memory! Writing to memory is expensive, and wasteful if you're a)simply going to overwrite it again and b)not going to use all of it.

In AMX Mod X, the dirty hack was to use "static" instead, which makes the array pseudo-global. But this removes re-entrancy. So for SourcePawn, the new decl keyword will skip the zeroing step, allowing you to opt into fine-tuned, optimized, but unsafe storage.

Bonus Question:
Why didn't we implement copying array references, or global array creation?
Hint: Dynamic arrays go on the heap, but in SourcePawn, the heap is just another stack. Since you can't copy their references, they are simply "popped" off the heap once their scope dies.

Leave a Reply

You must be logged in to post a comment.