OMGOMGOGM TEH PM R TALKIN ABOUT HIS SOURCEHOOK AGAIN STFU OMGOM BANANA BAN1111111
Yeah, it’s me again. I’ve decided to optimize SourceHook a bit. What I wanted to optimize is:
1) Execution time
2) Compilation time
3) Executable size
—
1) Execution time
One can see that SourceHook’s original function calling mechanism is a bit weird:
Change vtable entry to original function Execute the function Change vtable entry to hooked function
This has the advantage that it can execute the original function with the correct this pointer and without the use of inline assembly. But we can optimize it. If SH_SAFE is not defined, SourceHook now generates an other call class for each function: Its vtable points to call gates that alter the this pointer and call the original function.
mov ecx, [ecx+4] mov eax, (original function address) jmp eax
(msvc)
As you can see, the original function pointer is stored as a member of the call class (in [ecx=this] + 4). The original function address is assigned on runtime.
Because this generates code on runtime, it will not work with PaX. It is also not compatible with gcc and with amd64 at the moment. So you can disable it by defining SH_SAFE.
An other problem is the list lookup in the hook handlers:
# define SH_FIND_ENTRY() SH_ACI(SH_HookedIfaceList)::iterator fetmpiter = std::find(SH_ACI(g_SH_HookedIfaceList_).begin(), SH_ACI(g_SH_HookedIfaceList_).end(), this); if (fetmpiter == SH_ACI(g_SH_HookedIfaceList_).end()) SH_CF_NoEntryFound_FatalErr(); SH_ACI(SH_HookedIface_) &entry = *fetmpiter;
Imagine this situation:
There are many plugins, and 30 instances of IGaben are hooked. The EatYams function is called on the one that was registered last. The hook handler has a pointer to the instance at this moment (the “this-pointer”). In order to get the original vtable, the pre and post functions list, etc., it needs to find the corresponding entry in g_SH_HookedIfaceList_IGaben. Because our instance is 30th in the list, everytime some hooked function is called, std::find has to iterate 30 times before it finds the entry.
But what else could we do? We can hide the entry pointer into the instance somehow. We are modifying the vtable all the time, so we can use it for this too. Here is what I did:
When an instance is added, its first vtable entry is replace with code generated on runtime. This code only jumps to the original vtable entry. But we can hide something behind the code! This code looks like this:
mov eax, (original vtable entry) jmp eax (ENTRY POINTER HERE!)
When the intance is removed from the hooked instances, and reset the vtable entry.
This way, we can replace the SH_FIND_ENTRY macro with this:
inline void *SH_CF_GetPointer(const void *iface) { const unsigned char *jmpcode = *(*(const unsigned char***)iface); return *(void**)(jmpcode + SH_CF_POINTEROFFSET); } # define SH_FIND_ENTRY() SH_ACI(SH_HookedIface_) &entry = *(SH_ACI(SH_HookedIface_)*)SH_CF_GetPointer(this);
This is again generating x86 code on runtime, so the old method is used when SH_SAFE is defined.
—
Compilation time and executable size
I was trying to optimize these too, but I don’t have any impressive results so far.