X86 Compiler Optimization: Parameter Reuse


This article discusses a compiler optimization technique, which causes parameter values on the stack to show up differently from their actual values. Since it is common practice, during debugging, to read parameter values directly from X86 call stacks, this behavior can be misleading. Although the assembly language code snippets used in this article are from Windows XP running on X86, the content discussed here applies to all versions of Windows.

Invalid Parameters on Stack

On the X86 CPU, barring a few exceptions like functions using the fastcall calling convention, parameter values read from the call stack are assumed to be accurate. However, the example shown below seems to imply otherwise. The following kernel mode stack depicts a waiting thread on Windows XP :

ChildEBP RetAddr  Args to Child              
fc0b6c08 804dc6a6 ff8e3e18 ff8e3da8 804dc6f2 nt!KiSwapContext+0x2e
fc0b6c14 804dc6f2 00000103 ff95c16c 00000000 nt!KiSwapThread+0x46
fc0b6c3c 80616c2b 00000000 00000000 00000000 nt!KeWaitForSingleObject+0x1c2
fc0b6c68 805e6f85 0095c16c ff8e39a8 ff8e3a3c nt!IopCancelAlertedRequest+0x68
fc0b6c84 8057a510 ff953680 00000103 ff95c110 nt!IopSynchronousServiceTail+0xe1
fc0b6d38 804df06b 000007b0 00000770 00000000 nt!NtWriteFile+0x602
fc0b6d38 7c90eb94 000007b0 00000770 00000000 nt!KiFastCallEntry+0xf8
007cff30 00000000 00000000 00000000 00000000 ntdll!KiFastSystemCallRet

In the above stack, the first (highlighted) parameter to KeWaitForSingleObject() is NULL. KeWaitForSingleObject() happens to be a kernel function that is documented on MSDN and its prototype is as follows:

NTSTATUS 
KeWaitForSingleObject(
  __in      PVOID Object,
  __in      KWAIT_REASON WaitReason,
  __in      KPROCESSOR_MODE WaitMode,
  __in      BOOLEAN Alertable,
  __in_opt  PLARGE_INTEGER Timeout );

As seen from the prototype, the first parameter i.e. the pointer to the object to wait on, is probably the most important parameter to KeWaitForSingleObject(). And yet, that parameter being 0x00000000 does not cause this function to crash. The rest of the article investigates why this mandatory parameter to KeWaitForSingleObject() appears as NULL on the call stack.

Parameters passed in by the caller

The call stack shown above indicates that the caller to KeWaitForSingleObject() is the function IopCancelAlertedRequest(). Examining the assembler code of IopCancelAlertedRequest(), shows that the value of the first parameter being passed in to KeWaitForSingleObject, is read from the ESI register, as shown in the highlighted instruction.

nt!IopCancelAlertedRequest+0x53
80616c21 53               push    ebx
80616c22 53               push    ebx
80616c23 53               push    ebx
80616c24 53               push    ebx
80616c25 56               push    esi
80616c26 e8255becff       call    nt!KeWaitForSingleObject (804dc750)

The assembler code for the initial part of the function KeWaitForSingleObject(), i.e. the function prolog, shows the non-volatile registers (ebp, esi, edi, ebx) being saved on the stack. The compiler saves and later restores the values of these non-volatile registers since their values to be preserved across function calls. Out of these four non-volatile registers, the one that is of interest is the ESI register, as highlighted below:

nt!KeWaitForSingleObject:
804dc750 8bff             mov     edi,edi
804dc752 55               push    ebp
804dc753 8bec             mov     ebp,esp
804dc755 83ec14           sub     esp,0x14
804dc758 53               push    ebx
804dc759 56               push    esi
804dc75a 57               push    edi
804dc75b 64a124010000     mov     eax,fs:[00000124]
804dc761 8b5518           mov     edx,[ebp+0x18]
804dc764 8b5d08           mov     ebx,[ebp+0x8]
.
.
.

Examining of the contents of the stack frame for the function KeWaitForSingleObject(), the value of the parameters, return address, frame pointer, local variables and saved non-volatile registers can be deduced.

fc0b6c1c  00000103 ; Saved EDI
fc0b6c20  ff95c16c ; Saved ESI
fc0b6c24  00000000 ; Saved EBX
fc0b6c28  00000000 ; Start of Local Variable Area
fc0b6c2c  00000246
fc0b6c30  804e6b73
fc0b6c34  fc0b6c4c 
fc0b6c38  00000000 ; End of Local Variable Area
fc0b6c3c  fc0b6c68 ; Saved EBP (frame pointer)
fc0b6c40  80616c2b nt!IopCancelAlertedRequest+0x68 ; return address of the caller to KeWaitForSingleObject()
fc0b6c44  00000000 ; Parameter #1 = Object
fc0b6c48  00000000 ; Parameter #2 = WaitReason
fc0b6c4c  00000000 ; Parameter #3 = WaitMode
fc0b6c50  00000000 ; Parameter #4 = Alertable
fc0b6c54  00000000 ; Parameter #5 = Timeout

The stack location containing the saved value of the ESI register is at address 0xfc0b6c20 and the contents of the location, shown highlighted above, is 0xff95c16c. This value of the ESI register is the same value what would have been passed in as the first parameter to KeWaitForSingleObject(), by the caller IopCancelAlertedRequest(), as explained before.

Furthermore, details of the waiting thread reveal that it is indeed waiting on a valid notification event whose address, highlighted below, is the same value as the contents of the ESI register observed above. This eludes to the fact that when KeWaitForSingleObject() was called, the value of the first parameter was a valid pointer i.e. 0xff95c16c.

THREAD ff8e3da8  Cid 0514.06cc  Teb: 7ffde000 Win32Thread: 00000000 WAIT: (Executive) KernelMode Non-Alertable
    ff95c16c  NotificationEvent

Parameter space reuse by compiler

Examining the assembler code of the function body for KeWaitForSingleObject() shows an instruction at address 0x804fbe3c (highlighted below) which writes the value 0x0 to a location on the stack pointed to by ebp+0x08. Based on the layout of the X86 call stack, this also happens to be location containing the first parameter passed to KeWaitForSingleObject().

nt!KeWaitForSingleObject+0x97:
804fbe3c 83650800         and     dword ptr [ebp+0x8],0x0
804fbe40 e9b609feff       jmp     nt!KeWaitForSingleObject+0x9b (804dc7fb)

The above assembly sequence is generated by the compiler to perform a particular type of optimization which reduces the amount of stack space used by a function. When the compiler notices a code pattern, similar to the one shown in the following 'C' code, it concludes that the usage of a particular parameter and local variable are mutually exclusive within the function.

Function ( PVOID Parameter1 )
{
    PVOID Local1;
    
    if ( . . . ) 
    {
        . . . 
        // Access Parameter1;
        // No access to Local1;
        . . .
    } else 
    {
        . . .
        // Access to Local1;
        // No access to Parameter1;
        . . .
    }
}

Since the compiler observes that Parameter1 and Local1 are not used at the same time, it repurposes the memory locations containing the parameter "Parameter1" to store local variable "Local1". The utilization of parameter memory space, on the stack, to store local variables is common practice on the X86 CPU since there aren't that many register the compiler can use during program execution to store values of temporary/local variables. On CPU architectures like X64, where there are a lot more registers available (i.e. R8 through R15), the need for doing such optimization diminishes as these extra registers can be used to cache temporary/local variables, which is much more efficient than storing them on the stack.

To sum up the observations above, at the time of the call to KeWaitForSingleObject(), the first parameter was passed in correctly but subsequently KeWaitForSingleObject() overwrote the value by zeroing out the contents of the location where the parameter was stored on the stack. So, once the function body had finished using the parameter value and did not need it anymore, it reused the stack space occupied by the parameter to store a local variable, leveraging the fact that according to the "C" calling convention the caller to KeWaitForSingleObject() would not depend on the parameter value upon return.