PE File Format Foundations - Gr00t's Security Research Lab

# Portable Executables: Foundations Ready to dive into the fascinating world of PE files? Whether you're just starting out in malware analysis or wanting to understand how Windows executables work, you're in the right place. We'll keep things simple as we build our foundation, but don't worry - we'll branch out into more complex territory in future posts. ## What's a PE File Anyway? Think of a PE (Portable Executable) as Windows' favourite way to package programs. It's like a standardised container that holds everything Windows needs to run your program - the code, resources, and even references to other bits and pieces it might need. The "portable" part means it can work across different Windows versions, which is pretty clever when you think about it. You'll run into three main types of PE files: - **EXE**: Your everyday programs that you double-click to run - **DLL**: Think of these as shared code libraries that multiple programs can use - **SYS**: System files (usually drivers) that chat directly with the Windows kernel. These will operate at ring 0. (If you have been with me you might recognise this term from [[Windows API Basics]] where we talked about the various rings.) ## From Human to Machine Here's something cool - when developers write programs, they usually use languages like C or C++ that humans can understand. But your computer? It needs everything broken down into much simpler instructions. Let's look at a dead simple example: ![[Pasted image 20250104121652.png | 450]] Look at that assembly code on the right. Doesn't look very friendly, does it? But it's doing exactly the same thing as our nice, readable C code on the left. Let's break down what's happening: - `push`: Puts our "Hello, World!" string where it needs to go - `call`: Tells the computer to run our print function - `add`: Does some housekeeping with memory - `xor`: Gets ready to say "all done!" (sets our return value) - `ret`: Wraps everything up and hands control back Don't stress if some of these terms feel like they are from another planet right now - we were all beginners once! We'll dive deeper into assembly language in future posts. ## Headers: The Roadmap to Your Program Every PE file starts with a set of headers that tell Windows what it's dealing with. Think of them as the table of contents and instruction manual rolled into one: ### The DOS Header: A Blast from the Past Remember MS-DOS? Even though we're well past those days, every PE file still starts with "MZ" (the initials of Mark Zbikowski, one of DOS's creators). It's like a little tip of the hat to computing history! The most important bit here is a pointer called `e_lfanew` that tells Windows where to find the real PE header. ### Section Headers: Where's What These headers tell Windows what goes where when loading your program. Common sections include: |Section|What's In It|Example Stuff| |---|---|---| |.text|Your program's actual code|CPU instructions| |.data|Information your program needs|Variables, constants| |.rdata|Read-only information|Import/export info, strings| |.rsrc|Resources|Icons, menus, dialog boxes| ## Want to Learn More? We've just scratched the surface of PE files here. If you're keen to dive deeper, check out [[Windows API Basics]] where we talk about how programs actually interact with Windows. In our next post, we'll explore more advanced topics like Rich Headers, overlay data, and some sneaky tricks malware authors use to hide their code. Remember, understanding PE files is like learning to read a new language - it takes time and practice. Keep at it, and before you know it, you'll be reading binaries like a pro! 🎯