Skip to content

Learning Binary Exploitation — Part 1

Published: at 01:39 PMSuggest Changes

Hello there, I was facing a lot of difficulties in properly understanding binary exploitation. I thought, it may be helpful if I explain it in a blog while trying to understand it myself. So, here’s the first one…

Understanding Memory Diagram

Memory Diagram

This is what the memory diagram of a 32-bit executable looks like. Let me walk you through the function of each one of these:

  1. Command line arguments and environment variables: Well, this one is pretty obvious, this stores the arguments we provide while executing the binary and the environment variables.

  2. Stack: It functions like a regular stack and the variables declared inside the code are stored here BUT this stack is in reverse order. In fact, the whole memory diagram is drawn in a way such that the highest memory address is at the top and the lowest is at the bottom. So, the stack actually grows downwards towards the lowest memory address.

  3. Heap: It stores the dynamic variables that are generated by *alloc functions. It grows upwards, i.e. from the lowest address to the highest address.

  4. Uninitialized Data(BSS): This section stores all the uninitialized variables that were declared in the code. It initializes those variables to 0 by default.

  5. Initialized Data: Yes, you guessed it, this part stores the initialized variables.

  6. Read-Only data + source code: this part stores the binary code along with read only data, that can’t be tampered with.

Now, to exploit the binary we need to know a bit more than just the functions of each component… I think (-_-;)

So, I learnt a bit about ELF file format.

ELF File Format

It stands for Executable/Extensible and Linkable Format, because well it can be either linkable or executable. ELF files store the program and its data. Every ELF file contains file headers, program headers and section headers.

  1. File headers: These basically determine if the program should use 32-bit or 64-bit memory address. This contains information like entry point which defines where to begin program execution.

  2. Program headers: This describes how the program should be loaded into memory.

  3. Section headers: Contains metadata describing the various components of the program. It stores a lot of semantic information which are not very important when loading headers, so these are skipped sometimes.

    Some important sections inside section headers:

    • .text - the executable code of your program
    • .plt and .got - used to resolve and dispatch library calls
    • .data - used for pre-initialized global writable data
    • .rodata - used for global read-only data
    • .bss - used for uninitialized global writable data

Thank you for reading.


Previous Post
Learning Binary Exploitation — Part 2