In the previous post, we’ve explored the observer effect in IT by writing a program that behaved differently under a debugger session than standalone. In this post, we’ll extend selfmod1_amd64.S and selfmod1_i386.S in such a way, that they won’t crash anymore, irrespectively of the environment they run in.
Analyzing the problem
Why did we witness a different behavior inside and outside of a debugging environment?
Code pages are read-only by default
The programs selfmod1_amd64.S and selfmod1_i386.S crashed, because they tried to modify counter, a memory location that was stored in the read-only .text section. The operating system effectively prevented this by killing the processes with a SIGBUS signal.
This is perfectly desirable behavior of Unix systems: code pages are usually shared across many processes (imagine many shells, httpd processes etc. running concurrently and executing the same binary), and if all those pages are mapped read-only in the processes’ memory address space, they won’t have to be physically duplicated, thus saving a lot of RAM.
Furthermore, read-protecting code pages by default is also a safeguard against buggy programs: self-modifying code is usually not intended, and normally an excellent shooting-in-the-foot mechanism, so it is disabled by default.
So is self-modifying code impossible?
While read-protecting code pages is considered best practice, under some circumstances, processes may need to modify code pages (e.g. our program needs to do just that). Debuggers are notorious for this: in order to implement breakpoints and single stepping, they need to inject special op-codes in the code pages of the debugged processes. The operating system obviously needs to provide a mechanism to debuggers to (temporarily) lift the read-only protection of code pages.
In practice, debuggers on Unix systems would use the venerable ptrace(2) system call (it appeared as early as in Version 7 AT&T Unix!). While it is available on virtually all Unix systems, it is notoriously unportable, and has an ugly interface.
Some Unix systems that rely on Mach’s VM subsystem or a variant thereof, provide an additional way to set permissions of pages. FreeBSD is one such system, and it provides the mprotect(2) system call. With the following pseudo-code, we can effectively turn a read-only code page into a read-write page:
#include <sys/mman.h>
void *addr = get_address_of_instruction_pointer();
int length = 4096; /* size of a page */
if (mprotect(addr, length, PROT_READ | PROT_WRITE | PROT_EXEC) == 0) {
/* current code page is now writable */
}
Solving the challenge
So, to solve the challenge of the previous post, we simply need to translate the C call to mprotect(2) into a corresponding assembly call. What do we need to do? Basically, we’ll call mprotect(2) with appropriate arguments to set the PROT_WRITE bit on the .text page that contains the variable counter. As soon as that happens (i.e. if FreeBSD kernel lets us change the protection at all), the counter should decrease even outside a debugging session, without causing a SIGBUS.
What do we need precisely? We have to
- detect the address of the current code page, and
- execute the mprotect(2) syscall.
The first point is not as obvious as one may think. We can’t simply stuff the address of counter in mprotect(2)‘s addr argument. Why not? Because the assembler doesn’t know at this point the address where the program will be loaded! counter, being at the very beginning of the .text section has the address 0 at the time of assembly. Obviously, at run time, that address will be something else, determined by the dynamic linker and by the kernel.
It all boils down to this: we need to detect at run time where counter is located. We use the following observation to achieve this feat:
Since counter is very near of the remaining code (certainly not farther away than, say, 4K), we use the address of the instruction pointer
%rip(or%eip) instead!
If we call mprotect(2) with %rip or %eip, we will not hit the very beginning of the page where counter is. But since this system call acts at best with page granularity on FreeBSD (you can’t change the protection of only parts of a page), if we unprotect the page %rip or %eip points to, counter will be unprotected too.
There’s a little technical issue to resolve though:
Suppose we will stuff %rbx or %ebx into the addr argument of the mprotect macro. Unfortunately, we can’t execute movq %rip, %rbx nor movl %eip, %ebx. There’s simply no op-code with this combination of registers in the i386 or amd64 instruction set! Fortunately, we can do it via the stack! Consider the following code on FreeBSD/amd64:
/* simulating movq %rip, %rbx */
call next_location
next_location:
popq %rbx
This strange-looking idiom transfers %rip (which contains the address following the call next_location instruction) into %rbx! How? Remember that the call instruction is normally meant to call a subroutine. As part of that instruction, the instruction pointer %rip is implicitely pushed on the stack so that ret from the called subroutine returns to the address following the call to call. Instead of ret, we can as well popq the stack (implicitely returning from the subroutine) into a different register (here: %rbx).
This idiom on FreeBSD/i386 looks exactly the same, except that we use different registers and pop a long, instead of a quad:
/* simulating movl %eip, %ebx */
call next_location
next_location:
popl %ebx
The other arguments to mprotect(2) are easy: the length can be constant, i.e. 4K (4096), and the protection bits will be PROT_READ, PROT_WRITE and PROT_EXEC or-ed together.
We’ll implement the invocation of the mprotect(2) syscall as a macro, just as we did for other syscalls.
Now, we’re ready to modify selfmod1_amd64.S and selfmod1_i386.S accordingly, and test our hypothesis.
The FreeBSD/amd64 version
The program for the self-modifying .text code section on FreeBSD/amd64 is thus:
// selfmod2_amd64.S -- self modifying code, FreeBSD/amd64 version
// Runs under gdb(1) AND standalone.
/* Some constants */
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4 /* WRITE syscall */
.equiv SYS_EXIT, 1 /* EXIT syscall */
.equiv SYS_MPROTECT, 74 /* MPROTECT syscall */
// These constants for mprotect(2) are from <sys/mman.h>
.equiv PROT_READ, 0x01 /* pages can be read */
.equiv PROT_WRITE, 0x02 /* pages can be written */
.equiv PROT_EXEC, 0x04 /* pages can be executed */
.equiv PROT_ALL, PROT_READ | PROT_WRITE | PROT_EXEC
.equiv PAGE_SIZE, 4096
/*
* Syscalls implemented as macros for max performance.
*/
.macro write fd, buf, len
movq \fd, %rdi
movq \buf, %rsi
movq \len, %rdx
movq $SYS_WRITE, %rax
syscall
.endm
.macro exit retcode
movq \retcode, %rdi
movq $SYS_EXIT, %rax
syscall
.endm
.macro mprotect addr, len, prot
movq \addr, %rdi
movq \len, %rsi
movq \prot, %rdx
movq $SYS_MPROTECT, %rax
syscall
.endm
/* The .data section contains our output buffer */
.section .data
buffer:
.byte 'X', '\n'
/* Code section */
.text
// The counter is in the .text code section!
counter:
.byte 0
.align 8
.global _start
_start:
/* set up a stack frame */
pushq %rbp
movq %rsp, %rbp
/* simulating movq %rip, %rbx */
call next_location
next_location:
popq %rbx
/* make current code/text page writable */
mprotect %rbx, $PAGE_SIZE, $PROT_ALL
jc bye /* silently fail if we can't */
init_counter:
movb $6, (counter) /* initial value of counter, + 1 */
loop_counter:
movb (counter), %al
decb %al
je end_loop
movb %al, (counter)
call write_counter
jmp loop_counter
end_loop:
movb %al, (counter) /* one last time to print 0 */
call write_counter
bye:
exit $0
/* NOT REACHED */
popq %rbp
write_counter:
pushq %rbp
movq %rsp, %rbp
movb (counter), %al /* convert counter to ASCII */
addb $0x30, %al
movb %al, (buffer)
write $1, $buffer, $2 /* write 2 bytes of buffer: value and \n */
popq %rbp
ret
The only additions to the previous version are the PROT_* and PAGE_SIZE constants, the mprotect macro, and our neat little trick to discover the address of the counter variable, followed by the call to the mprotect macro:
/* simulating movq %rip, %rbx */
call next_location
next_location:
popq %rbx
/* make current code/text page writable */
mprotect %rbx, $PAGE_SIZE, $PROT_ALL
jc bye /* silently fail if we can't */
Will the program run without crashing? Let’s test it!
% as --64 -o selfmod2_amd64.o selfmod2_amd64.S % ld -o selfmod2_amd64 selfmod2_amd64.o % ./selfmod2_amd64 5 4 3 2 1 0
Not bad at all. But we’re curious, so we’ll also ktrace(1) it, to see if the mprotect(2) syscall was executed and what it did return. After executing ktrace ./selfmod2_amd64 we can kdump(1) it (output truncated):
% kdump
3341 ktrace RET ktrace 0
3341 ktrace CALL execve(0x7fffffffe94f,0x7fffffffe690,0x7fffffffe6a0)
3341 ktrace NAMI "./selfmod2_amd64"
3341 selfmod2_amd64 RET execve 0
3341 selfmod2_amd64 CALL mprotect(0x4000c1,0x1000,PROT_READ|PROT_WRITE|PROT_EXEC)
3341 selfmod2_amd64 RET mprotect 0
3341 selfmod2_amd64 CALL write(0x1,0x500150,0x2)
3341 selfmod2_amd64 GIO fd 1 wrote 2 bytes
"5
"
3341 selfmod2_amd64 RET write 2
(...)
3341 selfmod2_amd64 CALL write(0x1,0x500150,0x2)
3341 selfmod2_amd64 GIO fd 1 wrote 2 bytes
"0
"
3341 selfmod2_amd64 RET write 2
3341 selfmod2_amd64 CALL exit(0)
As we can see, mprotect(2) returned 0, i.e. the call was successful.
The FreeBSD/i386 version
The 32-bit version is just as simple, and you should understand it immediately now:
// selfmod2_i386.S -- self modifying code, FreeBSD/i386 version
// Runs under gdb(1) AND standalone.
/* Some constants */
// The following syscall IDs are from /usr/src/sys/kern/syscalls.master
.equiv SYS_WRITE, 4 /* WRITE syscall */
.equiv SYS_EXIT, 1 /* EXIT syscall */
.equiv SYS_MPROTECT, 74 /* MPROTECT syscall */
// These constants for mprotect(2) are from <sys/mman.h>
.equiv PROT_READ, 0x01 /* pages can be read */
.equiv PROT_WRITE, 0x02 /* pages can be written */
.equiv PROT_EXEC, 0x04 /* pages can be executed */
.equiv PROT_ALL, PROT_READ | PROT_WRITE | PROT_EXEC
.equiv PAGE_SIZE, 4096
/*
* Syscalls implemented as macros for max performance.
*/
.macro write fd, buf, len
sub $0x10, %esp
movl \fd, (%esp)
movl \buf, 0x4(%esp)
movl \len, 0x8(%esp)
movl $SYS_WRITE, %eax
call do_syscall
add $0x10, %esp
.endm
.macro exit retcode
sub $0x10, %esp
movl \retcode, (%esp)
movl $SYS_EXIT, %eax
call do_syscall
add $0x10, %esp /* NOTREACHED */
.endm
.macro mprotect addr, len, prot
sub $0x10, %esp
movl \addr, (%esp)
movl \len, 0x4(%esp)
movl \prot, 0x8(%esp)
movl $SYS_MPROTECT, %eax
call do_syscall
add $0x10, %esp
.endm
/* The .data section contains our output buffer */
.section .data
buffer:
.byte 'X', '\n'
/* Code section */
.text
// The counter is in the .text code section!
counter:
.byte 0
.align 8
.global _start
_start:
/* set up a stack frame */
pushl %ebp
movl %esp, %ebp
/* simulating movl %eip, %ebx */
call next_location
next_location:
popl %ebx
/* make current code/text page writable */
mprotect %ebx, $PAGE_SIZE, $PROT_ALL
cmpl $0, %eax
js bye /* silently fail if we can't */
init_counter:
movb $6, (counter) /* initial value of counter, + 1 */
loop_counter:
movb (counter), %al
decb %al
je end_loop
movb %al, (counter)
call write_counter
jmp loop_counter
end_loop:
movb %al, (counter) /* one last time to print 0 */
call write_counter
bye:
exit $0
/* NOT REACHED */
popl %ebp
write_counter:
pushl %ebp
movl %esp, %ebp
movb (counter), %al /* convert counter to ASCII */
addb $0x30, %al
movb %al, (buffer)
write $1, $buffer, $2 /* write 2 bytes of buffer: value and \n */
popl %ebp
ret
do_syscall:
int $0x80
jnc ret_from_syscall
neg %eax
ret_from_syscall:
ret
As a little exercise, try to spot the changes w.r.t. the previous version selfmod1_i386.S.
How about testing? Sure, why not?
$ as --32 -o selfmod2_i386.o selfmod2_i386.S $ ld -o selfmod2_i386 selfmod2_i386.o $ ./selfmod2_i386 5 4 3 2 1 0
No crashes. The programs runs just fine. A kdump(1) shows no surpises:
73799 ktrace RET ktrace 0
73799 ktrace CALL execve(0xbfbfedc7,0xbfbfeca4,0xbfbfecac)
73799 ktrace NAMI "./selfmod2_i386"
73799 selfmod2_i386 RET execve 0
73799 selfmod2_i386 CALL mprotect(0x8048088,0x1000,PROT_READ|PROT_WRITE|PROT_EXEC)
73799 selfmod2_i386 RET mprotect 0
73799 selfmod2_i386 CALL write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO fd 1 wrote 2 bytes
"5
"
73799 selfmod2_i386 RET write 2
73799 selfmod2_i386 CALL write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO fd 1 wrote 2 bytes
"4
"
73799 selfmod2_i386 RET write 2
73799 selfmod2_i386 CALL write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO fd 1 wrote 2 bytes
"3
"
73799 selfmod2_i386 RET write 2
73799 selfmod2_i386 CALL write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO fd 1 wrote 2 bytes
"2
"
73799 selfmod2_i386 RET write 2
73799 selfmod2_i386 CALL write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO fd 1 wrote 2 bytes
"1
"
73799 selfmod2_i386 RET write 2
73799 selfmod2_i386 CALL write(0x1,0x8049130,0x2)
73799 selfmod2_i386 GIO fd 1 wrote 2 bytes
"0
"
73799 selfmod2_i386 RET write 2
73799 selfmod2_i386 CALL exit(0)
Conclusion
Even though self-modifying code (or, more precisely, code that modifies the .text section) is rarely a good idea, it is possible nonetheless if the operating system provides a way to remove the read-only protection. FreeBSD provides the mprotect(2) system call to do just that, and we’ve made use of it, so that our counter variable could be modified, even though it was located in the .text section.
This was not really an example in self-modifying code, but since we’ve modified the bytes in the .text section of a running process, we can still claim to be at least able to modify code on-the-fly (at run time), by injecting bytes in the code pages, should the need arise.
I want to quote your post in my blog. It can?
And you et an account on Twitter?
Sure, why not? I don’t have an account on Twitter.
Thank you for this nice articles. I would love to see more articles about assembly!
For the linux users out there: (in linux) the address needs to be a multiple of the pagesize, so we can’t use %rip but we can calculate the page start address using: andq $0xfffffffffffff000, %rbx .
Happy hacking!