Posts Tagged ‘IDA-Pro’

# IDA-Pro steals RIP — introduction in relative addressing

intraarterial injection: i was involved into a project on design a software-level protection, based on anti-dbg tricks that should work in 32- and 64-bit environment causing no conflict with legal apps. also, my shell-code locator has to learn how to recognize x86-64 exploits, so… I took a deep breath and dived into 64-bit word. well, I’m not newbit here, but digging up the anti-dbg tricks working everywhere sounds like a challenge. ok, anti-dbg tricks, shell-codes… good point to begin with.

kotal: x86 does not allow to address EIP register directly (PDP-11 does), but supports relative addressing in the flow control commands (“the” means “all”), for example: CALL L1 it’s a relative call. in the machine representation it looks like: E8 61 06 00 00, where E8h is opcode of CALL and 61 06 00 00 – a relative 32bit signed offset of the target, calculated from the _end_ of the CALL.

it’s very important for shell-codes, because it gives them ability to work being loaded at any offset. for protections it’s useful well. to prevent dumping – just allocate the memory on the heap and copy your procedure there. no dumper is able to create a workable PE image out of heap!

drawbacks: aside of benefits of relative addressing it has its own disadvantages. guess, what happens if we copy our function which calls the function we can’t copy (for example, API). the delta between CALL and the target will be changed, forcing us to recalculate all relative addresses, or… (turn your mind on) start to use absolute addressing, for example: mov eax, offset API_func/CALL eax;

home and dry: x86-64 does not allow to use RIP (former EIP) as a general purpose register (MOV RAX, RIP does not work), but it supports relative addressing almost everywhere (let me to quite the Intel manuals:”RIP-relative addressing allows specific ModR/M modes to address memory relative to
the 64-bit RIP using a signed 32-bit displacement. This provides an offset range of -/+2GB from the RIP
“). what it does mean?! for shell-code writers it means a lot!!! from now on we don’t need in GetPC subroutine (usually, CALL L1/L1:POP r32) and can use RIP directly. and this is the part where we meet the problem of the stolen RIP.

anaphylactic shock: please, consider the following code. this is how IDA-Pro 5.5 disassembles it. remember: it’s a piece of a real shell-code, so, concentrate your mind into fuming acid and do not miss the point (see the picture bellow as well):

.code:0000000000401000 start proc near
.code:0000000000401000 mov ecx, 69h
.code:0000000000401005 jmp short loc_40100C
.code:0000000000401007
.code:0000000000401007 loc_401007:
.code:0000000000401007 nop
.code:0000000000401008 xor [eax+ecx], cl
.code:000000000040100C
.code:000000000040100C loc_40100C:
.code:000000000040100C lea rax, loc_401013
.code:0000000000401013
.code:0000000000401013 loc_401013:
.code:0000000000401013 loop loc_401007
.code:0000000000401015 mov r9d, 0

how do you like it?! ok, let me to be more specific. how do you like the line: “lea rax, loc_401013″?! what?! did you say: “looks clear!” hello no!!! look closely!!! Option -> Text representation -> Number of opcode bytes -> 9. do you see _now_ what IDA-Pro hides from us?!

.code:000000000040100C 48 8D 05 00 00 00 00 lea rax, loc_401013
.code:0000000000401013 loc_401013:

oh, my unholy cow!!! “LEA RAX, loc_401013” turns out to be “LEA RAX, [RIP]“, thus we’re dealing with position-independent code. in a way, IDA-Pro is correct. she is just calculates RIP on the fly and replaces it by the effective offset. but, we – hackers – want to know if the code is position independent or not!!!

breakdown: HIEW also replaces RIP by effective offset. please consider the following line: 0040100C: 488D05000000001 LEA RAX, [000401013]

ok, do you want to get high? well, let’s do it, ppl!

00000000: 488D0500000000 lea rax,[7]
00000007: 488D0500000000 lea rax,[00000000E]
0000000E: 488D0500000000 lea rax,[000000015]

the same opcodes produce different targets, how funny!!! of course, it’s an opcode of LEA RAX, [RIP] command and I would like to have an option which enables/disables showing RIP, because I do need in very much!!!

updated: Igor Skochinsky pointed out (see his comment below) that IDA-Pro allows us to show RIP (Options -> Analysis options -> Processor specific analysis options -> Explicit RIP-addressing). ok, lets enable it and see what happens:

.code:000000000040100C loc_40100C: ; CODE XREF: start+5j
.code:000000000040100C lea rax, [rip+0]

well, say hello to “RIP”! it’s explicated now, but… the rest of the code is almost damaged and unvoyageable (means: inconvenient for navigation):

.code:000000000040101B lea r8, [rip+0FDEh] ; “x86-64 program!”
.code:0000000000401022 lea rdx, [rip+0FEEh] ; “hello world!”
.code:0000000000401029 mov rcx, 0 ; hWnd
.code:0000000000401030 call qword ptr [rip+2016h]
.code:0000000000401036 mov ecx, eax ; uExitCode

we see relative offsets like 0FDEh, 0FEEh, 2016h, etc. they’re red colored (means: IDA-Pro does not recognize these offsets) and if we move cursor to the constant – we can’t jump by ENTER and we need to calculate the target address manually. so, the problem is still unsolved.

in passing: look at the encoder again. don’t you think that it damages the loop?! ok, lets trace the code with any debugger or with our own mind if we have no 64-bit debugger under our hands.

“loop loc_401007″ has E2h F2h opcode. in binary representation F2h is “011110010″, so the lowest bit is zero, thus, when ECX = = 1, the target of loop will change from 401007h to 401008h (401007h ^ 1 = 401008h). as result – NOP will be skipped. of course, it might be INC EBX (opcode 43h) – in that case, EBX would be increased not by ECX (as it’s expected), but by (ECX – 1). how interesting…

well, when ECX = = 0, LOOP just does not pass the control to the target, so everything works fine.

updated: Sol_Ksacap (from www.wasm.ru) pointed out that (let me to quote him): “the target of loop will indeed change, but there won’t be any loop – “loop” instruction first decreases RCX, and only then checks if it’s zero“. and he is definitely right. this post was written in hurry. sorry for the mistakes I made and big thanks all guys who pointed it out.

off the record: in normal shell-codes you probably meet something like LEA EAX, [RIP-1] (opcode: 8B05FFFFFFFF), since commands with the positive offsets have zeros in opcodes and shell-codes do not like zeros very much (because of ASCIIZ, where Zero is a string terminator).

updated on:
Wed, Juli-15: enable-RIP option in IDA-Pro, loop patching bugs;

an example of real 64bit shell-code with hidden RIP

an example of real 64bit shell-code with hidden RIP

 

# IDA-Pro 5.5 has been updated, fixed — Bochs plug-in unaligned PE bug

in a nutshell: IDA-Pro 5.5.0.925 has been updated on July-01/2009 in order to fix a bug in BOCHSDBG plug-in. from now on it supports unaligned PE files (see definition below). if you want to get the updated version, send your identification (the ida.key) to support@hex-rays.com.

nude statement: I don’t like IDA-Pro Debugger. it’s very limited, devilish uncomfortable and embarrassing. it has its own benefits, none the less, but for me OllyDbg is much better. every man has his taste – opinions differ.

death notice: OllyDbg/Soft-Ice (like any other x86 debugger) is very limited. it could be detected, it could be broken. it does not support tracing of a self-traced program and there is no workaround — no script nor plug-in to fix it. it’s nature of x86 CPU. the same story with DRx registers. virtualization and emulation is the only way to hack strong protections (oh, come on! as if you can’t break an emulator, whose behavior is pretty different from native CPU).

brutal facts: what do we have?! is there any decent emulator?! well, x86emu (plug-in for IDA-Pro) is extremely limited. BOCHSDBG is good enough to debug MBR or OS loader, but… how we’re supposed to debug applications/drivers, working _below_ the Operation System?! the same story with VMWare/WinDbg and QEMU/GDB. so, in essence there is no decent emulators, except for internal products like McAfee EDebug (very good tool, but only for home consumption, “home” means “McAfee”).

beam of hope: IDA-Pro debugger had been significantly improved since 5.4 and the most dramatical change is BOCHSDBG plug-in supporting win32 PE debugging. what does it mean and how it works? well, to answer the first question: we got what we were waiting for a looong time. yeah, there was BOCHS, but it’s impossible to debug code snippets directly into BOCHS. the only way to do it – create an image of a tiny operation system and put a snippet there. CPU starts in real 16 bits mode, while win32 programs expect to see 32-bit protected mode with flat address space – the minimum requirements to debug code snippets, but it’s not enough to hack real applications!

the next step is to create win32-like environment. at least we need to emulate fundamental system structures (like PEB) and engines (SEH for example) not mention basic API set. it has been hell of a job (or, may be, a job in hell). and this job has been done by Elias Bachaalany, he is our hell-guy – very talented brick from the eastern shore of the Mediterranean Sea. not wonder that he is a clever cat!

a mint of intrigue: it was excitement from the first sight when I read Ilfak’ post “Bochs Emulator and IDA?“. it was just awesome! at that moment my company provided me a license for IDA-Pro 5.3, but it was too old to be updated for a free, so I kicked up my heels. McAfee provided me a license as well (Danke schön to the director of IPS research of Avert Labs, it was very kind of him and maybe I will find myself in his team based in Santa-Clara). but… it was IDA-Pro 5.3 – the original CD, shipped to Moscow McAfee office, hosted in the biggest building in Europe – Naberezhnaya Towers, but I was unable to get the updates because of security policies of McAfee. I had no access to the internal network and the sardonic Firewall did not allow to go outside. what’s a piggishly! only when IDA-Pro 5.5 has been released, I got the updates directly from hex-rays.com, sitting in Macrovision office and thinking that even in my village Internet is faster (I own 10 Mbytes link, but an average speed is 2 Mbytes, but it’s more that enough to fit my needs).

collapse of plans: I started testing the new plug-in, trying to debug programs (malware mostly) that I was unable to debug with OllyDbg and the old IDA-Pro debugger. the first impression was: wow! it’s cool! it’s easy to trace self-tracing code, “software” breakpoints do not changed the content… well, I felt like I reached the golden gates (or it was Golden Bridge?) and was about to dine with Mohammed, happy hunting ground – Elias made the best of both worlds, but… better to reign in hell than serve in heaven. BOCHSDBG plug-in is a great tool, yet it’s all wrong. the whole design is wrong. it’s easy to break the debugger.

for example: it traces programs by BOCHS virtual CPU engine. the very engine is used by the debugged program, so… no problem to detect the debugger, yet it’s harder than beat a non-virtual one. (now I’m working on anti-debugging tricks some of them will be posted here, some – for commercial purposes).

the facts: when you choose Loader Type -> PE in BOCHS configuration message box, the plug-in prepares a virtual image and loads PE file there. so, we should expect a lot of problems, because it’s almost impossible to design a decent 3rd party PE Loader. the problem is: MS PE Specification is not accurate and MS does not follow it. take Section Alignment for example. according to MS PE specification, the minimal Section Alignment == PAGE_SIZE, but win/32 supports much smaller values as well (win/64 does not) and the smallest alignment, supported by standard MS Linker, is 10h. lets come to terms to call these files “unaligned PE” – it’s not a good term, because the files are still aligned, yet the align value is much smaller than the specification requires, but it’s just a term :-)

IDA-Pro 5.5.0.95 BOCHSDBG Plug-in does not support unaligned PE files, and generates an exception on any writing attempt, even if the section is writable (in fact, even the section is not writable, the system PE loader makes _all_ sections writable, regardless of the attributes – but I will keep this feature to another post). just a few people have a chance to meet an unaligned PE, because these files are not common for commercial applications, but malware use this trick quite often in order to be smaller, and I met the problem on the second day of using IDA-Pro BOCHSDBG Plug-in.

in the can: to demonstrate the problem I created a very simple file. download it or see the source below:

int mem; char *txt=”[OK]“;
__declspec(naked) nezumi()
{
__asm{ mov [mem], eax }
MessageBox(0, txt, txt, 0); ExitProcess(0);
}

to make it, run “nmake make” or use the following command lines:

$cl.exe /c /Ox ida-bug_bochsdbg-16.c
$Link.exe ida-bug_bochsdbg-16.obj /ALIGN:16 /ENTRY:nezumi /SUBSYSTEM:WINDOWS KERNEL32.lib USER32.lib

ignore the linker warning “LiLNK4108: /ALIGN specified without /DRIVER or /VXD; image may not run” — image works fine on 32-bits editions of NT, W2K, XP, S2K3/S2K8, Vista (64-bits editions probably will not run it, but I have not checked it by myself, if you have 64-bits editions of Windows under your hand, please test it and post your comment here, thanks!)

as you can see, unaligned exe works fine, OllyDbg and local IDA-Pro debugger have no problem with it, but… go to Debugger menu, click “Switch Debugger”, select “Local Bochs debugger” and run it by F9 or try to trace step-by-step.

ops!!! an exception on a write attempt (see the pic below), accessing “mem” variable, which belongs to .data section, which is writable. remove “/ALIGN:16” key from the linker arguments, rebuild the program and try to debug it again. now it works fine! but… we can’t rebuild closed source program!!! so, it’s a problem and now it’s fixed. just ask the support for the updated version.

updated on: Sun, July 05, 04:44: grammar fix

IDA-Pro 5.5.0.925, BOCHSDBG, unaligned pe, impact area

IDA-Pro 5.5.0.925, BOCHSDBG, unaligned pe, impact area