Archive for the ‘anti-AV’ Category

# faked Adobe PDF.SWF exploit on milw0rm

on July-23, milw0rm uploaded “Adobe Flash (Embedded in PDF) LIVE VIRUS/MALWARE Exploit” written by @hdmoore who states that it’s (I quote) “live exploit sample for the new Flash bug (embedded in PDF)“, which is far from the truth.

the truth is – it’s the old getIcon exploit having nothing to do with the new vulnerability in ActiveScript Virtual Machine. the real worms (described here ) uses PDF with two embedded SWF files, one – triggers the bug, another performs heap-spraying and generates the shell-code on the fly! yeah! it uses Active Script byte-code (which is not plain text like JavaScript, it’s more like Java byte-code) to generate the shell-code, so there is no unescape strings, so my shell-detector fails to find it (of course it fails, it does not support Active Script byte code, at least not yet).

I will write about the real SWF exploit tomorrow. today we’re going talk about that faked exploit. it’s pretty interested as well. the first thing we have to do is to decompress all streams. it’s easy. zlib support that format, we just should write PDF parser… should we?! oh, not really!!!

according to RFC-1950 a zlib stream has the following structure: CMF_FLG (more–>). so, we can just look for CMF_FLG header, trying to decompress every stream we meet – very useful universal decompressor, supporting now only PDF, but much more (HTTP streams for example).

FLG filed has 4 bits FCHECK checksum and the header itself is quite predicable, so it’s easy to find a potential ZLIB header inside a byte stream. how to defeat false positives? (2byte header is too short to be reliable enough). well, no problem guys! if we found something looks like CMF_FLG just try to unpack the first 512 bytes by zlib inflate() function. if it fails it means – false positive, otherwise we have to call it again to unpack the rest.

ok, all streams of hereEvil.pdf are decompressed. 15th stream is JavaScript with a large Array contains unescaped string. looks like a shell-code, but hell no! decode it with a simple deURI converter and… ops!!! another JavaScript!!! yes!!! exploit inside exploit, nested obfuscation. could you believe me?! I just improved my shell-code locator, adding recursive filtering support (zlib-decompror and unescape decoder – basically are external filters for the locator engine). I have not released the new version yet, just was testing in and… wow!!! I met the exploit that really uses the nested JavaScripts for better obfuscation! well, just in time, just in time…

NOTE: if you have no idea how to write deURI decoder, download ECMA-262.pdf (ECMAScript Language Specification) and go to section “B.2.2 unescape (string)“. there you will find unescape decoder, written in pseodo-code.

the second (underlined) layer is not interested. it’s just Array with uneascape string contains the real shell-code includes well-known ["doc"]["Collab"]["getIcon"]. do they look familiar?! of course they do!!! it’s the old getIcon exploit, just more obfuscated.

now, about the shell-code. it’s very simple, don’t even encrypted. this is what my shell-code locator said:

+KRNL32 BASE ADDR PEB FINDER @ 00000019h
XOR key: 00 00 00 00 (00000000h)
#GREEN

ok, open the file with HIEW, go to 19h offset and see:

00000019: mov eax,[eax][0C]
0000001C: mov esi,[eax][1C]
0000001F: lodsd
00000020: mov eax,[eax][08]
00000023: jmps 00000002E —

yep, a typical KERNEL32 base address finder. what’s else?! the most interesting thing is — the shell-code has text strings. just look at them:

URLMON.DLL, URLDownloadToFileA, update.exe, crash.php, http://viorfjoj-2.com/2/update.php?id=0

wow!!! the domain name!!! I checked it and found out that viorfjoj-2.com is down, so I went to who is who service and… ops! surprise!!!

WHOIS information for viorfjoj-2.com:
* Registration Service Provided By: DOMAIN NAMES REGISTRAR REG.RU LTD.
* Contact: +7.4955801111
* Domain Name: VIORFJOJ-2.COM

Registrant:
Private person
Dmitry Ostupin (conroetxwelc@gmail.com)
ul. Malaya Semenovskaya, d.5, kv. 28
g. Moskva
g. Moskva,107023
RU
Tel. +7.4952240537

Creation Date: 08-Jul-2009
Expiration Date: 08-Jul-2010

Russian guy! that’s a deal! I have no idea whether he is the author of the exploit or maybe his server was used by another person, but I wonder… I wonder… going to give him a call tomorrow just out of curiosity.

well, maybe I should not public his contact info here because of etiquette, but… why not?! the exploit was taken from the public source, the hard-coded domain name was found, so… everyone can use the whois service to get this contact info.

well, what we’re going to do on ISP side? if you meet a packet from/to viorfjoj-2.com it means the host is infected and the packet should be blocked. well, since the server is down – obviously all major ISPs had blocked it already.

faked exploit on milw0rm - it has nothing to do with the real SWF security hole

faked exploit on milw0rm - it has nothing to do with the real SWF security hole

 

# weakness of PAGE_GUARD or new Windows bug (XP/Vista 32/64 SP1)

simple, but effective system independent anti-debug trick based on well-documented APIs and does not involve assembly inline (means: it could be implemented in pure C). also it works as anti-dump sensor.

caution: I would recommend do _not_ use this thick in production code, because it’s based on the bug (two bugs actually: one in Windows, another in OllyDbg), which could be fixed at any moment. however, noting terrible happens if the bug would be fixed – the application just could not detect debugger/dumper.

in passing: I found this bug working on the project for a spectrography cherry group, well, not a cherry actually, but I prefer to keep the real name if it under the mat, anyway it’s all about Ciscar Fon – my first love, a gothic type, very kinky and yet creative.

in a nutshell: the whole idea based on PAGE_GUARD attribute. SDK says: “any attempt to access a guard page causes the system to raise a STATUS_GUARD_PAGE (80000001h) exception and turn off the guard page status… if a guard page exception occurs during a system service, the service typically returns a failure status indicator“. wow! how I like these words: “typically”, “usually”, “normally”… they say nothing, but at the same time they say everything!!! just read between the lines…

ReadProcessMemory: normally, /* I mean _normally_ */ ReadProcessMemory() returns error if it meet a page with PAGE_GUARD attribute. does it make sense? of course! but, _normally_ does not mean “every time”. Windows has a bug (I tested W2K SP4, XP SP3, Vista SP0 and Vista 64bit SP1 – they are all affected).

the bug: if PAGE_GUARD page is created by VirtualAlloc() call, ReadProcessMemory() turns off the guard page status without any exception and returns a failure status indicator. however, the second ReadProcessMemory() call returns a positive status (because PAGE_GUARD was turned off), so when the application will try to access to that page – there will be no exception (as it’s supposed to be), because there is no guard anymore.

the sensor: it’s easy to create a sensor to detect dumpers. allocate a page with PAGE_GUARD attribute and check it from time to time: just install SEH handler and read the content. no exception means we’re fucked, oh, sorry, dumped. I tested PE-TOOLS and other popular dumpers and they all were detected.

demo: to demonstrate the bug, I wrote two simple applications. one – “protected-like” application, another – dumper-like application. please download the sources and binaries.

“protected” application (PAGE_GUARD_BUG.c) is very simple, basically it just calls VirtualAlloc(,,,PAGE_READWRITE | PAGE_GUARD), displays the address/PID, waits for pressing ENTER and attempts to read the content of the allocated block. there is no SEH handler, so if an exception happens you will see the standard Windows message box.

p = VirtualAlloc(0, 0×1000, MEM_COMMIT, PAGE_READWRITE | PAGE_GUARD);
printf(“run turnoff.exe %d %d twice and press enter”, GetCurrentProcessId(), p);
gets(buf); printf(“result: %x\n”, *p);

and the “dumper” (turnoff.c ) just calls ReadProcessMemory() and displays the result:

h = OpenProcess(PROCESS_ALL_ACCESS, 0, atol(arg_id));
x = ReadProcessMemory(h, (void*)atol(arg_addr), &buf, 0×1, &n);

oh, here we go. follow me, please!

1) run the protected app (“$start PAGE_GUARD_BUG.exe“);
2) it displays ID/addr, like: id:1216 addr:4325376;
3) press right now;
4) ops! exception! this means: PAGE_GUARD works!!!
5) run the protected app again (“$start PAGE_GUARD_BUG.exe“);
6) it displays ID/addr, like: id:1212 addr:4325376;
7) run the dumper, passing ID and addr (“$turnoff.exe 1212 4325376“);
8) it says: “satus:0, bytes read: 0″ (means: ReadProcMem failed);
9) but! if you switch to PAGE_GUARD_BUG.exe and press ENTER you will see no exception (means: PAGE_GUARD was turned off);
10) if you run the dumper twice (of course without pressing ENTER) it will displays: “satus:1, bytes read: 1″ (means: there is no PAGE_GUARD anymore);

nice trick, it’s it? but actually it was just a little warming-up. the real tricks are coming.

NOTE: if PAGE_GUARD attribute is assigned by VirtualProtect(), Windows respects the attribute and ReadProcessMemory() fails, leaving PAGE_GUARD turned on.

debuggers: what happens if a debugger meet PAGE_GUARD page? the answer is: there will be no exception, the debugger just turns PAGE_GUARD off, processes the content and forgets to return PAGE_GUARD back.

demonstration: to demonstrate this nasty behavior I wrote a simple program PAGE_GUARD_DBG.c, download it, please. and follow me. the source code is easy to understand:

push 0×102 ; PAGE_READONLY | PAGE_GUARD
push MEM_COMMIT
push 0×1000
push 0
call ds:[VirtualAlloc]
mov eax, [eax]

execute it step-by-step, make step over the VirtualAlloc() call and display the content of the allocated memory block (for example, in IDA-Pro press , eax, ENTER and to go back). continue tracing, and… ops! where is our exception?! there is no one!!!

OllyDbg is even worse. it automatically resolves memory references displaying the content in the right column, so we don’t need to go to the dump window nor pressing CTRL-G… just trace it and the debugger will be detected, since there will be no exception!!!

IDA-Pro: well, what about if we just run the program under debugger? just run, no trace! IDA-Pro triggers an exception: “401035: A page of memory that marks the end of a data structure such as a stack or an array has been accessed (exc. code 80000001, TID 1312)” and offers to pass the exception to the application. in this case the debugger will be _not_ detected.

OllyDbg: the standard de-facto debugger stops when the application accesses PAGE_GUARD, giving the message “Break-on-access when reading” in the status bar, but Olly does not offer us to pass the exception to the application. even we go to options->exceptions and add 0×80000001 (STATUS_GUARD_PAGE) exception to the list, Olly will ignore it! guess, PAGE_GUARD is just a part of “memory-breakpoint” engine, so no way to pass PAGE_GUARD exception to the application, so it’s easy to detect the debugger. (I tested OllyDbg 1.10).

Soft-Ice: it does not display the content of PAGE_GUARDED pages, so it could not be detected by this way. in other hand, keeping the impotent content under PAGE_GUARD makes debugging much harder. we can’t perform full memory search, we can’t find cross references… we’re blind.

the love triangle: PAGE_GUARD, Windows and OllyDbg: Windows has a bug, Olly has a bug, so... how were supposed to debug?!

the love triangle: PAGE_GUARD, Windows and OllyDbg: Windows has a bug, Olly has a bug, so... how we're supposed to debug?!

 

# Xcon2009: passive non-resident root-kits

I was invited to Xcon 2009 Security Conference (Beijing, China, 18th-19th August 2009) where I’m going to talk about a new generation of passive non-resident win32/Linux root-kits. the brief introduction is followed bellow:.

In the dark…
…I heard your voice: “hey, you on the other side! In this dark and rainy night, we come out of the shadows just to finish what we began a thousand years ago. my gun is pumping, you’re down on your knees. a closer step to death. I think I’m coming, are you ready to receive? I spray you full with my killer disease! now life is death and light is dark!

there is a full-scale subterranean war been raged for every shred of information, there are things that go bump in the night, everybody knows about it and nobody says anything about it. they don’t intend to upset the balance of the war. I will. I wlll open a portal… and awaken the Ogdru Jahad. behind this door, a dark entity. evil, ancient and hungry.

The Seven Gods of Chaos turn out to be a new kind of root-kits. non-resident passive Ring-3 Root-Kits affect Windows and Linux. sounds boring, doesn’t it? but hold a candle to the sun and listen. they’re coming inside to break you down, they hide exe/dll modules, using only well-documented win32 APIs, working _everywhere_ from 9x to Vista, they don’t request administrator rights and every known AV fails to find the hidden modules as well as to detect the root-kits, because there is nothing to detect — thanks to the passive non-resident nature of them! your favorite tool — the manual detection (“hands-n-brain”) fails to detect them as well! soft-ice, syser, and root-kit finders show us nothing! what the hell is this — science or black magic?! I don’t know, I just hear how your PC box is crying: what’s happening to me? everything is so cold! everything is so dark! what is this pain I feel, why does it hurt? please no, let me die… let me die… let me die… hey! don’t you know it is supposed to work? you always get what you deserve! there is no cure. there is no solution. in death and dark we are all alone.

facts: This is not something absolutely new. this is what the hacker community started to talk about a year ago. it was a part of my Reverse Engineering Course lectured to Sec++ Group (Israel) Sense Post company (South Africa) and many others. at that moment we considered it as a win32 bug, allowing us to infect running EXEs and loaded DLLs.
Discussing this stuff with the Apple Panda and Soft Forum guys (Seoul, South Korea) suddenly we realized — this is much more than just infection, the same trick might be used for hiding and there is no way to find the coffined modules. it was supposed to be a part of my speech on CodeGate-2009 conference, but for some reasons this topic was removed and suspended for a while.
There were some (just a few) internal reports that I sent to my company (McAfee, Avert Labs), but the wide public had no idea about what was going on till now, and from now till doomsday you will know for sure what this is all about. this is a new threat, spotlighting maladjustment of three major Windows engines – file system, virtual memory manager and object manager. Linux boxes are not affected. well, in fact, they’re affected, but for them there is a solution — a cure. but not for Windows! we’re all waiting for an official patch, fixing the problem.

/* snippets from New Rose Hotel, Queen Of The Damned, Hell-boy, BlutEngel, Pain were used */

updated on: Jul-09, grammar fix (thanks to Ben Layer, McAfee)

new generation of passive non-resident win32/Linux root-kits

Xcon - one of the most authoritative and famous information security conference in China

 

# JL/JGE Intel CPU bug as anti-reversing trick

months ago Bow Sineath (a very clever reverser!) asked me: “does JL [jump is less] instruction check ZF flag?” I said: “well, give me a second to think, well, it’s supposed to check it, otherwise it would act like JLE [jump if less or equal] and besides, JL is synonym of JNGE (jump if not great or equal), so JL should check ZF!“.

but, according to Intel’ manuals JL and JNGE check only if SF != OF. CMOVL/CMOVNGE work the same way. at that time I thought that it’s just a documentation bug and even pointed this out in my presentation on HITB 2008 conference.

fragment of Intel manual

fragment of Intel' manual

but I was wrong!!! I have checked it and found out that JL/JNGE does not check ZF flag!!! to do this I wrote extremely simple POC (if you’re too lazy to type, download source and binary):

__asm
{
mov eax, 002C2h ; S = 1, O = 0, Z = 1
push eax
popfd
jl jump_is_taken ; ==>
mov p, offset noo
jump_is_taken:
}

mov eax, 2C2h/push eax/popfd set SF with ZF and clear OF. so, SF != OF, but ZF is set. what CPU is going to do? easy to check with Olly! just load the program and start tracing. ops!!! JL is taken!!! JL ignores ZF!!! x86emu (plug-in for IDA-Pro) acts the same. didn’t check other emulators yet.

well, it’s interesting. why JL (and similar commands) ignores ZF?! guess, normal CPU command (like TEST/CMP/XOR/etc) never set ZF if result is less, so JL just ignores it. but… if we set flags manually or use other tricks… it becomes a real trap!!! consider the listing above and ask your co-worker: is the jump taken or not? I’m sure, some of them will answer: of course, the jump is not going to be taken! a good anti-reversing trick!!! I just wonder – how software is still working on buggy hardware.

JL does not check ZF flag as it is supposed to do!!!

JL does not check ZF flag as it is supposed to do!!!

 

# Olly loads Olly to bypass anti-attach tricks /* Clerk’ trick */

the problem of anti-anti-attaching came up in conversation on the legendary wasm.ru site. Clerk (a very clever guy carring a heavy plasma gun, loaded with rounds of brilliant ideas) as always offered a very elegant, yet bizarre solution (ru). I wonder – what kind of Rasta stuff makes him so creative! well, stop to expatiate, back to business.

previous posts demonstrate numerous anti-attach tricks and most of them based on the system thread, creating by OS during attaching. here they are (the tricks): BaseThreadStartThunk => NO_ACCESS, NtRequestWaitReplyPort, DbgBreakPoint

the question is – how to ask OS do not create the system thread? to do it we should know OS internals. IDA-Pro/Soft-Ice shows us that KERNEL32!DebugActiveProcess comes to NTDLL!DbgUiDebugActiveProcess, who calls NTDLL!ZwDebugActiveProcess/ NTDLL!DbgUiIssueRemoteBreakin| NTDLL!DbgUiStopDebugging (just to dissemble NTDLL!DbgUiDebugActiveProcess to see it with your own eyes).

the point is – NTDLL!ZwDebugActiveProcess does all job, attaching a debugger to the process. . as soon as NTDLL!ZwDebugActiveProcess returns status ok, the process has been attached and can be debugger. but! operation system calls NTDLL!DbgUiIssueRemoteBreakin just to notify the debugger by generating breakpoint exception, however, we don’t need it!!!

so, what we’re going to do? I prefer to use old soft-ice with global breakpoints support. just set HW or software breakpoint on NTDLL!DbgUiDebugActiveProcess or NTDLL!ZwDebugActiveProcess and skip the rest of the function. it’s easy, but soft-ice does not work with newest operation system.

Clerk found the way how to do this with Olly. the idea is: load Olly into Olly. yeah! right!

1) load Olly into Olly /* to avoid a mess lets call the first Olly (I) and the loaded copy – Olly (II) */;
2) Olly (I): Set breakpoint on NTDLL!DbgUiDebugActiveProcess: View\Executable Modules\NTDLL.DLL, CTRL-N, “DbgUiDebugActiveProcess”, F2, ENTER;
3) Olly (I): run Olly (II): press F9 several times until right corner “paused” changed by “running” meaning that Olly (II) is still under debugging but it’s running now;
4) ALT-TAB to switch to Olly (II);
5) Olly (II): File\Attach\name_of_the_trickily_process to attach (for example: to_attach_36.exe);
6) Olly (I) pops up, the breakpoint has been triggered;
7) to_attach_36.exe is still running;
8) Olly (I): press F8 several time until NTDLL!ZwDebugActiveProcess is executed;
9) to_attach_36.exe has been stopped, Olly (II) has been attached to it, Olly (II) is stopped as well;
10) Olly (I): move cursor to the next command after NTDLL!DbgUiStopDebugging, right click to context menu and “new origin here” or simple press CTRL+Gray * (“gray” means small numeral keyboard);
11) Olly (I): press F9 to run Olly(II);
12) ALT-TAB to switch to Olly (II);
13) Olly (II) shows naked screen w/o any info, to_attach_36.exe is running;
14) Olly (II): View\Threads. do you see the only one thread? the main thread of the app?! wow!
15) Olly (II): press “pause” to stop to_attach_36.exe;
16) Olly (II) updates CPU window and from that moment we can trace to_attach_36.exe as usual;

well, you got it. a nice trick to bypass anti-attaches. it’s very powerful and universal, but does not work with PEB=>LdrData . um, every technique has its own limitations.

meanwhile, you probably know, if the process is already attached to another process, we can’t attach our debugger to it. many protections use this trick – they create a child process (packed), attaches to it for dynamic unpacking. but there is a loophole. we can attach to the parent process unless the child not attached to the father. yes!!! a debugged process can attach to the debugger!!! it looks like (parent <== attach ==> child)

I’ll write about it latter, showing you how to break this chain. for now, you can play with Clerk’s trick!

Olly loaded into Olly attached to to_attach_36.exe

Olly loaded into Olly attached to to_attach_36.exe

 

# Process Explorer – bloody hell of indefinite waiting bugs

long time ago I found a bug in Process Explorer pointing out that it uses wrong algorithm of retrieving Thread Start Address. all threads created with CreateRemoteThread has Thread Start Address pointing to KERNEL32.DLL. it’s true, but help- and sense-less. I sent my bug report to Mark, but got no answer and posted it on my old blog: “bug in Process Explorer (a gift for malware)

it means that Process Explorer is not reliable enough, we can’t trust it anymore and yesterday I found more bugs. they’re very common and almost _every_ application has numerous bugs like this. are you intrigued? well, lets begin!!!

download to_attach_ldr.exe (anti-attach trick, based on damaged PEB=>PPEB_LDR_DATA – read this post for more info), run it and… ask Process Explorer to display properties of “to_attach_ldr.ex” process (yeah! “ex”, not “exe” – another bug?!). now, go to “Threads” tab and…

ops!!! Process Explorer freezes falling into infinitive loop with 100% CPU load. ~60% is to_attach_ldr.exe and ~40% is CSRSS.EXE. ok, close to_attach_ldr.exe and Process Explorer immediately wakes up. this means it just was waiting the even that was not going to happen.

ok, another example. download to_attach_33.exe (anti-attach based on intercepted NtRequestWaitReplyPort, described here ). run it and ask Process Explorer to show properties, go to “Threads” tab and… you know what will happens in the next moment. freezing!!!

freezing Process Explorer

freezing Process Explorer

we just found out how buggy Process Explorer is. but why? what’s the issue? the answer: NT has numerous functions like WaitForSingleObject, WaitForDebugEvent, etc. they all take dwMilliseconds argument specifies time to wait for an event. if the parameter is INFINITE, the function does not return until an event has occurred. the problem is – almost every programmer uses INFINITE and does not handles time out error. admit, you did this too?

Ilfak does for sure and I pointed out it before (“another EnableTracing() bug“).

ok, we got two points. first – never use INFINITE unless you’re 100% sure it works. the second point: if malware creates a malicious thread inside a trusted application – anti-attach tricks help it to survive. many anti-viruses are unable to enumerate threads in this case and some of them just freezes. very effective DoS attack against pro-active protections!!!

note: I tested Process Explorer 11.4 under W2K SP4, not tested other versions yet, but guess the bug is there.

 

# NtRequestWaitReplyPort abuses IDA-Pro

good news first. my simple anti-anti-attaching plug-in is coming soon and it works very well. meanwhile, I’m experimenting with different anti-attaching technologies and wish to share a few new (old?) tricks with you.

well, the method based on NTDLL!DbgBreakPoint (see “try to attach to me: if you can!” ) is not good enough to hurt IDA-Pro. like Ilfak said – just set “Stop on debugging start” checkbox in the debugger options to stop IDA-Pro _before_ NTDLL!DbgBreakPoint. how it’s going to help?! nothing! but we will try. press F7 several times. go to Debug\Open subviews\Open threads. do you see two threads there? one is the main thread of the app, another – the system tread, created by DebugActiveProcess() API. the point is – we don’t need the system thread anymore. we should kill it. why? the debugged application might inject bad code into it. but IDA-Pro does not allow us to kill treats.

OllyDbg does. ok, run Olly, go to Options\Debugging options\Events\Break on new thread [x]. Attach to the to_attach_31.exe. ok, OllyDbg has been stopped. now, View\Threads. do you see two threads there? current thread is the system thread. kill it! (context menu\kill thread). um, the thread does not want to die. don’t worry it’s almost dead. now, click another thread (the main), context menu, actualize it and start tracing the main thread step-by-step or just press F9 to run. the system thread is disappeared. there was injected code displaying “shit happens”, but since the thread has been killed – no shit!!! everything is just fine!!!

this is universal technology. I tested it on large malware/protectors collection and it works well! at least for Olly. for IDA-Pro we need to write a script or plug-in, killing unwanted threads.

ok, forget about NTDLL!DbgBreakPoint. back to IDA-Pro. “Stop on debugging start” is set, IDA-Pro attaches to the process (any process you want) and stops. where it stops? let me see…

NTDLL!77F88B6C ZwRequestWaitReplyPort proc near
NTDLL!77F88B6C mov eax, 0B0h ; NtRequestWaitReplyPort
NTDLL!77F88B71 lea edx, [esp+arg_0]
NTDLL!77F88B75 int 2Eh
NTDLL!77F88B77 retn 0Ch ; << here NTDLL!77F88B77 ZwRequestWaitReplyPort endp

so, IDA-Pro stops at NTDLL!77F88B77, when NtRequestWaitReplyPort NTCALL has been executed, so NtRequestWaitReplyPort (called by CsrClientCallServer) is executed _before_ stop. thus, if we intercept NtRequestWaitReplyPort – it will be easy for us to abuse IDA-Pro or do something unexpected. and IDA-Pro has nothing to do this it.

the problem is: NtRequestWaitReplyPort is very popular function and it’s used not only by debugger. so, we can’t just intercept it. we have to check the caller – the thread ID.

for example:

mov eax, fs:[18h] ; // *TIB
mov eax, [eax+24h] ; // CurrentThreadId
sub eax, [our_tid] ; // ?another Thread
jz to_old ; // => no dbg

; // perform stack overflow
die: push eax
jmp die

; // all ok, passing control to the old func
to_old: jmp ds:[old_NtRequestWaitReplyPort]

I wrote a simple POC, abusing IDA-Pro and OllyDbg. download it, run exe. do you see message – “attach to me”? well, ask Olly to attach. Olly attaches without any problems, but… the string changes to “debugger is detected” and the process is still running. wow!!! of course, we can stop the process and continue tracing, but… the point is – the debugger has been detected. how? I just injected my code into NtRequestWaitReplyPort to set global flag if we’re under debugger. the main thread checks this flag and changes its behavior if we’re under debugger. of course, in this simple case we can fix it after attaching, but imagine what happens if the injected code will wipe out all code of the app or destroy the critical structures or just free a few memory blocks cause random crashes?

OllyDbg stops too late. our code injected into NtRequestWaitReplyPort executes before, and debugger has no control under it. what’s about IDA-Pro? try to attach to to_attach_33.exe (all check box in debug options are set). what do we see? the protection tell us “debugger is detected”, the process is running, but IDA-Pro‚Ķ freezes. what she is waiting for? and how to save our database? we don’t want to kill IDA-Pro, don’t we? right! kill to_attach_33.exe with Process Explorer. IDA-Pro will return from dead to alive.

IDA-Pro 5.3 fails to attach

IDA-Pro 5.3 fails to attach

well, who will break this simple crack me? who will find the way how to attach to the process do not disturbing the protection?

 

# PRNG based on REP STOS

unlike many others, REP STOS/MOVS instruction is very hard to emulate (see “self-overwritten REP STOS/MOVS, IDA-Pro 5.4 and Ko” post). at first sight it’s easy! live CPU continues executing overwritten command, right? so, take OllyDbg or IDA-Pro and write a plug-in to redesign the trace engine. if the tracer meets self-overwritten REP STOS/MOVS it memorizes it and continues executing like the real CPU does. it’s easy! what’s the problem?!

but, it’s more than meets the eye. CPU stops executing over-written REP STOS/MOVS in arbitrary moments. well, almost arbitrary. hardware interrupts force CPU to clean pipeline, stopping executing the overwritten REP STOS/MOVS command. internal CPU events also stop REP executing. different CPUs have different behavior (and different bugs :=).

this gives us… a very interesting pseudo-random number generator. if initial ECX is quite big, REP loop finishes with almost unpredictable ECX and the most interesting part is – our PRNG does not look like normal PRNG!!!

to prove it, I wrote a simple POC (plz, do not use it in commercial software, coz, due to CPU bugs sometimes it crashes with access violation or another exception, but in general it works, I’m working on the stable version now). download it and run.

the program allocates memory on the heap, copies self-overwritten REP STOSB code there, initializes ECX by 16Mb value, passes control to it and displays ECX. the point is – ECX is different every run.

my favorite Pentium-III 733 MHz Coppermine (my base PC) copies at least 64 Kb in average, see:

16685683 (91533 copied)
16575731 (201485 copied)
16755560 (21656 copied)
16770446 (6770 copied)
16670677 (106539 copied)
16759763 (17453 copied)

Pentium-4 3.2 GHz HT (my home file-server) shows pretty different result:

16044939 (00732277 bytes copied)
16777211 (00000005 bytes copied)
16777210 (00000006 bytes copied)
16777206 (00000010 bytes copied)
16777209 (00000007 bytes copied)
15788810 (00988406 bytes copied)

damn! sometimes CPU copies only 5 bytes!!! well, not “sometimes” – very often!!! it makes this trick not just PRNG, but… kind of fingerprints of CPU.

 

# attach to me… if you can (part II)

the previous post describes how to intercept attaching, but that way does not prevent attaching itself. as it happens there is a simple and elegant way to block any attaching attempts. just wipe out PEB=>PPEB_LDR_DATA field. the application is running well, the process is present in the processes list of Task Manager/Process Explore, but… it’s not listed in the Olly 1.10/Olly 2.00i attach windows!!!

to_attach_ldr.exe is not present in the attach windows!

to_attach_ldr.exe is not present in the attach windows!

ok, guys another plan! load the file directly into OllyDbg 1.10 in order to debug it. can we debug it? well, yes, but… no. OllyDbg 1.10 does not show us the module list (so, how we’re supposed to set breakpoints on API?) and the map window is empty as well. OllyDbg 2.00i and IDA-Pro 5.3 have no such problem.

IDA-Pro 5.3 can’t attach to the process as well, she just freezes!!! and there is nothing to do but terminate IDA-Pro with all changes we have made. a very nasty bug!

the source code is extreme simple. see it bellow or download.

__asm{
mov eax, fs:[30h] ; // PEB
mov [eax + 0xC], eax ; // damage LdrData to prevent attaching
}
// do something
while(1) printf(“\rattach to me [%c]“,x[++a % (sizeof(x)-1)]), Sleep(100);
}

so, you get it. any ideas how to hack it? does anybody know the way how to attach to the process?

note: Elias Bachaalany checked IDA-Pro 5.4 (both with WinDbg plug-in and build-in win32 debugger). it does not freeze, but attached is code crashed inside NTDLL.DLL and IDA catches a lot of exceptions, so this trick works for IDA-Pro 5.4 as well. it’s not IDA-Pro bug! and IDA-Pro has nothing to do to fix it.

 

# self-overwritten REP STOS/MOVS, IDA-Pro 5.4 and Ko

once upon a time was MS-DOS and ancient debuggers like Turbo-Debugger, Soft-Ice and many others. and there were anti-debug tricks. one of them was based on self-overwritten REP STOS/MOVS instruction. it worked great against all existing debuggers, including CUP 386 (exe unpacker with build-in CPU emulator).

I used this tricks for years. I would almost forget about if Silviocesare not posted “Anti-debugging prefetch tricks and single stepping through a rep stos/movs” article on his blog (very nice blog, btw).

I was interested: what’s about modern debuggers? what’s about emulators like BOCHS? what’s about IDA-Pro 5.4 with BOCHS-based debugger? imagine, how surprised I was when I found out that IDA-Pro 5.3, Olly 1.10 and Soft-Ice not only can be detected this way, but also lost the control during step over tracing! debugged code just escapes out of the debugger!!! IDA-Pro 5.4 with BOCHS module fails to emulate the self-overwritten REP STOS/MOVS instruction (so it can be detected as well) and lost control on Step Over tracing. only Olly 2.00i recognizes attempts to espace and blocks them, however it can be detected the same way.

for testing reasons I wrote a simple program with self-overwritten REP STOSB command (see source bellow).

std
xor ebx, ebx
mov al, 43h ; // INC EBX
mov edi, offset end_of
mov ecx, 6
REP STOSB
NOP
NOP
NOP
end_of: NOP
cld

download it, load into IDA-Pro 5.3 and start tracing REP STOSB instruction (F7 hot key). what do we see? REP STOSB changes NOP to INC EBX, overwrites four commands and overwrites itself (STOSB). since, during tracing CPU generates a single step exception every iteration, REP STOSB becomes REP INC EBX and as far as we all know, REP works only with string commands, so REP INC EBX is not executed and REP loop finishes with ECX = 1.

now, run the program without tracing. CPU pipelines REP STOSB and executes it until ECX > 0. REP STOSB modifies only data cache, while the instruction is executed on the pipeline and CPU does not recognizes modification of the code, so REP loop finishes with ECX = 0.

Olly 1.10/200i and Soft-Ice also fails to trace self-overwritten REP STOSB instruction. of course, if we trace the program with our hands, it’s easy to set a breakpoint _after_ it and run the code without tracing, but!!! many plug-ins use trace engine for their needs, so the trace engine should work fine and it’s possible to fix debuggers – just before executing REP STOS/MOVS we have to perform some checks and if the command overwrites itself we either set a breakpoint either emulate CPU behavior.

ok, run the program under BOCHS (IDA-Pro 5.4 support a special plug-in, allowing us to debug code on the fly). regardless of whether we trace program or not, the REP loop finishes with ECX = 1. well-known x86-emu plug-in gives us the same result, and this result is definitely wrong.

by the way, did I hear a question: how long CPU is executing overwritten REP STOS/MOVS command? there is no universal answer. it depends on CPU internal behavior and external evens like interrupts. when an interrupt is generated, CPU stops executing overwritten REP STOS/MOVS. a good way to create a pseudo-random generator! I’m going to write about it in the next post.

meanwhile, some CPU have a bug. they executes CLD commands _before_ overwritten REP STOS/MOVS will be stopped or finishes. as result, REP STOS/MOVS changes the direction and hits the memory not supposed to be written. I’m investigating this case now, will publish the result soon.

well, let’s return to our muttons. load the program into IDA-Pro 5.3/5.4 and perform Step Over tracing. move cursor to REP STOSB, press F8 and… the debugger lost the control!!! why? the answer is: to gain control back after REP STOSB command IDA-Pro sets a software breakpoint on the next command. we all know that a software breakpoint it’s just INT 03 (CCh) instruction and in our case this instruction is overwritten by REP STOSB. thus the breakpoint is wiped out and the process is executed until another breakpoint will be triggered. if there is no other breakpoints – the debugged code escapes out of the debugger!

what’s about Olly 1.10 and Soft-Ice?! fast check shows us like they do not lost the control and stop after REP STOSB. but… how they do it? well, they just use hardware breakpoints!!! and if all four hardware breakpoints are in use, the debuggers set a software breakpoint like IDA-Pro does, so they lost control as well. not good!

Olly 2.00i (didn’t check other versions) is the only debugger who is able to detect that the breakpoint was wiped out. if it happens we have a warning message. impressive! Olly 2.00 is a great debugger doubtless!!!