在B站,公眾號上發了一篇 AOT 的文章後,沒想到反響還是挺大的,都稱讚這個東西能抗反編譯,可以讓破解難度極大提高,可能有很多朋友對逆向不瞭解,以為用 ILSpy
,Reflector
,DnSpy
這些工具打不開就覺得很安全,其實不然,在 OllyDbg
,IDA
,WinDBG
這些逆向工具面前一樣是裸奔。
既然大家都很感興趣,那這篇就和大家聊一聊。
修改程式資料在逆向中再正常不過了,由於目前的 AOT 只能釋出成 x64 ,這裡就用 WinDbg
做下演示,首先看下例子。
internal class Program
{
static void Main(string[] args)
{
while (true)
{
Console.WriteLine("hello world!");
Thread.Sleep(1000);
}
}
}
程式在不斷的輸出,接下來我們將 hello world
中的 world
給去掉,操作手法非常簡單,先記憶體搜尋找到 hello world
,然後修改 length=5 即可。
0:005> lm
start end module name
00007ff7`95b70000 00007ff7`95e5d000 ConsoleApp1 C (private pdb symbols)
0:005> s-u 00007ff7`95b70000 L?0x00007ff7`95e5d000 hello
00007ff7`95e1c41c 0068 0065 006c 006c 006f 0020 0077 006f h.e.l.l.o. .w.o.
0:000> dp 00007ff795e1c41c-0x4 L1
00007ff7`95e1c418 00650068`0000000c
0:000> eq 00007ff7`95e1c418 00650068`00000005
0:000> g
AOT 再怎麼牛,它終還是個託管程式,既然是託管程式自然就有託管堆,託管堆中就有所有的託管資料,玩過 SOS.dll
朋友應該知道,用 !eeheap -gc
就能把託管堆給顯示出來,比如下面這樣。
0:022> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x000002414D891030
generation 1 starts at 0x000002414D891018
generation 2 starts at 0x000002414D891000
ephemeral segment allocation context: none
segment begin allocated committed allocated size committed size
000002414D890000 000002414D891000 000002414D8D1FE8 000002414D8D2000 0x40fe8(266216) 0x41000(266240)
Large object heap starts at 0x000002415D891000
segment begin allocated committed allocated size committed size
000002415D890000 000002415D891000 000002415D891018 000002415D892000 0x18(24) 0x1000(4096)
Pinned object heap starts at 0x0000024165891000
0000024165890000 0000024165891000 0000024165899C10 00000241658A2000 0x8c10(35856) 0x11000(69632)
Total Allocated Size: Size: 0x49c10 (302096) bytes.
Total Committed Size: Size: 0x42000 (270336) bytes.
------------------------------
GC Allocated Heap Size: Size: 0x49c10 (302096) bytes.
GC Committed Heap Size: Size: 0x42000 (270336) bytes.
雖然目前的 AOT 不支援 SOS 擴充套件,無法顯示出託管堆,但一點關係都沒有,SOS 是通過 DataAccess 去挖的,大不來我手工挖一下就好了哈,接下來就是怎麼挖的問題了,熟悉 CLR 的朋友應該知道所謂的託管堆在內部用的是 generation_table[]
一維資料來維護的,以 代
的方式來劃分,代的落地是用 heap_segment
來表示的, 參考程式碼如下:
generation gc_heap::generation_table [total_generation_count];
enum gc_generation_num
{
// small object heap includes generations [0-2], which are "generations" in the general sense.
soh_gen0 = 0,
soh_gen1 = 1,
soh_gen2 = 2,
max_generation = soh_gen2,
// large object heap, technically not a generation, but it is convenient to represent it as such
loh_generation = 3,
// pinned heap, a separate generation for the same reasons as loh
poh_generation = 4,
uoh_start_generation = loh_generation,
// number of ephemeral generations
ephemeral_generation_count = max_generation,
// number of all generations
total_generation_count = poh_generation + 1
};
接下來用 x 命令看下陣列內容,程式碼如下:
0:000> x ConsoleApp1!WKS::gc_heap::generation_table
00007ff7`95e25010 ConsoleApp1!WKS::gc_heap::generation_table = class WKS::generation [5]
0:000> dx -r1 (*((ConsoleApp1!WKS::generation (*)[5])0x7ff795e25010))
(*((ConsoleApp1!WKS::generation (*)[5])0x7ff795e25010)) [Type: WKS::generation [5]]
[0] [Type: WKS::generation]
...
[4] [Type: WKS::generation]
0:000> dx -r1 (*((ConsoleApp1!WKS::generation *)0x7ff795e25010))
(*((ConsoleApp1!WKS::generation *)0x7ff795e25010)) [Type: WKS::generation]
[+0x038] start_segment : 0x25100000000 [Type: WKS::heap_segment *]
[+0x040] allocation_start : 0x25100001030 : 0x38 [Type: unsigned char *]
[+0x048] allocation_segment : 0x25100000000 [Type: WKS::heap_segment *]
[+0x0d0] allocation_size : 0x0 [Type: unsigned __int64]
[+0x100] gen_num : 0 [Type: int]
...
0:000> dx -r1 ((ConsoleApp1!WKS::heap_segment *)0x25100000000)
((ConsoleApp1!WKS::heap_segment *)0x25100000000) : 0x25100000000 [Type: WKS::heap_segment *]
[+0x000] allocated : 0x25100001048 : 0x90 [Type: unsigned char *]
[+0x008] committed : 0x25100012000 : Unable to read memory at Address 0x25100012000 [Type: unsigned char *]
[+0x010] reserved : 0x25110000000 : 0x18 [Type: unsigned char *]
[+0x018] used : 0x25100009fe0 : 0x0 [Type: unsigned char *]
[+0x020] mem : 0x25100001000 : 0x38 [Type: unsigned char *]
[+0x028] flags : 0x0 [Type: unsigned __int64]
[+0x030] next : 0x0 [Type: WKS::heap_segment *]
...
上面的這些欄位就描述出了 !eeheap -gc
的結果,接下來想挖什麼,提取什麼我就不過多介紹了。
提取 託管執行緒
列表也是非常重要的, 它能指示出很多資訊,一般用 !t
命令就能顯示,輸出如下:
0:022> !t
ThreadCount: 17
UnstartedThread: 0
BackgroundThread: 6
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
Lock
DBG ID OSID ThreadOBJ State GC Mode GC Alloc Context Domain Count Apt Exception
0 1 4128 000002414BDB8C70 2a020 Preemptive 000002414D8C6108:000002414D8C8000 000002414bdaf8f0 -00001 MTA
6 2 4458 000002414BDE5EB0 2b220 Preemptive 0000000000000000:0000000000000000 000002414bdaf8f0 -00001 MTA (Finalizer)
7 4 23e8 000002416DDB15C0 102b220 Preemptive 000002414D8C9250:000002414D8CA000 000002414bdaf8f0 -00001 MTA (Threadpool Worker)
...
20 17 50a8 000002416DE43DD0 102b220 Preemptive 000002414D8BC2D0:000002414D8BDFD0 000002414bdaf8f0 -00001 MTA (Threadpool Worker)
21 18 57d4 000002416DE628E0 8029220 Preemptive 000002414D8CC2A8:000002414D8CE000 000002414bdaf8f0 -00001 MTA (Threadpool Completion Port)
既然目前的 SOS 不支援,同樣可以手工到 CLR 中去挖,熟悉的朋友應該知道 !t
的資料來源來自於 ThreadStore::s_pThreadStore
下的 m_ThreadList
集合,它以連結串列的形式串聯了每個執行緒的 LinkPtr
欄位,但可惜的是,在 AOT 中,這一塊已經重寫了,由 g_pTheRuntimeInstance
全域性變數下的 m_ThreadList
來維護了。
為了方便觀察,多生成幾個 Thread。
static void Main(string[] args)
{
Debugger.Break();
var tasks = Enumerable.Range(0, 10).Select(m => new Thread(() =>
{
Console.WriteLine($"tid={Thread.CurrentThread.ManagedThreadId} 已執行!");
Console.ReadLine();
}));
foreach (var item in tasks)
{
item.Start();
}
Console.ReadLine();
}
程式跑起來後,深挖 g_pTheRuntimeInstance
全域性變數即可。
0:015> x ConsoleApp1!g_pTheRuntimeInstance
00007ff7`0155ee20 ConsoleApp1!g_pTheRuntimeInstance = 0x00000291`cb5b9300
0:015> dx -r1 ((ConsoleApp1!RuntimeInstance *)0x291cb5b9300)
((ConsoleApp1!RuntimeInstance *)0x291cb5b9300) : 0x291cb5b9300 [Type: RuntimeInstance *]
[+0x000] m_pThreadStore : 0x291cb5b9390 [Type: ThreadStore *]
...
0:015> dx -r1 ((ConsoleApp1!ThreadStore *)0x291cb5b9390)
((ConsoleApp1!ThreadStore *)0x291cb5b9390) : 0x291cb5b9390 [Type: ThreadStore *]
[+0x000] m_ThreadList [Type: SList<Thread,DefaultSListTraits<Thread,DoNothingFailFastPolicy> >]
[+0x008] m_pRuntimeInstance : 0x291cb5b9300 [Type: RuntimeInstance *]
[+0x010] m_Lock [Type: ReaderWriterLock]
0:015> dx -r1 (*((ConsoleApp1!SList<Thread,DefaultSListTraits<Thread,DoNothingFailFastPolicy> > *)0x291cb5b9390))
(*((ConsoleApp1!SList<Thread,DefaultSListTraits<Thread,DoNothingFailFastPolicy> > *)0x291cb5b9390)) [Type: SList<Thread,DefaultSListTraits<Thread,DoNothingFailFastPolicy> >]
[+0x000] m_pHead : 0x291ed366240 [Type: Thread *]
0:015> dx -r1 ((ConsoleApp1!Thread *)0x291ed366240)
((ConsoleApp1!Thread *)0x291ed366240) : 0x291ed366240 [Type: Thread *]
...
[+0x058] m_pNext : 0x291cb6aeb60 [Type: Thread *]
[+0x060] m_hPalThread : 0x204 [Type: void *]
[+0x068] m_ppvHijackedReturnAddressLocation : 0x0 [Type: void * *]
[+0x070] m_pvHijackedReturnAddress : 0x0 [Type: void *]
[+0x078] m_uHijackedReturnValueFlags : 0x0 [Type: unsigned __int64]
[+0x080] m_pExInfoStackHead : 0x0 [Type: ExInfo *]
[+0x088] m_threadAbortException : 0x0 [Type: Object *]
[+0x090] m_pThreadLocalModuleStatics : 0x291cb6aee90 [Type: void * *]
[+0x098] m_numThreadLocalModuleStatics : 0x1 [Type: unsigned int]
[+0x0a0] m_pGCFrameRegistrations : 0x0 [Type: GCFrameRegistration *]
[+0x0a8] m_pStackLow : 0xf754100000 [Type: void *]
[+0x0b0] m_pStackHigh : 0xf754200000 [Type: void *]
[+0x0b8] m_pTEB : 0xf7533ba000 : 0x0 [Type: unsigned char *]
[+0x0c0] m_uPalThreadIdForLogging : 0x2044 [Type: unsigned __int64]
[+0x0c8] m_threadId [Type: EEThreadId]
[+0x0d0] m_pThreadStressLog : 0x0 [Type: void *]
[+0x0d8] m_interruptedContext : 0x0 [Type: _CONTEXT *]
[+0x0e0] m_redirectionContextBuffer : 0x0 [Type: unsigned char *]
0:015> dx -r1 (*((ConsoleApp1!EEThreadId *)0x291ed366308))
(*((ConsoleApp1!EEThreadId *)0x291ed366308)) [Type: EEThreadId]
[+0x000] m_uiId : 0x2044 [Type: unsigned __int64]
從CLR 的 Thread 維護的資訊來看,這個結構體已經很小了,也說明 AOT 在Thread資訊維護上做了很多的精簡。
總的來說,AOT 確實能加速程式的初始啟動,一體化的打包機制也非常方便部署,但怎麼變終究還是一個託管程式,需要底層的 C++ 託著它,扛 反編譯 無從談起,所以防小人的話,該加殼的加殼,該混淆的混淆。